[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] graceful coexistence of vanilla and mpi jobs

Wolf-Dieter Klotz wrote:
Hi all,
I have a Condor pool of 41 machines. Out of these I declared 5 machines as dedicated mpi resources. If vanilla jobs are executing on these dedicated machines because the scheduler found these machines idle and unclaimed and a mpi job is submitted, the vanilla jobs are brutally killed and restarted once the mpi job has finished executing. Is there a way to let the mpi job wait for the vanilla job's completion?

If you have set up Condor as recommended for MPI jobs, you will have a line like this in your config file:

RANK = Scheduler =?= $(DedicatedScheduler)

If you comment this out, the vanilla jobs should not get preempted. However, it may cause the MPI jobs to wait a long time to start.