[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Condor a viable/realistic option for MPI jobsubmission in a cluster?


some months back I tried to configure Condor as the queue management system for
our cluster (used only for MPI jobs), but ended up abandoning it and installing
OpenPBS instead.

After getting a lot of experience with our workstations pool, I'm thinking of
going back to Condor for our cluster, but I'm not sure whether it could be a
real contender for MPI stuff.

The two main points that we needed and that I couldn't figure out how to do with
Condor were:

1. How to specify the distribution of processors for the jobs. When I tried it,
   Condor would take the decision, but I couldn't change that. I would like to
   be able to try to execute a job in four CPUs, but all in the same node, or
   perhaps just one CPU per node, etc.

2. How to have two queues, a fast one and a slow one. I got an example from the
   Bologna system paper, so that I duplicated the number of CPUs per node, so
   that I could have half the virtual CPUs as one queue and the other half as
   the other queue. This seemed to work for vanilla jobs, but it didn't for MPI

Does anybody have experience with these issues or at least knows whether this is

Thanks a lot,
Angel de Vicente

PostDoc Software Support
Instituto de Astrofisica de Canarias