[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Limiting number of concurrently running jobs from a cluster.



On Tue, Aug 26, 2008 at 12:56 PM, Johan Bengtsson <teofrastius@xxxxxxxxx> wrote:
> Hi,
> I am trying to configure Condor to limit the number of simultaneously
> running jobs from an individual cluster. For instance I have a cluster
> with 1000 jobs but as they all share a centralized resource there is no
> point in running more than 50 jobs concurrently.
>
> Does anyone know how to do this (or if it is even possible)?

short answer, use DagMan I believe configuration to allow just what
you descrive was recently added.

long answer, submit everything on hold and have a management job which
maintains the "Hold + Running <= 50" invariant (we do something
similar but we are flexible on the exact number so simply release n
jobs when hold+ running < threshold.
To avoid polling the queue you could poll the collector or use
condor_wait to operate on the files.

If you can you might want to consider getting the DagMan stuff to work
though since that is officially supported and has stuff to do what you
want directly.