[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Scheduling N of M jobs



If you put these jobs into a DAG, then the DAG can throttle the number of concurrently running jobs.  The setup we use looks like:

jobset.dag:
JOB handle1 foo.ca
JOB handle2 bar.ca
...

config:
DAGMAN_MAX_JOBS_SUBMITTED=20
DAGMAN_MAX_SUBMITS_PER_INTERVAL=5
DAGMAN_USER_LOG_SCAN_INTERVAL=5

submit command:

$ condor_submit_dag -config config -debug 2 jobset.dag

You fill  jobset.dag with the list of classads for your jobs.  If it can easily be parameterized, you can specific the same classad template for every job, and then include a VARS line with the parameters to use to fill in the classad.  The config file says to only have 20 submitted at once, to submit in blocks of 5 at a time, and to scan the logs every 5 seconds to track changes in state.  You'd have to read the docs to see if there is an option to throttle on the number of running jobs, rather than the number of submitted jobs.

HTH,

Ian

On 2/5/10 11:04 AM, Nathan Whitehorn wrote:
I am working with a set of jobs that are I/O bound, and need to submit several hundred of them to our cluster. Once 10 or 20 are running, the storage system they are using becomes saturated, so running more than that just wastes CPUs in the cluster. Is there a way, as a normal user, to tell the scheduler that, of this group of 400 jobs, I want 10 to be running at any given time?

This seems to be a fairly common problem, but all I have run into as a solution are scripts that periodically check the current queue state and either submit or release jobs if the running number has dropped below a threshold, which seems like an ugly hack.
-Nathan
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/

-- 
Ian Stokes-Rees, PhD                       W: http://hkl.hms.harvard.edu
ijstokes@xxxxxxxxxxxxxxxxxxx               T: +1 617 432-5608 x75
NEBioGrid, Harvard Medical School          C: +1 617 331-5993

begin:vcard
fn:Ian Stokes-Rees, PhD
n:Stokes-Rees;Ian
org:Harvard Medical School;Biological Chemistry and Molecular Pharmacology
adr;dom:;;250 Longwood Ave;Boston;MA;02115
email;internet:ijstokes@xxxxxxxxxxxxxxxxxxx
title:Research Associate, Sliz Lab
tel;work:+1 617 432-5608 x75
tel;fax:+1 617 432-5600
tel;cell:+1 617 331-5993
url:http://hkl.hms.harvard.edu
version:2.1
end:vcard