[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] scheduler universe and dagman



On Wed, 9 Apr 2014, Walid Saad wrote:

Hi Keith,Try to use the SPLICE Command as follows:

# ALL-DAG.dag File.

SPLICE DAG1 DAG1.submit

SPLICE DAG2 DAG2.submit

SPLICE DAG3 DAG3.submit

Note that DAG1.submit, etc., need to be DAG files, not HTCondor submit files, for this to work. So it's probably better to call them something like DAG1.dag, etc., to avoid confusion.

2014-04-08 13:06 GMT+01:00 Keith Brown <keith6014@xxxxxxxxx>:
      I have a set of 60 different DAGs (condor_dagman) and it seems
      its hitting the scheduler heavily. Unfortunately, I can't
      consolidate it into one dag since that will take considerable
      amount of time. 

is there dagman option to not schedule jobs so aggressively?  or poll
the scheduler every few seconds? 

DAGMan doesn't actually poll the scheduler -- it looks at the relevant log file(s). You can set the rate at which it does this with the DAGMAN_USER_LOG_SCAN_INTERVAL configuration macro. The default is 5 (one scan every 5 seconds). You can set it to anything from 1 to MAX_INT.

If you want to reduce the number of jobs that DAGMan puts into the queue, you can do that with the maxjobs or maxidle throttles:
http://research.cs.wisc.edu/htcondor/manual/v8.1/3_3Configuration.html#23736
http://research.cs.wisc.edu/htcondor/manual/v8.1/3_3Configuration.html#23715

You might also want to look at node category throttles:
http://research.cs.wisc.edu/htcondor/manual/v8.1/2_10DAGMan_Applications.html#SECTION003108400000000000000

Is your issue just that you don't want so many jobs in your queue, or is it more complicated than that? Recent versions of HTCondor should be able
to handle at least 1000s of queued jobs without much problem.

Kent Wenger
CHTC Team