[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Order in which dagman queues jobs



On Thu, 5 Dec 2013, Brian Candler wrote:

From experimentation, it seems that dagman queues up jobs in the order that they become ready. Is this true, and is there any way to change this?

Well, it's really the order they're defined in the DAG file.

Yes, there is a way to change it -- see below.

Let me explain what I'm doing. I have a DAG has a number of independent job threads, each of which is a linear chain of nodes. i.e. something like this:

A1 -> B1 -> C1 -> D1
A2 -> B2 -> C2 -> D2
A3 -> B3 -> C3 -> D3
...
A1000 -> B1000 -> C1000 -> D1000

The 'A' jobs complete very quickly, each within a second or two; dagman can't submit them into the queue fast enough. The B and C jobs are relatively long-running and compute intensive, and the D jobs are quite short.

What I'm discovering from watching progress is:

- Even when some of the A jobs have completed (and therefore the related B jobs are ready to run), dagman continues to submit all the remaining A jobs before it starts to submit any B jobs. Therefore these compute-heavy jobs don't start to run as soon as they might.

- Things move more or less in lock step (i.e. there's a phase when A jobs are running, then B jobs are running, then C jobs are running etc)

- At the end, when the D jobs are running, because these are short the queue empties out and again dagman can't submit jobs fast enough.

Obviously one thing I need to do is to get dagman to push jobs into the queue faster, and I'm going to investigate some of the ideas at https://www-auth.cs.wisc.edu/lists/htcondor-users/2013-August/msg00002.shtml

However, in my case it would also be helpful if dagman queue up jobs in a different order - for example, when an 'A' job completes then queue up its corresponding 'B' job in preference to another 'A' job. This would mix the workload better through the lifetime of the jobs, and also some of the completed results would come out sooner.

If I'm correctly understanding what you want to do, this is pretty simple to accomplish with node priorities. What you need to do is make your B nodes higher priority than your A nodes, and probably make your C nodes higher than your B nodes, and your D nodes the highest of all.

So something like this in your DAG file:

Job A1 ...
Job A2 ...
...
Job B1 ...
Priority B1 10
Job B2
Priority B2 10
...
Job C1 ...
Priority C1 20
...
Job D1
Priority D1 30
...

should do what you want.

Or else just set DAGMAN_SUBMIT_DEPTH_FIRST to true in a per-DAG config file.

Kent Wenger
CHTC Team