[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] slow scheduling of dagman jobs
- Date: Wed, 7 Sep 2011 14:51:28 -0500 (CDT)
- From: "R. Kent Wenger" <wenger@xxxxxxxxxxx>
- Subject: Re: [Condor-users] slow scheduling of dagman jobs
On Wed, 7 Sep 2011, Patty Bragger wrote:
I'm running into a performance issue of sorts with submitting dagman jobs.
When submitting a dagman job of say 100 nodes, I find that it takes quite a
wile for all 100 nodes to show up in the queue. After an initial wait of
about 12 seconds, the nodes are added to the queue at a rate of about 7 per
second. The nodes have no dependencies on each other, they are completely
stand alone and could be submitted without using dag. When I do submit jobs
without using dag, the jobs are added to the queue much faster, about
100/second. I can get that submission rate whether submitting one job with
a "queue 100" or submitting 100 separate jobs in one submit file.
Well, keep in mind that DAGMan is doing a separate condor_submit for each
node. When I do that (outside of DAGMan) it's much slower than doing a
single condor_submit that queues 100 jobs.
So I think you're basically seeing the overhead of a condor_submit call
for every job versus a single condor_submit call.
Keep in mind that (at least with recent versions of DAGMan) you can queue
multiple jobs in a single submit file (as long as they are all part of the
same cluster). I'm pretty sure (but not 100% sure) that that feature was
in 7.4.4. Of course, depending on exactly how you are using DAGMan, this
may not be a good idea, but the option is there if one of your main goals
is to get jobs into the queue as fast as possible.