[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] maxidle for a dag with one node?
- Date: Thu, 22 Sep 2011 15:18:44 -0500 (CDT)
- From: "R. Kent Wenger" <wenger@xxxxxxxxxxx>
- Subject: Re: [Condor-users] maxidle for a dag with one node?
On Thu, 22 Sep 2011, Rob de Graaf wrote:
I have a job consisting of one cluster with several hundred thousand
processes. The individual processes use $(Process) as an argument. I can't
submit them all at once, so I made a DAG with one JOB node and tried to use
condor_submit_dag's -maxidle throttling capability. According to the manual,
each individual process counts as a job, so this matches what I want to do,
but doesn't seem to work; the entire cluster is submitted regardless of what
I set -maxidle to. I've also tried -maxjobs just in case, but that does what
it says and throttles whole clusters, not the processes within.
Some of the subtle differences between maxidle and maxjobs have been
difficult to explain -- I'll take another shot at it...
First of all, keep in mind that DAGMan only controls things at the submit
file level of granularity. In other words, if DAGMan submits a submit
file that has 'queue 10' in it, you get a cluster with 10 procs, and
DAGMan doesn't try to do anything to the individual procs.
One of the differences between maxidle and maxjobs, though, is how things
are counted towards the specified total. If you have a submit file that
has 'queue 10' and DAGMan submits it, and all of the procs are idle, that
counts as 10 towards maxidle. But if you have maxjobs set, it only counts
as 1 towards that. Once you hit the limit, DAGMan just stops submitting
any more jobs; it doesn't remove or hold specific jobs or procs.
Is there a way to throttle processes in a single-node DAG? I realize that I
could split the cluster into many single-process clusters and use -maxjobs,
but then I wouldn't be able to use $(Process) anymore. Ideally I'd like to
avoid having to generate that many submit files.
Are you using $(process) for things like output file names? You can do
something similar with arbitrary names in your submit file that have
values assigned in the DAG file:
That would allow you to use the same submit file for many nodes in your
DAG -- would that solve the problem for you?
(Basically, if you want DAGMan's throttling to work right you have to have
a small number of procs per submit file, ideally one per submit file...)