[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Question about new foreach condor_submit syntax and dagman
- Date: Fri, 5 Jun 2015 14:21:02 -0500 (CDT)
- From: "R. Kent Wenger" <wenger@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Question about new foreach condor_submit syntax and dagman
On Fri, 5 Jun 2015, Gonzalo Merino wrote:
I want to submit 200 jobs, each to process a different file. I also
want to throttle their execution so that I can set the maximum number
of running jobs at any given time. So far, pretty much a bread and
butter use case for every day data processing business.
Until now, I have been doing this with dagman, which provides the
maxjobs throttling functionality.
I learned in the htcondor week about the new features of
condor_submit, in particular the one that allows me to submit my 200
jobs with one submit file using the syntax like:
Queue <vars> from <filename>
Nice. I would like to use the new condor_submit feature, but then
still using dagman to throttle the maxjobs. How should I do this? Now
I have one submit file that will generate 200 jobs, so I tried a dummy
dagman file like:
JOB Job1 my200jobs.submit
this way I see my 200 jobs get submitted, but dagman does not seem to
apply the maxjobs constraints to them.
What is the way to get dagman maxjobs throttling to work with the new
"queue from" syntax in condor_submit?
DAGMan can only throttle jobs to the granularity of a single submit file.
So if your submit file queues 200 jobs, a maxjobs of, say, 10, won't do
you any good.
Also, "maxjobs" should really be something like "maxclusters" -- a single
cluster counts as one "item" for maxjobs.
There are basically two approaches you can take:
1) Use the foreach feature in condor_submit, and throttle your jobs
in other ways.
2) Keep using DAGMan, but don't use "foreach" in your submit files.
(We may eventually add the "foreach" capability to DAGMan, but that's
somewhere down the road.)
If you go with #2, you can make things a little easier on yourself by
using the DAGMan VARS feature
to re-use a single submit file for all of your nodes.
If you go with #1, you may be able to throttle your jobs with
or Concurrency Limits
but both of these require admin permissions for your HTCondor