[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Question about new foreach condor_submit syntax and dagman



Thanks a lot Kent.

Situation is clear now.

I will go for option 2. This is what I was doing in the past, generating a .dag file with VARS directives via a script with a loop. I thought I could save this little loop script with the new features of condor_submit. No problem. This works fine.

thanks!
Gonzalo

On 5 June 2015 at 14:21, R. Kent Wenger <wenger@xxxxxxxxxxx> wrote:
On Fri, 5 Jun 2015, Gonzalo Merino wrote:

I want to submit 200 jobs, each to process a different file. I also
want to throttle their execution so that I can set the maximum number
of running jobs at any given time. So far, pretty much a bread and
butter use case for every day data processing business.

Until now, I have been doing this with dagman, which provides the
maxjobs throttling functionality.

I learned in the htcondor week about the new features of
condor_submit, in particular the one that allows me to submit my 200
jobs with one submit file using the syntax like:

Queue <vars> from <filename>

Nice. I would like to use the new condor_submit feature, but then
still using dagman to throttle the maxjobs. How should I do this? Now
I have one submit file that will generate 200 jobs, so I tried a dummy
dagman file like:

JOB Job1 my200jobs.submit

this way I see my 200 jobs get submitted, but dagman does not seem to
apply the maxjobs constraints to them.

What is the way to get dagman maxjobs throttling to work with the new
"queue from" syntax in condor_submit?

DAGMan can only throttle jobs to the granularity of a single submit file. So if your submit file queues 200 jobs, a maxjobs of, say, 10, won't do you any good.

Also, "maxjobs" should really be something like "maxclusters" -- a single cluster counts as one "item" for maxjobs.

There are basically two approaches you can take:

1) Use the foreach feature in condor_submit, and throttle your jobs
in other ways.

2) Keep using DAGMan, but don't use "foreach" in your submit files.

(We may eventually add the "foreach" capability to DAGMan, but that's somewhere down the road.)

If you go with #2, you can make things a little easier on yourself by
using the DAGMan VARS feature (http://research.cs.wisc.edu/htcondor/manual/v8.3/2_10DAGMan_Applications.html#SECTION003108200000000000000)
to re-use a single submit file for all of your nodes.

If you go with #1, you may be able to throttle your jobs with
MAX_JOBS_RUNNING
(http://research.cs.wisc.edu/htcondor/manual/v8.3/3_3Configuration.html#param:MaxJobsRunning)
or Concurrency Limits
(http://research.cs.wisc.edu/htcondor/manual/v8.3/3_12Setting_Up.html#43259)
but both of these require admin permissions for your HTCondor installation.

Kent Wenger
CHTC Team

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/