Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] maxidle for a dag with one node?

Date: Thu, 22 Sep 2011 15:18:44 -0500 (CDT)
From: "R. Kent Wenger" <wenger@xxxxxxxxxxx>
Subject: Re: [Condor-users] maxidle for a dag with one node?

On Thu, 22 Sep 2011, Rob de Graaf wrote:

I have a job consisting of one cluster with several hundred thousandprocesses. The individual processes use $(Process) as an argument. I can'tsubmit them all at once, so I made a DAG with one JOB node and tried to usecondor_submit_dag's -maxidle throttling capability. According to the manual,each individual process counts as a job, so this matches what I want to do,but doesn't seem to work; the entire cluster is submitted regardless of whatI set -maxidle to. I've also tried -maxjobs just in case, but that does whatit says and throttles whole clusters, not the processes within.

Some of the subtle differences between maxidle and maxjobs have beendifficult to explain -- I'll take another shot at it...

First of all, keep in mind that DAGMan only controls things at the submitfile level of granularity. In other words, if DAGMan submits a submitfile that has 'queue 10' in it, you get a cluster with 10 procs, andDAGMan doesn't try to do anything to the individual procs.

One of the differences between maxidle and maxjobs, though, is how thingsare counted towards the specified total. If you have a submit file thathas 'queue 10' and DAGMan submits it, and all of the procs are idle, thatcounts as 10 towards maxidle. But if you have maxjobs set, it only countsas 1 towards that. Once you hit the limit, DAGMan just stops submittingany more jobs; it doesn't remove or hold specific jobs or procs.

Is there a way to throttle processes in a single-node DAG? I realize that Icould split the cluster into many single-process clusters and use -maxjobs,but then I wouldn't be able to use $(Process) anymore. Ideally I'd like toavoid having to generate that many submit files.

Are you using $(process) for things like output file names? You can dosomething similar with arbitrary names in your submit file that havevalues assigned in the DAG file:

http://www.cs.wisc.edu/condor/manual/v7.7/2_10DAGMan_Applications.html#SECTION003106200000000000000

That would allow you to use the same submit file for many nodes in yourDAG -- would that solve the problem for you?

(Basically, if you want DAGMan's throttling to work right you have to havea small number of procs per submit file, ideally one per submit file...)


Kent Wenger
Condor Team

Follow-Ups:
- Re: [Condor-users] maxidle for a dag with one node?
  - From: Rob de Graaf

References:
- [Condor-users] maxidle for a dag with one node?
  - From: Rob de Graaf

Prev by Date: Re: [Condor-users] maxidle for a dag with one node?
Next by Date: [Condor-users] Condor Using Parallel Universe
Previous by thread: Re: [Condor-users] maxidle for a dag with one node?
Next by thread: Re: [Condor-users] maxidle for a dag with one node?
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [Condor-users] maxidle for a dag with one node?