[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] restricting the number of jobs



I think this is because of "maxidle" versus "maxjob".

I saw a similar thing when I used a small number for "maxidle" like 3.

I had a dagman of 10 and tried to limit "maxidle" to 3. Since it appears that dagman submits 5 jobs at a time, you still get 5 but not 10 since a job isn't "idle" until it's actually submitted. Since my dagman submitted 5 jobs at a time, you can't get 3 so I get 5 but at least I don't get all 10.

However, if I use "maxjobs" set to 3. I get just 3 submitted at a time since dagman can figure out maxjobs prior to submitting them.

At least I think that's how it works because that is what I interpreted from the documentation and what I saw happen on my 10 job test dagman script.

However, the documentation appears to indicate that "maxjobs" will only count each "job" in the dagman script and doesn't count each individual "job" within a single submit script (i.e. queue 500 for instance would count as 1 "job"). It does indicate that the "maxidle" option (since it looks at each job in the queue separately for counting purposes) will throttle a "single" job in dagman with many individual "jobs" (i.e. queue xxx) since it counts them AFTER they are submitted so it looks at them as individual jobs. Kind of confusing but that is why the exact "maxidle" doesn't appear to work to the exact number since the dagman submits the jobs in "groups" of x (5 in my case but I'm not sure exactly where that comes from yet) and "maxidle" doesn't take affect until after they are submitted to suppress additional submissions.

Kind of confusing but it seems to work okay especially if you keep the "maxidle" to something greater than whatever the single group submission number is for a dag.

If anyone can explain how this works in more detail, I would love to hear about it to save some time on experimentation to figure it out. :-)

Thanks,

Kim

------------------------------------------------------------------------------
Kim Dillman
Research Programmer – Rosen Center for Advanced Computing
Purdue TeraGrid Campus Champion
YONG 956
Phone: 765-494-5446
Email: kadillma@xxxxxxxxxx


-----Original Message-----
From: Miroslav V.Shaltev [mailto:miroslav.shaltev@xxxxxxxxxx] 
Sent: Monday, October 12, 2009 10:08 AM
To: condor-users@xxxxxxxxxxx
Cc: Dillman, Kimberley A; atlas-users@xxxxxxxxxx
Subject: Re: [Condor-users] restricting the number of jobs

hi all,

from what i see the user's way to control the number of jobs in a single 
dagman is 

condor_submit_dag -maxjobs X some.dag

the -maxidle option is not working for me. i set it to 1 and still have X (in 
my case 3) idle jobs coming from the dag.

cheers,
miroslav

On Monday 12 October 2009, Dillman, Kimberley A wrote:
> I have a similar issue (no I/O intensive but a reasonably large number of
> jobs).
>
> I set up a dagman job to control it since you can provide a command line
> option to condor_submit_dag that allows you to specific "maxidle" and
> "maxjobs". I think (from what I read), "maxjobs" only applies to each
> separate "job" inside the dagman and not the jobs inside the individual
> submit script (such as you have here) but I believe that the documentation
> says that "maxidle" does apply to each individual job submitted even though
> it is within a single submit script (such as you have).
>
> I haven't tried it though. I just created a dagman job that would have all
> of the jobs separated out in the dagman submit script because I wanted to
> vary the input/output file names and initial directory per job as well as
> control how many jobs were submitted at any one time.
>
> Maybe you can try that.
>
> Kim
>
> ---------------------------------------------------------------------------
>--- Kim Dillman
> Research Programmer – Rosen Center for Advanced Computing
> Purdue TeraGrid Campus Champion
> YONG 956
> Phone: 765-494-5446
> Email: kadillma@xxxxxxxxxx
>
> -----Original Message-----
> From: condor-users-bounces@xxxxxxxxxxx
> [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Mag Gam Sent:
> Saturday, October 10, 2009 2:06 PM
> To: Condor-Users Mail List
> Subject: Re: [Condor-users] restricting the number of jobs
>
> how can I let the user control it?
>
> WE have some users whose jobs are very I/O intensive. They want to run
> only 10 at time.
>
> On Sat, Oct 10, 2009 at 2:55 PM, dawnsong <dawnsong.tsinghua@xxxxxxxxx> 
wrote:
> > set
> > MAX_JOBS_RUNNING = 10
> > in Condor global configuration file.
> >
> > 2009/10/10 Mag Gam <magawake@xxxxxxxxx>
> >
> >> Is it possible to restrict the number of jobs to run?
> >>
> >> For example?
> >>
> >> I have something like this:
> >>
> >> Universe       = vanilla
> >> Executable     = hello_world.sh
> >>
> >> input   = /dev/null
> >> output  = hello.out
> >> error   = hello.error
> >>
> >> Queue 5000
> >>
> >>
> >> What is I want only 10 to run at a time?
> >>
> >> Is that possible to do?
> >> _______________________________________________
> >> Condor-users mailing list
> >> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
> >> a subject: Unsubscribe
> >> You can also unsubscribe by visiting
> >> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >>
> >> The archives can be found at:
> >> https://lists.cs.wisc.edu/archive/condor-users/
> >
> > _______________________________________________
> > Condor-users mailing list
> > To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting
> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >
> > The archives can be found at:
> > https://lists.cs.wisc.edu/archive/condor-users/
>
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/


-- 
SHALTEV.ORG @ http://www.shaltev.de