[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Maximum jobs on submit machine



Lukas, Micheal, Matthew, and Mark,

Thank you for your responses.  I will respond to all of you in a single email if possible.  

First, this is a windows pool.  The problem I am having is a maximum number of jobs running concurrently on a submit machine.  All of the execute machines are capped at the number of available CPU's, and they are working fine.  Like most places, each machine is set up with an anti-virus software, in this case Symantec.  The anti-virus utility is set up to handle the firewall, so windows firewall is disabled.  I have had to get IT to enable exceptions for all condor processes.  I have been running the pool for about 8-9 months now, but only recently have I recruited enough CPU's for this problem to surface.

I have validated that the MaxJobsRunning value is not the limiter by setting its value first to 30, which definitely capped the number of running jobs at 30, then setting it to 2000, in which case the number of jobs simply floated to its maximum which are the 85 and 50 that I initially reported. 

Mark, if I were to temporarily disable Symantec, then this would test whether or not it's a firewall issue, correct?

Thank you all for your ideas.  Hopefully we can find a resolution here.

Eric


-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Lukas Slebodnik
Sent: Friday, January 13, 2012 8:03 AM
To: Condor-Users Mail List
Subject: Re: [Condor-users] Maximum jobs on submit machine

On Fri, Jan 13, 2012 at 10:49:32AM -0500, Matthew Farrellee wrote:
> On 01/13/2012 10:22 AM, Eric Abel wrote:
> >Fellow condor users,
> >
> >I am finding that there is a limit to the number of jobs that will run
> >on a given submit machine, and that number is different depending on the
> >machine. I have already verified that this limit is well below the
> >default MaxJobsRunning value. For example on one machine the maximum
> >seems to be about 85, and on another it’s about 50. Any ideas on this?
> >
> >Thanks,
> >
> >Eric
> 
> [MAX_JOBS_RUNNING]
> default=ceiling(ifThenElse( $(DETECTED_MEMORY)*0.8*1024/800 < 10000,
> $(DETECTED_MEMORY)*0.8*1024/800, 10000 ))
> 
> So the MaxJobsRunning is a function of RAM in the box. If you're on
> Windows it is more complicated. Generally, I recommend using a
> non-Windows machine for hosting the condor_schedd.

You can view values for all schedd daemons by executing command
condor_status -sched -f "%s " Name -f "%s\n" MaxJobsRunning

On Windows platforms, the number of running jobs is capped at 200.
A 64-bit version of Windows is recommended in order to raise the value above
the default.

Details:
http://research.cs.wisc.edu/condor/manual/v7.6/3_3Configuration.html#18253

Regards,
Lukas

> 
> Best,
> 
> 
> matt
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/