[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Maximum jobs on submit machine



Hi all,

A much delayed follow-up here, but I tried increasing the HEAP size to 8 MB, but the maximum number of simultaneous jobs (for this particular machine) is 105-110.  I checked the Microsoft website, and apparently there about 40 MB is the upper limit to this value.  I'm not sure what setting it this high would do, but it doesn't matter because increasing to 8 MB did nothing to help my problem.  Any other ideas?  It would be nice to reach that 300 machine limit!!!

Thanks,

Eric

-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Gore, Brooklin
Sent: Thursday, January 19, 2012 7:40 AM
To: Condor-Users Mail List
Subject: Re: [Condor-users] Maximum jobs on submit machine

Try some Google searches, I didn't get anything in the first hit, but this
additional tip:

http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/index.jsp?topic=/com
.ibm.swg.im.iis.productization.iisinfsv.install.doc/topics/wsisinst_config_
winreg.html

I suggest trying 2048, 4096, etc. to see if this helps get you more jobs.

~B

On 1/18/12 5:12 PM, "Eric Abel" <Eric.Abel@xxxxxxxxxx> wrote:

>Thanks for the tip.  I changed the SharedSection value from 512 to 1280
>following the instructions on the link you provided, and now the number
>of jobs seems to peak at about 110.  However, I am not able to go much
>higher...is there a maximum to the value SharedSection can have?
>
>Eric
>
>-----Original Message-----
>From: condor-users-bounces@xxxxxxxxxxx
>[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Gore, Brooklin
>Sent: Friday, January 13, 2012 11:49 AM
>To: Condor-Users Mail List
>Subject: Re: [Condor-users] Maximum jobs on submit machine
>
>Eric,
>
>While your maximum jobs running (50-85) is a bit lower than the 120
>usually associated with the Windows HEAP size issue, it could be related.
>
>Check the last article here:
>http://research.cs.wisc.edu/condor/manual/v6.8/7_4Condor_on.html
>
>A silly question: There are more than 50-85 machines available to actually
>run these jobs, right?
>
>Best, ~Brooklin
>
>On 1/13/12 10:17 AM, "Eric Abel" <Eric.Abel@xxxxxxxxxx> wrote:
>
>>Lukas, Micheal, Matthew, and Mark,
>>
>>Thank you for your responses.  I will respond to all of you in a single
>>email if possible.
>>
>>First, this is a windows pool.  The problem I am having is a maximum
>>number of jobs running concurrently on a submit machine.  All of the
>>execute machines are capped at the number of available CPU's, and they
>>are working fine.  Like most places, each machine is set up with an
>>anti-virus software, in this case Symantec.  The anti-virus utility is
>>set up to handle the firewall, so windows firewall is disabled.  I have
>>had to get IT to enable exceptions for all condor processes.  I have been
>>running the pool for about 8-9 months now, but only recently have I
>>recruited enough CPU's for this problem to surface.
>>
>>I have validated that the MaxJobsRunning value is not the limiter by
>>setting its value first to 30, which definitely capped the number of
>>running jobs at 30, then setting it to 2000, in which case the number of
>>jobs simply floated to its maximum which are the 85 and 50 that I
>>initially reported.
>>
>>Mark, if I were to temporarily disable Symantec, then this would test
>>whether or not it's a firewall issue, correct?
>>
>>Thank you all for your ideas.  Hopefully we can find a resolution here.
>>
>>Eric
>>
>>
>>-----Original Message-----
>>From: condor-users-bounces@xxxxxxxxxxx
>>[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Lukas Slebodnik
>>Sent: Friday, January 13, 2012 8:03 AM
>>To: Condor-Users Mail List
>>Subject: Re: [Condor-users] Maximum jobs on submit machine
>>
>>On Fri, Jan 13, 2012 at 10:49:32AM -0500, Matthew Farrellee wrote:
>>> On 01/13/2012 10:22 AM, Eric Abel wrote:
>>> >Fellow condor users,
>>> >
>>> >I am finding that there is a limit to the number of jobs that will run
>>> >on a given submit machine, and that number is different depending on
>>>the
>>> >machine. I have already verified that this limit is well below the
>>> >default MaxJobsRunning value. For example on one machine the maximum
>>> >seems to be about 85, and on another it¹s about 50. Any ideas on this?
>>> >
>>> >Thanks,
>>> >
>>> >Eric
>>> 
>>> [MAX_JOBS_RUNNING]
>>> default=ceiling(ifThenElse( $(DETECTED_MEMORY)*0.8*1024/800 < 10000,
>>> $(DETECTED_MEMORY)*0.8*1024/800, 10000 ))
>>> 
>>> So the MaxJobsRunning is a function of RAM in the box. If you're on
>>> Windows it is more complicated. Generally, I recommend using a
>>> non-Windows machine for hosting the condor_schedd.
>>
>>You can view values for all schedd daemons by executing command
>>condor_status -sched -f "%s " Name -f "%s\n" MaxJobsRunning
>>
>>On Windows platforms, the number of running jobs is capped at 200.
>>A 64-bit version of Windows is recommended in order to raise the value
>>above
>>the default.
>>
>>Details:
>>http://research.cs.wisc.edu/condor/manual/v7.6/3_3Configuration.html#1825
>>3
>>
>>Regards,
>>Lukas
>>
>>> 
>>> Best,
>>> 
>>> 
>>> matt
>>_______________________________________________
>>Condor-users mailing list
>>To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>>subject: Unsubscribe
>>You can also unsubscribe by visiting
>>https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>
>>The archives can be found at:
>>https://lists.cs.wisc.edu/archive/condor-users/
>>_______________________________________________
>>Condor-users mailing list
>>To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>>subject: Unsubscribe
>>You can also unsubscribe by visiting
>>https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>
>>The archives can be found at:
>>https://lists.cs.wisc.edu/archive/condor-users/
>
>_______________________________________________
>Condor-users mailing list
>To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>subject: Unsubscribe
>You can also unsubscribe by visiting
>https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
>The archives can be found at:
>https://lists.cs.wisc.edu/archive/condor-users/
>
>
>_______________________________________________
>Condor-users mailing list
>To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>subject: Unsubscribe
>You can also unsubscribe by visiting
>https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
>The archives can be found at:
>https://lists.cs.wisc.edu/archive/condor-users/
>

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/