[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Only allow 40 of my runs to run at once



On 04/13/2011 05:14 AM, Rob Stevenson wrote:
Hi all,
One of our Condor users has a particular type of job which puts a lot of
pressure on a database server. We've worked out that any more than 40
runs at once and the db server will fall over and all jobs fail completely.
So far I've got a partial solution of adding a ClassAd to 40 machine's
config files. Eg:
ONLY40 = True
STARTD_EXPRS = $(STARTD_EXPRS) , ONLY40
And the user adds a requirement (ONLY40 =?= True) to their submit file.
This works great while the same 40 CPUs remain in the pool and there's
not much else going on. But if many other users jobs happen to be
running on these 40 CPUs or I have to remove some of these from the
pool, then these jobs will not be able to run on anywhere near 40 CPUs
at once.
Of course, the ideal solution is to look at the code and optimise the
sql queries - I'm sure there's scope for this.. But for now:
Does anyone have any ideas about how I might be able to modify the limit
so we're limiting to ANY 40 CPUs rather than a SPECIFIC set of 40?
Many thanks!
Rob

Concurrency Limits -

http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_MRG/1.3/html-single/Grid_User_Guide/index.html#chap-Grid_User_Guide-Concurrency_Limits

Best,


matt