[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Alternative to MAX_JOBS_RUNNING?

On Tue, 14 Mar 2006, Rob Pieké wrote:

This is primarily to deal with license issues, where we may have far
more machines than licenses and, as such, want to limit the number of
running jobs since jobs that can't get licenses will not be doing useful

The only solutions I know of so far are:

* to split up jobs based on the type of software they use and send them
through unique schedulers that each have MAX_JOBS_RUNNING based on the
license count for the particular type of job they represent.

I am using the following plan for this:  Mapping the user who needs
a license to a special Accounting Group and making that accounting
Group have a quota of only 24 jobs, which is the number of
licenses they are holding.  The licenses are served out with lmgrd
daemon and can be available to any of the 197 nodes in the cluster.

Steve Timm

* have some external piece of software monitoring the farm and
"intelligently" holding and releasing jobs

The first way definitely works, but every time we get another piece of
software we have to set aside another scheduler. I'm pretty sure the
second way will work, but I'd really like to avoid having to do this.

Is there an easier way to tell the matchmaker to only consider the first
M jobs of category X and the first N jobs of category Y, etc?

Similarly, is there an "easy" way to run multiple schedulers on the same
machine other than repeatedly setting CONDOR_CONFIG to point to unique
config files and running condor_master?

Condor-users mailing list

Steven C. Timm, Ph.D  (630) 840-8525  timm@xxxxxxxx  http://home.fnal.gov/~timm/
Fermilab Computing Div/Core Support Services Dept./Scientific Computing Section
Assistant Group Leader, Farms and Clustered Systems Group
Lead of Computing Farms Team