[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Better control over negotiator?



I'll answer this one 'cause it's what we do and I know the answer.

> Is it possible, in an almost idle pool, to claim slots "in 
> alphabetical 
> order"?

So we fill our machines "width first". If the system is empty jobs start
on all the slot1@<machine> locations then start filling up the
slot2@<machine> locations. That way, in a system with light load, the
jobs get to run on as free a machine as possible. We do this with:

##  The NEGOTIATOR_POST_JOB_RANK expression chooses between
##  resources that are equally preferred by the job.
##  The following example expression steers jobs toward
##  faster machines and tends to fill a cluster of multi-processors
##  breadth-first instead of depth-first.  In this example,
##  the expression is chosen to have no effect when preemption
##  would take place, allowing control to pass on to
##  PREEMPTION_RANK.
#ALTERA_NEGOTIATOR_POST_JOB_RANK = (RemoteOwner =?= UNDEFINED) * (KFlops
- VirtualMachineID)
##
##  Break ties by looking for machines that have Idle longer than others
##  and use them first. Also try and use faster machines before slower
##  machines and assign jobs to separate machines before we start
putting
##  two jobs on a machine.
ALTERA_NEGOTIATOR_POST_JOB_RANK = (((Activity =?= 'Owner') * (State =?=
'Idle')) * 1000000000) + ((Activity =?= 'Unclaimed') * 100000000) +
(KFlops * 0.001) - (VirtualMachineID * 10)

The second line is the one that's actually used. I just put it all there
for general information.

As for your other question:

> Is there a (simple but consistent) way to handle the small
> slots the same, (by rounding MIPS and KFLOPS values ? while reserving
> the big ones for big jobs as long as possible?

Well, here's our pre-job rank expression for your perusal:

##  The NEGOTIATOR_PRE_JOB_RANK expression overrides all other ranks
##  that are used to pick a match from the set of possibilities.
##  Try running jobs on machines that are unclaimed. Also try putting
##  jobs on machines that are in the state Owner+Idle because these
machines
##  may just have very strict START requirements.
ALTERA_NEGOTIATOR_PRE_JOB_RANK =  (((Activity =?= 'Owner') * (State =?=
'Idle')) * 1000000000) + ((Activity =?= 'Unclaimed') * 100000000)

You could add and inverse KFLOPS or MIPS to that expression. Ditto to
the post rank expression.

- Ian


Confidentiality Notice.  This message may contain information that is confidential or otherwise protected from disclosure.
If you are not the intended recipient, you are hereby notified that any use, disclosure, dissemination, distribution, 
or copying of this message, or any attachments, is strictly prohibited.  If you have received this message in error, 
please advise the sender by reply e-mail, and delete the message and any attachments.  Thank you.