[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] feasable policy?



Assumptions:

1. Machines purchases by group G have a custom class ad attribute
advertised by their startd's:

IsOwnedBy = "G"

2. Jobs submitted by people from group G have a custom class ad
attribute:

+IsSubmittedBy = "G"


> 	1) Prefer to run on machines in GM if available.

When you submit your job, in the submit ticket write the following:

rank = (TARGET.IsOwnedBy =?= "G") * 100

Now your jobs will prefer machines that were purchased by group G. But
it's just a preference, not an exclusive requirement.

> 	2) otherwise, if any machine is available, run it there

No problem. This is handled by using rank in your submit ticket instead
of requirements to steer your jobs.

> 	3) otherwise, preempt a job running on GM owned by someone not
in G
> Jobs submitted by people not in G:

Startd RANK-based preemption is your friend here. On your machines:

RANK = (TARGET.IsSubmittedBy =?= "G") * 100

If there's a job running and it's not submitted by group "G" it will
have a lower rank on this machine than a job in the queue that *was*
submitted by group "G". Startd RANK preemption happens regardless of
what you have PREEMPTION_REQUIREMENTS set to -- that setting is only for
userprio preemption.

We use this setup at Altera and we set:

PREEPTION_REQUIREMENTS = False

Because we don't do userprio-based preemption. Only startd RANK
preemption (and we use it to allow groups to "own" machines -- exactly
like what you're looking for here).

> 	4) If a machine not in GM is available, run there

So you only want to preempt if there's not a free machine available? You
can use NEGOTIATOR PRE and POST ranks to sort the machines so the system
considers free machines first and then looks at machines that are
occupied when it's doing match making. Try:

##  The NEGOTIATOR_PRE_JOB_RANK expression overrides all other ranks
##  that are used to pick a match from the set of possibilities.
##  Try running jobs on machines that are unclaimed. Also try putting
##  jobs on machines that are in the state Owner+Idle because these
machines
##  may just have very strict START requirements.
NEGOTIATOR_PRE_JOB_RANK =  (((Activity =?= 'Owner') * (State =?=
'Idle')) * 1000000000) + ((Activity =?= 'Unclaimed') * 100000000)

##  The NEGOTIATOR_POST_JOB_RANK expression chooses between
##  resources that are equally preferred by the job.
##  Break ties by looking for machines that have Idle'd longer than
others
##  and use them first. Also try and use faster machines before slower
##  machines and assign jobs to separate machines before we start
putting
##  two jobs on a machine.
ALTERA_NEGOTIATOR_POST_JOB_RANK = (((Activity =?= 'Owner') * (State =?=
'Idle')) * 1000000000) + ((Activity =?= 'Unclaimed') * 100000000) +
(KFlops * 0.001) - (VirtualMachineID * 10)

> 	5) otherwise, run on GM

Everything I've just given does exactly this.
 
> I believe I know how to implement 1,2,4,5. I'm having trouble with 3;
I
> get the impression that it involves PREEMPTION_REQUIREMENTS, but can't
> see how that can be used to distinguish between machines.

Caveat: what I'm describing has been tested and works well for us but
we're using 6.7.x. Mileage may vary on 6.6.x.

Cheers!

- Ian