[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [condor-users] The Role of Machine-Side RANK

It's a bit complicated, but I'll dare to give my understanding:

It seems that the right way to think about RANK expression is as a
preference specification for a search. It's like when you get the best
matches first when searching the Internet. 
This is how the RANK in job's classad works .It is evaluated during
negotiation cycle in the context of the resource classad to find the 
best match for the job, given that more than one resource match the
request. Now, if these matched resources  are already  busy running
other jobs - here the Machine RANK comes into a play. If the job being
matched is preferred ( that is Machine's RANK expression is evaluated to
higher value in the context of that Job's classad ) by the busy resource
over the one which is already running on it, the running job is
preempted and the new one is invoked. If the busy resource has no  RANK
expression defined, or it is the same, PREEMPTION_REQUIREMENTS are
evaluated by the negotiator ( this one is defined in the negotiator
config ) to find which job has higher priority to run. If there are
several busy resources  with equal chance to be preempted,
PREEMPTION_RANK is evaluated to decide which one is better.
Complicated, ah? If anyone from the team thinks that I'm wrong, I'd be
happy to hear the comments.

I think that Raman's PhD ( available in Research publications section )
describes this fairly well -  see Section 3.2.4. 
There is also one nice explanation I got from Peter Keller regarding
preemption policy- see below.
Hope it helps,

Peter's explanation:

> here are four things you need to control about Condor's preemption
> policy
> .
> 1. Preemption of the job because of local events on the compute
> machine.
> 2. Preemption of a job due to user priority.
> 3. Preemption of job due to user action: condor_vacate, condor_hold,
> etc.
> 4. Preemption of job due to startd rank.
> 1. 
> This is how you would stop a job being preempted for some local(load
> to high,
> suspended too long, some other user defineable expression) reason:
> PREEMPT = False
> Also, if you do not want your job to be wasting time in suspension(say
> because
> your resources are more dedicated then interactive, then also do this:
> SUSPEND = False
> 2. 
> This happens when a startd is running a job, and the negotiator wants
> to preempt the job for another job with a BETTER user priority. You
> can
> turn this feature off like this(in the config file your negotiator
> reads):
> 3. 
> Usually, if a user types condor_hold, they wanted to. :)
> Also, things like condor_vacate are (currently by default) controlled
> under 
> HOSTALLOW_ADMIN host based authentication, so if the machine that the
> condor_vacate was typed under isn't in that list, it doesn't happen.
> 4. 
> This happens when the startd is running a job, and the negotiator
> asks the startd if it would like to run a certain job, and the startd
> decides it likes the new job better due to the RANK of the new job
> being
> better than the rank of the current job.
> It is here that you shall implement your black and white policy.
> Suppose we imagine a "tier" system where jobs with higher tier numbers
> always
> preempt jobs with lower tier numbers.
> Now, in your submit file for your job add, enter the "tier" number of
> the job
> or cluster:
> +Tier = 10
> Now, the RANK expression(in the main config file) looks like this:
> If the rank is undefined due to Tier not existing in the job ad, then
> it 
> will be considered to be zero.
> Condor Admin
> ========================================
> ========================================
> * From: Peter Keller <psilord@xxxxxxxxxxx>
On Tue, 2004-02-10 at 15:58, Alexander Klyubin wrote:
> Hello!
> I would like to check if I correctly understand the role the 
> machine-side RANK plays in Condor. I assume that a job running on a 
> machine can only be preempted by a job whose rank *on the machine* is at 
> least as high irrespective of negotiator-side PREEMPTION_RANK. Is this 
> correct?
> Kind Regards,
> Alexander Klyubin
> Condor Support Information:
> http://www.cs.wisc.edu/condor/condor-support/
> To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
> unsubscribe condor-users <your_email_address>

Condor Support Information:
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>