[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [condor-users] RE: clarification required please
- Date: Mon, 10 May 2004 17:24:43 -0500
- From: Alain Roy <roy@xxxxxxxxxxx>
- Subject: Re: [condor-users] RE: clarification required please
Attempting to debug some bizarre behaviour on our windows farm found the
3.6.1 Startd ClassAd Attributes
... first ...
: A float which represents this machine owner's affinity for
running the Condor job which it is currently hosting. If not currently
hosting a Condor job, CurrentRank is -1.0.
... a bit further down ...
: The value of the RANK expression when evaluated against the
ClassAd of the ``current'' job using this machine. If the resource has
been claimed but no job is running, the ``current'' job ClassAd is the
one that was used when claiming the resource. If a job is currently
running, that job's ClassAd is the ``current'' one. If the resource is
between jobs, the ClassAd of the last job that was run is used for
which is true?
This is easily discovered by examining the computer's ClassAd. Notice that
you can find the current rank of a machine by doing:
condor_status -l <name> | grep -i rank
Uhh... Does grep work on Windows? If not, just skim the output and look for
It appears to me that the CurrentRank is 0 unless a job is running. It is
not -1, nor the last rank that was used. You can do your own easy test to
Perhaps the real question is "what should CurrentRank be?".
from the supplied config file
## The negotiator will not preempt a job running on a given machine
## unless the PREEMPTION_REQUIREMENTS expression evaluates to true
## and the owner of the idle job has a better priority than the owner
## of the running job. This expression defaults to true.
UWCS_PREEMPTION_REQUIREMENTS = $(StateTimer) > (1 * $(HOUR)) &&
RemoteUserPrio > SubmittorPrio * 1.2
does this means that, in addition to this PREEMPTION_REQUIREMENTS
evaluating to true the user prio must be better or that this particular
expression causes this.
The condor_negotiator checks the priority internally as well.
Also I believe there is a bug on the windows port:
I doubt it's Windows specific.
vanilla jobs do not immediately go to the killing state they remain in the
preempting state till the timeout expires (we were using the default UWCS
value for KILL as I thought it would not matter)
I suspect that this is a bug in the documentation, not in the code. I will
ask someone else to weigh in on this, and maybe fix the documentation if
Condor Support Information:
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>