[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[condor-users] RE: clarification required please



Apologies for sending the controlled version originally - resending in the clear,

Attempting to debug some bizarre behaviour on our windows farm found the following inconsistencies.

A) CurrentRank
http://www.cs.wisc.edu/condor/manual/v6.6/3_6Startd_Policy.html

3.6.1 Startd ClassAd Attributes 

... first ...

CurrentRank 
	: A float which represents this machine owner's affinity for running the Condor job which it is currently hosting. If not currently hosting a Condor job, CurrentRank is -1.0. 

... a bit further down ...

CurrentRank 
	: The value of the RANK expression when evaluated against the ClassAd of the ``current'' job using this machine. If the resource has been claimed but no job is running, the ``current'' job ClassAd is the one that was used when claiming the resource. If a job is currently running, that job's ClassAd is the ``current'' one. If the resource is between jobs, the ClassAd of the last job that was run is used for CurrentRank. 

which is true?

B) Preemption
from the supplied config file

##  The negotiator will not preempt a job running on a given machine
##  unless the PREEMPTION_REQUIREMENTS expression evaluates to true
##  and the owner of the idle job has a better priority than the owner
##  of the running job.  This expression defaults to true.
UWCS_PREEMPTION_REQUIREMENTS = $(StateTimer) > (1 * $(HOUR)) && RemoteUserPrio > SubmittorPrio * 1.2

does this means that, in addition to this PREEMPTION_REQUIREMENTS evaluating to true the user prio must be better or that this particular expression causes this.

C) Vacation
Also I believe there is a bug on the windows port:

we have 

want_vacate = False

there is no definition for want_vacate_vanilla (condor_config_val confirms this)

vanilla jobs do not immediately go to the killing state they remain in the preempting state till the timeout expires (we were using the default UWCS value for KILL as I thought it would not matter)

I have therefore modified KILL to be true, thus mitigating the problem but It makes me wonder whether other bugs such as this exist?

Thanks for any informaion that can be supplied
Matt



*****************************************************************
Gloucester Research Limited believes the information 
provided herein is reliable. While every care has been 
taken to ensure accuracy, the information is furnished 
to the recipients with no warranty as to the completeness 
and accuracy of its contents and on condition that any 
errors or omissions shall not be made the basis for any 
claim, demand or cause for action.
*****************************************************************

Condor Support Information:
http://www.cs.wisc.edu/condor/condor-support/
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>