[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] disable preemption




The RANK expression is an execute-node setting. You need to reconfigure all machines in your cluster.

--Dan

On 7/31/13 10:18 AM, Pek Daniel wrote:
Thank you for your answer!

Do I have to make this change on all of my nodes, or is it enough to
add this parameter to negotiator and collector host?

Because I did add RANK = 0 on my negotiator and collector host, and I
still get these preemptions:

07/31/13 17:07:20 match_info called
07/31/13 17:07:20 Preempting claim has correct ClaimId.
07/31/13 17:07:20 New claim has sufficient rank, preempting current claim.
07/31/13 17:07:20 State change: preempting claim based on user priority
07/31/13 17:07:20 State change: claim retirement ended/expired
07/31/13 17:07:20 Changing state and activity: Claimed/Busy ->
Preempting/Vacating
07/31/13 17:07:20 Got DEACTIVATE_CLAIM_FORCIBLY while in Preempting
state, ignoring.
07/31/13 17:07:20 Starter pid 29790 exited with status 0
07/31/13 17:07:20 State change: starter exited
07/31/13 17:07:20 State change: preempting claim exists - START is
true or undefined
07/31/13 17:07:20 Remote owner is submitter@condorschedd
07/31/13 17:07:20 State change: claiming protocol successful
07/31/13 17:07:20 Changing state and activity: Preempting/Vacating ->
Claimed/Idle
07/31/13 17:07:20 Got activate_claim request from shadow (zzz.zzz.zzz.zzz)
07/31/13 17:07:20 Remote job ID is 3840.0
07/31/13 17:07:20 Got universe "VANILLA" (5) from request classad
07/31/13 17:07:20 State change: claim-activation protocol successful
07/31/13 17:07:20 Changing activity: Idle -> Busy

My jobs are identical, I send them in the very same way.

2013/7/31 Dan Bradley <dan@xxxxxxxxxxxx>:
The preemption noted in your logs is due to the RANK expression ranking one
job higher than another.  If you don't want that, set RANK to some constant
value.  Example:

RANK = 0

You will need to run condor_reconfig after making that change.  To verify
that it worked, look at the Rank attribute in the machine ClassAd:

condor_status -long | grep -i Rank

--Dan


On 7/31/13 6:41 AM, Pek Daniel wrote:
Hi!

This is my condor_config.local:

CONDOR_HOST = condorcm
CONDOR_ADMIN = foo@bar
COLLECTOR_NAME = HTCondor testbench
ALLOW_WRITE = *
MASTER_NAME = $(FULL_HOSTNAME)
START = TRUE
SUSPEND = FALSE
SUBMIT_TIMEOUT_MULTIPLER = 3
TOOL_TIMEOUT_MULTILIER = 3
NEGOTIATOR_INTERVAL = 1
NEGOTIATOR_CYCLE_DELAY = 1
SCHEDD_INTERVAL = 10
UPDATE_INTERVAL = 10
PREEMPT = FALSE
PREEMPTION_REQUIREMENTS = False
SCHEDD_NAME = $(FULL_HOSTNAME)
SEC_DEFAULT_AUTHENTICATION_METHODS = CLAIMTOBE
SEC_CLIENT_AUTHENTICATION_METHODS = CLAIMTOBE
KILL = FALSE
DAEMON_LIST = ......

I'd like to completely disable every kind of preemption, and in fact -
because now the aim is to stresstest the system - to disable every
kind of prioritization or fairshare. I have only one user, so it's not
neccessary.

But for some reason I always get messages in my StartLogs like this:
07/31/13 13:33:31 match_info called
07/31/13 13:33:31 Preempting claim has correct ClaimId.
07/31/13 13:33:31 New claim has sufficient rank, preempting current claim.
07/31/13 13:33:31 State change: preempting claim based on user priority
07/31/13 13:33:31 State change: claim retirement ended/expired
07/31/13 13:33:31 Changing state and activity: Claimed/Busy ->
Preempting/Vacating
07/31/13 13:33:31 Starter pid 23274 exited with status 0
07/31/13 13:33:31 State change: starter exited
07/31/13 13:33:31 State change: preempting claim exists - START is
true or undefined
07/31/13 13:33:31 Remote owner is submitter@condorschedd
07/31/13 13:33:31 State change: claiming protocol successful
07/31/13 13:33:31 Changing state and activity: Preempting/Vacating ->
Claimed/Idle
07/31/13 13:33:31 Error: can't find resource with ClaimId
(<zzz.zzz.zzz.zzz:57640>#1375175876#309#...) -- perhaps this claim was
already removed?
07/31/13 13:33:31 Error: problem finding resource for 404
(DEACTIVATE_CLAIM_FORCIBLY)
07/31/13 13:33:31 Got activate_claim request from shadow (xxx.xxx.xxx.xxx)
07/31/13 13:33:31 Remote job ID is 3639.0
07/31/13 13:33:31 Got universe "VANILLA" (5) from request classad
07/31/13 13:33:31 State change: claim-activation protocol successful
07/31/13 13:33:31 Changing activity: Idle -> Busy

How can I completely turn off this kind of behaviour for testing purposes?

Thanks,
Daniel
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with
a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/