[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] condor_status -total, Preempting




Hi Daniel,

With the help of Brian looking over the logs on your test pool, we think we know what is going on. There appears to be a rare race condition that can occur in the matchmaking protocol.

Short Story / Work Around -

There are several easy ways to fix the protocol, we will patch the code base for future HTCondor releases. See
 https://htcondor-wiki.cs.wisc.edu/index.cgi/tktview?tn=4183
The odds of hitting the race condition is exasperated if you changed NEGOTIATOR_CYCLE_DELAY in your config file, especially if you are using a "tree" of collectors. So for an immediate workaround, do not set NEGOTIATOR_CYCLE_DELAY to small values (aka lower than 20) in order to minimize the chances of the race condition, and also use MaxJobRetirementTime to disable preemption. If you use MaxJobRetirementTime to disable preemption, the race condition bug could still unfortunately put the slot into Retiring activity when it should not, but at least it will indeed prevent preemption of the job.


Longer Story / Details -

In the collector, for each startd ad you can see with condor_status, there is also an "startd private ad" that you cannot see which is used by the negotiator. The startd private ad contains a secret key that must be presented in order to claim a slot. Once a key is used to claim a slot, the startd will ensure that the same key cannot be reused. Upon being claimed, the startd will generate and advertise a new private ad with a new "preempting secret" that can be used to preempt the claimed slot.

At the start of a negotiation cycle, the negotiator 1) fetches the startd ads, then it 2) fetches all the corresponding startd private ads. What happens if the ads are updated between steps 1 and 2? In this case, the negotiator may see a stale startd ad that says the slot is Unclaimed, and claim it with an up-to-date "preempting secret" from the private ad. The result is what you are seeing - preemptions that should not occur!

regards,
Todd

On 1/29/2014 4:27 AM, Pek Daniel wrote:
Now, I've set MAXJOBRETIREMENTTIME to a high value, and now I can't
see any machines in "Preempting" state, instead, they're in Claimed
with Retiring activity.

01/29/14 10:43:23 slot5: Request accepted.
01/29/14 10:43:23 slot5: Remote owner is xxx@xxxxxx
01/29/14 10:43:23 slot5: State change: claiming protocol successful
01/29/14 10:43:23 slot5: Changing state: Unclaimed -> Claimed
01/29/14 10:43:24 slot5: Got activate_claim request from shadow
(xxx.xxx.xxx.xxx)
01/29/14 10:43:24 slot5: Remote job ID is 3933.36
01/29/14 10:43:24 slot5: Got universe "VANILLA" (5) from request classad
01/29/14 10:43:24 slot5: State change: claim-activation protocol successful
01/29/14 10:43:24 slot5: Changing activity: Idle -> Busy
01/29/14 10:43:34 slot5: Preempting claim has correct ClaimId.
01/29/14 10:43:34 slot5: New claim has sufficient rank, preempting
current claim.
01/29/14 10:43:34 slot5: State change: preempting claim based on user priority
01/29/14 10:43:34 slot5: State change: retiring due to preempting claim
01/29/14 10:43:34 slot5: Changing activity: Busy -> Retiring

And also, during the negotiation, there're some fluctuations in the
number of claimed machines. It should be monotonicly increasing, but
sometimes it drops down to a lower value, and then it's increasing
again...



2014/1/28 Pek Daniel <pekdaniel@xxxxxxxxx>:
2014/1/28 Pek Daniel <pekdaniel@xxxxxxxxx>:
Hi,

2014/1/27 Todd Tannenbaum <tannenba@xxxxxxxxxxx>:

Hi Daniel -

The below looks really unexpected.  Your settings indeed should disable
preemption, assuming you did a successful condor_reconfig after the
changes
and they are set at the right host (the PREEMPTION_REQUIREMENTS change
read
by the condor_negotiator, and the other settings are read by all the
execute
hosts running condor_startds).  Note that the preferred way to disable
preemption on HTCondor v8.0+ is via MaxJobRetirementTime, see


http://research.cs.wisc.edu/htcondor/manual/current/3_5Policy_Configuration.html#SECTION00459500000000000000

But what you have below should work as well.

HTCondor may preempt a job in favor of another job from the same user, but
only in the case of a higher startd RANK.

Very strange.

Is the below regularly reproducible, or do you only see it very rarely ?

Yes, this is a regular thing, I can reproduce it. What I do is I submit 4000
jobs spread across 10 schedds with the negotiator turned off, and then I
turn it on and poll condor_status -total. I can see from time to time the
value of Preemption other than zero.



Note that starting HTCondor v8.1.3, the machine classads will report some
helpful/insightful attributes regarding preemption; I copied the below
from
the manual at
http://research.cs.wisc.edu/htcondor/manual/latest/12_Appendix_A.html
These statistics were added for just such an occurance, i.e. so admins can
confirm that preemption is disabled. So, if you are running v8.1.3 or
above,
are these statistics below reporting preemptions as occuring?  If so, is
it
reporting user preemptions or rank preemptions? Maybe it is only happening
on some specific nodes?

JobPreemptions:
     The total number of times a running job has been preempted on this
machine.

JobRankPreemptions:
     The total number of times a running job has been preempted on this
machine due to the machine's rank of jobs since the condor_startd started
running.

JobUserPrioPreemptions:
     The total number of times a running job has been preempted on this
machine based on a fair share allocation of the pool since the
condor_startd
started running.

RecentJobPreemptions:
     The total number of jobs which have been preempted from this machine
in
the last twenty minutes.

RecentJobRankPreemptions:
     The total number of times a running job has been preempted on this
machine due to the machine's rank of jobs in the last twenty minutes.

RecentJobUserPrio:
     The total number of times a running job has been preempted on this
machine based on a fair share allocation of the pool in the last twenty
minutes.

Yes, recent userprio and total values are around 16 (out of 4000 jobs).
These happen on different schedds and startds, not always the same. They
have exactly the same configuration btw.

Ah, sorry, I've just noticed that this value is per machine (or per
slot?). So this means ~16 preemptions / machine.

Also I found these in my NegotiatorLog which might be relevant:

01/28/14 16:43:39 PREEMPTION_REQUIREMENTS = FALSE
01/28/14 16:43:39 NEGOTIATOR_INTERVAL = 1 sec
01/28/14 16:43:39 NEGOTIATOR_TIMEOUT = 30 sec
01/28/14 16:43:39 MAX_TIME_PER_SUBMITTER = 31536000 sec
01/28/14 16:43:39 MAX_TIME_PER_PIESPIN = 31536000 sec
01/28/14 16:43:39 PREEMPTION_RANK = (RemoteUserPrio * 1000000) -
TARGET.ImageSize
01/28/14 16:43:39 NEGOTIATOR_PRE_JOB_RANK = RemoteOwner =?= UNDEFINED
01/28/14 16:43:39 NEGOTIATOR_POST_JOB_RANK = (RemoteOwner =?=
UNDEFINED) * (ifthenElse(isUndefined(KFlops), 1000, Kflops) - SlotID
  - 1.0e10*(Offline=?=True))

And at the beginning of new cycles:
01/28/14 16:43:54 Not considering preemption, therefore constraining
idle machines with ifThenElse(State == "Claimed","Name State
Activity StartdIpAddr AccountingGroup Owner RemoteUser Requirements
SlotWeight ConcurrencyLimits","")

Can any of these cause the preemptions?




regards,
Todd


Thanks,
Daniel


On 1/27/2014 9:53 AM, Pek Daniel wrote:

Some lines from the StartLog:

01/27/14 16:45:42 slot22: Request accepted.
01/27/14 16:45:42 slot22: Remote owner is xxx
01/27/14 16:45:42 slot22: State change: claiming protocol successful
01/27/14 16:45:42 slot22: Changing state: Unclaimed -> Claimed
01/27/14 16:45:46 slot22: Got activate_claim request from shadow
(xxx.xxx.xxx.xxx)
01/27/14 16:45:46 slot22: Remote job ID is 3920.25
01/27/14 16:45:46 slot22: Got universe "VANILLA" (5) from request classad
01/27/14 16:45:47 slot22: State change: claim-activation protocol
successful
01/27/14 16:45:47 slot22: Changing activity: Idle -> Busy
01/27/14 16:45:55 slot22: Preempting claim has correct ClaimId.
01/27/14 16:45:55 slot22: New claim has sufficient rank, preempting
current claim.
01/27/14 16:45:55 slot22: State change: preempting claim based on user
priority
01/27/14 16:45:55 slot22: State change: claim retirement ended/expired
01/27/14 16:45:55 slot22: Changing state and activity: Claimed/Busy ->
Preempting/Vacating

2014/1/27 Pek Daniel <pekdaniel@xxxxxxxxx>:

Hi,

I tried my best to turn off preemption completely:
PREEMPT = FALSE
SUSPEND = FALSE
KILL = FALSE
PREEMPTION_REQUIREMENTS = FALSE
NEGOTIATOR_CONSIDER_PREEMPTION = FALSE
RANK = 0

But sometimes during negotiation, I still can see non-zero value in
the Preempting column of the output of condor_status -total.

According to the docs:

``Preempting'': A Condor job is being preempted (possibly via
checkpointing) in order to clear the machine for either a higher
priority job or because the machine owner wants the machine back.

Regarding that I have only one single user and completely identical
jobs, I don't think the preemption would happen because of a higher
priority job. Any idea why is this?

Thanks,
Daniel

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with
a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/



--
Todd Tannenbaum <tannenba@xxxxxxxxxxx> University of Wisconsin-Madison
Center for High Throughput Computing   Department of Computer Sciences
HTCondor Technical Lead                1210 W. Dayton St. Rm #4257
Phone: (608) 263-7132                  Madison, WI 53706-1685
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with
a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/



--
Todd Tannenbaum <tannenba@xxxxxxxxxxx> University of Wisconsin-Madison
Center for High Throughput Computing   Department of Computer Sciences
HTCondor Technical Lead                1210 W. Dayton St. Rm #4257
Phone: (608) 263-7132                  Madison, WI 53706-1685