[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] How to set a worker node offline in HTCondor



> On Apr 8, 2021, at 10:47 AM, John M Knoeller <johnkn@xxxxxxxxxxx> wrote:
> 
> In general the mechanism that we use to avoid cancelling a drain that was not started by defrag is to look for the DrainReason attribute of the p-slot.   
> 
> Draining can be cancelled by the Defrag daemon if there is no DrainReason, or if the DrainReason is "defrag". 
> 
> There should always be a DrainReason attribute if draining was started by an 8.9.11 or later condor_drain command, or by an 8.9.11 or later DEFRAG daemon. 

OK, then there appears to be a bug in 8.9.11 (or I need to enable another condor setting). In particular, I ran version 8.9.11 "condor_drain machine-name" and DEFRAG restarted jobs after it was drained.

Note, I don't see a condor_drain option to specify DrainReason. If you agree the above is a bug then once it is fixed how should I specify DrainReason to indicate that a manual drain should be canceled by the Defrag daemon when it is done draining?

Thanks.

--
Stuart Anderson
sba@xxxxxxxxxxx