[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] How to set a worker node offline in HTCondor



You will be glad to hear that as of  8.9.11  DEFRAG and the condor_drain command will now set a DrainReason attribute into the machine ClassAd.   DEFRAG will check this attribute and only resume running jobs on machines that it drained.


-tj

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Anderson, Stuart B. <sba@xxxxxxxxxxx>
Sent: Wednesday, March 31, 2021 6:27 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] How to set a worker node offline in HTCondor
 
DEFRAG was another surprise (at least for me) with condor_drain since the DEFRAG daemon also drains jobs, but it does not make any distinction between draining that it started and manual administrative draining before it starts running new jobs on completely drained nodes.

> On Mar 31, 2021, at 12:03 PM, Michael Pelletier via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:
>
> The “condor_off –peaceful” is what I usually use to get this behavior. The drawback is that it closes down the daemons and stops reporting to the collector when all the jobs finish, rather than leaving the startd active in the “Drained” state, but that’s reasonably straightforward to work with. I keep looking for a condor_drain -peaceful.

--
Stuart Anderson
sba@xxxxxxxxxxx




_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/