[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] peaceful node drain and shutdown



Argh!

I think Brian is right. Unless you also change IS_OWNER, you will want to do 

  START = UNDEFINED

instead of start = false in order to avoid the owner state and just go to undefined. 

Sent from my iPhone

> On Jul 13, 2016, at 3:28 PM, Brian Bockelman <bbockelm@xxxxxxxxxxx> wrote:
> 
> You sure about this?
> 
> I also recall the same behavior that Bob describes - if START goes to FALSE instead of UNDEFINED, then the node transitions to Owner state, which then kills off running jobs.
> 
> (Again, might have changed at some point)
> 
> Brian
> 
>> On Jul 13, 2016, at 3:09 PM, Todd Tannenbaum <tannenba@xxxxxxxxxxx> wrote:
>> 
>> On 7/13/2016 3:03 PM, Bob Ball wrote:
>>> Maybe this info is now obsolete, but I remember once setting the START
>>> to an expression that evaluated "FALSE" and caused all the running jobs
>>> to terminate....
>>> 
>>> bob
>> 
>> Only if $(START) is referenced in the PREEMPT expression....
>> 
>> START just controls when new jobs can be launched.
>> 
>> PREEMPT controls when to kick off jobs (really would be more accurate to have named it "Evict" instead of "Preempt", sigh...).
>> 
>> regards
>> Todd
>> 
>> 
>>>> On 7/13/2016 3:56 PM, Fox, Kevin M wrote:
>>>> I'm guessing the condor_drain command will have similar issues to the
>>>> condor_off -peaceful command? That you have to have all the
>>>> permissions setup right?
>>>> 
>>>> The nice thing about the START=FALSE config trick is you only need
>>>> root on the machine to do it.
>>>> 
>>>> Thanks,
>>>> Kevin
>>>> ________________________________________
>>>> From: HTCondor-users [htcondor-users-bounces@xxxxxxxxxxx] on behalf of
>>>> Todd Tannenbaum [tannenba@xxxxxxxxxxx]
>>>> Sent: Wednesday, July 13, 2016 12:46 PM
>>>> To: HTCondor-Users Mail List
>>>> Subject: Re: [HTCondor-users] peaceful node drain and shutdown
>>>> 
>>>>> On 7/13/2016 2:29 PM, Fox, Kevin M wrote:
>>>>> Ah. I had seen the docs for START but didn't realize it would affect new
>>>>> job startup too. It seemed to imply that its for eviction.
>>>>> 
>>>>> But, the following seems to work to drain the node gracefully, as you
>>>>> suggested:
>>>>> echo START=FALSE > /etc/condor/config.d/00shutdown
>>>>> kill -HUP <PID OF MASTER>
>>>>> 
>>>>> and to reverse it
>>>>> rm -f /etc/condor/config.d/00shutdown
>>>>> kill -HUP <PID OF MASTER>
>>>>> 
>>>>> Thanks for the help. :)
>>>> Hi Kevin,
>>>> 
>>>> If the above satisfies your needs, great.  But just wanted to point out
>>>> you can do the same thing (drain a node gracefully) with the
>>>> condor_drain tool.  Do "man condor_drain", or see
>>>>  http://htcondor.org/manual/v8.4/condor_drain.html
>>>> 
>>>> Also in the upcoming HTCondor v8.5.6, the condor_drain functionality is
>>>> exposed via HTCondor's Python API. :)
>>>> 
>>>> regards,
>>>> Todd
>>>> 
>>>> 
>>>> _______________________________________________
>>>> HTCondor-users mailing list
>>>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
>>>> with a
>>>> subject: Unsubscribe
>>>> You can also unsubscribe by visiting
>>>> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>>> 
>>>> The archives can be found at:
>>>> https://lists.cs.wisc.edu/archive/htcondor-users/
>>>> _______________________________________________
>>>> HTCondor-users mailing list
>>>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
>>>> with a
>>>> subject: Unsubscribe
>>>> You can also unsubscribe by visiting
>>>> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>>> 
>>>> The archives can be found at:
>>>> https://lists.cs.wisc.edu/archive/htcondor-users/
>>> 
>>> _______________________________________________
>>> HTCondor-users mailing list
>>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
>>> subject: Unsubscribe
>>> You can also unsubscribe by visiting
>>> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>> 
>>> The archives can be found at:
>>> https://lists.cs.wisc.edu/archive/htcondor-users/
>> 
>> 
>> -- 
>> Todd Tannenbaum <tannenba@xxxxxxxxxxx> University of Wisconsin-Madison
>> Center for High Throughput Computing   Department of Computer Sciences
>> HTCondor Technical Lead                1210 W. Dayton St. Rm #4257
>> Phone: (608) 263-7132                  Madison, WI 53706-1685
>> _______________________________________________
>> HTCondor-users mailing list
>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>> 
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/htcondor-users/
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/