[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] identifying (gracefully/peacefully) shutting down daemons



Hi Steve,

yes, that was what I found as well. Thing is, that afais drainings can
also be initiated by the defrag daemon, so I *guess* that draining is
not a sure sign for a daemon to shut down. Ideally, I would like to get
the shutdown state of the master(?) to avoid pitfalls(?) with startd/defrag.

Cheers,
  Thomas

On 2016-08-04 22:01, Steven Timm wrote:
> It has been a while but I believe that there is a change in the value of
> the START expression in the condor_startd classad when it is in drain mode.
> 
> Steve
> 
> 
> On Thu, 4 Aug 2016, Brian Bockelman wrote:
> 
>> Hi Thomas,
>>
>> I don’t know of any way to do this myself.
>>
>> Note there’s two levels here:
>> - Determining the state of the master.  Is it trying to shut down
>> daemons?
>> - Determining the state of the child daemons.  Did they successfully
>> get the signal from the master?
>>
>> Brian
>>
>>> On Aug 4, 2016, at 10:39 AM, Thomas Hartmann
>>> <thomas.hartmann@xxxxxxx> wrote:
>>>
>>> Hi all,
>>>
>>> I would like to find all daemons in my pool, that got a condor_off
>>> (-graceful -peaceful) and are draining and shutting down. So,
>>> is_owner/draining status is implied. Draining related ads are afais no
>>> positive indicator, since maybe the defragd would also initiates
>>> drainings.
>>>
>>>
>>> For the startd's I suppose
>>>
>>> Activity = Retiring
>>> ChildActivity = { "Retiring"
>>>
>>> would be the ads to check for.
>>>
>>> But I am not sure, if this is sufficient for identifying a node's
>>> daemon(s) shutting down compared to a job/slot preempting/draining for
>>> any reason where the daemons/slots(?) would be staying subsequently
>>> alive [1]?
>>>
>>> Cheers and thanks,
>>>  Thomas
>>>
>>> [1]
>>> http://research.cs.wisc.edu/htcondor/manual/latest/3_5Policy_Configuration.html
>>>
>>>
>>> _______________________________________________
>>> HTCondor-users mailing list
>>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
>>> with a
>>> subject: Unsubscribe
>>> You can also unsubscribe by visiting
>>> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>>
>>> The archives can be found at:
>>> https://lists.cs.wisc.edu/archive/htcondor-users/
>>
>>
>> _______________________________________________
>> HTCondor-users mailing list
>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
>> with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/htcondor-users/
> 
> ------------------------------------------------------------------
> Steven C. Timm, Ph.D  (630) 840-8525
> timm@xxxxxxxx  http://home.fnal.gov/~timm/
> Office: Feynman Computing Center 243
> Fermilab Scientific Computing Division,
> Scientific Computing Facilities Quadrant.,
> Experimental Computing Facilities Dept.,
> Grid and Cloud Operations Group
> 
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
> 

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature