[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Persistent offline machines ads when using a third party power management tool?



Ah. Reading the ticket I thought it made it in to 7.6.4. Thanks.

- Ian

On 2012-02-01, at 6:44 AM, Lukas Slebodnik <slebodnik@xxxxxxxx> wrote:

> Just one notice. Patch from ticket
> https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=2564
> is not included in any condor release.
> 
> Regards,
> Lukas
> 
> On Tue, Jan 31, 2012 at 03:07:51PM -0500, Ian Chesal wrote:
>> Hi Lukas,
>> 
>> Maybe. I'm not sure OFFLINE=True is ever being set. But I never considered that it could be getting set and the unset before hibernation kicked it.
>> 
>> I'll try extending HIBERNATION_WAIT_INTERVAL to see if it helps the situation.
>> 
>> Regards,
>> - Ian
>> 
>> 
>> ---
>> Ian Chesal
>> 
>> Cycle Computing, LLC
>> Leader in Open Compute Solutions for Clouds, Servers, and Desktops
>> Enterprise Condor Support and Management Tools
>> 
>> http://www.cyclecomputing.com
>> http://www.cyclecloud.com
>> http://twitter.com/cyclecomputing
>> 
>> 
>> On Tuesday, 31 January, 2012 at 2:56 PM, slebodnik wrote:
>> 
>>> Are symptoms similar like in ticket
>>> https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=2564
>>> 
>>> Regards,
>>> Lukas
>>> 
>>> On Tue, 31 Jan 2012 13:39:26 -0500, Ian Chesal  
>>> <ichesal@xxxxxxxxxxxxxxxxxx (mailto:ichesal@xxxxxxxxxxxxxxxxxx)> wrote:
>>>> It seems like it should work but in my 7.6.5 pool any machine that's
>>>> hibernated by third party (external to Condor) power management
>>>> software fails to end up with an OFFLINE=True attribute in the  
>>>> machine
>>>> ad at the collector and subsequently disappears from my list of
>>>> machines so it cannot be woken up by Rooster.
>>>> 
>>>> Condor appears to know that it's being hibernated. I see the
>>>> following the MasterLog for the machine when the third party tool
>>>> starts the machine hibernation procedure:
>>>> 
>>>> 01/26/12 09:17:13 PowerEventHander: Some driver/application is asking
>>>> if we can enter hibernation
>>>> 01/26/12 09:17:15 PowerEventHander: Machine entering hibernation
>>>> 
>>>> Looking through the source it doesn't appear that the event handler
>>>> for this event does anything. There's no sign that it's updating the
>>>> machine ad to let the collector know it's going offline. When the
>>>> collector-side OfflineCollectorPlugin runs the ad is purged, not off
>>>> lined. If I set the offline attribute on the machine ad to true  
>>>> before
>>>> hibernating the machine by hand everything works. Unfortunately I
>>>> don't seem to be able to run a script from the hibernation tool  
>>>> that's
>>>> in use, so I can't (at least not without great difficulty) follow  
>>>> this
>>>> approach in the third party tool.
>>>> 
>>>> Is it not possible to have this third party hibernation offline the
>>>> machine ad when the hibernate signal is trapped?
>>>> 
>>>> (This is on Windows BTW…)
>>>> 
>>>> Regards,
>>>> - Ian
>>>> 
>>>> ---
>>>> 
>>>> Ian Chesal
>>>> 
>>>> Cycle Computing, LLC
>>>> Leader in Open Compute Solutions for Clouds, Servers, and Desktops
>>>> Enterprise Condor Support and Management Tools
>>>> 
>>>> http://www.cyclecomputing.com
>>>> http://www.cyclecloud.com
>>>> http://twitter.com/cyclecomputing
>>>> 
>>> 
>>> 
>>> 
>> 
>> 
> 
>> _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>> 
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/condor-users/
>