[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Send offline classAD on execute node graceful shutdown



I managed to solve this.  

Adding EXPIRE_INVALIDATED_ADS = True to the condor_config.local on the collector basically resolves this.   This combined with ABSENT_REQUIREMENTS = True and a location set for COLLECTOR_PERSISTENT_AD_LOG means that machines shutdown due to power management software or students trying to be mindful, end up in the persistent ad log.  They can be woken by roosters in their subnet, and seen by condor users with condor_status -absent.

The documentation explains all this, but the chain of logic wasn't completely clear without digging around in the logs.
Nodes gracefully shutting down send INVALIDATE_MASTER_ADS to the collector.
You need to set the collector to expire those ads rather than invalidating them. (EXPIRE_INVALIDATED_ADS)
And then, rather than actually expiring ads, make them absent ads. (ABSENT_REQUIREMENTS)

I though I should just reply to my own question in case it was useful to anyone else.

Cheers,
Andre Geldenhuis


From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Andre Geldenhuis <Andre.Geldenhuis@xxxxxxxxx>
Sent: Monday, 2 March 2015 2:27 p.m.
To: htcondor-users@xxxxxxxxxxx
Subject: [HTCondor-users] Send offline classAD on execute node graceful shutdown
 

Hi,

I'm building a small HTcondor cluster to test its interaction with our universities power management software. The final stumbling block is getting an execute node to send an offline ClassAD when shutdown gracefully.  This would allow the rooster to wake the nodes shutdown by students, or our power management software.  

Currently a graceful shutdown sends INVALIDATE_MASTER_ADS to the collector. Is it possible to configure the node to send its offline classAd at the same time?

I can do this manually by saving the machine classAd to disk and resending it to the collector after shutdown with condor_advertise (after editing the requisite fields). This would be easy to automate, but it seems that it shouldn't be necessary?

Absent classAds also work, but only if I kill the condor processes on the node, preventing INVALIDATE_MASTER_ADS getting sent.  This would obviously not be the usual case.

Any help with this would be greatly appreciated.
Regards,
Andre Geldenhuis