[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] offline compute nodes and Rooster





On 10/16/10 8:24 AM, Paul Haldane wrote:
3.  Offline slots _should_ (I think they should, but would like
confirmation) continue to appear in the output of condor_status (using
-constraint Offline to just see offline slots).  In our environment
they only appear for 10/20 minutes after powering off.  This isn't what
I expect because OFFLINE_EXPIRE_ADS_AFTER defaults to maxint.

Yes, the offline ads should remain visible in condor_status.  They
should not expire in 30 minutes if you are using the default
OFFLINE_EXPIRE_ADS_AFTER.
I've just been able to grab (using condor_status -l yard10.campus.ncl.ac.uk) the ADS for a machine that's unpingable (so it is hibernating) but still visible in condor_status output.

I won't include all 109 lines of output here (unless that would be useful - full version is at http://www.staff.ncl.ac.uk/paul.haldane/yard10.txt).  All looks plausible to me apart from

Offline = ((CurrentTime - EnteredCurrentState)>= 60&&  MachineLastMatchTime =?= UNDEFINED&&  State =?= "Unclaimed")

Is that correct or should it just be a simple Boolean value?

I know why it's showing that value ("Offline = $(ShouldHibernate)" in the config file on the compute nodes) but perfectly willing to believe that it's rubbish.

I would expect Offline to just be a simple boolean value. If you are setting it in the config file, I'd recommend getting rid of that setting. Condor should set Offline automatically when it publishes the final ad before hibernating.

--Dan