[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Daylight savings put all our jobs on hold?



Russell,

When the U.S. switched to daylight saving time last month, I had a
customer who discovered that all of their execute nodes died. The
culprit there turned out to be that condor_master thought the
timestamp of the condor_master.exe binary had changed and attempted to
restart. HTCondor was running as an Active Directory user that wasn't
correctly figured and so condor_master was not able to restart. Ticket
3572[1] has some information on that.

Without knowing what version of HTCondor and what OS you're running, I
can't say if that's in any way related. Generally, I'm not sure why
the time change would cause your jobs to remain in idle state. Would
it be possible for you to share the output of a representative
condor_q -bet?

[1] https://htcondor-wiki.cs.wisc.edu/index.cgi/tktview?tn=3572


Thanks,
BC

-- 
Ben Cotton
main: 888.292.5320

Cycle Computing
Leader in Utility HPC Software

http://www.cyclecomputing.com
twitter: @cyclecomputing