[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Condor held jobs should retry/release after certain configured timeout automatically



On Tue, Apr 7, 2015 at 11:16 AM, Sridhar Thumma <deadman.den@xxxxxxxxx> wrote:

> Attached wrong image?
>
Hah, oops. I need more coffee. What I meant to copy was:

04/07/15 10:38:13 (pid:12519) Evaluated periodic expressions in
0.000s, scheduling next run in 61s

Looking at your job's classad, I notice that the JobRunCount attribute
isn't in there, which may explain why your expression isn't evaluating
as expected. According to the docs, that attribute is on the way out
the door:

JobRunCount: This attribute is retained for backwards compatibility.
It may go away in the future. It is equivalent to NumShadowStarts for
all universes except scheduler. For the scheduler universe, this
attribute is equivalent to NumJobStarts.
http://research.cs.wisc.edu/htcondor/manual/v8.2/12_Appendix_A.html

You might try using NumJobStarts or NumSystemHolds instead.


Thanks,
BC

-- 
Ben Cotton
main: 888.292.5320

Cycle Computing
Better Answers. Faster.

http://www.cyclecomputing.com
twitter: @cyclecomputing