[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Condor held jobs should retry/release after certain configured timeout automatically



Hey..Â

Thanks alot. I used NumJobStarts.. It worked like a charm.. :)Â

On Tue, Apr 7, 2015 at 8:53 PM, Ben Cotton <ben.cotton@xxxxxxxxxxxxxxxxxx> wrote:
On Tue, Apr 7, 2015 at 11:16 AM, Sridhar Thumma <deadman.den@xxxxxxxxx> wrote:

> Attached wrong image?
>
Hah, oops. I need more coffee. What I meant to copy was:

04/07/15 10:38:13 (pid:12519) Evaluated periodic expressions in
0.000s, scheduling next run in 61s

Looking at your job's classad, I notice that the JobRunCount attribute
isn't in there, which may explain why your _expression_ isn't evaluating
as expected. According to the docs, that attribute is on the way out
the door:

JobRunCount: This attribute is retained for backwards compatibility.
It may go away in the future. It is equivalent to NumShadowStarts for
all universes except scheduler. For the scheduler universe, this
attribute is equivalent to NumJobStarts.
http://research.cs.wisc.edu/htcondor/manual/v8.2/12_Appendix_A.html

You might try using NumJobStarts or NumSystemHolds instead.


Thanks,
BC

--
Ben Cotton
main: 888.292.5320

Cycle Computing
Better Answers. Faster.

http://www.cyclecomputing.com
twitter: @cyclecomputing
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/