[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Setting up a minimum delay before retrying a job that has failed?
- Date: Fri, 12 Apr 2019 14:12:32 +0200 (CEST)
- From: "Beyer, Christoph" <christoph.beyer@xxxxxxx>
- Subject: Re: [HTCondor-users] Setting up a minimum delay before retrying a job that has failed?
I think you could do that inside the node start expression, something like:
START = (NumJobStarts < 1) || ((CurrentTime - EnteredCurrentStatus) > (<time in secs>)
Would probably do the trick, need to test it of course as I am to lazy to do that right now, and you have to combine it with your other START dependencys most likely ;)
Maybe it's not the most intelligent place to do that either but it's one way to get around your problem for sure ....
Building 02b, Room 009
----- UrsprÃngliche Mail -----
Von: "Nicolas Arnaud" <narnaud@xxxxxxxxxxxx>
An: "htcondor-users" <htcondor-users@xxxxxxxxxxx>
Gesendet: Freitag, 12. April 2019 13:15:02
Betreff: [HTCondor-users] Setting up a minimum delay before retrying a job that has failed?
Is there an easy way to set a (minimum) delay between two (re)tries of
an HTCondor job? That would help in case the failure is due to a
transient problem (like the unavailability of the input data) that is
likely to be solved after O(few minutes) at most. Currently the retries
are so quick that the maximum number of retries is reached before the
transient problem gets cleared.
Thanks in advance,
= Nicolas ARNAUD =
= Laboratoire de l'Accelerateur Lineaire =
= CNRS/IN2P3 & UniversitÃ Paris-Sud =
= Virgo Experiment =
= European Gravitational Observatory =
= Via E. Amaldi, 5 =
= 56021 Santo Stefano a Macerata =
= Cascina (PI) -- Italia =
= Tel: + 39 050 752 314 =
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
You can also unsubscribe by visiting
The archives can be found at: