Re: [Condor-users] Disabling the restarting of jobs

In the submit file:
periodic_hold = NumJobStarts > 1

There is no way to update the hold reason, unfortunately. However, it will state that the above expression evaluated to true, which will hopefully be self explanatory.


Robert E. Parrott wrote:

Hi Folks,

We have a certain set of users whose parallel code is fairly strict in the way it names and manages files. They are looking for an option to disable the restarting of jobs after the jobs have been damaged (for example by an OOM condition on a node), because a restarted job will otherwise overwrite the partial, and still usable, data files.

Is there a submit file expression, of a config expression, with a boolean to be added to the config file, that would put all jobs that would otherwise restart into a "hold" state?

As a secondary question, would there be a way to update the "HoldReason" classad expression with something relevant?


