[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Disabling the restarting of jobs



never mind ... saw your prev. answer.

Thanks all.

rob

On Feb 29, 2008, at 12:55 PM, Robert E. Parrott wrote:

Thanks for the reply.

Will this expression keep the job from restarting, or put it on hold
after it restarts?

My concern is that the initial output files will be overwritten at
first, and then the job put into hold, if there is any delay.

Otherwise this would be the simplest approach.

rob


On Feb 29, 2008, at 12:27 PM, Dan Bradley wrote:


In the submit file:
periodic_hold = NumJobStarts > 1

There is no way to update the hold reason, unfortunately. However, it
will state that the above expression evaluated to true, which will
hopefully be self explanatory.

--Dan

Robert E. Parrott wrote:

Hi Folks,

We have a certain set of users whose parallel code is fairly strict
in the way it names and manages files. They are looking for an option
to disable the restarting of jobs after the jobs have been damaged
(for example by an OOM condition on a node), because a restarted job
will otherwise overwrite the partial, and still usable, data files.

Is there a submit file expression, of a config expression, with a
boolean to be added to the config file, that would put all jobs that
would otherwise restart into a "hold" state?

As a secondary question, would there be a way to update the
"HoldReason" classad expression with something relevant?

thanks,
rob


==========================
Robert E. Parrott, Ph.D. (Phys. '06)
Associate Director, Grid and
        Supercomputing Platforms
Project Manager, CrimsonGrid Initiative
Harvard University Sch. of Eng. and App. Sci.
Maxwell-Dworkin  211,
33 Oxford St.
Cambridge, MA 02138
(617)-495-5045




_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/


_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/

==========================
Robert E. Parrott, Ph.D. (Phys. '06)
Associate Director, Grid and
         Supercomputing Platforms
Project Manager, CrimsonGrid Initiative
Harvard University Sch. of Eng. and App. Sci.
Maxwell-Dworkin  211,
33 Oxford St.
Cambridge, MA 02138
(617)-495-5045




_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/

==========================
Robert E. Parrott, Ph.D. (Phys. '06)
Associate Director, Grid and
        Supercomputing Platforms
Project Manager, CrimsonGrid Initiative
Harvard University Sch. of Eng. and App. Sci.
Maxwell-Dworkin  211,
33 Oxford St.
Cambridge, MA 02138
(617)-495-5045