[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] [resubmit jobs]
- Date: Tue, 01 Jul 2014 12:09:22 -0500
- From: Todd Tannenbaum <tannenba@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] [resubmit jobs]
On 7/1/2014 11:03 AM, Sunshine wrote:
I submit some jobs.
A few of jobs took 2 hours to complete, but I think the time should be 20m and some similar jobs indeed finished within 20minutes.
I think something wrong with my jobs or clusters..
My question: how do I let a job restart after a specific time?
For example, if a job didn't finish within 5 minutes, then let the job resubmit?ãor restart on a different machine?
For example, i submit 100 jobs, then 99 jobs finished within 20m, but a job cost 2hours long , i want to resubmit a job.
I used following:
periodic_remove = (CurrentTime - EnteredCurrentStatus >60*20)âââ
then check the log, then submit the failed job.
Any better ideas?