[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[HTCondor-users] [resubmit jobs]
- Date: Wed, 2 Jul 2014 00:03:11 +0800
- From: "Sunshine" <492096437@xxxxxx>
- Subject: [HTCondor-users] [resubmit jobs]
I submit some jobs.
A few of jobs took 2 hours to complete, but I think the time should be 20m and some similar jobs indeed finished within 20minutes.
I think something wrong with my jobs or clusters..
My question: how do I let a job restart after a specific time?
For example, if a job didn't finish within 5 minutes, then let the job resubmit?ãor restart on a different machine?
For example, i submit 100 jobs, then 99 jobs finished within 20m, but a job cost 2hours long , i want to resubmit a job.
I used following:
periodic_remove = (CurrentTime - EnteredCurrentStatus >60*20)âââ
then check the log, then submit the failed job.
Any better ideas?