[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Automatic Restart of Failed jobs.
- Date: Tue, 5 Oct 2010 23:16:46 -0400
- From: Ian Chesal <ichesal@xxxxxxxxxxxxxxxxxx>
- Subject: Re: [Condor-users] Automatic Restart of Failed jobs.
You bet. This is where on_exit_remove comes in handy:
on_exit_remove = ExitCode =?= 0
Says: remove this job from the queue when it ends if it exits with a value of zero. Otherwise it goes back in to the queue in the idle state.
On 2010-10-05, at 10:32 PM, Edier Alberto Zapata Hernández <edalzap@xxxxxxxxx> wrote:
> Good night,
> Today I was running some test with Exonerate using Condor. I split the queries file in many files eachone with only 1 sequence in it. The problem is that some of the jobs failed some because the node was down when I put the Database in them, other because they crash, and so on.
> I got the Error files of all the jobs, but check one by one, find the job's files and restart it's a little slow (the queries file have 13,600+ sequences). Is there a parameter in the submitFile to define that if the job fails (and only if It fails, I mean if the jobs finish Ok, no actions should be taken.) Condor should try to restart it?
> Thank you.
> Edier Alberto Zapata Hernández
> Est. Ingeniería de Sistemas
> Universidad de Valle
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> The archives can be found at: