[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] [newbie question: using DAGman how can I restart a job that failed after that another script solve it]
- Date: Fri, 09 May 2008 15:30:12 -0400
- From: Jean-Pierre Ocalan <jean-pierre.ocalan@xxxxxxxxxxxxxxxx>
- Subject: [Condor-users] [newbie question: using DAGman how can I restart a job that failed after that another script solve it]
I'm trying to do some exercise with Condor to understand better how does
this huge system work.
Let's say I have few jobs organized as it follows:
A -> B -> C -> F
-> D -> E
(D depends on B)
Let's say now that B fails ... I don't want to retry immediately B with
the command RETRY B <number of time> ....
I want to launch another script that will repair the problem an restart B.
I guess that I can work with the PRE and POST script.
Let's say that my POST script, launched after the execution of B, check
the returned value and if there is a problem the script fix it but how
can I tell to restart B ?
Do I have to create a new workflow of jobs like this ?