[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Register a job as failed from it's exit code



On Mon, Apr 01, 2013 at 12:12:05PM +1000, Robert McMillan wrote:
> Thanks for your help Nathan,
> 
> I can see in the job log that it is now retrying the job. 
> "(1) Normal termination (return value -532462766)
> The job attribute OnExitRemove expression 'ExitCode =?= 0' evaluated to
> FALSE"
> 
> Is there also a way to set the maximum number of times the job will retry?

Something like this will work I think.

on_exit_remove = ExitCode =?= 0 || NumJobStarts > 3

> 
> Regards,
> Robert
> 
> -----Original Message-----
> From: htcondor-users-bounces@xxxxxxxxxxx
> [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Nathan Panike
> Sent: Monday, 1 April 2013 11:34 AM
> To: HTCondor-Users Mail List
> Subject: Re: [HTCondor-users] Register a job as failed from it's exit code
> 
> Place the following line in a submit file:
> 
> on_exit_remove = ExitCode =?= 0
> 
> Nathan Panike
> 
> On Sun, Mar 31, 2013 at 12:31:01PM +1000, Robert McMillan wrote:
> > Hi,
> > 
> > I currently have condor running on a Windows 7 x64 environment. My 
> > Condor jobs call a small executable file, and if the executable fails 
> > it returns a non-zero exit code. Is there a way for Condor to register 
> > this as a "Failed Job" and potentially retry it?

Nathan Panike