[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] criteria for non-DAG job failures?



On Thu, Jul 05, 2012 at 08:30:14AM -0500, Nathan Panike wrote:
> On Tue, Jul 03, 2012 at 03:04:19PM -0400, Vlad wrote:
> > Nathan,
> > 
> > The expression you gave me had the effect that the jobs with non-zero
> > return codes get placed on hold. That is good. However, with "notification
> > = error" I would have liked Condor to send me an email about such job
> > failures, but that does not seem to happen.
> > 
> 
> Looking at the code, I see that Periodic hold, as in this case, is not
> considered an error. Maybe it should be.
> 
> > Someone I discussed this with have found this:
> > https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=1976 The behavior
> > described there is exactly what I get. The issue is known but also appears
> > as fixed in v7.7.5. I am on v7.8. Could this be a regression or am I
> > missing more configuration settings?
> 
> At the moment, CondorWiki is down and I cannot track this down.  I will
> check this when CondorWiki is working again.
> 

The fix for #1976 above does not actually address the issue you want
fixed.  Thus it is still outstanding.

Nathan Panike