[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Email Error notication



On 11/15/2016 3:04 PM, Uchenna Ojiaku - NOAA Affiliate wrote:

> In the command reference manual it states this concerning error email
> notification:
> "If defined by /Error/, the owner will only be notified if the job
> terminates abnormally, or if the job is placed on hold because of a
> failure, and not by user request".
> 
> I've ran multiple jobs, in this job below the log file returned a
> non-zero value yet the job was complete. *How do I get an error email
> notification when there is an "error' with the job, i.e. a non-zero value?*
>

Hi Uche,

As you discovered, when notification=Error, HTCondor sends email when there was an error launching the job (for instance, if the initial working directory or job executable is missing) or if the job exits with a signal.  

If you want HTCondor to do something based upon a normal exit status code, you need to explicitly tell HTCondor what exit code(s) is/are considered "success" and, and what codes are considered failure.

In the upcoming HTCondor v8.5.8+ release, things are made more intuitive with the introduction of the "success_exit_code=X" macro in the job submit file. See https://is.gd/vsQvJk 

In earlier versions of HTCondor, you can still achieve what you want via the power of ClassAds by replacing your "notification=error" line with one other line, although it is a bit non-obvious.  In the HTCondor Manual in Appendix A, there is a list description of many of the job classad attributes, including the attribute JobNotification ( see https://is.gd/PeDlhv ).  When you put "notification=complete" in your job submit file, condor_submit sets in the job classad "JobNotification=2", and when you put "notification=error" in the submit file, condor_submit sets "JobNotification=3".  All classad attributes be set to be literals (like integers 2, 3), or they can be set to expressions that can use a bunch of functions including conditionals.  So to achieve what you want to do, whereby email is sent even if a job runs ok but exits with a non-zero exit code, you can explicitly set JobNotification like the following example in your job submit file:

  executable = /bin/bash 
  # Make notification=complete if ExitCode is non-zero, else make it error
  +JobNotification = IfThenElse(ExitCode=!=UNDEFINED && ExitCode=!=0, 2, 3)
  # So this job will not send email
  arguments = "-c 'exit 0'"
  queue
  # And this job will send email
  arguments = "-c 'exit 1'"
  queue

Hope the above helps.  I realize the above is non-obvious, which is why we made things easier starting in HTCondor v8.5.8.  But I hope the above is instructive re learning about the flexibility/power that ClassAds gives end users and administrators.  Details about the ClassAd language is in section 4.1 of the Manual.

regards
Todd