[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Signal definitions



On Tue, Oct 04, 2005 at 02:23:00PM -0500, Alexander Dietz wrote:
> Hi,
> 
> is there a list somewhere explaining the different signals which the 
> jobs can fail with? So, I think 13 is segmentation fault, but was is 
> signal 113?
> 

The output status of a job comes from the job, not Condor. 

Your job can exit either with a signal or an exit value.

Signals are defined by the OS, and the list is usually in
an include file somewhere. (On linux, it's 
/usr/include/asm/signal.h ). Signal 11 is traditionally
segmentation violation.

exit values are determined entirely by the job. 0 is
traditionally success.

I doubt that your job is exiting with signal 113 - it's
probably exiting with status 113 (which is different than
signal 113). Prehaps your job is not calling exit() or does
not return from main()? 

On rare occassions, you'll see Condor reporting a job exiting
with status 139 (or some status bigger than 128) - that's 
usually a bug in Condor; your job really exited with signal
(whatever status condor says - 128). So when Condor says
status 139, it usually should be saying signal 11. I think
we've tracked all of those bugs down, and if a job exits
with a signal it gets reported as such.

-Erik