[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Abnormal Termination received Signal 11
- Date: Sat, 22 Mar 2008 05:38:47 -0400
- From: vinai sundaram <vsundar@xxxxxxxxxx>
- Subject: [Condor-users] Abnormal Termination received Signal 11
I have two jobs. One of the jobs run fine when submitted to Condor.
However, the other job receives a signal 11. I ran the other job
manually on the same machine where Condor scheduled it (worker node) and
the job runs fine for the same input. It also runs fine for the same
input in the local machine I am using to submit the jobs. Moreover, if I
submit the job through globus ( universe = globus ), the job runs fine.
However I have other issues with globus universe and hence, need to use
I notice that the job some times won't start. It sometimes runs a bit
and get a signal 11 and exits.
According to an earlier post in this list, different library versions
can be a potential reason. However, I manually verified the versions of
the libraries on the worker node and my local machine and they are
identical. LD_LIBRARY_PATH is not set on both machines and hence
doesn't make a difference.
Since the code is running fine for the same input, I believe the signal
11 is due to environment the job runs in. Other than the library
versions, I don't know what to check for. Are there any other
environment variables or condor-specific settings missing that need to
be set in the submission file?
Do let me know if the problem is not clear or if you need any details. I
will appreciate any tips/pointers.