I am currently having troubles running java jobs on my
condor installation. I have a central manager running on a mac
osx and 60 processing nodes running windows xp. It all works perfectly for short jobs (less than say 30
minutes of execute time). However, when I try to run jobs that take say 4 hours
to complete, I find that most of them will fail to return any output. In the
job log file, I see this:
Normal termination (return value
When a job completes perfectly I see this:
Normal termination (return value 0)
So, my question is, what does
return value 143 mean? How do I find out what may be causing my longer jobs to
not work? It to me that they are being evicted. I have
all my run options set to UCWS_TESTINGMODE so that they run all the time and as
far as I understand never get evicted. Is this correct???
Thanks for any help you may be able to provide.