[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Java universe and memory (moved from devel to user)
- Date: Wed, 12 Mar 2008 10:02:46 -0500
- From: Jaime Frey <jfrey@xxxxxxxxxxx>
- Subject: Re: [Condor-users] Java universe and memory (moved from devel to user)
On Mar 7, 2008, at 2:27 PM, Craig Bruce wrote:
the on_exit_remove or on_exit_hold can trap this and place it on hold
for you to deal with.
I can't use either of these as the job never gets as far as exiting,
goes back to idle and will resubmit to get the same error, ad
The exit code is 1, as an abnormal termination, so I tried this in
on_exit_hold and periodic_hold, but the first doesn't run and second
before the exitcode is defined.
Is there something like on_evict_hold? I couldn't find anything in the
Condor evaluates on_exit_hold/remove when the job completes and is
ready to leave the queue. Since Condor leaves the job in the queue on
OutOfMemory, the on_exit expressions are evaluated.
Here's how you can use periodic_hold:
periodic_hold = NumJobStarts =!= Undefined && NumJobStarts > 2
The first half of the expression is required because NumJobStarts
isn't defined in the job ad until it starts running for the first time.
This will catch jobs that re-execute other reasons as well, but it
will stop infinite re-execution.
Thanks and regards,
UW-Madison Condor Team