[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Java universe and memory (moved from devel to user)



>  check whether the resulting exit code is consistent and happens only
>  in this or similar events.
>  If not and you can alter you application use something like:
>   try
>   {
>   }
>   catch (OutOfMemoryException e)
>   {
>      // log it however you normally would
>      System.exit(some constant number you know)
>   }

After reading up some more it seems java.lang.OutOfMemoryError is a
non-fatal error, which is presumably why condor doesn't exit the job.

I'm reluctant to edit the source as it is a 3rd party application and I
don't want to have to modify it for each release.

>  the on_exit_remove or on_exit_hold can trap this and place it on hold
>  for you to deal with.

I can't use either of these as the job never gets as far as exiting, it just
goes back to idle and will resubmit to get the same error, ad infinitum.

The exit code is 1, as an abnormal termination, so I tried this in
on_exit_hold and periodic_hold, but the first doesn't run and second runs
before the exitcode is defined.

Is there something like on_evict_hold? I couldn't find anything in the
manual.

Cheers
Craig

PS apologies for posting to the wrong list originally.


This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.