[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] howto avoid that a job is being evicted



Hallo,

is this question such trival that I should found it in the manual?
Or is it such complicated that no one can answer it?

Because of pipes we must run our jobs mostly in the vanilla universe,
if a job is evicted this means it is restarted. In most cases (all I know)
the restart fails. Therefore it is no good idea to evicte a job which has run
for roughly a day.

Beside of this problem condor works great. The distribution of jobs to free nodes
alone improves the performance of our cluster much.
At the moment we have heavy load. If a user have to wait its nothing else as before we use condor
If a user lose a job, the system administrator is guilty!

Thanks for you help
Harald



Harald van Pee wrote:

Hi all,

I just saw the following in the ShadowLog

3/1 22:25:34 (770.0) (22282): Job 770.0 is being evicted
3/1 22:25:34 (770.0) (22282): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 107

can anybody explain what happens? I have tried to avoid that any job is being evicted. There are still enough vm available and I use the following in the condor_config.local of each machine.

SUSPEND = FALSE
START = TRUE
RELEASE_DIR = /condor/home/condor-6.6.10
PREEMPT = FALSE
VACATE = KILL = False
WANT_SUSPEND   = False
WANT_VACATE    = False


do I miss something or is there any serious problem occoured? How can I found out what happens?

With Best Regards
Harald

_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users