Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] jobs vacating reason

Date: Thu, 9 Dec 2010 13:03:48 -0500
From: Erik Aronesty <erik@xxxxxxx>
Subject: [Condor-users] jobs vacating reason

I'm very new to condor, and although I seem to have gotten it working (one sumbit node, 6 compute nodes, 36 slots), and am running jobs, I have a couple questions:

1. Where can i look to find out precisely why jobs are vacating and restarting?

2. For now, I'm using dedicated machines... and thus I don't want vanilla jobs to "vacate/kill/die" since it just means they get restarted... usually 90% of the way through them. I haven't tried, yet, compiling with condor libs and running standard universe jobs... but i'd like the config to be done nicely for them). If a job without checkpointing is preempted, or if the cpu gets busy, I'd like it to SUSPEND, never vacate.

Here's my relevant configs I can think of. I think perhaps the KILL_VANILLA and VACATE_VANILLA won't do what I expect, and condor may use "more drastic measures" anyway (although I'm not sure what "more drastic" means).

SUSPEND = $(CPUBusy)

WANT_SUSPEND = True

MAXVACATETIME = 20 * $(MINUTE)

VACATE = $(ActivityTimer) > $(MaxSuspendTime)

VACATE_VANILLA = False

WANT_VACATE = True

KILL = $(UWCS_KILL)

KILL_VANILLA = False

PREEMPT = $(UWCS_PREEMPT)

PREEMPT_VANILLA = False

Yet I still get stuff like when looking at the queue:

LastVacateTime = 1291916587

and this when grepping the logs...

Changing state and activity: Claimed/Idle -> Preempting/Vacating

Follow-Ups:
- Re: [Condor-users] jobs vacating reason
  - From: Matthew Farrellee

Prev by Date: Re: [Condor-users] condor_submit with -append arguments
Next by Date: Re: [Condor-users] jobs vacating reason
Previous by thread: Re: [Condor-users] condor_submit with -append arguments
Next by thread: Re: [Condor-users] jobs vacating reason
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

[Condor-users] jobs vacating reason