[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Clarification on Job suspension, holding, and vacating

On Wed, Sep 22, 2010 at 2:24 PM, David Arthur <mumrah@xxxxxxxxx> wrote:
My use case is: I have a few low priority long running jobs that will
always be running, as well as occasional short running high priority
jobs. I would like for the high priority jobs to be able to preempt
the lower priority jobs, but I don't want to lose any progress on the
low priority ones (since they are costly). I feel like this is
possible, but I'm a bit confused on the vocabulary.

Once a job is running a slot, it owns the slot and Condor can't suspend it and give the slot to another job. So in order to achieve what you're after you have to make slots that only deal with certain types of jobs, but have policies that interact with each other. It's not impossible, but it's not trivial either.

Lets say you've got a 2 CPU machine that you'd normally advertise 2 slots from. In order to achieve your goals you'll want to consider forcing the machine to advertise 4 slots instead. "Slot pairs" if you will. Slots 1 & 2 will be a pair and slots 3 & 4 will be a pair.

To advertise 4 identical slots:


This has the side-effect of causing the memory and disk in the machine to now be divided 4 ways instead of two. So may also want to double the memory Condor thinks the machine has with:


There's not much you can do about disk except perhaps write your job requirement expressions to reference TotalDisk instead of Disk from the machine's ad.

In a slot pair the first slot (the lower numbered slot) will *only* run long running jobs. How do we know a job is long running? You'll have to tell the system when you submit a job:

+LongRunningJob = True

And the START _expression_ for the slot will be:

START = LongRunningJob == True && ...whatever other slot stuff you usually have...

The other slot in the pair will only run fast running jobs. Same deal: you'll need to identify them at submit time and tune your start _expression_ to look for the attribute in jobs.

You'll also want to cross-advertise the state of each slot in each other slot's ad. So that you can write START/SUSPEND/RESUME expressions for slot 1 that reference the state of slot 2.

Still with me?

To advertise the necessary attributes across all the slots you use STARTD_SLOT_ATTRIBUTES:

STARTD_SLOT_ATTRS = State, Activity, EnteredCurrentActivity

That would make the state and activity of Slot 2 available in the Slot 1 ad as:


So lets try writing a bit of policy around this. First: lets say that we won't start long running jobs a short running job is using the slot. This translates to: jobs won't run in Slot 1 if Slot 2 is running a job already. So:

START = (SlotID == 1 && (LongRunningJob =?= True && (Slot2_State == "Unclaimed" && Slot2_Activity == "Idle")) || (SlotID != 1)

Interesting, eh? Because settings are shared among all the slots (we don't have per-slot config files) we need to write an _expression_ that's different depending on the slot ID. In this case Slot 1 gets the first bit, and every other slot gets True.

Now what if Slot 1 is running a job and something lands in Slot 2? We want to write a policy that suspends the job in Slot 1 while Slot 2 is busy. Not a problem:

SUSPEND = (SlotID == 1 && (Slot2_State == "Claimed" && Slot2_State == "Busy")) || (SlotID != 1 && False)
CONTINUE = (SlotID == 1 && (Slot2_State == "Unclaimed" && Slot2_State == "Idle")) || (SlotID != 1 && True)

That's, more or less, right I think. I haven't actually tested it but it's in the ballpark of what you're after.

And hopefully you can extrapolate from that to see how you'd expand your setup to control the other slot pair (slots 3 & 4) to behave the same way.

I'd like to point out some caveats though:

1. This is infinite suspension. As long as you have jobs running in slot 2, slot 1 is on hold. You can use the PREEMPT setting to remove a slot 1 job that's been suspended for a long time and maybe give it a chance to run on some other machine.
2. Suspending a job just gets you back CPU. It doesn't get you back the memory used by the suspended job. And, depending on the tool, it sometimes doesn't get you back the licenses it's using either. Worth keeping in the back of your mind if you find you're running out of machine or shared resources.

Hopefully that wasn't too much to follow.

- Ian

Cycle Computing, LLC
The Leader in Open Compute Solutions for Clouds, Servers, and Desktops
Enterprise Condor Support and Management Tools