[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Advice on job suspension vs preemption with partitionable slots



On Wed, Feb 22, 2012 at 05:46:42PM +0100, Michael Hanke wrote:
> Apparently not, as it seems SUSPEND is not evaluated into the context of
> a candidate job matching. The question remains, how I can achieve
> particular jobs getting suspended instead of being evicted (until other
> resources except cpus in a partitionable slot are exhausted).

I tried to extend the approach from

  https://condor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToSuspendJobs

to work with partitionable slots. Obviously one cannot have dedicated
pairs of slots, as the partitions may have arbitrary "sizes". Instead I
defined two partitionable slots each offering all CPUs (ignoring memory
for now). The idea was to limit the actual number of utilized CPUs to
the number of physical CPUs by suspending jobs frmo the low-priority
slot if necessary (matching the current demand on the high priority
slot).

Directly modifying the last segment of the above sketch I defined:


# Slot 2 is for jobs that get suspended while slot 1 is busy
slot2_cpus_suspended = slot2_Cpus - ( (slot2_1_Activity =?= "Suspended") - (slot2_2_Activity =?= "Suspended"))
SLOT2_START    = TARGET.IsSuspensionJob =?= true
SLOT2_CONTINUE = MY.slot1_cpus + ($(slot2_cpus_suspended)) >= TARGET.RequestCpus
SLOT2_PREEMPT  = FALSE
SLOT2_SUSPEND  = slot1_Cpus + slot2_cpus + ($(slot2_cpus_suspended)) < TotalCpus / 2


This (somewhat) has the desired effect. Jobs on partitions of slot2 get
suspended whenever cpus from slot1 are claimed -- and only so many jobs
get suspended as many cpus are required.

There are three problems: 1) The above assumes only one cpu per job (which
makes the whole thing pointless). 2) The slot2_cpus_suspended expression
needs to be extended to match the theoretical maximum number of
partitions (doable). 3) Whenever slot1 is utililized, but not completely, Condor
cycles through all suspended jobs on slot2 and continuously suspends and
wakes them up again (not sure how much negative impact that has).

I'd be glad if someone could comment on this problem. Is there a proper
solution, or am I wasting my time here?

The only other solution seems to be to assign dedicated resources to
"uninterruptible" jobs, with the obvious disadvantage that jobs relying
on this property can only use a fraction of the resources, even if
everything else is idle.

Michael

-- 
Michael Hanke
http://mih.voxindeserto.de