[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Suspend and resume jobs by on demand




Here is an example of a configuration that suspends jobs on one batch slot while the other batch slot is busy. It is based off of a working configuration, but it is untested in its specific form below. It is designed to work on a one-cpu system, using the new 6.9.3 "slot" terminology in place of the old "vm" terminology. It can be extended to SMP machines in a fairly straightforward way, and it can be translated into the old "vm" syntax for 6.8 Condor easily enough.


# You may want to advertise double the amount of system memory
# if you have enough virtual memory to allow the foreground job
# to consume all of memory while the suspended job gets pushed
# into swap memory.  There is currently no convenient way to
# tell Condor you want to oversubscribe your memory, so you
# have to hard-code the amount of memory you want to advertise
# by uncommenting and filling in the following:
# Memory = TWICE_YOUR_SYSTEM_MEMORY

NUM_CPUS = 2

# So that the suspension slot can see the state
# of the other slot, we need to have some things
# advertised about each slot in the ClassAds of
# all the other slots on the same machine:
STARTD_SLOT_EXPRS = State, RemoteUser, CurrentRank

# For informational purposes, put IsSuspensionSlot
# in the startd ClassAd:
STARTD_ATTRS = IsSuspensionSlot

# Slot 1 is the "normal" batch slot
SLOT1_IsSuspensionSlot = False
# Slot 2 is suspends its job, rather than preempting them
SLOT2_IsSuspensionSlot = True


START    = ($(SLOT1_START))    || ($(SLOT2_START))
CONTINUE = ($(SLOT1_CONTINUE)) || ($(SLOT2_CONTINUE))
PREEMPT  = ($(SLOT1_PREEMPT))  || ($(SLOT2_PREEMPT))
SUSPEND  = ($(SLOT1_SUSPEND))  || ($(SLOT2_SUSPEND))


# The purpose of the following expression is to prevent a
# job from starting on slot 1 if it has less priority to run
# than the job already running on slot 2, because once we let
# a job run on slot 1, the slot 2 job will be suspended.
# This expression refers to attributes that are only defined
# when requirements are being evaluated by the Negotiator:
# SubmittorPrio [sic] and RemoteUserPrio

SLOT1_HAS_PRIO = SubmittorPrio =?= UNDEFINED || \
                vm2_RemoteUserPrio =?= UNDEFINED || \
                SubmittorPrio < 1.2 * vm2_RemoteUserPrio || \
                vm2_CurrentRank =?= UNDEFINED || \
                MY.Rank > vm2_CurrentRank

# Slot 1 is a normal execution slot
SLOT1_START = SlotID == 1 && TARGET.IsSuspensionJob =!= true && ($(SLOT1_HAS_PRIO))
SLOT1_CONTINUE = SlotID == 1 && ($(TESTINGMODE_CONTINUE))
SLOT1_PREEMPT  = SlotID == 1 && ($(TESTINGMODE_PREEMPT))
SLOT1_SUSPEND  = SlotID == 1 && ($(TESTINGMODE_SUSPEND))

# Slot 2 is for jobs that get suspended while slot 1 is busy
SLOT2_START    = SlotID == 2 && TARGET.IsSuspensionJob =?= true
SLOT2_CONTINUE = SlotID == 2 && (slot1_State =?= "Unclaimed" || slot1_State =?= "Owner")
SLOT2_PREEMPT  = FALSE
SLOT2_SUSPEND  = SlotID == 2 && slot1_State =?= "Claimed"


To submit a suspension job, you could put something like the following in your submit file:

+IsSuspensionJob = True
requirements = TARGET.IsSuspensionSlot


The example policy above does not prevent preemption of suspension jobs by other suspension jobs. If you want to prevent that, you could do something like this:

# Do not preempt suspension jobs (for up to 24 hours)
MaxJobRetirementTime = (MY.IsSuspensionSlot =?= True) * 3600 * 24


Hope that helps.

--Dan

Rick Lan wrote:
Hi all
I was wondering if someone has some experience/suggestion for this following setup. We have Windows machines so checkpointing is not supported. Preemption is off because we don't want loose running progress. Is there a way to suspend running jobs (usually takes days) to run newly submitted jobs (usually takes mins/hours) and to resume suspended jobs once these short jobs finish? I was thinking that I could set NUM_CPUS to double the actual number of CPUs. Set STARTD policy in a way that when half of CPUs is running a job, the other half can't match to a job. When short jobs comes, either identified by accounting groups or a config variable, suspend running jobs and run short jobs on the other half of CPUs. Is this configuration feasible? Thanks
Rick