[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Preempt jobs which exceed their request_memory - but no parallel universe?



On 03/03/2015 05:31 AM, Steffen Grunewald wrote:
I'm confused.

I have a couple of users who underestimate the memory their jobs
would attempt to allocate, and as a result some worker nodes end
up swapping heavily.
I tried to get those jobs preempted, and sent back into the queue
with their updated (ImageSize) request_memory:

# Let job use its declared amount of memory and some more
MEMORY_EXTRA            = 2048
MEMORY_ALLOWED          = (Memory + $(MEMORY_EXTRA)*Cpus)
# Get the current footprint
MEMORY_CURRENT          = (ImageSize/1024)
# Exceeds expectations?
MEMORY_EXCEEDED         = $(MEMORY_CURRENT) > $(MEMORY_ALLOWED)
# If exceeding, preempt
#[preset]PREEMPT        = False
PREEMPT                 = ($(PREEMPT)) || ($(MEMORY_EXCEEDED))
WANT_SUSPEND            = False


This should all work. Can you wrap your PREEMPT expression in the debug() function like this:

PREEMPT = debug($(PREEMPT) || ($(MEMORY_EXCEEDED)))

What are WANT_VACATE and KILL set to? If you don't want to give these jobs a grace period, you
probably want WANT_VACATE = false.

-greg