[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Preempt jobs which exceed their request_memory - but no parallel universe?

On 03/03/2015 05:31 AM, Steffen Grunewald wrote:
I'm confused.

I have a couple of users who underestimate the memory their jobs
would attempt to allocate, and as a result some worker nodes end
up swapping heavily.
I tried to get those jobs preempted, and sent back into the queue
with their updated (ImageSize) request_memory:

# Let job use its declared amount of memory and some more
MEMORY_EXTRA            = 2048
MEMORY_ALLOWED          = (Memory + $(MEMORY_EXTRA)*Cpus)
# Get the current footprint
MEMORY_CURRENT          = (ImageSize/1024)
# Exceeds expectations?
# If exceeding, preempt
#[preset]PREEMPT        = False
PREEMPT                 = ($(PREEMPT)) || ($(MEMORY_EXCEEDED))
WANT_SUSPEND            = False

This should all work. Can you wrap your PREEMPT expression in the debug() function like this:


What are WANT_VACATE and KILL set to? If you don't want to give these jobs a grace period, you
probably want WANT_VACATE = false.