[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Preempt jobs which exceed their request_memory - but no parallel universe?
- Date: Tue, 03 Mar 2015 10:36:30 -0600
- From: Greg Thain <gthain@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Preempt jobs which exceed their request_memory - but no parallel universe?
On 03/03/2015 05:31 AM, Steffen Grunewald wrote:
This should all work. Can you wrap your PREEMPT expression in the
debug() function like this:
I have a couple of users who underestimate the memory their jobs
would attempt to allocate, and as a result some worker nodes end
up swapping heavily.
I tried to get those jobs preempted, and sent back into the queue
with their updated (ImageSize) request_memory:
# Let job use its declared amount of memory and some more
MEMORY_EXTRA = 2048
MEMORY_ALLOWED = (Memory + $(MEMORY_EXTRA)*Cpus)
# Get the current footprint
MEMORY_CURRENT = (ImageSize/1024)
# Exceeds expectations?
MEMORY_EXCEEDED = $(MEMORY_CURRENT) > $(MEMORY_ALLOWED)
# If exceeding, preempt
#[preset]PREEMPT = False
PREEMPT = ($(PREEMPT)) || ($(MEMORY_EXCEEDED))
WANT_SUSPEND = False
PREEMPT = debug($(PREEMPT) || ($(MEMORY_EXCEEDED)))
What are WANT_VACATE and KILL set to? If you don't want to give these
jobs a grace period, you
probably want WANT_VACATE = false.