Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] jobs killed due to memory?

Date: Wed, 30 Sep 2020 18:57:09 -0500
From: dmaziuk <dmaziuk@xxxxxxxxxxxxx>
Subject: Re: [HTCondor-users] jobs killed due to memory?

On 9/29/2020 9:18 PM, Kristian Kvilekval wrote:

Hello

I am seeing jobs killed when they exceed their requested memory.
I believe I have shut off any preemption or eviction, but that does not
seem to be the case.   Below is our condor_local and a typical submit file
(we are running using DAGman), and a submit.log. Note that we
request_memory is 24GB and we seem to be exiting approximately (and
prematurely) at 24GB. I believe the process may be requesting more and
these particular nodes have a lot more (unused) memory on them.

Is there a way to never kill a job due to memory? Or am I misreading the
logs?

...> request_memory=24000

request_memory = 0

worked last time I tried. Use at own risk etc.

Dimitri

References:
- [HTCondor-users] jobs killed due to memory?
  - From: Kristian Kvilekval

Prev by Date: [HTCondor-users] Trying to understand DedicatedScheduler related problems
Previous by thread: Re: [HTCondor-users] jobs killed due to memory?
Next by thread: [HTCondor-users] Trying to understand DedicatedScheduler related problems
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [HTCondor-users] jobs killed due to memory?