[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] jobs killed due to memory?
- Date: Wed, 30 Sep 2020 18:57:09 -0500
- From: dmaziuk <dmaziuk@xxxxxxxxxxxxx>
- Subject: Re: [HTCondor-users] jobs killed due to memory?
On 9/29/2020 9:18 PM, Kristian Kvilekval wrote:
I am seeing jobs killed when they exceed their requested memory.
I believe I have shut off any preemption or eviction, but that does not
seem to be the case. Below is our condor_local and a typical submit file
(we are running using DAGman), and a submit.log. Note that we
request_memory is 24GB and we seem to be exiting approximately (and
prematurely) at 24GB. I believe the process may be requesting more and
these particular nodes have a lot more (unused) memory on them.
Is there a way to never kill a job due to memory? Or am I misreading the
request_memory = 0
worked last time I tried. Use at own risk etc.