[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] jobs killed due to memory?

On 9/29/2020 9:18 PM, Kristian Kvilekval wrote:

I am seeing jobs killed when they exceed their requested memory.
I believe I have shut off any preemption or eviction, but that does not
seem to be the case.   Below is our condor_local and a typical submit file
(we are running using DAGman), and a submit.log. Note that we
request_memory is 24GB and we seem to be exiting approximately (and
prematurely) at 24GB. I believe the process may be requesting more and
these particular nodes have a lot more (unused) memory on them.

Is there a way to never kill a job due to memory? Or am I misreading the
...> request_memory=24000

request_memory = 0

worked last time I tried. Use at own risk etc.