[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Limiting HTCondor total RAM usage



Does anyone have experience in hard-limiting condor's *total* RAM usage, e.g. by putting the condor_start process inside a cgroup?

I have some machines which need to share with some critical background tasks, and I need to avoid them being hit by the OOM killer (which can happen if I throw a load of jobs into the queue, and those jobs have underestimated their RAM usage)

I found this:
https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToLimitMemoryUsage

But if I understand it rightly, this is for enforcing a hard memory usage on individual jobs. This might work, but (a) it involves all the RequestMemory values being individually correct, and (b) I have a lot of jobs which share memory, e.g. they mmap() the same file, and I'm not sure how a hard RequestMemory limit would interact with that.

So I'd rather for now limit the total HTCondor usage only. Putting HTCondor inside a VM is one option (messy); deploying a HTCondor Docker container is another option (more stuff to learn); so I wonder if using a cgroup directly might be the way to go.

Any experiences greatfully received.

Regards,

Brian Candler.