[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Out of memory killer & cgroups



On 10/2/14 6:23 AM, andrew.lahiff@xxxxxxxxxx wrote:
> and a job uses so much memory that the system runs out of memory,
> the OOM killer kills the job:

The Linux kernel OOM killer is independent of cgroups. The only
interaction with cgroups is that a cgroup memory limit may cause the OOM
killer to activate sooner than it would without a constrained memory limit.

SIGKILL (kill -9) is immediate and it cannot be trapped or ignored. The
killed process does not have a chance to write out any logs or otherwise
clean up after itself so there's nothing that it can do to let users
know why it was killed. What the parent does is up to the parent,
although right off it has no way to distinguish between a KILL signal
sent by the kernel and a KILL signal sent by a user.

So, on the face of it, the behavior that you are seeing is something
that I would expect to see. Whether or not it's the intended behavior is
something that I will leave to the Condor devs to address.

-- 
Rich Pieri <ratinox@xxxxxxx>
MIT Laboratory for Nuclear Science