[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] spontaneous reboots after enabling cgroups

On Jul 23, 2013, at 7:58 AM, Chris Filo Gorgolewski <krzysztof.gorgolewski@xxxxxxxxx> wrote:

> On Thu, Jun 27, 2013 at 2:31 AM, Jason Ferrara <jason.ferrara@xxxxxxxxxxxxx> wrote:
> I have a pool of machines running CentOS 6.4, Kernel 2.6.32-358, and HTCondor 7.9.4.
> Today, in order to try to stop jobs which underestimate their memory usage from making the machines swap a lot and get slow, I enabled cgroups and set
> I have similar issue (jobs which underestimate their memory usage), but I wasn't sure that CGROUP will solve this. Do I understand correctly that setting CGROUP_MEMORY_LIMIT_POLICY to either "soft" or "hard" will disable swapping for all condor jobs?

- "hard" will kill the job as soon as it goes over its requested memory.
- "soft" will kill jobs that are over their requested memory only when the kernel believes memory is tight.

> Furthermore, the free memory used in the "soft" policy is calculated based on the current system state not taken from the RESERVED_MEMORY variable?

No -- soft-memory kills are controlled by the kernel.  From https://www.kernel.org/doc/Documentation/cgroups/memory.txt:

7. Soft limits

Soft limits allow for greater sharing of memory. The idea behind soft limits
is to allow control groups to use as much of the memory as needed, provided

a. There is no memory contention
b. They do not exceed their hard limit

When the system detects memory contention or low memory, control groups
are pushed back to their soft limits. If the soft limit of each control
group is very high, they are pushed back as much as possible to make
sure that one control group does not starve the others of memory.

Please note that soft limits is a best-effort feature; it comes with
no guarantees, but it does its best to make sure that when memory is
heavily contended for, memory is allocated based on the soft limit
hints/setup. Currently soft limit based reclaim is set up such that
it gets invoked from balance_pgdat (kswapd).

I'm not a kernel programmer, but looking at the relevant kernel code, it seems that the cgroup is checked prior to swapping in most cases.