[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] spontaneous reboots after enabling cgroups



On 8/15/2013 5:10 AM, Chris Filo Gorgolewski wrote:
Yes that what I was afraid off - that this will cause swapping and slow
down the whole machine anyway.


Not sure how you conclude that swapping for a small subset of processes (aka perhaps one slot on a machine with many slots) will slow down execution of other processes/slots that have their entire image in RAM. Is virtual memory management i/o in the kernel synchronous? I doubt it...

If you are really concerned, you set your HTCondor startd PREEMPT expression to kill off jobs that are swapping after just a few seconds, i.e. jobs whose MemoryUsage > Memory.

With regards to the original topic of this thread, HTCondor v8.0.2 (scheduled for release next week) will include a patch to work around the kernel bug that can result in a reboot. See
https://htcondor-wiki.cs.wisc.edu/index.cgi/tktview?tn=3847

regards,
Todd



On Fri, Aug 2, 2013 at 11:57 PM, Martin Bukatovic <
martin.bukatovic@xxxxxxxxx> wrote:

Dear condor list,

On Thu, Aug 1, 2013 at 3:37 PM, Brian Bockelman <bbockelm@xxxxxxxxxxx>
wrote:

On Jul 23, 2013, at 7:58 AM, Chris Filo Gorgolewski <
krzysztof.gorgolewski@xxxxxxxxx> wrote:

On Thu, Jun 27, 2013 at 2:31 AM, Jason Ferrara <
jason.ferrara@xxxxxxxxxxxxx> wrote:
I have a pool of machines running CentOS 6.4, Kernel 2.6.32-358, and
HTCondor 7.9.4.

Today, in order to try to stop jobs which underestimate their memory
usage from making the machines swap a lot and get slow, I enabled cgroups
and set

CGROUP_MEMORY_LIMIT_POLICY = soft
RESERVED_MEMORY = 1024
I have similar issue (jobs which underestimate their memory usage), but
I wasn't sure that CGROUP will solve this. Do I understand correctly that
setting CGROUP_MEMORY_LIMIT_POLICY to either "soft" or "hard" will disable
swapping for all condor jobs?

- "hard" will kill the job as soon as it goes over its requested memory.
- "soft" will kill jobs that are over their requested memory only when
the kernel believes memory is tight.

This is not a correct description.

First of all, it's worth noting that both hard and soft limits should
not kill any process unless there is other good reason for OOM killer
to intervene - such situation may be a running out of memory
completely, but definitely not just going over soft or hard memory
limit. And even when OOM killer is triggered, you can specify what it
will mean for your process in particular cgroup via memory.oom_control
file of memory cgroup controller.

Unfortunately there is an exception from this: a known bug in
RHEL/CentOS kernel will kill processes of cgroup with small hard
memory limit when the limit is breached. But this *is not* an correct
behaviour. Kernel developers are working on the fix, see the bugzilla
for details: https://bugzilla.redhat.com/show_bug.cgi?id=870011

Back to the CGROUP_MEMORY_LIMIT_POLICY,  see what upstream condor
documentation states:


http://research.cs.wisc.edu/htcondor/manual/v8.0/3_12Setting_Up.html#SECTION0041212000000000000000

<from condor docs>
If the hard limit is in force, then the total amount of physical
memory used by the sum of all processes in this job will not be
allowed to exceed the limit. If the processes try to allocate more
memory, the allocation will succeed, and virtual memory will be
allocated, but no additional physical memory will be allocated. The
system will keep the amount of physical memory constant by swapping
some page from that job out of memory.

if the soft limit is in place, the job will be allowed to go over the
limit if there is free memory available on the system. Only when there
is contention between other processes for physical memory will the
system force physical memory into swap and push the physical memory
used towards the assigned limit.
  </from condor docs>

Note that this description is consistent with kernel documentation for
cgroups which was referenced in this thread before.

Martin B.


Furthermore, the free memory used in the "soft" policy is calculated
based on the current system state not taken from the RESERVED_MEMORY
variable?


No -- soft-memory kills are controlled by the kernel.  From
https://www.kernel.org/doc/Documentation/cgroups/memory.txt:

"""
7. Soft limits

Soft limits allow for greater sharing of memory. The idea behind soft
limits
is to allow control groups to use as much of the memory as needed,
provided

a. There is no memory contention
b. They do not exceed their hard limit

When the system detects memory contention or low memory, control groups
are pushed back to their soft limits. If the soft limit of each control
group is very high, they are pushed back as much as possible to make
sure that one control group does not starve the others of memory.

Please note that soft limits is a best-effort feature; it comes with
no guarantees, but it does its best to make sure that when memory is
heavily contended for, memory is allocated based on the soft limit
hints/setup. Currently soft limit based reclaim is set up such that
it gets invoked from balance_pgdat (kswapd).
"""

I'm not a kernel programmer, but looking at the relevant kernel code, it
seems that the cgroup is checked prior to swapping in most cases.

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with
a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/




_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/