Re: [HTCondor-users] spontaneous reboots after enabling cgroups

Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

On Fri, Aug 16, 2013 at 4:04 AM, Brian Bockelman <bbockelm@xxxxxxxxxxx> wrote:

Hi,

A few thoughts-
- Why do you want swap on your worker nodes? ÂWe found it much more useful to just disable swap and kill jobs when they went over their memory limit.

Yes this is what I would like to do, but only for the condor jobs.

- You can set the swappiness of the /condor cgroup to 0, disabling swap only for condor jobs and processes.

Ha this would be perfect. However, I was thinking if a job runs out of memory in this set up would it just fail or get preemted and sent back to the pool to be executed later on on a machine with more memory?

Best,

Chris

- cgroups, depending on the kernel and distro, may also track memory+swap usage. We don't do this currently, but is a very simple change.Â

- We already listen for events about OOM issues in the cgroup and disable the OOM-killer. ÂShould be straightforward to add a listener when memory boundaries are crossed.

Food for thought,

Brian

Sent from my iPhone

On Aug 15, 2013, at 12:02 PM, Todd Tannenbaum <tannenba@xxxxxxxxxxx> wrote:

> On 8/15/2013 5:10 AM, Chris Filo Gorgolewski wrote:
>> Yes that what I was afraid off - that this will cause swapping and slow
>> down the whole machine anyway.
>
> Not sure how you conclude that swapping for a small subset of processes Â(aka perhaps one slot on a machine with many slots) will slow down execution of other processes/slots that have their entire image in RAM. ÂIs virtual memory management i/o in the kernel synchronous? I doubt it...
>
> If you are really concerned, you set your HTCondor startd PREEMPT _expression_ to kill off jobs that are swapping after just a few seconds, i.e. jobs whose MemoryUsage > Memory.
>
> With regards to the original topic of this thread, HTCondor v8.0.2 (scheduled for release next week) will include a patch to work around the kernel bug that can result in a reboot. ÂSee
> https://htcondor-wiki.cs.wisc.edu/index.cgi/tktview?tn=3847
>
> regards,
> Todd
>
>
>>
>> On Fri, Aug 2, 2013 at 11:57 PM, Martin Bukatovic <
>> martin.bukatovic@xxxxxxxxx> wrote:
>>
>>> Dear condor list,
>>>
>>> On Thu, Aug 1, 2013 at 3:37 PM, Brian Bockelman <bbockelm@xxxxxxxxxxx>
>>> wrote:
>>>>
>>>> On Jul 23, 2013, at 7:58 AM, Chris Filo Gorgolewski <
>>> krzysztof.gorgolewski@xxxxxxxxx> wrote:
>>>>
>>>>> On Thu, Jun 27, 2013 at 2:31 AM, Jason Ferrara <
>>> jason.ferrara@xxxxxxxxxxxxx> wrote:
>>>>> I have a pool of machines running CentOS 6.4, Kernel 2.6.32-358, and
>>> HTCondor 7.9.4.
>>>>>
>>>>> Today, in order to try to stop jobs which underestimate their memory
>>> usage from making the machines swap a lot and get slow, I enabled cgroups
>>> and set
>>>>>
>>>>> CGROUP_MEMORY_LIMIT_POLICY = soft
>>>>> RESERVED_MEMORY = 1024
>>>>> I have similar issue (jobs which underestimate their memory usage), but
>>> I wasn't sure that CGROUP will solve this. Do I understand correctly that
>>> setting CGROUP_MEMORY_LIMIT_POLICY to either "soft" or "hard" will disable
>>> swapping for all condor jobs?
>>>>
>>>> - "hard" will kill the job as soon as it goes over its requested memory.
>>>> - "soft" will kill jobs that are over their requested memory only when
>>> the kernel believes memory is tight.
>>>
>>> This is not a correct description.
>>>
>>> First of all, it's worth noting that both hard and soft limits should
>>> not kill any process unless there is other good reason for OOM killer
>>> to intervene - such situation may be a running out of memory
>>> completely, but definitely not just going over soft or hard memory
>>> limit. And even when OOM killer is triggered, you can specify what it
>>> will mean for your process in particular cgroup via memory.oom_control
>>> file of memory cgroup controller.
>>>
>>> Unfortunately there is an exception from this: a known bug in
>>> RHEL/CentOS kernel will kill processes of cgroup with small hard
>>> memory limit when the limit is breached. But this *is not* an correct
>>> behaviour. Kernel developers are working on the fix, see the bugzilla
>>> for details: https://bugzilla.redhat.com/show_bug.cgi?id=870011
>>>
>>> Back to the CGROUP_MEMORY_LIMIT_POLICY, Âsee what upstream condor
>>> documentation states:
>>>
>>>
>>> http://research.cs.wisc.edu/htcondor/manual/v8.0/3_12Setting_Up.html#SECTION0041212000000000000000
>>>
>>> <from condor docs>
>>> If the hard limit is in force, then the total amount of physical
>>> memory used by the sum of all processes in this job will not be
>>> allowed to exceed the limit. If the processes try to allocate more
>>> memory, the allocation will succeed, and virtual memory will be
>>> allocated, but no additional physical memory will be allocated. The
>>> system will keep the amount of physical memory constant by swapping
>>> some page from that job out of memory.
>>>
>>> if the soft limit is in place, the job will be allowed to go over the
>>> limit if there is free memory available on the system. Only when there
>>> is contention between other processes for physical memory will the
>>> system force physical memory into swap and push the physical memory
>>> used towards the assigned limit.
>>> Â</from condor docs>
>>>
>>> Note that this description is consistent with kernel documentation for
>>> cgroups which was referenced in this thread before.
>>>
>>> Martin B.
>>>
>>>>>
>>>>> Furthermore, the free memory used in the "soft" policy is calculated
>>> based on the current system state not taken from the RESERVED_MEMORY
>>> variable?
>>>>
>>>> No -- soft-memory kills are controlled by the kernel. ÂFrom
>>> https://www.kernel.org/doc/Documentation/cgroups/memory.txt:
>>>>
>>>> """
>>>> 7. Soft limits
>>>>
>>>> Soft limits allow for greater sharing of memory. The idea behind soft
>>> limits
>>>> is to allow control groups to use as much of the memory as needed,
>>> provided
>>>>
>>>> a. There is no memory contention
>>>> b. They do not exceed their hard limit
>>>>
>>>> When the system detects memory contention or low memory, control groups
>>>> are pushed back to their soft limits. If the soft limit of each control
>>>> group is very high, they are pushed back as much as possible to make
>>>> sure that one control group does not starve the others of memory.
>>>>
>>>> Please note that soft limits is a best-effort feature; it comes with
>>>> no guarantees, but it does its best to make sure that when memory is
>>>> heavily contended for, memory is allocated based on the soft limit
>>>> hints/setup. Currently soft limit based reclaim is set up such that
>>>> it gets invoked from balance_pgdat (kswapd).
>>>> """
>>>>
>>>> I'm not a kernel programmer, but looking at the relevant kernel code, it
>>> seems that the cgroup is checked prior to swapping in most cases.
>>>
>>>> The archives can be found at:
>>>> https://lists.cs.wisc.edu/archive/htcondor-users/
>>> _______________________________________________
>>> HTCondor-users mailing list
>>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with
>>> a
>>> subject: Unsubscribe
>>> You can also unsubscribe by visiting
>>> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>>
>>> The archives can be found at:
>>> https://lists.cs.wisc.edu/archive/htcondor-users/
>>
>>
>>
>> _______________________________________________
>> HTCondor-users mailing list
>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/htcondor-users/
>
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

Mailing List Archives

Public Access

Re: [HTCondor-users] spontaneous reboots after enabling cgroups