A few thoughts-
- Why do you want swap on your worker nodes? ÂWe found it much more useful to just disable swap and kill jobs when they went over their memory limit.
- You can set the swappiness of the /condor cgroup to 0, disabling swap only for condor jobs and processes.
- cgroups, depending on the kernel and distro, may also track memory+swap usage. We don't do this currently, but is a very simple change.Â
- We already listen for events about OOM issues in the cgroup and disable the OOM-killer. ÂShould be straightforward to add a listener when memory boundaries are crossed.
Food for thought,
Sent from my iPhone
On Aug 15, 2013, at 12:02 PM, Todd Tannenbaum <tannenba@xxxxxxxxxxx> wrote:
> On 8/15/2013 5:10 AM, Chris Filo Gorgolewski wrote:
>> Yes that what I was afraid off - that this will cause swapping and slow
>> down the whole machine anyway.
> Not sure how you conclude that swapping for a small subset of processes Â(aka perhaps one slot on a machine with many slots) will slow down execution of other processes/slots that have their entire image in RAM. ÂIs virtual memory management i/o in the kernel synchronous? I doubt it...
> If you are really concerned, you set your HTCondor startd PREEMPT _expression_ to kill off jobs that are swapping after just a few seconds, i.e. jobs whose MemoryUsage > Memory.
> With regards to the original topic of this thread, HTCondor v8.0.2 (scheduled for release next week) will include a patch to work around the kernel bug that can result in a reboot. ÂSee
>> On Fri, Aug 2, 2013 at 11:57 PM, Martin Bukatovic <
>> martin.bukatovic@xxxxxxxxx> wrote:
>>> Dear condor list,
>>> On Thu, Aug 1, 2013 at 3:37 PM, Brian Bockelman <bbockelm@xxxxxxxxxxx>
>>>> On Jul 23, 2013, at 7:58 AM, Chris Filo Gorgolewski <
>>> krzysztof.gorgolewski@xxxxxxxxx> wrote:
>>>>> On Thu, Jun 27, 2013 at 2:31 AM, Jason Ferrara <
>>> jason.ferrara@xxxxxxxxxxxxx> wrote:
>>>>> I have a pool of machines running CentOS 6.4, Kernel 2.6.32-358, and
>>> HTCondor 7.9.4.
>>>>> Today, in order to try to stop jobs which underestimate their memory
>>> usage from making the machines swap a lot and get slow, I enabled cgroups
>>> and set
>>>>> CGROUP_MEMORY_LIMIT_POLICY = soft
>>>>> RESERVED_MEMORY = 1024
>>>>> I have similar issue (jobs which underestimate their memory usage), but
>>> I wasn't sure that CGROUP will solve this. Do I understand correctly that
>>> setting CGROUP_MEMORY_LIMIT_POLICY to either "soft" or "hard" will disable
>>> swapping for all condor jobs?
>>>> - "hard" will kill the job as soon as it goes over its requested memory.
>>>> - "soft" will kill jobs that are over their requested memory only when
>>> the kernel believes memory is tight.
>>> This is not a correct description.
>>> First of all, it's worth noting that both hard and soft limits should
>>> not kill any process unless there is other good reason for OOM killer
>>> to intervene - such situation may be a running out of memory
>>> completely, but definitely not just going over soft or hard memory
>>> limit. And even when OOM killer is triggered, you can specify what it
>>> will mean for your process in particular cgroup via memory.oom_control
>>> file of memory cgroup controller.
>>> Unfortunately there is an exception from this: a known bug in
>>> RHEL/CentOS kernel will kill processes of cgroup with small hard
>>> memory limit when the limit is breached. But this *is not* an correct
>>> behaviour. Kernel developers are working on the fix, see the bugzilla
>>> for details: https://bugzilla.redhat.com/show_bug.cgi?id=870011
>>> Back to the CGROUP_MEMORY_LIMIT_POLICY, Âsee what upstream condor
>>> documentation states:
>>> <from condor docs>
>>> If the hard limit is in force, then the total amount of physical
>>> memory used by the sum of all processes in this job will not be
>>> allowed to exceed the limit. If the processes try to allocate more
>>> memory, the allocation will succeed, and virtual memory will be
>>> allocated, but no additional physical memory will be allocated. The
>>> system will keep the amount of physical memory constant by swapping
>>> some page from that job out of memory.
>>> if the soft limit is in place, the job will be allowed to go over the
>>> limit if there is free memory available on the system. Only when there
>>> is contention between other processes for physical memory will the
>>> system force physical memory into swap and push the physical memory
>>> used towards the assigned limit.
>>> Â</from condor docs>
>>> Note that this description is consistent with kernel documentation for
>>> cgroups which was referenced in this thread before.
>>> Martin B.
>>>>> Furthermore, the free memory used in the "soft" policy is calculated
>>> based on the current system state not taken from the RESERVED_MEMORY
>>>> No -- soft-memory kills are controlled by the kernel. ÂFrom
>>>> 7. Soft limits
>>>> Soft limits allow for greater sharing of memory. The idea behind soft
>>>> is to allow control groups to use as much of the memory as needed,
>>>> a. There is no memory contention
>>>> b. They do not exceed their hard limit
>>>> When the system detects memory contention or low memory, control groups
>>>> are pushed back to their soft limits. If the soft limit of each control
>>>> group is very high, they are pushed back as much as possible to make
>>>> sure that one control group does not starve the others of memory.
>>>> Please note that soft limits is a best-effort feature; it comes with
>>>> no guarantees, but it does its best to make sure that when memory is
>>>> heavily contended for, memory is allocated based on the soft limit
>>>> hints/setup. Currently soft limit based reclaim is set up such that
>>>> it gets invoked from balance_pgdat (kswapd).
>>>> I'm not a kernel programmer, but looking at the relevant kernel code, it
>>> seems that the cgroup is checked prior to swapping in most cases.
>>>> The archives can be found at:
>>> HTCondor-users mailing list
>>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with
>>> subject: Unsubscribe
>>> You can also unsubscribe by visiting
>>> The archives can be found at:
>> HTCondor-users mailing list
>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> The archives can be found at:
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> The archives can be found at:
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
You can also unsubscribe by visiting
The archives can be found at: