Cristoph Beyer: "I thought the jobs that by far exceed the memory limit would be killed and go on hold but that seems only to happen from time to time (?)"

Hi Christoph, I've been using cgroups for about the last two and a half years, and my name is on a few of the patches for them, so I can tell you that from personal experience, the cgroups configuration of HTCondor doesn't inherently constrain the memory and processor utilization of the jobs, but rather provides a 100% accurate way for HTCondor to track that utilization by all processes involved in the job (except for condor_ssh_to_job processes).

By default, oversubscription of CPU shares is permitted - the cgroup just insures that when there's contention for available CPUs, each job will get at least the number of processors it specified in request_cpus. That is to say, if you run a "make -j 8" in a "request_cpus =1" slot, the make job will be able to use 8 CPUs as long as nobody else on the machine wants them, but if the machine is full it will at least get the one CPU it requested. This is true unless you enable CPU affinity via ASSIGN_CPU_AFFINITY or ENFORCE_CPU_AFFINITY, which prevents oversubscription.

The same goes for memory - unless you set up the enforcement, a job will be able to use as much memory as it wants until it wedges the machine into a swap-thrashing state (Red Hat 5) or runs afoul of the Out-of-Memory Killer (Red Hat 6), both of which I've encountered. The new Docker universe  in 8.3/8.4 does, however, enforce a hard memory limit by default.

The details of limiting resource usage via cgroups is in the 8.2.9 manual section 3.12.14, referencing CGROUP_MEMORY_LIMIT_POLICY and the like. The manual should probably mention CPU affinity in that section too, in the second paragraph on page 446 regarding CPU usage.


