[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Cgroups problem



On 18/06/14 12:13, Dave Macey wrote:
Hi,

I'm testing the use of Cgroups in HTCondor version 8.0.7, following the manual instructions as per section 3.12.12, but according to the ProcLog this is failing with:

06/17/14 15:52:40 : Setting cgroup to "htcondor"/condor_home_condor_execute_slot1_1@xxxxx for ProcFamily 3853.
06/17/14 15:52:40 : Cannot attach pid 3853 to cgroup "htcondor"/condor_home_condor_execute_slot1_1@xxxxx for ProcFamily 3853: 50016 No space left on device

Since I'm testing under Debian 7.5, which by default sets up all subsystems in the same directory, my /etc/cgconfig.conf file looks like:

mount {
        cpu    = /sys/fs/cgroup;
        cpuset    = /sys/fs/cgroup;
        cpuacct = /sys/fs/cgroup;
        memory  = /sys/fs/cgroup;
        freezer = /sys/fs/cgroup;
        blkio   = /sys/fs/cgroup;
}

group htcondor {
  cpu {}
  cpuacct {}
  memory {}
  freezer {}
  blkio {}
  cpuset {
    cpuset.cpus = 0-3;   # It's a four core machine
    cpuset.mems = 0;   # Recommended by numerous online posts
  }
}

Another Debian vagary is that the memory subsystem is not available by default, which requires it to be loaded via a kernel boot option, but that all works and I can see the directory  /sys/fs/cgroup/htcondor suitably populated. Trawling the internet would suggest that the problem is usually due to an empty cpuset.mems field, but I've covered that, so I'd be grateful for any ideas where the problem might be.

Dave,
The problem is that although you are setting cpuset.cpus and cpuset.mems at the /sys/fs/cgroup/htcondor level, these are not being propagated to the subdirectories that HTCondor produces, which would have the problematic empty values. I got round this by using the cgroup.clone_children property, e.g. in my /etc/rc.local I have:

/usr/bin/cgconfigparser -l /etc/cgconfig.conf
/bin/echo 1 > /sys/fs/cgroup/htcondor/cgroup.clone_children

Hope that help,
Mark