[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] HTCondor/cgroups: limiting CPUs/pinning processes to CPUs with hyperthreaded CPUs

From: Thomas Hartmann <thomas.hartmann@xxxxxxx>
Date: 02/04/2016 10:08 AM
> If a I understand the cgroup documentation correctly, a cgroup cannot be
> limited to a "general number of cores" but can only be pinned to certain
> cores. I.e., limiting the number of cores for a cgroup means to pin the
> cgroup to as many dedicated cores on the system, or?
> So, I guess the startd pins a job with core limit correspondingly in a
> cgroup to cores, or?
> Is this actually a drawback, that processes cannot be switched between
> cores (does the CPU would move anyway processes between cores?)?
> How does actually a hyperthreaded system would look like - if a process
> is pinned to "a hyperthreaded core", I guess the process would be moved
> over the physical cores by the CPU, or?

With the default cgroup setup, the startd does not pin jobs to specific
processors, but instead uses the cpu.shares functionality. The share
assigned to a job is the number of requested cpus times 100, so a single
core job gets 100, two-core gets 200, and so on.

This limit is only applied when there is contention for CPU time, however,
so if a job wants to use 8 cores but only requested one, it can use eight
only as long as there's idle capacity on other cores, but if the machine
fills up it will be dialed back to its cpu.shares value of 100, on a single

See this page:

This has been important for compiled MATLAB jobs - unless the MATLAB code
specifies a maximum compute thread count or has the singleCompThread
command-line option, MATLAB will use all available cores, which is a
bummer if your machine has a lot of cores and is also trying to run
such a MATLAB job on each of them. The cpu.shares doesn't require a
specific constraint in the MATLAB code, which means it will run full
bore on the user's desktop, and full bore on an underutilized exec
node, but won't step on everything else on a busy exec node.

If you do want affinity, for processor cache coherence considerations
or the like, you can do that too, though.

There's a knob called "ENFORCE_CPU_AFFINITY" which causes each job and
all its children to stay on a specific core, and "ASSIGN_CPU_AFFINITY"
which enables affinity to work with dynamic slots and overrides the
ENFORCE setting.

        -Michael Pelletier.