[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] soft/hard limiting cpu.shares ?


You understand the cpu shares mechanism correctly. It's a soft limit with a policy for resolving conflict when conflict arises.

If you really want to nail down HTCondor jobs to a total number of cores, you want to want to use cpu.cfs_quota_us (and optionally cpu.cfs_period_us) on the parent htcondor cgroup. This is an honest to goodness hard limit on CPU usage that works in parallel with the shares mechanism.


Short version, to assignÂ1-core to the cgroup, set the quota to 1000000.

Within the htcondor cgroup, shares will be enforced by HTCondor but the overall limit will be applied at the parent level.

On Wed, Jun 16, 2021 at 10:20 AM Thomas Hartmann <thomas.hartmann@xxxxxxx> wrote:
Hi all,

a short question regarding jobs core time scaling via cgroup cpu.shares:

The relative share of a job's cgroup is only limiting with respect to
the total core-scaled CPU time, or?

I.e., we are running our nodes with hyperthreading 2x enabled for
simplicity, since we use the same machines for production jobs as well
as for user job sub-clusters.

Since user have occasionally odd user jobs (that tend to work better
without overbooking) we broker on user nodes only 1/2 of the HT-core
numbers for jobs.

now, the condor parent cgroup has assigned
 Âhtcondor/cpu.shares = 1024
with respect to the total system share of
so all condor child processes (without further sub-groups) could in
principle use up to 100% of the total HT-core scaled CPU time.

A single core job gets a relative share like

where we broker only 50% of the total HT-core scaled time - as far as I see.

However, user jobs can utilize more than their nominally assigned cpu share.
My understanding is, that the kernel notices, that the total CPU time is
not utilized completely - and thus allows processes to use more than
their nominal time limit as there is still CPU time available.
Is this correct? ð

When we scale the condor parent cgroup to a reasonable fraction of the
system cpu.share (taking HT efficiency into account), we should be able
to scale CPU times per job to (roughly) core-equivalents - without the
need to bind jobs to specific cores, or?


HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting

The archives can be found at: