[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] HT cores utilized to 100% although HT core count is false



Hi all,

I am currently wondering about a few nodes, that have a utilization of all (HT) cores but should only be using only 50%, i.e., just the physical core count.

The nodes have AMD Epycs with HT/SMT cores active - but since we have
  COUNT_HYPERTHREAD_CPUS = false
set, Condor should be using only 50% of the (virtual) core count [1], or?.

What worries me a bit is, that the CPU time shares of the jobs look good [2], i.e., currently just <48 single core jobs with a relative '100' weight. However, I am not sure anymore, how the kernel is distributing the CPU time slots here, if the parent relative share is 100%(?) of the overall(??) time share?

Is the CPU time weighting maybe misleading here, if one tries to 'match' only for the physical core count?

Cheers and thanks for ideas,
  Thomas



[1]
COUNT_HYPERTHREAD_CPUS = false
...
DETECTED_CORES = 96
DETECTED_CPUS = 48
DETECTED_MEMORY = 257656
DETECTED_PHYSICAL_CPUS = 48
..
NUM_CPUS = $(DETECTED_CPUS)


[2]
[root@batch1071 htcondor]# cat /sys/fs/cgroup/cpu,cpuacct/cpu.shares
1024
[root@batch1071 htcondor]# cat /sys/fs/cgroup/cpu,cpuacct/htcondor/cpu.shares
1024
[root@batch1071 htcondor]# cat /sys/fs/cgroup/cpu,cpuacct/htcondor/condor_var_lib_condor_execute_slot*/cpu.shares | sort | wc -l
45

Attachment: batch1071_load_6h_20210115.png
Description: PNG image

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature