[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] ASSIGN_CPU_AFFINITY does not work as expected



Dear Condor Expert:

I am using condor 8.2.2 and from the manual I see that if we set 'ASSIGN_CPU_AFFINITY = True' for the STARTD daemon, then a multi-threaded job will not use more cores than the number it requests. I made a quick test but seems 'ASSIGN_CPU_AFFINITY = True' does not work:

1. I added 'ASSIGN_CPU_AFFINITY = True' for the STARTD daemon, after that I restarted the condor daemon on node008.
[root@node008 ~]# condor_config_val -dump | grep AFFINI
ASSIGN_CPU_AFFINITY = True

2. I submit a 4 threads job to node008, but I assign 'RequestCpus = 1' in the submission file:

[scotg001@node003 test]$ condor_submit submit.stress
Submitting job(s).
1 job(s) submitted to cluster 41.

[scotg001@node003 test]$ cat submit.stress
Universe       = vanilla
Executable     = stress.sh
Output         = stress.out
Log            = stress.log

RequestMemory  = 1000
RequestCpus = 1
requirements = ((Arch == "INTEL" || ARCH == "X86_64") && (machine =="node008"))

Queue 1


[scotg001@node003 test]$ cat stress.sh
#!/bin/bash
echo $HOSTNAME
echo `date`
stress --vm 4  --vm-bytes 800M --timeout 60s


3.  Condor shows that this job requires 1 cpu:

[scotg001@node003 test]$ condor_history  41.0 -af RequestCpus
1

but on node008 with top I can see the job is actually occupying 4 cores.
 5232 scotg001  30  10  806m 767m  140 R 94.7  9.8   0:07.60 stress
 5233 scotg001  30  10  806m 651m  140 R 94.7  8.3   0:07.48 stress
 5230 scotg001  30  10  806m  15m  140 R 92.9  0.2   0:07.46 stress
 5231 scotg001  30  10  806m 201m  140 R 92.9  2.6   0:07.54 stress

Any idea why 'ASSIGN_CPU_AFFINITY = True' does not work as expected? I use partitionable slots but in 8.2.2 the previous bug should already be fixed.

  Cheers,Gang