[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] making condor aware of isolcpus/cpu subset
- Date: Fri, 15 Jun 2012 12:07:59 -0500
- From: Todd Tannenbaum <tannenba@xxxxxxxxxxx>
- Subject: Re: [Condor-users] making condor aware of isolcpus/cpu subset
On 6/15/2012 10:27 AM, Vlad wrote:
I am using Condor 7.8 on RHEL 5 machines that use isolcpus boot option
to exclude a couple of physical cores from regular process scheduling
(these cores are dedicated to I/O processing). It appears that Condor is
having a hard time recognizing that these CPUs are not available for job
scheduling: it tries to run its benchmark on each core and only figures
out that a core is unavailable when it times out (takes hours to settle).
We have attempted to configure Condor with 1 slot per available core,
but there does not seem to be a way to bind slots to specific physical
core indices -- is that true or have we just not found the right
configuration options? I would appreciate any insight into how to make
Condor aware of a restricted cpuset available for scheduling.
Thank you in advance,
Hi Vlad -
I think Condor can do what you want. Condor v7.8 can indeed bind slots
to specific physical cores; below I copied out of the Manual the config
knobs of interest. So I think/hope you can easily achieve what you want
by setting in condor_config
NUM_CPUS = X
(where X is the number of physical cores you want condor to control),
and then set the cpu affinity knobs as documented below. I think you
will have to do a condor_restart (i don't think cpu affinity edits work
with just a reconfig, but I cannot recall off the top of my head for
Hope the above helps,
ENFORCE_CPU_AFFINITY A boolean value that defaults to False. When False,
the affinity of jobs and their descendants to a CPU is not enforced.
When True, Condor jobs and their descendants maintain their affinity to
a CPU. When True, more fine grained affinities may
be specified with SLOT<N>_CPU_AFFINITY.
SLOT<N>_CPU_AFFINITY A comma separated list of cores to which a Condor
job running on a specific slot given by the value of <N> show affinity.
Note that slots are numbered beginning with the value 1, while CPU cores
are numbered beginning with the value 0. This affinity list only takes
effect if ENFORCE_CPU_AFFINITY = True