Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Help needed understanding cpu core usage with cgroups

Date: Mon, 13 Apr 2015 19:37:20 +0100
From: Roderick Johnstone <rmj@xxxxxxxxxxxxx>
Subject: Re: [HTCondor-users] Help needed understanding cpu core usage with cgroups

On 13/04/15 19:00, Greg Thain wrote:

On 04/10/2015 09:19 AM, Roderick Johnstone wrote:

Hi

I have a condor job with 5 threads (4 cpu bound) running with
request_cpus = 2 in the submit file.

When I have 2 foreground (Owner) jobs running at 100%cpu the condor
job is only getting the equivalent of 1 cpu between its threads.

I'm measuring this by looking at the aggregate nice cpu percentage
which is 25% in the output of the top program (the condor jobs are
niced to 16 while the foreground jobs running at nice 0). This result
is confirmed by the sum of the cpu percentage of the condor job
threads adding up to approx 100% indicating that only one core is
being used.

From the wiki page above, I was expecting that the condor job would
access 2 cpus rather than 1 under these circumstances. Did I
misunderstand something here?


HTCondor with cgroups uses the "cpu shares" parameter to limit cpu
usage.  HTCondor will set the cpu shares of a cgroup to 100 *
number_of_cores_assigned_to_the_slot.  This works well if the only
cpu-bound activity on the machine is from HTCondor jobs.


When you say "foreground (Owner)" jobs -- are these processes running
under HTCondor, or not?  If not, and they aren't in any cgroup, then I
would expect the behavior that you see, their cpus shares are
effectively unlimited, and the condor jobs just get the leftovers.


Greg

Thanks for your responses.

Yes, the "foreground (Owner)" are non-condor jobs that are not usingcgroups. Our situation is that we use condor to soak up spare cycles onpeoples desktops as well as using dedicated computer servers and thiswas a test of the former situation where I was simulating theworkstation owner running their own code outside of condor.

So, just to double check I understand this, the cgroups is reallyworking to allocate the relative number of cpus between different condorjobs running in different slots according to the request_cpus criterionin the job submit file, regardless of the number of threads running ineach job.

The actual number of cpus that a condor job might run on is not reallyconstrained by cgroups because non-condor (non-cgroups) jobs can squeezeout the condor jobs and if there are no non-condor jobs, the condor jobscan take over all available cpus.


Is that right?


You could fix this by putting the foreground jobs into their own cgroup,
or running them as a condor job proper.


Thats not really an option for us in the general case.


One point that I'm not sure about is the first paragraph in Option 2.
HTCondor is started as root (from init scripts; condor is installed
form the condor repository rpm) but running as the condor user. Does
that count as "condor daemons being started as root"?


If condor is started from init, that counts as "started as root".


Thanks for the clarification.

Roderick

Follow-Ups:
- Re: [HTCondor-users] Help needed understanding cpu core usage with cgroups
  - From: Greg Thain

References:
- [HTCondor-users] Help needed understanding cpu core usage with cgroups
  - From: Roderick Johnstone
- Re: [HTCondor-users] Help needed understanding cpu core usage with cgroups
  - From: Greg Thain

Prev by Date: Re: [HTCondor-users] Help needed understanding cpu core usage with cgroups
Next by Date: Re: [HTCondor-users] HTCondor and Docker
Previous by thread: Re: [HTCondor-users] Help needed understanding cpu core usage with cgroups
Next by thread: Re: [HTCondor-users] Help needed understanding cpu core usage with cgroups
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [HTCondor-users] Help needed understanding cpu core usage with cgroups