[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] OMP_NUM_THREADS forced to request_cpus value



On 03/14/2018 08:18 PM, Alex Nitz wrote:

As a general matter, I think I'd disagree. Not so much on grounds that it does't make sense for some applications, but rather it seems an odd way to ensure that an application doesn't exceed its resource allotment. If that were the goal, I would think something like taskset would be more appropriate.

Note that the reason condor sets OMP_NUM_THREADS isn't primarily to ensure that cpu usage doesn't exceed the requested usage -- the cgroup cpu shares option we set enforces that. And, there's nothing condor can do to prevent a job from changing this environment variable after condor has spawned the job.  However, we see a large number of applications linked with open mp, often, in the case of 3rd party code, without the user knowing open mp is under the hood. The default for openmp is to spawn as many threads as cores detected, and this, combined with the cpu limiting, causes massive code slowdowns. The machine wasn't overloaded, and neighboring jobs on the same machine weren't effected, but when there are 32 threads competing for one core, we could see orders of magnitude performance impacts.

I think that trusting the user, when OMP_NUM_THREADS is set explictly in the condor submit file is an appropriate approach.

-greg