[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] OMP_NUM_THREADS forced to request_cpus value





On Thu, Mar 15, 2018 at 3:00 PM, Greg Thain <gthain@xxxxxxxxxxx> wrote:
On 03/14/2018 08:18 PM, Alex Nitz wrote:

As a general matter, I think I'd disagree. Not so much on grounds that it does't make sense for some applications, but rather it seems an odd way to ensure that an application doesn't exceed its resource allotment. If that were the goal, I would think something like taskset would be more appropriate.

Note that the reason condor sets OMP_NUM_THREADS isn't primarily to ensure that cpu usage doesn't exceed the requested usage -- the cgroup cpu shares option we set enforces that. And, there's nothing condor can do to prevent a job from changing this environment variable after condor has spawned the job.  However, we see a large number of applications linked with open mp, often, in the case of 3rd party code, without the user knowing open mp is under the hood. The default for openmp is to spawn as many threads as cores detected, and this, combined with the cpu limiting, causes massive code slowdowns. The machine wasn't overloaded, and neighboring jobs on the same machine weren't effected, but when there are 32 threads competing for one core, we could see orders of magnitude performance impacts.

Â
Thanks for the explanation. Yeah, that makes sense.
Â
I think that trusting the user, when OMP_NUM_THREADS is set explictly in the condor submit file is an appropriate approach.

That would be great!Â
Â


-greg



_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@cs.wisc.edu with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/