[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Limiting number of jobs of specific user to N per node





On 24 October 2017 at 00:51, Greg Thain <gthain@xxxxxxxxxxx> wrote:
On 10/23/2017 01:23 AM, Sean Crosby wrote:
Hi all,

We run jobs for the Belle experiment, and at the moment, their jobs are very IO intensive. If say, on a 12 core node, if all 12 cores are taken up by Belle jobs, the node suffers heavily from IO problems.

We'd like to limit the number of belle jobs on each node to be (say e.g.) 4, but keep the other 8 slots to be open for other user jobs.

What's the easiest way to do this?

Machine custom resources are the best way to do this. On the execute side, you can define how many resources that machine has, in some arbitrary unit, like this:

MACHINE_RESOURCE_Belle = 4

and in the job ad, the belle jobs should say

Request_belle = 1

which means "only match to machines which have 1 or more belle resources remaining, and consume 1 for the duration of my job".

This worked great. Many thanks for the tip.

Cheers,
Sean
Â

-greg
_______________________________________________



--
Sean Crosby
Research Computing
ARC Centre of Excellence for Particle Physics at the Terascale
School of Physics | University of Melbourne VIC 3010 Australia