[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Possible for user to limit number of jobs per physical machine?



Hi,

I think one way would be to use startd_cron to check periodically how many jobs of each user are currently running on the node. 

Export the values in the node-classadd for ex 'running_jobs_user_xy' and alter the start configuration of your slot accordingly to not start any jobs of user xy if running_jobs_user_xy > number ... 

Couple of lines in the config but not very complicated ...

Best
christoph

-- 
Christoph Beyer
DESY Hamburg
IT-Department

Notkestr. 85
Building 02b, Room 009
22607 Hamburg

phone:+49-(0)40-8998-2317
mail: christoph.beyer@xxxxxxx

----- UrsprÃngliche Mail -----
Von: "Carsten Aulbert" <carsten.aulbert@xxxxxxxxxx>
An: "htcondor-users" <htcondor-users@xxxxxxxxxxx>
Gesendet: Donnerstag, 10. September 2020 17:30:56
Betreff: [HTCondor-users] Possible for user to limit number of jobs per	physical machine?

Hi all,

a current user has the problem to start a very I/O intensive jobs and
would like to limit himself to one or two jobs per defined slot - as we
currently only define a single slot per physical machine, that should
not be a problem.

However, as we admins do not want to change the nodes' configuration on
a per user basis or that often, especially not if each user only has a
subset of jobs which are that demanding.

Therefore the question, has anyone a recipe how a user could limit
himself to only run a limited number of jobs per node regardless of how
many subslots a partitionable main slots a machine may have?

While browsing around the docs and mailing list archive, the only place
I found where this information may be readily available is the machine
ad "ChildRemoteUser" from the PartitionableSlot. However, given that
this seems to be a stringified list, I do not know if and how this could
be used in the Requirements section of a submit file.[1]

Anyone with an idea?

Cheers

Carsten

[1] While writing this email - thus without testing it so far - I
wondered if it were possible to use any of the predefined functions[2]
in the user's submit file to target only machines where this particular
user has nothing running so far? Or would that in the end lead to a
situation where the Negotiator would propose a match but the node may
refuse the job to run?

[2]
https://htcondor.readthedocs.io/en/latest/misc-concepts/classad-mechanism.html#predefined-functions


-- 
Dr. Carsten Aulbert, Max Planck Institute for Gravitational Physics,
CallinstraÃe 38, 30167 Hannover, Germany
Phone: +49 511 762 17185


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/