[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Possible for user to limit number of jobs per physical machine?



The problem with trying to use ChildRemoteUser in the job's requirements, is that the job needs to match both
the p-slot, and the d-slot that is created from it.   So you have to express this very carefully or else the job
will get a match, but be unable to claim the d-slot.

The requirements for the job would have to be something like

Requirements = TARGET.DynamicSlot || member(MY.user, TARGET.ChildRemoteUser) )

or 

Requirements = TARGET.DynamicSlot || stringListMember( MY.user, join(",",TARGET.ChildRemoteUser) )

-tj

-----Original Message-----
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Carsten Aulbert
Sent: Thursday, September 10, 2020 10:31 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] Possible for user to limit number of jobs per physical machine?

Hi all,

a current user has the problem to start a very I/O intensive jobs and
would like to limit himself to one or two jobs per defined slot - as we
currently only define a single slot per physical machine, that should
not be a problem.

However, as we admins do not want to change the nodes' configuration on
a per user basis or that often, especially not if each user only has a
subset of jobs which are that demanding.

Therefore the question, has anyone a recipe how a user could limit
himself to only run a limited number of jobs per node regardless of how
many subslots a partitionable main slots a machine may have?

While browsing around the docs and mailing list archive, the only place
I found where this information may be readily available is the machine
ad "ChildRemoteUser" from the PartitionableSlot. However, given that
this seems to be a stringified list, I do not know if and how this could
be used in the Requirements section of a submit file.[1]

Anyone with an idea?

Cheers

Carsten

[1] While writing this email - thus without testing it so far - I
wondered if it were possible to use any of the predefined functions[2]
in the user's submit file to target only machines where this particular
user has nothing running so far? Or would that in the end lead to a
situation where the Negotiator would propose a match but the node may
refuse the job to run?

[2]
https://htcondor.readthedocs.io/en/latest/misc-concepts/classad-mechanism.html#predefined-functions


-- 
Dr. Carsten Aulbert, Max Planck Institute for Gravitational Physics,
CallinstraÃe 38, 30167 Hannover, Germany
Phone: +49 511 762 17185