[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Negotiation with partitionable slots



On Fri, 2017-04-28 at 08:31:38 +0200, Mathieu Bahin wrote:
> Hi Steffen,
> 
> Thanks for your answer (and yes it was a typo).
> 
> Your formula is interesting too. Though I'm not sure I made myself clear
> on my main question...
> I'll try to illustrate it with a very simple example.
> 
> Let's have a cluster of 4 nodes: 2 "small" with 5 CPUs and 2 "big" with
> 10 CPUs. If I set NEGOTIATOR_PRE_JOB_RANK = floor(TotalCpus / 10) *
> -10000000
> then the small nodes should be assigned first to jobs (if matching the
> requirements).
> But if I run 6 jobs requesting 1 CPU each, in the negotiation cycle, the
> 4 nodes of the cluster will advertise 1 free slot each (the one
> gathering the free resources on each node) and so only 2 jobs will be
> assigned to the small nodes, then 2 others will be assigned to the big
> ones and finally the 2 last will be assigned to the small nodes during
> the next negotiation cycle.
> 
> Am I right? At least that's what I understand is happening on our
> cluster with this new formula. Though what I would like in this
> situation is the 6 jobs to be assigned to the small nodes.

Since I made this tweak to the config a couple of years ago, I may have
missed some of the details. I vaguely remember that there is a setting
that enforces subsequent splitting an already partially partitioned
slot. Now if I knew where to find it in 1100 pages of manual...
Perhaps you're more successful than me?

- S

> 
> Cheers,
> Mathieu
> 
> On 27/04/17 19:00, htcondor-users-request@xxxxxxxxxxx wrote:
> > Send HTCondor-users mailing list submissions to
> > 	htcondor-users@xxxxxxxxxxx
> >
> > To subscribe or unsubscribe via the World Wide Web, visit
> > 	https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> > or, via email, send a message with subject or body 'help' to
> > 	htcondor-users-request@xxxxxxxxxxx
> >
> > You can reach the person managing the list at
> > 	htcondor-users-owner@xxxxxxxxxxx
> >
> > When replying, please edit your Subject line so it is more specific
> > than "Re: Contents of HTCondor-users digest..."
> >
> >
> > Today's Topics:
> >
> >    1. Negotiation with partitionable slots (Mathieu Bahin)
> >    2. Re: Negotiation with partitionable slots (Steffen Grunewald)
> >
> >
> > ----------------------------------------------------------------------
> >
> > Message: 1
> > Date: Thu, 27 Apr 2017 09:23:16 +0200
> > From: Mathieu Bahin <mathieu.bahin@xxxxxxxxxxxxxxx>
> > To: htcondor-users@xxxxxxxxxxx
> > Subject: [HTCondor-users] Negotiation with partitionable slots
> > Message-ID: <59019C64.1000805@xxxxxxxxxxxxxxx>
> > Content-Type: text/plain; charset=utf-8
> >
> > Hi,
> >
> > Our cluster, configured with partitionable slots and having very
> > heterogeneous nodes, is being more loaded for a few months and we are
> > facing a new issue: it's complicated for a user to run a job requesting
> > big CPUs and/or memory.
> >
> > Ideally, we would have liked to have the smallest machines first filled
> > up to ~80% (not to overload them when there is still space anywhere
> > else) in order to leave free a few of the biggest machines when possible.
> >
> > We designed the NEGOCIATOR_JOB_POST_RANK value to something like that:
> > (RemoteOwner =!= UNDEFINED) * ((floor(TotalCpus / 10) * -10000000) +
> > KFlops - 1.0e10 * (Offline =?= True))
> > Our nodes are then divided into 4 classes (our biggest node has 32
> > CPUs), the one having less than 10 CPUs are now prioritary and within
> > classes, the most powerful nodes comes first (with Kflops).
> >
> > Though, if we run a few dozens of jobs, from what I noted, the priority
> > order is satisfying but only one job is allocated to the nodes of the
> > more priority classes (I guess because with the partitionable slots,
> > nodes only advertise one big free slot for one negotiation cycle) and
> > the next ones are allocated, one by one also, to the less priority nodes.
> > How can we do for several jobs really filling the highest priority nodes
> > before considering the others?
> >
> > Cheers,
> > Mathieu
> >
> 
> -- 
> ---------------------------------------------------------------------------------------
> | Mathieu Bahin
> | IE CNRS
> |
> | Institut de Biologie de l'Ecole Normale Supérieure (IBENS)
> | Biocomp team
> | 46 rue d'Ulm
> | 75230 PARIS CEDEX 05
> | 01.44.32.23.56
> ---------------------------------------------------------------------------------------
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/

-- 
Steffen Grunewald, Cluster Administrator
Max Planck Institute for Gravitational Physics (Albert Einstein Institute)
Am Mühlenberg 1
D-14476 Potsdam-Golm
Germany
~~~
Fon: +49-331-567 7274
Fax: +49-331-567 7298
Mail: steffen.grunewald(at)aei.mpg.de
~~~