[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Changing slotweight for few nodes in pool



I like the 2GB chunks idea.

The approach I took a few years ago was to allow one CPU to claim up to one CPU's share of memory on the system without penalty, so the "excess memory" factor varies depending on the number of CPU cores in the system.

For a 56-core 128G system, it's about 2.3GB, while on the 96-core 512GB system its' 5.3GB. So on the former system, a 4-core job can claim up to 9.2GB without penalty, and on the latter it can claim up to 21.2GB. I'll have to keep an eye on this as core density continues to climb, I reckon - I don't have a floor value for it, so it could wind up less than a 2GB chunk depending on the machine configuration.

The excess (job's memory request minus the per-CPU memory total) is added to the "Cpus" number for the final slotweight value. I don't reduce the weight for jobs using less memory than the per-CPU share, though perhaps I should.

Michael V. Pelletier
Information Technology
Digital Transformation & Innovation
Integrated Defense Systems
Raytheon Company

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Steven C Timm
Sent: Tuesday, August 13, 2019 9:30 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [External] Re: [HTCondor-users] Changing slotweight for few nodes in pool

We are doing something like what you are doing at Fermilab. Basically our slot-weight 
expression charges the user by the CPU's or the number of 2GB memory chunks, whichever is higher.  i.e. 1cpu 2 GB = 1, 1 cpu 3GB = 1.5, 1 cpu 4GB = 2, 2 cpu 2GB =2 , and so forth.
What I don't understand is why you would set the weight of the Partitionable slot to 1, 
it should be set to how many cpus remaining in it at the time.

The trap that can happen is that if you have a lot of small submitters, sometimes the slot weight of the slots will be so big that a submitter with a low limit will never get a slot. In theory the negotiator could either hand an existing dynamic slot or the whole Partitionable slot to the schedd.  In practice it is more the latter that we see.  We see the effect that small submitters get frozen out sometimes 
but they recently put in a patch to fix most of the problem.  

The other thing that can happen is that condor_userprio doesn't accurately show the effects of a floating point slot weight, it only reports integer resources used. but the underlying math is right.

Finally be sure that you never have the slot weight variable end up undefined, major craziness can happen then.

Steve Timm