[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] condor_defrag only some machines?
- Date: Thu, 12 Jan 2017 13:21:38 -0500
- From: Michael Di Domenico <mdidomenico4@xxxxxxxxx>
- Subject: Re: [HTCondor-users] condor_defrag only some machines?
having let the pool run for a while longer, it does appear to have
pulled in some of the nodes that originally weren't.
so i guess what this really boils down to is that I don't understand what
DEFRAG_RANK = -ExpectedMachineGracefulDrainingBadput
really means as it relates to the current state of my pool
I can see ExpectedMachineGracefulDrainingBadput is a classadd attached
to each of the machines in my pool, which represents a calculated
number, but i don't fully understand it
i see the explination in the manual, but it's still not clear. does
anyone have a pointer to something that might make it more clear how
this is actually choosing machines to set to draining state?
On Thu, Jan 12, 2017 at 11:57 AM, Michael Di Domenico
> i just turned on condor_defrag on my pool, we have a mixture of single
> core jobs and full node jobs in the queue presently. and as expected
> the full node jobs are backed up and not running behind the single
> core jobs
> the defragging seems to be churning along, but it only seems to drain
> a subset of nodes over all the others.
> my defrag config in condor is the defaults out of the box and we're
> 100% partitionable slots (cpu/memory/gpus) on the pool (which is the
> same on all nodes). we're running condor 8.4.7 on linux
> is there something i can run that will tell me why defrag is picking
> certain nodes or subsequently something that will tell me why defrag
> is ignoring other nodes?
> as best i can tell it's trying to defrag nodes, except the full node
> jobs waiting require a certain set of nodes which defrag doesn't seem
> to be touching