[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Partitionable Slot Starvation



Hi Dan,

On 08/16/2012 10:42 AM, Dan Bradley wrote:
> 
> If the problem was caused by DEFRAG_REQUIREMENTS and/or 
> DEFRAG_WHOLE_MACHINE_EXPR, the defrag log would indicate so with a 
> message like the following:
> 
> "Drained 0 machines (wanted to drain X machines)."
> 
> "Doing nothing, because DEFRAG_MAX_WHOLE_MACHINES=X and there are Y 
> whole machines."

Right, I'm not seeing that message.

> 
> 
> As a sanity check, what numbers do you see in the following line in the 
> log when defrag starts up or is reconfigured?
> 
> "polling interval %ds, DEFRAG_DRAINING_MACHINES_PER_HOUR = %f/hour = 
> %d/interval + %d/hour + %d/day"


08/15/12 15:07:13 polling interval 90s,
DEFRAG_DRAINING_MACHINES_PER_HOUR = 12.000000/hour = 0/interval +
12/hour + 0

> 
> And what numbers do you see in the most recent log line of the following 
> form:
> 
> "There are currently %d draining and %d whole machines."
> 

08/16/12 12:09:31 There are currently 0 draining and 0 whole machines.


> One word of warning: defrag drains the whole startd, partitionable slots 
> and static slots alike.  If you only want it to drain some slots and not 
> others, you need to run multiple startds and set DEFRAG_REQUIREMENTS to 
> only match the slots of the startd to be drained and not the slots of 
> the other startd.

OK, so do I infer that the defrag will only work on machines where there
is only one whole-machine slot? Or just that it will drain single-core
slots in addition to the partitionable ones?

Thanks,
-Will