[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Partitionable Slot Starvation



On 8/15/2012 4:49 PM, William Strecker-Kellogg wrote:

Each execute machine is 24-core and configured thus:

SLOT_TYPE_1 = cpus=16, ram=2/3, swap=2/3, disk=2/3
SLOT_TYPE_2 = cpus=auto, ram=auto, swap=auto, disk=auto

SLOT_TYPE_1_PARTITIONABLE = True

NUM_SLOTS_TYPE_1 = 1
NUM_SLOTS_TYPE_2 = 8

so we have a combination of partitionable and regular slots.

[snip]
Enter the defrag daemon. As I understand it, it was designed to prevent
this kind of starvation.  I configured it as follows:

DAEMON_LIST = $(DAEMON_LIST) DEFRAG
DEFRAG_INTERVAL = 90
DEFRAG_DRAINING_MACHINES_PER_HOUR = 12.0
DEFRAG_MAX_WHOLE_MACHINES = 4
DEFRAG_MAX_CONCURRENT_DRAINING = 4

Does anyone who has had experience setting up & running this have any
pointers or feedback?


Hi Will -

Warning: this feedback is with a grand total of 10 seconds of thought, sent out right before I walk out the door...but having said that, my initial thought is because you are mixing both static and partitionable slots on each machine (perfectly reasonable thing to do btw), perhaps the defrag daemon's default settings for DEFRAG_REQUIREMENTS and/or DEFRAG_WHOLE_MACHINE_EXPR are not appropriate and should be tweaked in your config. I.e. the default values for these two knobs may assume all the slots on a startd are partitionable.

regards,
Todd