[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Troubleshooting job eviction by machine RANK





On 2/3/2016 5:47 PM, Todd Tannenbaum wrote:
On 2/3/2016 5:09 PM, Graham Allan wrote:


but my interpretation was that RANK by itself should achieve the desired
effect.

Have I written enough for anyone to say where I'm going wrong?

Thanks, Graham


What version of HTCondor?

Mostly condor 8.2 (there are a few 8.4 nodes)

And is your startd using static slots (default) or partitionable slots
on your execute nodes?  If you don't know the answer to this question,
it is probably static slots, but you can do
   condor_status -cons partitionableslot
and if you see any results then you have partitionable slots...

It is using partitionable slots, fairly straightforwardly:

NUM_SLOTS = 1
NUM_SLOTS_TYPE_1 = 1
SLOT_TYPE_1 = cpus=100%
SLOT_TYPE_1_PARTITIONABLE = true

Now I have seen allusions to issues between partitionable slots and preemption but not exactly what they are - I had an impression it was something to do with evicted jobs leaving the slots fragmented, rather than preemption just not happening.

Graham