[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Trying to set-up DedicatedScheduler for parallel universe, but not preempting serial jobs



Hi all

I'm currently stuck with this problem and don't really know where to
continue.

I've set-up two execute nodes with four cores each to only run my jobs (

START = Owner == "carsten"

). The set-up is with fully partitionable slots, i.e.

SLOT_TYPE_1 = ram=15023, swap=0%
SLOT_TYPE_1_PARTITIONABLE = True

In the absence of any jobs running on the execute machines, I get both via

machine_count = 2
request_cpus = 4

So far so good. To allow both parallel and serial jobs I followed the
second policy of
https://research.cs.wisc.edu/htcondor/manual/v8.4/3_12Setting_Up.html#SECTION004128000000000000000
with the only exception for START described above and

PREEMPT = Scheduler =!= $(DedicatedScheduler)

after both PREEMPT = True as well as PREEMPT = false did not really work
out:

I started four single core jobs on one of the execute nodes and
hoped/expected HTCondor to preempt those to launch the parallel job, but
so far to no avail. The empty second execute nodes is fully matched but
no preemption occurs on the first one.

Other things tried so far:

* setting ALLOW_PSLOT_PREEMPTION = True on negotiatior, schedd and
execute node
* reducing various timers to check if these have any say in preemption,
but so far blanks only:

MaxJobRetirementTime = 1
MachineMaxVacateTime = 10
CLAIM_WORKLIFE = 60

As I'm running out of options, anyone succeeded with such a set-up?

Cheers

Carsten

-- 
Dr. Carsten Aulbert, Max Planck Institute for Gravitational Physics,
CallinstraÃe 38, 30167 Hannover, Germany
Phone: +49 511 762 17185