[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Trying to set-up DedicatedScheduler for parallel universe, but not preempting serial jobs
- Date: Tue, 06 Dec 2016 20:05:00 +0000
- From: Jaime Frey <jfrey@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Trying to set-up DedicatedScheduler for parallel universe, but not preempting serial jobs
> On Dec 6, 2016, at 1:08 PM, Carsten Aulbert <carsten.aulbert@xxxxxxxxxx> wrote:
> On 12/06/16 18:14, Jaime Frey wrote:
>> For this setup to work, you will need ALLOW_PSLOT_PREEMPTION=True, since you want one 4-core request to preempt four 1-core requests.
> Yepp, tried but that was not just enough - though I stumbled over that
> option only by coincidence.
>> It appears that pslot preemption doesnât work for parallel jobs, but that would be easy to fix. I will work on getting this in for a future release.
> Great :)
It turns out fixing this is more involved than I initially thought. The dedicated scheduler wonât try to match a parallel job if no single slot ad has enough resources to satisfy each node of the job. Logic to preempt and merge multiple small slots into a single slot for a large job is in the negotiator, but we need equivalent logic in the dedicated scheduler.
>> With the current release, setting ALLOW_PSLOT_PREEMPTION=False, your parallel job could request 8 single-core allocations, like this:
>> machine_count = 8
>> request_cpus = 1
> Hmm, I thought I tried that (will try again tomorrow), but wouldn't that
> only work with 8 different machines and get a single core each?
The machines here refer to Machine ads in the collector, not distinct physical machines.