[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Preemption



Hi Antonio,

Nothing immediately jumps out - other than the fact that partitionable slots are not preemptible until a fairly recent release (might have to dig through the version notes to figure out when this happened).

You may additionally want to provide a negotiatior-based ranking expression to avoid preemption unless there are no other available machines.

What errors / behaviors are you seeing when running with this configuration?  Is the negotiator handing the schedd preempting matches for the otherwise-claimed slot?

Brian

> On Aug 8, 2016, at 11:15 AM, Antonio Dorta <adorta@xxxxxx> wrote:
> 
> Hi!
> 
> In our HTCondor pool there is a special multi-core machine that belongs to a research group X. They want to run MPI parallel programs on that machine and right now that's properly working with partitionable slots and vanilla universe: all jobs submitted by users belonging to X (X-users) run immediately while there are still available cores (using request_cpus) and they are never preempted.
> 
> Since this group is not using that machine everyday, it could be interesting that other users would be able to use it for MPI or sequential programs, as long as they don't disturb X-users. So the idea is to run "non-X-users" jobs while there are available cores, but be able to preempt those other jobs as soon as X-users do a submission and there are no available cores... Is that possible in an easy way?
> 
> I've been reading documentation and trying some tests, but the results are not totally fine yet... So far I've been trying with some config files like the next one only on the special machine:
> 
> # Partitionable slot
> SLOT_TYPE_1               = cpu=100%
> SLOT_TYPE_1_PARTITIONABLE = TRUE
> NUM_SLOTS_TYPE_1          = 1
> 
> X_USERS         = (Owner == "a" || Owner == "b" || ...)
> RANK            = $(X_USERS) * 1000
> START           = $(X_USERS) || $(START)
> PREEMPT         = !$(X_USERS) && $(PREEMPT)
> WANT_SUSPEND    = False
> 
> PREEMPTION_RANK = - (X_USERS * 10000) - TotalJobRunTime
> 
> Thanks a lot!!
> 
> 
> 
> 
> 
> -- 
> Antonio Dorta
> Servicios InformÃticos EspecÃficos (SIE)
> InvestigaciÃn y EnseÃanza
> Instituto de AstrofÃsica de Canarias (IAC)
> C/ VÃa LÃctea, s/n. 38205 - La Laguna, Santa Cruz de Tenerife
> Despacho: 1124. Tfno: 922 60 5278. email: adorta@xxxxxx
> Supercomputing at IAC: http://www.iac.es/sieinvens/SINFIN/Main/supercomputing.php
> ----------------------------------------------------------------
> ADVERTENCIA: Sobre la privacidad y cumplimiento de la Ley de Proteccion de Datos, acceda a http://www.iac.es/disclaimer.php
> WARNING: For more information on privacy and fulfilment of the Law concerning the Protection of Data, consult http://www.iac.es/disclaimer.php?lang=en
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/