[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Preemption



Hi!

At the present status it is working reasonably fine with latest stable version: 8.4.8.

Right now, jobs from X-users are never preempted, and jobs from non-X-users are preempted only when some activity is detected on that machine (ssh connection, mouse, keyboard, etc.). That more or less achieves the goal since X-users connect via ssh to this machine in order to submit their jobs, so at that moment ALL non-X-users jobs are killed.

Ideally, to maximize the use of this machine, we would like to preempt non-X-users jobs ONLY when cores are needed by X-users. For instance, if a X-user submit a job with request_cpus=4 and there are no free cores, it would be great to preempt one or several non-X-users jobs to get the required 4 cores... But I don't think that's easy to manage in partitionable slots...

I've seen in documentation an example about how to preempt jobs after a defined running time when another job with better priority is submitted, using RemoteUserPrio and SubmitterUserPrio (or SubmittorPrio, I've seen different names for that):

PREEMPTION_REQUIREMENTS = $(StateTimer) > (1 * $(HOUR)) && RemoteUserPrio > SubmitterUserPrio * 1.2

but I don't know if it's possible to do something like that using usernames instead of priorities, so we can preempt jobs depending on who the Owner of the current running job is and who the Submitter is...

Thanks a lot for your help!

Best regards,





Quoting Brian Bockelman <bbockelm@xxxxxxxxxxx>:

Hi Antonio,

Nothing immediately jumps out - other than the fact that partitionable slots are not preemptible until a fairly recent release (might have to dig through the version notes to figure out when this happened).

You may additionally want to provide a negotiatior-based ranking expression to avoid preemption unless there are no other available machines.

What errors / behaviors are you seeing when running with this configuration? Is the negotiator handing the schedd preempting matches for the otherwise-claimed slot?

Brian

On Aug 8, 2016, at 11:15 AM, Antonio Dorta <adorta@xxxxxx> wrote:

Hi!

In our HTCondor pool there is a special multi-core machine that belongs to a research group X. They want to run MPI parallel programs on that machine and right now that's properly working with partitionable slots and vanilla universe: all jobs submitted by users belonging to X (X-users) run immediately while there are still available cores (using request_cpus) and they are never preempted.

Since this group is not using that machine everyday, it could be interesting that other users would be able to use it for MPI or sequential programs, as long as they don't disturb X-users. So the idea is to run "non-X-users" jobs while there are available cores, but be able to preempt those other jobs as soon as X-users do a submission and there are no available cores... Is that possible in an easy way?

I've been reading documentation and trying some tests, but the results are not totally fine yet... So far I've been trying with some config files like the next one only on the special machine:

# Partitionable slot
SLOT_TYPE_1               = cpu=100%
SLOT_TYPE_1_PARTITIONABLE = TRUE
NUM_SLOTS_TYPE_1          = 1

X_USERS         = (Owner == "a" || Owner == "b" || ...)
RANK            = $(X_USERS) * 1000
START           = $(X_USERS) || $(START)
PREEMPT         = !$(X_USERS) && $(PREEMPT)
WANT_SUSPEND    = False

PREEMPTION_RANK = - (X_USERS * 10000) - TotalJobRunTime

Thanks a lot!!





--
Antonio Dorta
Servicios InformÃticos EspecÃficos (SIE)
InvestigaciÃn y EnseÃanza
Instituto de AstrofÃsica de Canarias (IAC)
C/ VÃa LÃctea, s/n. 38205 - La Laguna, Santa Cruz de Tenerife
Despacho: 1124. Tfno: 922 60 5278. email: adorta@xxxxxx
Supercomputing at IAC: http://www.iac.es/sieinvens/SINFIN/Main/supercomputing.php
----------------------------------------------------------------
ADVERTENCIA: Sobre la privacidad y cumplimiento de la Ley de Proteccion de Datos, acceda a http://www.iac.es/disclaimer.php WARNING: For more information on privacy and fulfilment of the Law concerning the Protection of Data, consult http://www.iac.es/disclaimer.php?lang=en

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


--
Antonio Dorta
Servicios InformÃticos EspecÃficos (SIE)
InvestigaciÃn y EnseÃanza
Instituto de AstrofÃsica de Canarias (IAC)
C/ VÃa LÃctea, s/n. 38205 - La Laguna, Santa Cruz de Tenerife
Despacho: 1124. Tfno: 922 60 5278. email: adorta@xxxxxx
Supercomputing at IAC: http://www.iac.es/sieinvens/SINFIN/Main/supercomputing.php
----------------------------------------------------------------
ADVERTENCIA: Sobre la privacidad y cumplimiento de la Ley de Proteccion de Datos, acceda a http://www.iac.es/disclaimer.php WARNING: For more information on privacy and fulfilment of the Law concerning the Protection of Data, consult http://www.iac.es/disclaimer.php?lang=en