[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Preemption to priorize parallel jobs



Hi Carlos,

unfortunately I do not have real life experience with preemption but as far as I know the negotiator is causing the preemption mainly on the base of PREEMPTION_REQUIREMENTS evaluating to 'true'.

Hence I suppose you have to tag these slots in some way as 'mpi' or 'preemptable' and put the tag as one of the arguments in the PREEMPTION_REQUIREMENTS string. Than you need to distinguish between the two job types (that you may have to tag) in the same string (if runing job is type prio low and queuing job is type prio high then preempt) and that's it - easy ;)

Best
Christoph


--
Christoph Beyer
DESY Hamburg
IT-Department

Notkestr. 85
Building 02b, Room 009
22607 Hamburg

phone:+49-(0)40-8998-2317
mail: christoph.beyer@xxxxxxx


Von: "Carles Acosta" <cacosta@xxxxxx>
An: "htcondor-users" <htcondor-users@xxxxxxxxxxx>
Gesendet: Freitag, 21. Februar 2020 14:51:33
Betreff: [HTCondor-users] Preemption to priorize parallel jobs

Hi all,
I have 6 nodes in my pool that are available for MPI jobs. My idea is to have these nodes running other jobs and when one parallel job is queued, I would like the jobs to be evicted to allow the MPI job to enter. Thus, I understand that preemption is where I have to look. But I'm not so sure how to do it. This is the first question, is really something that I can do using the Negotiator Preemption? Or maybe there is any other option (I don't know, maybe with eviction - startd-preemption - I can create a PREEMPT _expression_ that is only yes when a parallel job is queued?)

Has anyone done something like this? 

My first idea is to add a RANK _expression_ on the startd like this:

RANK = Scheduler =?= $(DedicatedScheduler) * RequestCpus > 8 * JobUniverse == 11

And then on the negotiator side, add the PREEMPTION_REQUIREMENTS:

PREEMPTION_REQUIREMENTS = DedicatedScheduler =?= "DedicatedScheduler@xxxxxxxxxxxxxxx" && JobUniverse =!= 11 && (SubmitterGroup =?= RemoteGroup)

But my first tests are not working as expected.

Thank you in advance.

Best regards,

Carles


--
Carles Acosta i Silva
PIC (Port d'Informacià CientÃfica)
Campus UAB, Edifici D
E-08193 Bellaterra, Barcelona
Tel: +34 93 581 33 08
Fax: +34 93 581 41 10
AvÃs - Aviso - Legal Notice: http://www.ifae.es/legal.html

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/