[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Dynamic slots not resizing



Thanks a lot everybody,

 

Setting CLAIM_WORKLIFE to a value lower than the parallel Job0 duration did the trick for this example.

I am really liking this job scheduler, hope to deploy it in our department soon. :)

 

Óscar

 

 

De: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] En nombre de Steven C Timm
Enviado el: jueves, 22 de marzo de 2018 16:01
Para: htcondor-users@xxxxxxxxxxx
Asunto: Re: [HTCondor-users] Dynamic slots not resizing

 

You can play with variables such as CLAIM_WORKLIFE also to force the negotiator to get involved when the new job starts.  Generally if you have a single schedd with a claim on the partitionable slot then jobs will still keep running in the existing dynamic slot... sometimes in weird corner cases they will keep on matching and running anyway.

 

Steve

 


From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Laborda Sanchez, Oscar (Volkswagen Group Services) <extern.Oscar.Laborda@xxxxxxx>
Sent: Thursday, March 22, 2018 9:07:17 AM
To: htcondor-users@xxxxxxxxxxx
Subject: [HTCondor-users] Dynamic slots not resizing

 

Hello,

 

I have an execution server with 16 cores using a single partitionable slot:

NUM_SLOTS = 1

NUM_SLOTS_TYPE_1 = 1

SLOT_TYPE_1 = 100%

SLOT_TYPE_1_PARTITIONABLE = true

 

When I submit several jobs like this:

·         Job0: request_cpus=16

·         job1: request_cpus=1

·         Job2: request_cpus=1

·         ...

·         JobN: request_cpus=1

 

I find that Job0 starts running in a newly created dynamic slot with all 16 cpus as expected, but when it finishes, the other 1 cpu jobs start running sequentially one at a time in the very same slot with 16 cpus (environment variable OMP_NUM_THREADS=16 for those jobs too).

I expected the 16 cpu dynamic slot to be destroyed at the end of Job0 and several new ones to be created with 1 cpu each. This is what happens when I do not send Job0 first.

 

Can this be fixed?

 

Thank you

Oscar

DISCLAIMER: Este mensaje contiene informaci'on propietaria de la cual parte o toda puede contener informaci'on confidencial o protegida legalmente. Esta exclusivamente destinado al usuario de destino. Si, por un error de envio o transmisi'on, ha recibido este mensaje y usted no es el destinatario del mismo, por favor, notifique de este hecho al remitente. Si no es el destinatario final de este mensaje no debe usar, informar, distribuir, imprimir, copiar o difundir este mensaje bajo ning'un medio. --------- DISCLAIMER: This e-mail contains propietary information some or all of which may be legally protected. It is for the intended recipient only. If an addressing or transmission error has misdirected this e-mail, please notify the author by replying to this e-mail. If you are not the intended recipient you must not use, disclose, distribute, copy, print or relay this e-mail.

DISCLAIMER: Este mensaje contiene información propietaria de la cual parte o toda puede contener información confidencial o protegida legalmente. Esta exclusivamente destinado al usuario de destino. Si, por un error de envio o transmisión, ha recibido este mensaje y usted no es el destinatario del mismo, por favor, notifique de este hecho al remitente. Si no es el destinatario final de este mensaje no debe usar, informar, distribuir, imprimir, copiar o difundir este mensaje bajo ningún medio. --------- DISCLAIMER: This e-mail contains propietary information some or all of which may be legally protected. It is for the intended recipient only. If an addressing or transmission error has misdirected this e-mail, please notify the author by replying to this e-mail. If you are not the intended recipient you must not use, disclose, distribute, copy, print or relay this e-mail.