Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] parallel universe : some questions about nodes allocation and preemption

Date: Thu, 22 Feb 2018 16:56:19 +0100
From: Christophe DIARRA <diarra@xxxxxxxxxxxxx>
Subject: [HTCondor-users] parallel universe : some questions about nodes allocation and preemption

Hello,

I am doing some tests with the parallel universe on a small test pool (4Worker Nodes). I am using static slots on the WN. Each WN has 8 cores.

I am not very opimistic about the feasability of what I am asking forbut before giving up I would like to have a confirmation from theexperts. Below are my questions.

1) For a MPI job, is it possible to force the number of nodes and alsobalance the slots allocation between these nodes ? For example, if Isubmit a 16 cores MPI jobs (machine_count=16), is it possible to tell toHTCondor to allocate only 2 WN with 8 cores each ? With Torque/Maui wedot it with "#PBS -l nodes=2:ppn=8". We plan to migrate our parallelcluster from Torque/Maui to HTC. The migration is already done for thesingle/multicores cluster.

In the NEGOTIATOR_PRE_JOB_RANK expression I use a ranking based on aWN_ID (IP address converted to int), to have depth_first allocation.This works fine. But suppose now that I submit a 10 cores MPI job whileall the slots are Idle, I will have 8 cores Claimed on one WN and 2cores on the next WN based on the WN_ID ranking. I would prefer to have5 cores allocated on each WN (balanced allocation) and avoid the othercombinaisons.

2) In my tests, one user (puser) submit parallel jobs. Another user(vuser) submit vanilla single core jobs. puser has highier priority thanvuser. My PREEMPTION_REQUIREMENTS allows the preemption of vuser's jobs.It works, but the problem is the following: suppose that 32 vuser jobsare already running, if puser submit a 2 cores MPI job, all the 32 jobsof vuser will be preempted and put back in queue. It is possible toconfigure HTCondor to preempt only the required number of vanilla jobs ?In my example, I would like to have only 2 vallina jobs preemptedinstead of 32 jobs. What I have observed is the following: at eachnegociation cycle HTCondor preempt n slots (when possible) if the MPIjobs need n slots in total and the already preempted slots have not yetfinished retiring/vacating. At the end there may be n+n+n+... slotspreempted and the MPI job will use only n of them while the other willstay 'Claimed Idle'.


3) In relation with 2).

When the MPI jobs starts, the preempted unused slots will remain'Claimed Idle' for ~10 minutes before beeing 'Unclaimed Idle' ou'Claimed Busy'. Setting 'UNUSED_CLAIM_TIMEOUT = 120' on the Schedulerhas no effect. Is there an explanation for that ?


Thanks in advance for you help,

Christophe.

--
Christophe DIARRA
Institut de Physique Nucleaire
15 Rue Georges Clemenceau
S2I/D2I - Bat 100A - Piece A108
F91406 ORSAY Cedex
Tel:    +33 (0)1 69 15 65 60 / +33 (0)6 31 26 23 69
Fax:    +33 (0)1 69 15 64 70 / E-mail:diarra@xxxxxxxxxxxxx

Prev by Date: Re: [HTCondor-users] Assertion error
Next by Date: [HTCondor-users] HTCondor 8.6.8, 8.6.9 and 8.7.5 Job Run Error: Create_Process failed to register the job with the ProcD
Previous by thread: Re: [HTCondor-users] Assertion error
Next by thread: [HTCondor-users] HTCondor 8.6.8, 8.6.9 and 8.7.5 Job Run Error: Create_Process failed to register the job with the ProcD
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

[HTCondor-users] parallel universe : some questions about nodes allocation and preemption