[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Negotiator only allocating 1 job per machine per cycle



Hi Julio,

I think you are a bit overthinking here ;)

Condor can take care of mostof the issues you describe through classadd matching. You just need to define the needs of the job in the submitfile in terms of 'request<ressource>' e.g. memory, disk, cpu.

On the workernode you can create one partitionable slot and condor will create childslots for each job following the needs of the job. At the same time condor keeps track automatically of the ressources on the workernodes and e.g. once all the memory you gave to the partitionable slot is reserved by claimed slots no additional jobs will start on that worker.

I maybe wrong here though (and might have misunderstood you) - my wife assures I often am  ;)

Best
christoph


--
Christoph Beyer
DESY Hamburg
IT-Department

Notkestr. 85
Building 02b, Room 009
22607 Hamburg

phone:+49-(0)40-8998-2317
mail: christoph.beyer@xxxxxxx


Von: "Valdes, Julio" <Julio.Valdes@xxxxxxxxxxxxxx>
An: "htcondor-users" <htcondor-users@xxxxxxxxxxx>
Gesendet: Donnerstag, 2. September 2021 17:42:46
Betreff: Re: [HTCondor-users] Negotiator only allocating 1 job per machine per cycle

Hello Greg:

 

The machines where the jobs would run have memory in the 40-96 Gb range and considering the minimal size, each machine could run simultaneously with 4 jobs (I have done these tests manually, so I know it for sure). That is one of the reasons why I want to control the number of jobs assigned to each machine. If more than 4 jobs are simultaneously allocated, then those machines with the minimum amount of memory would be maxed out.

The other reason is that if all cores of a given machine are running jobs, the machines would be incapable of doing other tasks, or they would do it at a very low speed.

Disk space is not an issue at all, as they have large disks, more than sufficient to store the files generated by the jobs.

An additional question that I have is whether it is possible, when submitting a job to a given machine, to specify the slot that should run the job. I do not know if condor allows that kind of control and if it does, how to write the submission file to achieve such behavior.

Thank you for considering my problem and do not hesitate in asking any question to help you understand the situation.

I appreciate very much that you have taken time to considering the problem.

Sincerely

 

Julio J. ValdÃs

National Research Council Canada                                    | Conseil National de Recherches Canada

Digital Technologies Research Centre                               | Centre de Recherche en Technologies NumÃriques

Data Science for Complex Systems Group                        | Science des DonnÃes pour les SystÃmes Complexes

M-50, 1200 Montreal Road, Ottawa, Ontario K1A 0R6 | M-50, 1200 chemin MontrÃal, Ottawa, Ontario K1A 0R6

Canada                                                                                     | Canada

julio.valdes@xxxxxxxxxxxxxx

tel/tÃl: (1)613-993-0257

 

From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Greg Thain
Sent: Wednesday, September 01, 2021 3:48 PM
To: htcondor-users@xxxxxxxxxxx
Subject: Re: [HTCondor-users] Negotiator only allocating 1 job per machine per cycle

 

***ATTENTION*** This email originated from outside of the NRC. ***ATTENTION*** Ce courriel provient de l'extÃrieur du CNRC

On 9/1/21 2:20 PM, Valdes, Julio wrote:

Hello Todd:

 

Excuse me for sliding myself into the discussion that you have with Kneller and Ho.

My big need and question is how to send a given number of jobs to a specific machine.

For example: I have 20 jobs to submit but I want 4 to go to machine A, 7 to machine B and so on.

I assume that the number of cores of a given machine has to be less or equal to the number of jobs assigned to it, but correct me if I am wrong.

 

Can we get a few more details about your requirements?  e.g. Do you want only (I assume "at most") 4 jobs from any user of any kind of job to ever run at the same time on machine A?  What are the cpu & memory requirements for these jobs?

 

-greg

 


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/