[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] TARGET.Cpus < RequestCpus



Hi Sam,

condor_status -master returns the total available CPU cores detected on that machine, but when a job is being matched to run on an execution point it is matched with a slot. These slots are what you see when you do just condor_status. A base condor install will create 1 static slot per CPU core with all other resources (disk, ram, etc) divided up evenly. Since the base slots have 1 CPU core each and the job is requesting 16 cpu cores it will never match and sit idle forever (or at least until a matching slot is created/added somehow).

One thing you can do is switch the execution point over to using a partitionable slot by adding
'use FEATURE:PartitionableSlot'. This will create a single parent slot on the execution point with all available resources (CPU cores, disk, ram, etc) that will be used to dynamically create dynamic slots of the appropriate size for a job to run. One thing to note is that any configuration changes done to the management/creation of available job execution resources/slots requires a restart of the daemons. A reconfiguration will not work.

Cheers,
Cole Bollig

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Sam.Dana@xxxxxxxxxxx <Sam.Dana@xxxxxxxxxxx>
Sent: Friday, September 15, 2023 7:23 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] TARGET.Cpus < RequestCpus
 
My submit file has
request_cpus = 16   (>1)

condor_status -master
Name                      Version        Cpus   Memory      Uptime

Test-Daryll-00.cvasil.tld 10.0.6          32   255.8 GB    0+23:25:10        <-- CM / Execute
test-daryll-81.cvasil.tld 10.0.6           32   255.8 GB    0+23:25:03        <-- Execute
test-daryll-82.cvasil.tld 10.0.6           32   255.8 GB    1+00:05:03        <-- Schedd

condor_q -better-analyze 
The Requirements _expression_ for job 207.000 reduces to these conditions:

         Slots
Step    Matched  Condition
-----  --------  ---------
[0]          64  TARGET.Arch == "X86_64"
[1]          64  TARGET.OpSys == "WINDOWS"
[3]          64  TARGET.Disk >= RequestDisk
[5]          64  TARGET.Memory >= RequestMemory
[7]           0  TARGET.Cpus >= RequestCpus

No successful match recorded.

Windows 10 / Server2019
How do I resolve? 
ELI5, please.

Thanks, 
Sam



NOTICE: This email message and all attachments transmitted with it may contain privileged and confidential information, and information that is protected by, and proprietary to, Parsons Corporation, and is intended solely for the use of the addressee for the specific purpose set forth in this communication. If the reader of this message is not the intended recipient, you are hereby notified that any reading, dissemination, distribution, copying, or other use of this message or its attachments is strictly prohibited, and you should delete this message and all copies and backups thereof. The recipient may not further distribute or use any of the information contained herein without the express written authorization of the sender. If you have received this message in error, or if you have any questions regarding the use of the proprietary information contained therein, please contact the sender of this message immediately, and the sender will provide you with further instructions.