[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Problems Defining Additional Slot Types



When you define NUM_SLOTS_TYPE_1, the default type 0 slot will be disabled, so you would still only have one slot type.   I would expect that with a config of

 

NUM_SLOTS_TYPE_1 = $(MEMORY)/16384

SLOT_TYPE_1 = ram=16384

 

you would end up with 12 slots, each having 16384 Mb of memory.

 

If you don’t configure NUM_SLOTS_TYPE_1,  then SL0T_TYPE_1 will be ignored.

Note I used NUM_SLOTS_TYPE_1 , not NUM_SLOT_TYPE_1.   plural, not singular.

 

Similarly, the knob for setting the number of cpus is NUM_CPUS,  not NUM_CPU.

 

-tj

 

 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Vechinski, Douglas
Sent: Monday, April 4, 2022 10:05 AM
To: htcondor-users@xxxxxxxxxxx
Subject: [HTCondor-users] Problems Defining Additional Slot Types

 

I am attempting to define addition slot types on some of our execute machines in a condor pool. For example, a particular machine has 20 real cores and 192 GB of memory. So by default, when condor starts on this machine it has 20 slots, each “allotted” ~9.6 GB. In addition to these default slots, I am attempting to define other slots that would advertise as having 16GB of memory, so at most 12 of these types of slots. In the configuration file for this machine I have the following

 

NUM_CPU=20

 

SLOT_TYPE_1 = ram=16384

 

Where I assume that the memory is supposed to be specified in MB. I then have tried each of the following

 

% condor_reconfig –name machine_name

% condor_restart –name machine_name –daemon startd

% condor_restart –name machine_name

 

Neither of these appear to work and do ot create these additional slots designated for larger memory jobs. First question is, when making this type of change, which command above is the one necessary to implement the change. Second, any reason why the addition slots are not being created. I’ve also tried adding

 

NUM_SLOT_TYPE_1 = 12

 

To specify that maximum number of these slot types but I thought Condor would figure this out for itself. Either way, that didn’t appear to have any effect either.

 

An additional follow-on question, assuming that I can get the above to begin working. Suppose 8 jobs get submitted and are allocated to these slot_type_1. Next, suppose 20+ jobs are submitted that fall under the default type 0 slots. How many will begin running on this machine: a) 12 (based upon the maximum number of cores (20), b) 6 (based upon the memory that is “left”)?