[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Negotiator only allocating 1 job per machine per cycle



Hi,

"...ith one new slot per machine assigned on each negotiate loop although as jobs finish that slot is immediately refilled with an idle job in the batch (not waiting for the negotiate loop). Is this to be expected? ..."

This is most likely due to the claim_worklife, which is the time during which the sched will be running identical jobs on this slot. Default is 20 min (1200s) you can set it with 'claim_worklife = <xxx>s in the slot definition - value '0' means only one job will run during the claim and slot will be renogiated asap - which is more like the behaviour you expect I suspect ? 

best
christoph


--
Christoph Beyer
DESY Hamburg
IT-Department

Notkestr. 85
Building 02b, Room 009
22607 Hamburg

phone:+49-(0)40-8998-2317
mail: christoph.beyer@xxxxxxx


Von: "Lachlan Palmer" <LPalmer@xxxxxxxxxxxx>
An: "htcondor-users" <htcondor-users@xxxxxxxxxxx>
CC: "Isaac Ho" <IHo@xxxxxxxxxxxx>
Gesendet: Dienstag, 31. August 2021 00:17:37
Betreff: Re: [HTCondor-users] Negotiator only allocating 1        job        per        machine        per        cycle

TJ,

 

Yes I looked at both the schedd and the execute nodes and we had not specified either of those knobs. I have added them in just to be sure although when I had checked the values using condor_config_val the default values showed. We are still seeing the behavior.

 

One other thing, we do have a concurrency limit set up for this particular set of jobs. I ran a quick test and noticed that when this concurrency limit is off then the slots are filled up immediately. However when the concurrency limit is on then we see the staggered launching, with one new slot per machine assigned on each negotiate loop although as jobs finish that slot is immediately refilled with an idle job in the batch (not waiting for the negotiate loop). Is this to be expected?

 

If not then I will set up the logging and get back to you with more information.

 

Cheers,

 

Lachlan

 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of John M Knoeller
Sent: Monday, August 30, 2021 8:16 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Cc: Isaac Ho <IHo@xxxxxxxxxxxx>
Subject: Re: [HTCondor-users] Negotiator only allocating 1 job per machine per cycle

 

Just to double check, you looked at the configuration of the Schedd and the Execute node for these knobs, correct?

The Negotiator never looks at CLAIM_PARTITIONABLE_LEFTOVERS. 

 

I should also mention that if you have CONSUMPTION_POLICY enabled on the EXECUTE_NODE, then CLAIM_PARTITIONABLE_LEFTOVERS is ignored.

 

If you still want to pursue this question  I'll need to see the SchedLog from the time that the match is returned from the negotiator.  That would be  08/27/21 16:22:01 from the logs below, but you will need to have at least D_FULLDEBUG logging enabled in the Schedd to be sure of seeing the necessary messages.  having D_COMMAND logging will also be helpful.

 

Add this to the configuration of the Schedd.

 

SCHEDD_DEBUG =  $(SCHEDD_DEBUG) D_CAT D_COMMAND:1 D_FULLDEBUG

 

Then find a match message from the Negotiator and the send me the SchedLog for the time period from that match until the job starts up. 

 

You might also want to turn up the logging on the execute node for this

 

STARTD_DEBUG = $(STARTD_DEBUG) D_CAT D_COMMAND:1 D_FULLDEBUG

 

There will be a lotâ more output in the log, and it's difficult to wade through, but it should tell us why you are starting up a single job per negotiation cycle on your partitionable slots. 

 

-tj

 


From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Lachlan Palmer <LPalmer@xxxxxxxxxxxx>
Sent: Friday, August 27, 2021 6:35 PM
To: HTCondor-Users Mail List <
htcondor-users@xxxxxxxxxxx>
Cc: Isaac Ho <
IHo@xxxxxxxxxxxx>
Subject: Re: [HTCondor-users] Negotiator only allocating 1 job per machine per cycle

 

Thanks TJ

 

We have not changed the defaults. I explicitly added those lines in to be sure but still seeing the same:

 

08/27/21 16:22:01 Group group1 - BEGIN NEGOTIATION

08/27/21 16:22:01 Phase 3:  Sorting submitter ads by priority ...

08/27/21 16:22:01 Starting prefetch round; 1 potential prefetches to do.

08/27/21 16:22:01 Starting prefetch negotiation for group1.YYYY@xxxxxxxxxxxx.

08/27/21 16:22:01     Got NO_MORE_JOBS;  schedd has no more requests

08/27/21 16:22:01 Prefetch summary: 1 attempted, 1 successful.

08/27/21 16:22:01 Phase 4.1:  Negotiating with schedds ...

08/27/21 16:22:01   Negotiating with group1.YYYY@xxxxxxxxxxxx at <10.1.23.222:9618?addrs=10.1.23.222-9618&noUDP&sock=8396_d680_3>

08/27/21 16:22:01 0 seconds so far for this submitter

08/27/21 16:22:01 0 seconds so far for this schedd

08/27/21 16:22:01     Request 400355.00040: autocluster 5891 (request count 1 of 41)

08/27/21 16:22:01       Matched 400355.40 group1.YYYY@xxxxxxxxxxxx <10.1.23.222:9618?addrs=10.1.23.222-9618&noUDP&sock=8396_d680_3> preempting none <10.1.26.84:9618?addrs=10.1.26.84-9618&noUDP&sock=4916_7db8_3> slot1@xxxxxxxxxxxxxxxxxxxxx

08/27/21 16:22:01       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxxxx

08/27/21 16:22:01     Request 400355.00040: autocluster 5891 (request count 2 of 41)

08/27/21 16:22:01       Matched 400355.40 group1.YYYY@xxxxxxxxxxxx <10.1.23.222:9618?addrs=10.1.23.222-9618&noUDP&sock=8396_d680_3> preempting none <10.1.26.85:9618?addrs=10.1.26.85-9618&noUDP&sock=4212_6bc2_3> slot1@xxxxxxxxxxxxxxxxxxxxx

08/27/21 16:22:01       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxxxx

08/27/21 16:22:01     Request 400355.00040: autocluster 5891 (request count 3 of 41)

08/27/21 16:22:01       Matched 400355.40 group1.YYYY@xxxxxxxxxxxx <10.1.23.222:9618?addrs=10.1.23.222-9618&noUDP&sock=8396_d680_3> preempting none <10.1.27.73:9618?addrs=10.1.27.73-9618&noUDP&sock=14808_7a16_3> slot1@xxxxxxxxxxxxxxxxxxxxx

08/27/21 16:22:01       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxxxx

08/27/21 16:22:01     Request 400355.00040: autocluster 5891 (request count 4 of 41)

08/27/21 16:22:01       Matched 400355.40 group1.YYYY@xxxxxxxxxxxx <10.1.23.222:9618?addrs=10.1.23.222-9618&noUDP&sock=8396_d680_3> preempting none <10.1.26.86:9618?addrs=10.1.26.86-9618&noUDP&sock=5004_7ff6_3> slot1@xxxxxxxxxxxxxxxxxxxxx

08/27/21 16:22:01       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxxxx

08/27/21 16:22:01     Request 400355.00040: autocluster 5891 (request count 5 of 41)

08/27/21 16:22:01       Rejected 400355.40 group1.YYYY@xxxxxxxxxxxx <10.1.23.222:9618?addrs=10.1.23.222-9618&noUDP&sock=8396_d680_3>: no match found

08/27/21 16:22:01  negotiateWithGroup resources used submitterAds length 1

 

For the startd configuration we do have these settings:  

# Slot Configuration

NUM_SLOTS = 1

NUM_SLOTS_TYPE_1 = 1

SLOT_TYPE_1 = auto

SLOT_TYPE_1_PARTITIONABLE = True

 

MODIFY_REQUEST_EXPR_REQUESTCPUS = quantize(RequestCpus, {1})

MODIFY_REQUEST_EXPR_REQUESTMEMORY = quantize(RequestMemory, {TotalSlotMem / TotalSlotCpus / 4})

MODIFY_REQUEST_EXPR_REQUESTDISK = quantize(RequestDisk, {1024})

 

Is something with that incorrect?

 

Cheers,

 

Lachlan

 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of John M Knoeller
Sent: Friday, August 27, 2021 1:15 PM
To: HTCondor-Users Mail List <
htcondor-users@xxxxxxxxxxx>
Cc: Isaac Ho <
IHo@xxxxxxxxxxxx>
Subject: Re: [HTCondor-users] Negotiator only allocating 1 job per machine per cycle

 

 

A message like this

 

08/26/21 17:02:01       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxxxx

08/26/21 17:02:01     Request 400158.00040: autocluster 4430 (request count 3 of 41)

08/26/21 17:02:01       Matched 400158.40 group_1.YYYY@xxxxxxxxxxxx <10.1.23.222:9618?addrs=10.1.23.222-9618&noUDP&sock=8396_d680_3> preempting none <10.1.26.86:9618?addrs=10.1.26.86-9618&noUDP&sock=5004_7ff6_3> slot1@xxxxxxxxxxxxxxxxxxxxx

 

Does not necessarily mean that the Schedd is only going to start a single job on slot1@xxxxxxxxxxxxxxxxxxxxx

It can start more than one on that slot if the slot is a partitionable slot and it has at least 3*2 Cpus free.  We would need to look at the interaction between the Schedd and the Execute node to so see that part of the process.

 

If you look closely,  The various match messages from the negotiator keep referring to the same job id 400158.40: autocluster 4430 (request count N of 41)  this indicates that are 41 jobs that are all the same as  job 400158.40 for purpose of matchmaking.   The Negotiator can just match that job and let the Schedd decide whether to run that job or one of the other 40 jobs in that batch on the slot that the Negotiator hands back.

 

CLAIM_PARTITIONABLE_LEFTOVERS  controls whether the Schedd will try and start multiple jobs on a matching p-slot.  Your negotiator log looks a lot what what you would see if your schedd config had CLAIM_PARTITIONABLE_LEFTOVERS=false. 

The defaults for this are

 

CLAIM_PARTITIONABLE_LEFTOVERS = $(NEGOTIATOR_USE_SLOT_WEIGHTS)

NEGOTIATOR_USE_SLOT_WEIGHTS = true

 

Did you change one of these two config knobs from the default?

 

-tj

 


From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Lachlan Palmer <LPalmer@xxxxxxxxxxxx>
Sent: Friday, August 27, 2021 2:32 PM
To: HTCondor-Users Mail List <
htcondor-users@xxxxxxxxxxx>
Cc: Isaac Ho <
IHo@xxxxxxxxxxxx>
Subject: [HTCondor-users] Negotiator only allocating 1 job per machine per cycle

 

Hi Everyone,

 

We have been submitting a batch of jobs with following characteristics:

         Single variable within the batch, e.g. queue var in (var1, var2, var3, ââ.)

         Assign a priority equal to the process number e.g. priority=$(Process)

         Request 3 cpus e.g. request_cpus=3

 

What we are seeing is that the jobs are being launched by HTCondor in a staggered manner, one on each machine on the grid every cycle. Here is an example of the negotiator log.

 

08/26/21 17:02:01 ---------- Started Negotiation Cycle ----------

08/26/21 17:02:01 Phase 1:  Obtaining ads from collector ...

08/26/21 17:02:01   Getting startd private ads ...

08/26/21 17:02:01   Getting Scheduler, Submitter and Machine ads ...

08/26/21 17:02:01   Sorting 7 ads ...

08/26/21 17:02:01 Got ads: 7 public and 3 private

08/26/21 17:02:01 Public ads include 1 submitter, 3 startd

08/26/21 17:02:01 Phase 2:  Performing accounting ...

08/26/21 17:02:01 group quotas: assigning 1 submitters to accounting groups

08/26/21 17:02:01 group quotas: assigning group quotas from 84 available weighted slots

08/26/21 17:02:01 group quotas: allocation round 1

08/26/21 17:02:01 group quotas: groups= 6  requesting= 1  served= 1  unserved= 0  slots= 84  requested= 123  allocated= 84  surplus= 0  maxdelta= 84

08/26/21 17:02:01 group quotas: entering RR iteration n= 84

08/26/21 17:02:01 Group group_0 - skipping, zero slots allocated

08/26/21 17:02:01 Group group_1 - BEGIN NEGOTIATION

08/26/21 17:02:01 Phase 3:  Sorting submitter ads by priority ...

08/26/21 17:02:01 Starting prefetch round; 1 potential prefetches to do.

08/26/21 17:02:01 Starting prefetch negotiation for group_1.YYYY@xxxxxxxxxxxx.

08/26/21 17:02:01     Got NO_MORE_JOBS;  schedd has no more requests

08/26/21 17:02:01 Prefetch summary: 1 attempted, 1 successful.

08/26/21 17:02:01 Phase 4.1:  Negotiating with schedds ...

08/26/21 17:02:01   Negotiating with group_1.YYYY@xxxxxxxxxxxx at <10.1.23.222:9618?addrs=10.1.23.222-9618&noUDP&sock=8396_d680_3>

08/26/21 17:02:01 0 seconds so far for this submitter

08/26/21 17:02:01 0 seconds so far for this schedd

08/26/21 17:02:01     Request 400158.00040: autocluster 4430 (request count 1 of 41)

08/26/21 17:02:01       Matched 400158.40 group_1.YYYY@xxxxxxxxxxxx <10.1.23.222:9618?addrs=10.1.23.222-9618&noUDP&sock=8396_d680_3> preempting none <10.1.26.85:9618?addrs=10.1.26.85-9618&noUDP&sock=4212_6bc2_3> slot1@xxxxxxxxxxxxxxxxxxxxx

08/26/21 17:02:01       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxxxx

08/26/21 17:02:01     Request 400158.00040: autocluster 4430 (request count 2 of 41)

08/26/21 17:02:01       Matched 400158.40 group_1.YYYY@xxxxxxxxxxxx <10.1.23.222:9618?addrs=10.1.23.222-9618&noUDP&sock=8396_d680_3> preempting none <10.1.26.84:9618?addrs=10.1.26.84-9618&noUDP&sock=4916_7db8_3> slot1@xxxxxxxxxxxxxxxxxxxxx

08/26/21 17:02:01       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxxxx

08/26/21 17:02:01     Request 400158.00040: autocluster 4430 (request count 3 of 41)

08/26/21 17:02:01       Matched 400158.40 group_1.YYYY@xxxxxxxxxxxx <10.1.23.222:9618?addrs=10.1.23.222-9618&noUDP&sock=8396_d680_3> preempting none <10.1.26.86:9618?addrs=10.1.26.86-9618&noUDP&sock=5004_7ff6_3> slot1@xxxxxxxxxxxxxxxxxxxxx

08/26/21 17:02:01       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxxxx

08/26/21 17:02:01     Request 400158.00040: autocluster 4430 (request count 4 of 41)

08/26/21 17:02:01       Rejected 400158.40 group_1.YYYY@xxxxxxxxxxxx <10.1.23.222:9618?addrs=10.1.23.222-9618&noUDP&sock=8396_d680_3>: no match found

08/26/21 17:02:01  negotiateWithGroup resources used submitterAds length 1

08/26/21 17:02:01 Group group_2 - skipping, zero slots allocated

08/26/21 17:02:01 Group group_3 - skipping, zero slots allocated

08/26/21 17:02:01 Group group_4 - skipping, zero slots allocated

08/26/21 17:02:01 Group <none> - skipping, zero slots allocated

08/26/21 17:02:01 Round 1 totals: allocated= 84  usage= 84

08/26/21 17:02:01 ---------- Finished Negotiation Cycle ----------

08/26/21 17:03:01 ---------- Started Negotiation Cycle ----------

08/26/21 17:03:01 Phase 1:  Obtaining ads from collector ...

08/26/21 17:03:01   Getting startd private ads ...

08/26/21 17:03:01   Getting Scheduler, Submitter and Machine ads ...

08/26/21 17:03:01   Sorting 10 ads ...

08/26/21 17:03:01 Got ads: 10 public and 6 private

08/26/21 17:03:01 Public ads include 1 submitter, 6 startd

08/26/21 17:03:01 Phase 2:  Performing accounting ...

08/26/21 17:03:01 group quotas: assigning 1 submitters to accounting groups

08/26/21 17:03:01 group quotas: assigning group quotas from 84 available weighted slots

08/26/21 17:03:01 group quotas: allocation round 1

08/26/21 17:03:01 group quotas: groups= 6  requesting= 1  served= 1  unserved= 0  slots= 78  requested= 123  allocated= 84  surplus= 0  maxdelta= 75

08/26/21 17:03:01 group quotas: entering RR iteration n= 75

08/26/21 17:03:01 Group group_0 - skipping, zero slots allocated

08/26/21 17:03:01 Group group_2 - skipping, zero slots allocated

08/26/21 17:03:01 Group group_1 - BEGIN NEGOTIATION

08/26/21 17:03:01 Phase 3:  Sorting submitter ads by priority ...

08/26/21 17:03:01 Starting prefetch round; 1 potential prefetches to do.

08/26/21 17:03:01 Starting prefetch negotiation for group_1.YYYY@xxxxxxxxxxxx.

08/26/21 17:03:01     Got NO_MORE_JOBS;  schedd has no more requests

08/26/21 17:03:01 Prefetch summary: 1 attempted, 1 successful.

08/26/21 17:03:01 Phase 4.1:  Negotiating with schedds ...

08/26/21 17:03:01   Negotiating with group_1.YYYY@xxxxxxxxxxxx at <10.1.23.222:9618?addrs=10.1.23.222-9618&noUDP&sock=8396_d680_3>

08/26/21 17:03:01 0 seconds so far for this submitter

08/26/21 17:03:01 0 seconds so far for this schedd

08/26/21 17:03:01     Request 400158.00037: autocluster 4430 (request count 1 of 38)

08/26/21 17:03:01       Matched 400158.37 group_1.YYYY@xxxxxxxxxxxx <10.1.23.222:9618?addrs=10.1.23.222-9618&noUDP&sock=8396_d680_3> preempting none <10.1.26.85:9618?addrs=10.1.26.85-9618&noUDP&sock=4212_6bc2_3> slot1@xxxxxxxxxxxxxxxxxxxxx

08/26/21 17:03:01       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxxxx

08/26/21 17:03:01     Request 400158.00037: autocluster 4430 (request count 2 of 38)

08/26/21 17:03:01       Matched 400158.37 group_1.YYYY@xxxxxxxxxxxx <10.1.23.222:9618?addrs=10.1.23.222-9618&noUDP&sock=8396_d680_3> preempting none <10.1.26.84:9618?addrs=10.1.26.84-9618&noUDP&sock=4916_7db8_3> slot1@xxxxxxxxxxxxxxxxxxxxx

08/26/21 17:03:01       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxxxx

08/26/21 17:03:01     Request 400158.00037: autocluster 4430 (request count 3 of 38)

08/26/21 17:03:01       Matched 400158.37 group_1.YYYY@xxxxxxxxxxxx <10.1.23.222:9618?addrs=10.1.23.222-9618&noUDP&sock=8396_d680_3> preempting none <10.1.26.86:9618?addrs=10.1.26.86-9618&noUDP&sock=5004_7ff6_3> slot1@xxxxxxxxxxxxxxxxxxxxx

08/26/21 17:03:01       Successfully matched with slot1@xxxxxxxxxxxxxxxxxxxxx

08/26/21 17:03:01     Request 400158.00037: autocluster 4430 (request count 4 of 38)

08/26/21 17:03:01       Rejected 400158.37 group_1.YYYY@xxxxxxxxxxxx <10.1.23.222:9618?addrs=10.1.23.222-9618&noUDP&sock=8396_d680_3>: no match found

08/26/21 17:03:01  negotiateWithGroup resources used submitterAds length 1

08/26/21 17:03:01 Group group_3 - skipping, zero slots allocated

08/26/21 17:03:01 Group group_4 - skipping, zero slots allocated

08/26/21 17:03:01 Group <none> - skipping, zero slots allocated

08/26/21 17:03:01 Round 1 totals: allocated= 84  usage= 84

08/26/21 17:03:01 ---------- Finished Negotiation Cycle ----------

 

 

Previously we had not been seeing this when we were launching jobs in individual batches, but also without the priority nor the cpu request commands. Is either of these job commands responsible for the staggered launch?

 

Thanks,

 

Lachlan

 

This communication (both the message and any attachments or links) is confidential and only intended for the use of the person or persons to whom it is addressed unless we have expressly authorized otherwise. It also may contain information that is protected by solicitor-client privilege. If you are reading this communication and are not an addressee or authorized representative of an addressee, we hereby notify you that any distribution, copying or other use of it without our express authorization is strictly prohibited. If you have received this communication in error, please delete both the message and any attachments from your system and notify us immediately by e-mail or phone. In addition, we note that this communication and its transmission of data have not been secured by encryption. Therefore, we are not able to confirm or guarantee that the communication has not been intercepted, amended, or read by an unintended third party.

This communication (both the message and any attachments or links) is confidential and only intended for the use of the person or persons to whom it is addressed unless we have expressly authorized otherwise. It also may contain information that is protected by solicitor-client privilege. If you are reading this communication and are not an addressee or authorized representative of an addressee, we hereby notify you that any distribution, copying or other use of it without our express authorization is strictly prohibited. If you have received this communication in error, please delete both the message and any attachments from your system and notify us immediately by e-mail or phone. In addition, we note that this communication and its transmission of data have not been secured by encryption. Therefore, we are not able to confirm or guarantee that the communication has not been intercepted, amended, or read by an unintended third party.

This communication (both the message and any attachments or links) is confidential and only intended for the use of the person or persons to whom it is addressed unless we have expressly authorized otherwise. It also may contain information that is protected by solicitor-client privilege. If you are reading this communication and are not an addressee or authorized representative of an addressee, we hereby notify you that any distribution, copying or other use of it without our express authorization is strictly prohibited. If you have received this communication in error, please delete both the message and any attachments from your system and notify us immediately by e-mail or phone. In addition, we note that this communication and its transmission of data have not been secured by encryption. Therefore, we are not able to confirm or guarantee that the communication has not been intercepted, amended, or read by an unintended third party.


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/