[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Flock level issues with 8.8.12 compared with 8.6.12



OK, Iâve answered my own question. This was bugging me over the weekend so Iâve been checking the version changelogs

and came across this:

 

Version 8.8.10

Bugs Fixed:

Fixed a bug that prevented the condor_schedd from effectively flocking to pools when resource request list prefetching is enabled, which is the default in HTCondor version 8.8. (Ticket #7754)

 

Version 8.9.7

Bugs Fixed:

Fixed a bug that prevented the condor_schedd from effectively flocking to pools when resource request list prefetching is enabled, which is the default in HTCondor version 8.9 (Ticket #7549) (Ticket #7539)

 

So in our current setup with the versions listed in my original email below I added:

NEGOTIATOR_PREFETCH_REQUESTS = false

to all the config of all the Central managers and voila, flocking working again as it was previously.

 

I assume once all submit nodes and execute nodes (maybe just submit nodes) are also at version 8.8.12 like the CMs then we can

go back to the new defaults for 8.8.12 of:

 

NEGOTIATOR_PREFETCH_REQUESTS = true

# at: <Default>

# expanded: true

# default: true

NEGOTIATOR_PREFETCH_REQUESTS_MAX_TIME = 60

# at: <Default>

# expanded: 60

# default: 60

 

I canât seem to find any docs explaining what these config values do but I assume they speed up job matches on the Negotiatior?

 

Cheers

 

Greg

 

 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Hitchen, Greg (IM&T, Kensington WA)
Sent: Friday, 26 February 2021 4:14 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [ExternalEmail] [HTCondor-users] Flock level issues with 8.8.12 compared with 8.6.12

 

Hi again

 

Around 2 weeks ago we were running 8.6.* on our HTCondor system. Jobs would happily flock to all our State pools (9 in total).

Windows submit nodes - 8.6.13

Windows execute nodes - 8.6.12

Linux collectors/negotiators â 8.6.10

 

Since then we attempted an upgrade to 8.8.12

We got as far as upgrading the execute nodes and collectors/negotiators before we noticed an issue with the IS_OWNER parameter,

or rather noticed that execute nodes were no longer taking notice of owner activity (cpu, mouse, keyboard). Before understanding

what the issue was the upgrade was rolled back on the execute nodes so as not to impact execute node owners.

We did not roll back the Linux central managers.

We never got as far as upgrading the submit nodes.

 

So now we have:

Windows submit nodes - 8.6.13

Windows execute nodes - 8.6.12

Linux collectors/negotiators â 8.8.12

 

and jobs are not flocking to all the pools anymore. Note that no config files were changed on any systems.

The schedlog on the windows submit nodes show info like below. The schedd never seems to get above a

flock level of 3. I canât quite understand how this has happened, given that we are back to the original

8.6.* setup, apart from the linux collectors/negotiators. Before downgrading these to see if this fixes the issue

(I donât know how? Surely this is a schedd thing?) is there something I can try? Or config knobs to look at?

Or other log file entries to look for? It also seems to take much longer than I remember to increase the flock levels.

 

Thanks.

 

Cheers

 

Greg

 

*****************************************************************************************

C:\Program Files\condor\log>findstr -i flock SchedLog.txt

02/26/21 17:26:48 Increasing flock level for na-hit023 to 1 from 0. (Due to lack of activity from negotiator)

02/26/21 17:26:48 Decreasing flock level for na-hit023 to 0 from 1.

02/26/21 17:26:48 Increasing flock level for na-hit023 to 1 from 0.

02/26/21 17:27:28 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:27:28 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:27:48 Decreasing flock level for na-hit023 to 0 from 1.

02/26/21 17:27:48 Increasing flock level for na-hit023 to 1 from 0.

02/26/21 17:28:28 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:28:29 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:28:48 Decreasing flock level for na-hit023 to 0 from 1.

02/26/21 17:29:28 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:29:28 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:29:48 Increasing flock level for na-hit023 to 1 from 0.

02/26/21 17:30:29 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:30:29 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:30:47 Decreasing flock level for na-hit023 to 0 from 1.

02/26/21 17:31:29 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:31:29 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:31:47 Increasing flock level for na-hit023 to 1 from 0.

02/26/21 17:32:28 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:32:28 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:32:48 Decreasing flock level for na-hit023 to 0 from 1.

02/26/21 17:33:28 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:33:28 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:33:48 Increasing flock level for na-hit023 to 1 from 0.

02/26/21 17:34:28 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:34:28 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:34:48 Decreasing flock level for na-hit023 to 0 from 1.

02/26/21 17:35:29 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:35:29 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:35:48 Increasing flock level for na-hit023 to 1 from 0.

02/26/21 17:36:28 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:36:29 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:36:48 Decreasing flock level for na-hit023 to 0 from 1.

02/26/21 17:36:48 Increasing flock level for na-hit023 to 1 from 0.

02/26/21 17:37:28 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:37:28 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:37:48 Decreasing flock level for na-hit023 to 0 from 1.

02/26/21 17:38:28 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:38:28 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:38:47 Increasing flock level for na-hit023 to 1 from 0.

02/26/21 17:39:29 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:39:29 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:39:47 Decreasing flock level for na-hit023 to 0 from 1.

02/26/21 17:40:28 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:40:28 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:40:48 Increasing flock level for na-hit023 to 1 from 0.

02/26/21 17:41:28 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:42:29 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:43:28 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:44:21 Decreasing flock level for na-hit023 to 0 from 1.

02/26/21 17:44:21 Increasing flock level for na-hit023 to 1 from 0.

02/26/21 17:44:29 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:44:29 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:44:29 Increasing flock level for na-hit023 to 2 from 1.

02/26/21 17:45:10 Negotiating for owner: na-hit023@xxxxxxxx (flock level 2, pool condor-vic.csiro.au)

02/26/21 17:45:10 Negotiating for owner: na-hit023@xxxxxxxx (flock level 2, pool condor-vic.csiro.au)

02/26/21 17:45:10 Increasing flock level for na-hit023 to 3 from 2.

02/26/21 17:45:21 Decreasing flock level for na-hit023 to 0 from 3.

02/26/21 17:45:21 Increasing flock level for na-hit023 to 1 from 0.

02/26/21 17:45:30 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:45:30 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:45:30 Increasing flock level for na-hit023 to 2 from 1.

02/26/21 17:45:59 Negotiating for owner: na-hit023@xxxxxxxx (flock level 3, pool condor-nsw.csiro.au)

02/26/21 17:45:59 Negotiating for owner: na-hit023@xxxxxxxx (flock level 3, pool condor-nsw.csiro.au)

02/26/21 17:46:09 Negotiating for owner: na-hit023@xxxxxxxx (flock level 2, pool condor-vic.csiro.au)

02/26/21 17:46:09 Negotiating for owner: na-hit023@xxxxxxxx (flock level 2, pool condor-vic.csiro.au)

02/26/21 17:46:09 Increasing flock level for na-hit023 to 3 from 2.

02/26/21 17:46:21 Decreasing flock level for na-hit023 to 0 from 3.

02/26/21 17:46:21 Increasing flock level for na-hit023 to 1 from 0.

02/26/21 17:46:30 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:46:30 Negotiating for owner: na-hit023@xxxxxxxx (flock level 1, pool condor-act.csiro.au)

02/26/21 17:46:30 Increasing flock level for na-hit023 to 2 from 1.

02/26/21 17:46:59 Negotiating for owner: na-hit023@xxxxxxxx (flock level 3, pool condor-nsw.csiro.au)

02/26/21 17:46:59 Negotiating for owner: na-hit023@xxxxxxxx (flock level 3, pool condor-nsw.csiro.au)

02/26/21 17:47:10 Negotiating for owner: na-hit023@xxxxxxxx (flock level 2, pool condor-vic.csiro.au)

02/26/21 17:47:10 Negotiating for owner: na-hit023@xxxxxxxx (flock level 2, pool condor-vic.csiro.au)

02/26/21 17:47:10 Increasing flock level for na-hit023 to 3 from 2.

02/26/21 17:47:22 Decreasing flock level for na-hit023 to 0 from 3.

02/26/21 17:47:22 Increasing flock level for na-hit023 to 1 from 0.