Re: [HTCondor-users] Job stays in queue for approx 20m before match making

Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

On Fri, Oct 6, 2023 at 4:51âPM Vikrant Aggarwal <ervikrant06@xxxxxxxxx> wrote:

Hello Experts,

We are seeing issues with the 9.0.17 submitter box (all-in-one) with multiple pools in the flocking list. Flocking pools are running with the 8.8.5 version.Â

Job submitted but it wasn't even considered for matchmaking by the negotiator.Â
Logs from submit node. I don't see any attempt in Negotiator logs during this time to match the job.Â

10/06/23 10:18:30 (pid:1811906) job_transforms for 1129266.0: 5 considered, 5 applied
===== Lot of logs =====
10/06/23 10:29:50 (pid:1811906) Starting add_shadow_birthdate(1129266.0)

I do see messages about "rebuilt prioritized runnable list"Â
# awk '/10\/06\/23 10:18:30/,/10\/06\/23 10:29:51/ {print $0}' /var/log/condor/SchedLog | grep 'Rebuilt prioritized runnable job list in' | head
10/06/23 10:18:34 (pid:1811906) Rebuilt prioritized runnable job list in 0.014s.
10/06/23 10:18:52 (pid:1811906) Rebuilt prioritized runnable job list in 0.004s.
This bug [1] is already fixed in the version we are using on submitter, and afaiu it's only related to submitter not master or worker nodes, anything else which can cause this issue? 
[1] https://opensciencegrid.atlassian.net/browse/HTCONDOR-769
Thanks & Regards,

Vikrant Aggarwal

Mailing List Archives

Public Access

Re: [HTCondor-users] Job stays in queue for approx 20m before match making