[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[HTCondor-users] Fair-share limits reached while there are whole machines are available and idle jobs
- Date: Tue, 20 Nov 2018 15:47:34 -0600
- From: Alec Sheperd <alec.sheperd@xxxxxxxxxxxxxxxx>
- Subject: [HTCondor-users] Fair-share limits reached while there are whole machines are available and idle jobs
I recently noticed something strange with our condor pool. There are a
lot of idle jobs in the queue and yet there are nearly equally many
available slots. Whole machines even, where there are no jobs running,
none of the idle jobs get allocated one of these empty slots.
After digging around in the negotiator logs and classads, it seems there
are a lot of jobs that are being rejected based on fair-share limits.
There are many more rejections happening than matches, and as far as I
can tell they are due to fair-share limits.
From the LastNegotiationCycleSubmittersShareLimit* classsad, it seems
like all the ones being rejected are in the list provided from it.
These jobs are all getting submitted from the default <none> group which
has the surplus flag set. In the negotiator log it displays "Group
<none> is using its quota 2629 - halting negotiation".
Could it be something wrong with user prio and quotas disallowing slot
matches? Also wonder if maybe it's related to bug fixed in 8.7.10
Thanks for any help or thoughts,