[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Looking for negotiator optimization setting



Dear Condor experts,

I've noticed the following scenario in our NegotiatorLog leading to a longer than usual negotiation cycle duration, and was wondering if there's a setting/knob that could help.

Example: (using HTCondor v8.6.2)

A submitter has several thousand idle, ready jobs (that are not clustered well) and a few hundred running jobs. The running jobs/slots used are weighted by Cpus (the default) and are consuming almost all of the submitter's quota, so when negotiation with this submitter starts they receive a small limit. The submitter belongs to an accounting group, but they are the only one in that group. The accounting group is using a dynamic allocation and accepts surplus, leading to fractions of slots being assigned.

Here are the relevant lines from the start of negotiation, if preferred:
06/14/18 18:06:54 0 seconds so far for this submitter
06/14/18 18:06:54 1 seconds so far for this schedd
06/14/18 18:06:54Â Â maxAllowed= 60.7366Â ÂgroupQuota= 1284.74Â Âgroupusage=Â 1224
06/14/18 18:06:54Â ÂCalculating submitter limit with the following parameters
06/14/18 18:06:54  ÂSubmitterPrio   Â= 1344791.875000
06/14/18 18:06:54Â Â ÂSubmitterPrioFactor = 1000.000000
06/14/18 18:06:54  ÂsubmitterShare   = 1.000000
06/14/18 18:06:54  ÂsubmitterAbsShare Â= 1.000000
06/14/18 18:06:54  ÂsubmitterLimit  = 60.736572
06/14/18 18:06:54  ÂsubmitterUsage  = 1224.000000

This submitter then quickly matches three out of the first ~20 ready jobs, costing 32, 19, and 9, which uses up almost all of it's limit.

The next job considered after the third match:
06/14/18 18:06:54 matchmakingAlgorithm: limit 60.736572 used 60.000000 pieLeft 0.736572

The submitter has run for less than a second at this point and can no longer match any additional jobs, because the smallest possible weighted request it can make is one core. (Though it's also possible for this situation to occur when pieLeft is >1 if the submitter only has larger/heavier jobs ready to run.) We use an _expression_ for RequestCpus, but set SCHEDD_SLOT_WEIGHT such that the schedds are still able to accurately weigh the jobs before sending them to the negotiator.

I'm looking for a setting that would cause the negotiator to immediately end negotiation with a submitter when all remaining individual job weights for that submitter are greater than the remaining quota (pieLeft). This already happens if the pieLeft hits zero, but they usually end up with some fraction of a slot as shown here.Â

Alternatively, if I could define a minimum required pieLeft for it to continue negotiating, that would also solve the most common occurrence.

In this example the negotiator continued to consider jobs from this submitter for another 44 seconds without making any additional matches. This seems long to me even if every job is considered, because it should be able to just immediately reject all of them for the submitter limit. Other submitters also experienced this issue, but this one had the most idle jobs and thus took the longest.

The last few lines before it moves to the next submitter:
06/14/18 18:07:38Â Â ÂSending SEND_RESOURCE_REQUEST_LIST/20/eom
06/14/18 18:07:38Â Â ÂGetting reply from schedd ...
06/14/18 18:07:38Â Â ÂGot NO_MORE_JOBS;Â schedd has no more requests
06/14/18 18:07:38Â ÂThis submitter hit its submitterLimit.

My short-term solution is to lower the timeout period for submitters to mitigate the issue, as none of them legitimately take this long.

It's always possible I'm approaching this wrong, or there's some other setting I'm missing. I'd love to hear your thoughts.

Thanks,
Collin