[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] jobprio isn't global?



On Mar 6, 2014, at 10:42 AM, Pek Daniel <pekdaniel@xxxxxxxxx> wrote:

> Hi,
> 
> First, I turned off the negotiator.
> 
> I've submitted 80 000 identical jobs from 10 schedd nodes with the
> same user, 8000 jobs / schedd.
> 
> After, I turned on the negotiator.
> 
> I noticed, that during the negotiation, all of the jobs from a single
> schedd will be dispatched first, then from the second one, etc,
> sequentially.
> 
> Then, I tried to trick around a bit, and I assigned a randomized
> JobPrio to every job (0-1000000) with the 'priority' submitfile
> command. I experienced the same behaviour.
> 
> I can imagine two explanations:
> - jobprio is local to a specific schedd, and doesn't have any effect
> on the order of dispatching across different schedds.
> 
> - jobprio is ignored for some reason, maybe a global setting which
> overwrites it...
> The possible configuration settings which - in my opinion - can affect this:
> 
> ##  When is this machine willing to start a job?
> START = TRUE
> 
> ## We don't want preemption ever to be used
> PREEMPT = FALSE
> SUSPEND = FALSE
> KILL = FALSE
> PREEMPTION_REQUIREMENTS = FALSE
> NEGOTIATOR_CONSIDER_PREEMPTION = FALSE
> RANK = 0
> 
> Any idea what can cause this, and how to circumvent the original problem?
> 

Hi Daniel,

It's a bit of an undocumented knob, but you can set:

USE_GLOBAL_JOB_PRIOS = true

in the negotiator and the schedd's IIRC.

You're not going to be very happy though - that protocol was designed to work with dozens of different priorities, not tens of thousands.  I bet it won't work well (would love to hear I'm wrong!).

No clever ideas on how to randomly order the schedd list.  I suppose one could argue that if the jobs have identical priority, then htcondor can do whatever it wants.

However, I also suspect this would be pretty easy to make configurable.

What do others think?

Brian