[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Priority submission.



I believe quotas can be either percentages or quantities, although if you use both the math is pretty complicated, because the effective quotas are normalized based on the actual pool size.

So yes.  .75 does say 3375 out of 4500.   The important thing here is any job that isn't in a group that has a quota gets to share the remainder. so all non-high jobs have a 'quota' of (4500-3375).

This "high priority" group gets first crack at machines as long as it is furthest from filling its quota, which we assume will always be true because the quota exceeds the expected number of high priority jobs. (it does, right?)

Then, if you have 

   GROUP_AUTOREGROUP = false

The 75% quota acts as a reservation, leaving slots idle so that high priority jobs can start right away.  

If you have 

   GROUP_AUTOREGROUP = true

Then once all of the high priority jobs have gotten matched, then negotiator will distribute the leftovers to the remaining groups and then do another matchmaking pass.


-tj

-----Original Message-----
From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Amy Bush
Sent: Wednesday, February 28, 2018 3:59 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Priority submission.

Okay. I've got that set up, but I have another question, as I try to
figure out if it's working.

You mentioned the variable "GROUP_QUOTA_DYNAMIC", and setting it to a
high number. But that particular variable is actually a percentage,
right?

Now, if I set it to, say, 0.75, what exactly am I saying? Am I saying
"we have 4500 condor nodes available, and anyone in the High accounting
group should always have access to 3375 of those nodes"?

Or am I completely misunderstanding? I know I'm asking for a thing
("priority") that condor doesn't really do, but it seems like I should
be able to puzzle out a way to make this work.


On Tue, Feb 27, 2018 at 06:50:47PM +0000, John M Knoeller wrote:
> In that case.  I recommend having the  high-priority schedd set (or force) an accounting_group into the jobs either by using SUBMIT_ATTRS (for pre 8.6) or a JOB_TRANSFORM (8.6 and later)
> 
> === pre 8.6 config
>    SUBMIT_ATTRS = AcctGroup
>    AcctGroup = "High"
> 
> 
> === 8.6 config
>    JOB_TRANSFORM_NAMES = $(JOB_TRANSFORM_NAMES) SetHighGroup
>    JOB_TRANSFORM_SetHighGroup @=END
>        [
>        requirements = AcctGroup is undefined;
>        copy_Owner="AcctGroupUser";
>        set_AcctGroup = "High";
>        ]
>     @END
> 
> Then in the negotiator, configure the High group with a large quota relative to the size of your pool
> 
> GROUP_NAMES = $(GROUP_NAMES) High
> GROUP_QUOTA_DYNAMIC_High = 10000
> 
> There are lots of variables here, but this should get you started.   
> 
> -tj
> 
> -----Original Message-----
> From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Amy Bush
> Sent: Tuesday, February 27, 2018 12:14 PM
> To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
> Subject: Re: [HTCondor-users] Priority submission.
> 
> Thanks so much for your reply, John. If I'm going to do something, I'd
> definitely prefer to do it Right. I've never done anything with
> accounting groups and quotas, so hopefully google can find me an example
> or two to point me in the correct direction.
> 
> Off to break things!
> 
> On Tue, Feb 27, 2018 at 03:50:07PM +0000, John M Knoeller wrote:
> > A high priority submit machine is not really a thing in HTCondor, although you can use various configuration knobs to approximate it.   But in general the negotiator sorts users by user priority, then takes the highest priority user and tries to match its highest priority job.  So the best way to get a "high priority submission" is to have a high priority user.   This has nothing to do with the RANK expression in either the startd or the job.
> > 
> > I think you would have better success having the high priority submit machine put all of the jobs submitted there into an accounting group and then configure that accounting group with a very large quota.  accounting groups are considered in starvation order by the negotiator.
> > 
> > Regarding your existing configuration.
> > 
> > By default, the name of the submit machine isn't even available in the context of matchmaking, so you can't RANK on it.  This
> > 
> >    RANK =  ((TARGET.Machine =?= "azog.cs.utexas.edu") * 20)
> > 
> > Will have no effect because Machine is not a job attribute, so this is the same as 
> > 
> >    RANK =  ((undefined =?= "azog.cs.utexas.edu") * 20)
> > 
> > Now you could make it a job attribute by adding this to the configuration of the submit node
> > 
> > SUBMIT_ATTRS = $(SUBMIT_ATTRS) Machine
> > Machine = "$(FULL_HOSTNAME)"
> > 
> > The double quotes are required here, otherwise the result will parse ok, but still evaluate to undefined. 
> > 
> > But hat nothing prevents any user on any submit machine from adding this to their submit file
> > 
> >    +Machine = "azog.cs.utexas.edu"
> > 
> > Which would give them the same Rank boost, and in any case, boosting how a job is RANK'ed by the startd doesn't have any effect unless NEGOTIATOR_CONSIDER_PREEMPTION is true. 
> > 
> > The intent of the RANK expression in the startd is to let the negotiatior know when it should hand out *preempting* matches because there is an idle job in the queue that the startd should be running INSTEAD of the one it is currently running.
> > 
> > -tj
> > 
> > -----Original Message-----
> > From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Amy Bush
> > Sent: Monday, February 26, 2018 8:35 AM
> > To: htcondor-users@xxxxxxxxxxx
> > Subject: [HTCondor-users] Priority submission.
> > 
> > I fear that maybe I'm just being dense, but maybe you guys can help me.
> > 
> > We have a machine (azog) that is intended to be a priority submit node.
> > Submit jobs from there, and your jobs have higher priority. It used to
> > work. I swear it used to work. Now it doesn't appear to.
> > 
> > Formerly the higher priority jobs would be more likely to run and more
> > likely to preempt other jobs. When I noticed a lot of priority jobs
> > sitting idle, I began to investigate. Doing a condor_q -l of one of the
> > priority jobs, the Rank is 0.
> > 
> > Here's a snippet of what I have in my condor_config file:
> > 
> > RANK =  10 \
> >       + ((TARGET.Group =?= "PRIORITY") * 3)   \
> >       + ((TARGET.Group =?= "PROF") * 3)     \
> >       + ((TARGET.Group =?= "GRAD") * 3)     \
> >       + ((TARGET.Group =?= "UNDER") * 3)    \
> >       + ((TARGET.Machine =?= "azog.cs.utexas.edu") * 20)
> > 
> > That's a slightly modified version of what used to be in there. I added the 10 at the beginning to see if that impacted the rank reported by condor_q, but it still reports 0.
> > 
> > The config file for azog has this:
> > 
> > RANK_FACTOR     = 100000
> > RANK    = (($(RANK_FACTOR)) + $(RANK))
> > 
> > 
> > Which is identical to what all the other, non-priority condor nodes have. So the word should be being done in the RANK definition in my main condor_config. But it isn't.
> > 
> > So.. anyone have any ideas? How I can test it? Is this a rigorous enough test to prove that it isn't working, or am I completely misunderstanding how Rank works?
> > 
> > azog 08:33:14$ condor_q -l 53343 | grep ^Rank
> > Rank = 0.0
> > 
> > If it makes any difference, I also set NEGOTIATOR_PRE_JOB_RANK and NEGOTIATOR_POST_JOB_RANK, initially to 0, and then to 11 and 12 respectively, in case it impacted RANK at all, so I'd know which one was impacting it. (Neither did.)
> > 
> > Any help or ideas would be desperately appreciated.
> > 
> > --
> > amy 
> > 
> > 
> > _______________________________________________
> > HTCondor-users mailing list
> > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting
> > https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> > 
> > The archives can be found at:
> > https://lists.cs.wisc.edu/archive/htcondor-users/
> > 
> > _______________________________________________
> > HTCondor-users mailing list
> > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting
> > https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> > 
> > The archives can be found at:
> > https://lists.cs.wisc.edu/archive/htcondor-users/
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/