[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] condor_userprio



Hi Thomas,

The accounting groups and user priorities control different steps of the allocation process. The groups do not have priorities and will always receive their requested allocation up to their GROUP_QUOTA. If you have GROUP_ACCEPT_SURPLUS = True, any leftover allocation is re-distributed proportionally to the original quotas. It does this first among sub-groups and then moves up the hierarchy. This lets a group potentially be allocated more than it's quota if other groups are not using theirs. It can be set per-group, so you can hard-cap any groups at their quotas if needed.Â

If you have any users/submitters that don't belong to a group, they get assigned to the base level '<none>' group.

After all of the groups have received an allocation they are ordered by GROUP_SORT_EXPR. The default 'starvation' ordering puts groups using the smallest percentage of their quota first, and the <none> group last.

As it iterates over through the groups it negotiates with each submitter/user in the current group ordered by their priority (lower first). That user receives a portion of their group's allocation proportional to their priority, with smaller priorities receiving more of the group's allocation. If you only have one user in a group this step is very simple, because that user will get the group's entire allocation and their user priority will be irrelevant.

So how is the user priority calculated?

The priority reported by condor_userprio is the Effective Priority, which is just the Real User Priority * Priority Factor. The default priority factor is 1000, and is a constant that can be set for each user. The Real User Priority (RUP) starts at 0.5 and halves the distance between it's current value and that user's current resource usage every PRIORITY_HALFLIFE seconds (set on the CM). If they are not using any resources for long enough, they will end up at 0.5 again.

If you want certain users to get a larger share of their group's allocation you should lower those users' priority factors. For example, a user with a priority factor of 500 will be able to use twice the resources of a user with the default factor of 1000. This isn't a hard cap, it just means these two priorities will equalize at that point.Â

GROUP_AUTOREGROUP makes things more complicated because the second time it negotiates with the submitters they're all in the <none> group. Having them all in one group means their priority is now affecting the resource distribution between groups. I'll assume you have this disabled for now (set to False).

Hopefully this helps,
Collin

On Tue, Oct 2, 2018 at 3:22 PM Thomas Hartmann <thomas.hartmann@xxxxxxx> wrote:
Hi Alessandra et. al.,

I am also trying to get an understanding on how Condor calculates user
prios over time.

Thing is, that we have two large user(s)/group(s) and a few smaller
ones, where three are noticable. (a bit complicated by the fact that
some of these have either pure multicore jobs or a varying mix of single
and multicores)

The two larger ones have GROUP_QUOTA_DYNAMIC of about the same value --
and the three smaller ones have each about ~1/10th of the larger users
shares.

So, I have been tracking userprios over time to compare to the number of
allocated resources aka cores.

Thing is, that for example user 'LHCb' started mid of last week to
submit significant jobs. As they had been dormant for some time, I would
understand some higher allocation during the first few days.
However, Condor still allocates ~20% of the resources to the group,
while it should be <6% nominal share and still the group prio is below
of the two dominant users/ groups (ignoring for the moment subgroups and
job run times).

User 'Belle' is supposed to get ~10% of the resources, i.e., ~1.5k cores
-- while they are submitting somewhat constantly, their share moves
between ~1.5k and up two ~3k of used cores.

Tuning the GROUP_QUOTA_DYNAMIC values seem to have no significant impact
on the short/mid term prios AFAIS, so I wonder, how the nominal shares
can be enforced a bit more strict?

Cheers,
 Thomas
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


--
Collin Mehring | PE-JoSE - Software Engineer