[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] group <none> appeared after upgrade from 7.4.3 to 7.6.2





On 10/18/11 4:02 PM, Erik Erlandson wrote:
If you search for the first "Matched" line what you'll see is that the
jobs that were submitted without a group are now apparently in the group
"<none>" and that group actually has a quota somehow (the group doesn't
actually exist so it certainly doesn't have a quota).  Jobs for that
"group" get run in front of the groups that haven't filled their quota.
Hi Joe,

Accounting groups were enhanced to support fully generalized
Hierarchical Accounting Groups (HGQ), as of 7.5.6:

https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=1393

There is always a root node in the accounting group hierarchy, whose
name is "<none>", and any job that does not map to some other accounting
group will be assigned to<none>.  This group always accepts any surplus
quota not used by other groups.

You may also be interested in:
https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=1926
(dev release only, not on current stable series 7.6)

Joe,

It sounds like in your case, the <none> group is being considered before most other groups because it is more "starved", meaning it is using a smaller fraction of its share of the pool compared to most other groups. If I understand the new group quota system correctly, group <none>'s share of the pool is determined by computing the share of the pool for all the other groups and counting what is left. In your pool there are 10216 slots. 5682 of these are being assigned to group <none>.

One thing that can cause trouble is if you have special slots that are not available to all jobs. In this case, the size of the pool may be effectively overestimated. The result is that dynamic quotas are too big, and groups which are considered first may get too many slots, while groups that follow will starve. GROUP_DYNAMIC_MACH_CONSTRAINT can be used to attempt to work around this problem. So can GROUP_QUOTA_ROUND_ROBIN_RATE.

I haven't had a chance to consider your case carefully enough to make a specific recommendation. If you continue to have trouble, I recommend opening a help ticket with condor-admin@xxxxxxxxxxxx

--Dan