[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] question about accounting groups



On 23/03/21 12:04, Jeff Templon wrote:
Hi
Hello Jeff,


I am looking into setting up accounting and plotting on our condor setup. Weâve traditionally done this by unix groups, see this plot:

https://www.nikhef.nl/grid/stats/stbc/grisview-week

For how the plots look now under the torque batch system â on the top plot, right hand side, is a list of unix user groups and how much of the system was used by each during the past 7 days, giving also the color code legend for the plot, which is a stacked histogram of the number of jobs running by each of those unix groups at each sample point.

Condor does not have, AFAICT, this concept of accounting by unix groups - what I read is that the user needs to specify an accounting group (completely unrelated to unix groups). How can I have this automatically set to the unix group, except for cases where the user overrides it?

I went through similar steps as you seem to be going now. With HTCondor our current solution is to define a text mapfile filled with lines like this:

* <username> <group>,<group>

Example:

* pilatlas011 atlas,atlas

In our case "atlas" is the main gid for the pilatlas022 user.

Then, in the configuration of the SCHEDD (aka Submit Node)


#[1]
use FEATURE:AssignAccountingGroup($(T1_SHARED_SCRIPT_DIR)/Hgroups.txt)

#[2]
JOB_TRANSFORM_NAMES = $(JOB_TRANSFORM_NAMES) SetAccountingGroup

JOB_TRANSFORM_SetAccountingGroup @=end
[ eval_set_AcctGroup=usermap("AssignAccountingGroup",AcctGroupUser,AcctGroup,defaultGroup); eval_set_AccountingGroup=join(".",usermap("AssignAccountingGroup",AcctGroupUser,AcctGroup,defaultGroup),AcctGroupUser); ]
@end


Notes:
[1] This should be all you need. [2]Â have been added here because of an unexpected behaviour of #[1] in some particular cases:


Having set that, running jobs should have the following Classad set:

AccountingGroup = "atlas.atlasprd011"
AcctGroup = "atlas"
AcctGroupUser = "atlasprd011"

If a user defines his own AcctGroup in the submit file, this should be moved to "RequestedAcctGroup"
and AcctGroup should be set by [1].
We set [2] because in this particular case the AcctGroup remains unexpectedly unset.

####

With this in place you can configure fairshare; mine is set as follow:

In the Central Manager:

PRIORITY_HALFLIFE = 26000
# Accept surplus and regroup
GROUP_ACCEPT_SURPLUS = true
#GROUP_AUTOREGROUP = false
DEFAULT_PRIO_FACTOR = 100000.0

include ifexist : /usr/share/htc/prod/conf/htc_shares.conf

Âhtc_shares.conf is script generated and it contains:

GROUP_NAMES = \
ÂÂÂÂ atlas, \
ÂÂÂÂ alice, \
ÂÂÂÂ belle, \
[...]
ÂÂÂÂ lhcb


GROUP_QUOTA_DYNAMIC_belle = 0.041403
[...]
GROUP_QUOTA_DYNAMIC_cms = 0.154328

The total sum yelds 1.0 and these numbers are the "share" for that group.

Hope these notes and a bit of "Read That Fantastic Manual" should help
Stefano




Also : we ultimately need to consider doing fair share on these same unix groups - are the numbers going into the fair share calculations the same set going into accounting? I would like to avoid setting up parallel infrastructures for things that are identical.

Also : does the user have complete freedom to put any group they want? I hope not; I would not want to have to police the system. Not all groups have the same allocation here, and users are quite opportunistic when theyâve found shortcuts to getting their jobs running.

Thanks,

JT
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/