[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] question about accounting groups




Jeff:

The accounting information is stored in the "AccountantNew.log" file, which is maintained by the condor_negotiator. This file is written to in a transaction-log style, where all changes to the state are appended to the file, and periodically the file is rewritten with the current state.

-greg

On 4/28/21 7:52 AM, Jeff Templon wrote:

Hi Folks,

Coming back to this one, I think I âget itâ so far. The next question is this: where does the accounting information get stored, long-term? I check our CM machine and the only file in SPOOL there is AccountantNew.log. The schedd machine does have history files. This would seem to go against what was said below, about âon the Central Managerâ.

JT

On 23 Mar 2021, at 21:43, Todd Tannenbaum wrote:

On 3/23/2021 10:46 AM, Jeff Templon wrote:
Hi Stefano

Thanks for the hints! Do you understand why the Condor model seems to be to handle all the accounting etc on the submit node? The Central Manager is the thing that knows about the resources (and agrees to let the Schedd contact it) so it should be the thing handling the accounting, not the submit node.Â

Hi Jeff,

Hope time finds you well.

The accounting group management is handled on the central manager. Note that in Stefano's config he shared, everything about naming the groups and the fair-share policy across the groups Stefano prefaced with "In the Central Manager ....".

What is handled on the submit node is "filling out the order form", specifically inserting an attribute into each job ad stating which group to "charge". The configuration Stefano shared below that is specific to the submit node was all about how to map an incoming job into a group (based upon a list that looks at the unix login of the user, and assigns a default group and also allowed groups for that user). The submit node (schedd) is the component that needs to translate from the user namespace of the submission environment into the user/group namespace of the computing pool as a whole; after all, in HTCondor each schedd in a pool may live in a different organization and thus not share passwd files....

Itâs like asking the customers to keep track of their account balance - âplease upload your balance to the bank at the end of the monthâ.


I think a more accurate analogy would be asking the customers to fill out a form whenever they deposit money into the bank stating which account to use. The bank teller (schedd) will look at the form to ensure the account requested by customer is correct i.e. one they are allowed to use, but there is no need to put balances on this form....

Hope this helps
Todd

Â

On 23 Mar 2021, at 12:39, Stefano Dal Pra wrote:

On 23/03/21 12:04, Jeff Templon wrote:
Hi
Hello Jeff,


I am looking into setting up accounting and plotting on our condor setup. Weâve traditionally done this by unix groups, see this plot:

https://www.nikhef.nl/grid/stats/stbc/grisview-week

For how the plots look now under the torque batch system â on the top plot, right hand side, is a list of unix user groups and how much of the system was used by each during the past 7 days, giving also the color code legend for the plot, which is a stacked histogram of the number of jobs running by each of those unix groups at each sample point.

Condor does not have, AFAICT, this concept of accounting by unix groups - what I read is that the user needs to specify an accounting group (completely unrelated to unix groups). How can I have this automatically set to the unix group, except for cases where the user overrides it?

I went through similar steps as you seem to be going now. With HTCondor our current solution is to define a text mapfile filled with lines like this:

* <username> <group>,<group>

Example:

* pilatlas011 atlas,atlas

In our case "atlas" is the main gid for the pilatlas022 user.

Then, in the configuration of the SCHEDD (aka Submit Node)


#[1]
use FEATURE:AssignAccountingGroup($(T1_SHARED_SCRIPT_DIR)/Hgroups.txt)

#[2]
JOB_TRANSFORM_NAMES = $(JOB_TRANSFORM_NAMES) SetAccountingGroup

JOB_TRANSFORM_SetAccountingGroup @=end
[ eval_set_AcctGroup=usermap("AssignAccountingGroup",AcctGroupUser,AcctGroup,defaultGroup);
eval_set_AccountingGroup=join(".",usermap("AssignAccountingGroup",AcctGroupUser,AcctGroup,defaultGroup),AcctGroupUser); ]
@end


Notes:
[1] This should be all you need. [2]Â have been added here because of an unexpected behaviour of #[1] in some particular cases:


Having set that, running jobs should have the following Classad set:

AccountingGroup = "atlas.atlasprd011"
AcctGroup = "atlas"
AcctGroupUser = "atlasprd011"

If a user defines his own AcctGroup in the submit file, this should be moved to "RequestedAcctGroup"
and AcctGroup should be set by [1].
We set [2] because in this particular case the AcctGroup remains unexpectedly unset.

####

With this in place you can configure fairshare; mine is set as follow:

In the Central Manager:

PRIORITY_HALFLIFE = 26000
# Accept surplus and regroup
GROUP_ACCEPT_SURPLUS = true
#GROUP_AUTOREGROUP = false
DEFAULT_PRIO_FACTOR = 100000.0

include ifexist : /usr/share/htc/prod/conf/htc_shares.conf

Âhtc_shares.conf is script generated and it contains:

GROUP_NAMES = \
ÂÂÂÂ atlas, \
ÂÂÂÂ alice, \
ÂÂÂÂ belle, \
[...]
ÂÂÂÂ lhcb


GROUP_QUOTA_DYNAMIC_belle = 0.041403
[...]
GROUP_QUOTA_DYNAMIC_cms = 0.154328

The total sum yelds 1.0 and these numbers are the "share" for that group.

Hope these notes and a bit of "Read That Fantastic Manual" should help
Stefano




Also : we ultimately need to consider doing fair share on these same unix groups - are the numbers going into the fair share calculations the same set going into accounting? I would like to avoid setting up parallel infrastructures for things that are identical.

Also : does the user have complete freedom to put any group they want? I hope not; I would not want to have to police the system. Not all groups have the same allocation here, and users are quite opportunistic when theyâve found shortcuts to getting their jobs running.

Thanks,

JT
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


-- 
Todd Tannenbaum <tannenba@xxxxxxxxxxx>  University of Wisconsin-Madison
Center for High Throughput Computing    Department of Computer Sciences
Calendar: https://tinyurl.com/yd55mtgd  1210 W. Dayton St. Rm #4257
Phone: (608) 263-7132                   Madison, WI 53706-1685 

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/