[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Groups and SubGroups issue



On 10/12/2020 9:52 AM, Mihai Ciubancan wrote:
...also, do you have any suggestion for OPS ?

Mihai

Hi Mihai,

Just a random guess re your troubles with OPS :

From the below it looks like you are injecting the group attributes into the job ad using SUBMIT_ATTRS; this config knob is used by clients like condor_submit and the Python job submission APIs.  SUBMIT_ATTRS is not used by the condor_schedd. 

Perhaps whatever is submitting your OPS jobs (script? human?) is doing so from a server that is different from the server where your schedd is running, and thus may have a condor_config that does not include your SUBMIT_ATTRS customization?

If you want attribute to appear in every job entering into a condor_schedd, regardless of the client's configuration, I suggest you use a schedd job transform to instruct the schedd itself to insert the attributes  - see:
   https://htcondor.readthedocs.io/en/latest/admin-manual/policy-configuration.html#job-transforms

regards
Todd



      
The problem is that

                   ifThenElse(regexp("lhcb01",Owner), "lhcb",

Matches the last 6 characters of pillhcb01.

You need to put the longer test first, or change your regex to use an
match that is anchored at the start of the string like "^lhcb01"

alice01 and pilalice01 have the same problem.

-tj



-----Original Message-----
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of
Mihai Ciubancan
Sent: Monday, October 12, 2020 6:11 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] Groups and SubGroups issue

Hello,

I have configured a bunch of groups for my cluster, depending on users and
the number of cores used by their jobs:

RO07AcctGroup = ifThenElse(NordugridQueue =?= "atlas", "group_ATLAS", \
                ifThenElse(NordugridQueue =?= "lhcb",  "group_LHCB", \
                ifThenElse(NordugridQueue =?= "alice", "group_ALICE", \
                ifThenElse(NordugridQueue =?= "ops", "group_OPS" ))))

RO07AcctSubGroup = ifThenElse(regexp("atlas01",Owner) && RequestCpus >1,
"atlas_multicore", \
                   ifThenElse(regexp("atlas01", Owner), "atlas", \
                   ifThenElse(regexp("lhcb01",Owner), "lhcb", \
                   ifThenElse(regexp("pillhcb01",Owner), "pilotlhcb", \
                   ifThenElse(regexp("prdlhcb01",Owner), "prodlhcb", \
                   ifThenElse(regexp("alice01",Owner), "alice", \
                   ifThenElse(regexp("pilalice01",Owner), "pilotalice", \
                   ifThenElse(regexp("ops01",Owner), "ops" ))))))))

AccountingGroup = strcat(RO07AcctGroup, ".", RO07AcctSubGroup, ".", Owner)
ConcurrencyLimits = strcat(RO07AcctGroup, ",", RO07AcctSubGroup, ",",
Owner)
SUBMIT_ATTRS = $(SUBMIT_ATTRS), RO07AcctGroup, RO07AcctSubGroup,
AccountingGroup, ConcurrencyLimits

If for Atlas and Alice(as far as I can see) it's working properly, for
LHCb and OPS the mapping is wrong, as you can see from the output of
"condor_status -submitter" command:

Name                         Machine            RunningJobs IdleJobs
HeldJobs

alice01@xxxxxxxx             arc6atlas1.nipne.r          36        0
 0
atlas01@xxxxxxxx             arc6atlas1.nipne.r           0        0
 0
group_ALICE.alice.alice01@ni arc6atlas1.nipne.r         306        1
 0
group_ATLAS.atlas.atlas01@ni arc6atlas1.nipne.r          32        7
 0
group_ATLAS.atlas_multicore. arc6atlas1.nipne.r         305       79
 1
group_LHCB.lhcb.lhcb01@nipne arc6atlas1.nipne.r           0        0
 0
group_LHCB.lhcb.pillhcb01@ni arc6atlas1.nipne.r         712      129
 0
lhcb01@xxxxxxxx              arc6atlas1.nipne.r           0        0
 0
ops01@xxxxxxxx               arc6atlas1.nipne.r           1        0
 0
pillhcb01@xxxxxxxx           arc6atlas1.nipne.r           0        0
 0
                           RunningJobs           IdleJobs
HeldJobs

    alice01@xxxxxxxx                36                  0
0
    atlas01@xxxxxxxx                 0                  0
0
group_ALICE.alice.al               306                  1
0
group_ATLAS.atlas.at                32                  7
0
group_ATLAS.atlas_mu               305                 79
1
group_LHCB.lhcb.lhcb                 0                  0
0
group_LHCB.lhcb.pill               712                129
0
     lhcb01@xxxxxxxx                 0                  0
0
      ops01@xxxxxxxx                 1                  0
0
  pillhcb01@xxxxxxxx                 0                  0
0

               Total              1392                216
1

So the jobs run by user pillhcb01 should be mapped under
group_LHCB_pilotlhcb and not group_LHCB_lhcb, while ops jobs are running
under <none> group instead of group_OPS.

Do you know what I'm doing wrong?

Regards,
Mihai



_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with
a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with
a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


Dr. Mihai Ciubancan
IT Department
National Institute of Physics and Nuclear Engineering "Horia Hulubei"
Str. Reactorului no. 30, P.O. BOX MG-6
077125, Magurele - Bucharest, Romania
http://www.ifin.ro
Work:   +40214042360
Mobile: +40761345687
Fax:    +40214042395

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


-- 
Todd Tannenbaum <tannenba@xxxxxxxxxxx> University of Wisconsin-Madison
Center for High Throughput Computing   Department of Computer Sciences
HTCondor Technical Lead                1210 W. Dayton St. Rm #4257
Phone: (608) 263-7132                  Madison, WI 53706-1685