[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] [HTCcondor-users] Possibly a bug in subgroup working without surplus and autoregroup



Hello Experts,

I am exploring the usage of accounting groups and sub accounting groups. I saw weird behavior while using subgroups if I don't specify GROUP_ACCEPT_SURPLUS orÂGROUP_AUTOREGROUP then jobs submitted with subgroup never run. If i submit the job with parent group "cdp" it runs without any issue. Is't expected behavior? I tried to use false value of GROUP_ACCEPT_SURPLUS but no luck. If this is an expected behavior this means we can't use subgroups without over-commitment?

I added this in my configuration file:Â

GROUP_NAMES = cdp, cdp.cdp1, cdp.cdp2, cdp.cdp3
GROUP_QUOTA_DYNAMIC_cdp = .5
GROUP_QUOTA_DYNAMIC_cdp.cdp1 = .3
GROUP_QUOTA_DYNAMIC_cdp.cdp2 = .3
GROUP_QUOTA_DYNAMIC_cdp.cdp3 = .3

After reconfig submitted job with following line in submit file.Â

Accounting_group = cdp.cdp2

Submitted jobs never ran. Negotiator were not able to do the match making.Â

08/01/19 04:14:29 ---------- Started Negotiation Cycle ----------
08/01/19 04:14:29 Phase 1: ÂObtaining ads from collector ...
08/01/19 04:14:29 Not considering preemption, therefore constraining idle machines with ifThenElse(State == "Claimed","Name State Activity StartdIpAddr AccountingGroup Owner RemoteUser Requirements SlotWeight ConcurrencyLimits","")
08/01/19 04:14:29 Â Getting startd private ads ...
08/01/19 04:14:29 Â Getting Scheduler, Submitter and Machine ads ...
08/01/19 04:14:29 Â Sorting 12 ads ...
08/01/19 04:14:29 Got ads: 12 public and 6 private
08/01/19 04:14:29 Public ads include 1 submitter, 6 startd
08/01/19 04:14:29 Phase 2: ÂPerforming accounting ...
08/01/19 04:14:29 group quotas: assigning 1 submitters to accounting groups
08/01/19 04:14:29 group quotas: assigning group quotas from 18 available weighted slots
08/01/19 04:14:29 group quotas: allocation round 1
08/01/19 04:14:29 group quotas: groups= 5 Ârequesting= 1 Âserved= 1 Âunserved= 0 Âslots= 18 Ârequested= 1 Âallocated= 1 Âsurplus= 25.1 Âmaxdelta= 9
08/01/19 04:14:29 group quotas: entering RR iteration n= 9
08/01/19 04:14:29 Group cdp - skipping, zero slots allocated
08/01/19 04:14:29 Group cdp.cdp1 - skipping, zero slots allocated
08/01/19 04:14:29 Group cdp.cdp1 - skipping, zero slots allocated
08/01/19 04:14:29 Group cdp.cdp2 - BEGIN NEGOTIATION
08/01/19 04:14:29 Phase 3: ÂSorting submitter ads by priority ...
08/01/19 04:14:29 Phase 4.1: ÂNegotiating with schedds ...
08/01/19 04:14:29 Â Negotiating with cdp.cdp2.vaggarwal@xxxxxxxx at <xx.xx.xx.57:9618?addrs=xx.xx.xx.57-9618&noUDP&sock=9516_13b9_3>
08/01/19 04:14:29 0 seconds so far for this submitter
08/01/19 04:14:29 0 seconds so far for this schedd
08/01/19 04:14:29 Â Â Got NO_MORE_JOBS; Âschedd has no more requests
08/01/19 04:14:29 Â Â Request 00149.00000: autocluster 34 (request count 1 of 1)
08/01/19 04:14:29 Â Â Â Rejected 149.0 cdp.cdp2.vaggarwal@xxxxxxxx <xx.xx.xx.57:9618?addrs=xx.xx.xx.57-9618&noUDP&sock=9516_13b9_3>: submitter limit exceeded
08/01/19 04:14:29 Â Â Got NO_MORE_JOBS; Âschedd has no more requests
08/01/19 04:14:29 ÂnegotiateWithGroup resources used scheddAds length 1
08/01/19 04:14:29 Group cdp.cdp3 - skipping, zero slots allocated
08/01/19 04:14:29 Group <none> - skipping, zero slots allocated
08/01/19 04:14:29 Round 1 totals: allocated= 1 Âusage= 0
08/01/19 04:14:29 Round 1 totals: allocated= 1 Âusage= 0
08/01/19 04:14:29 group quotas: allocation round 2
08/01/19 04:14:29 group quotas: allocation round 2
08/01/19 04:14:29 group quotas: groups= 5 Ârequesting= 0 Âserved= 0 Âunserved= 0 Âslots= 18 Ârequested= 0 Âallocated= 0 Âsurplus= 26.1 Âmaxdelta= 9
08/01/19 04:14:29 group quotas: entering RR iteration n= 9
08/01/19 04:14:29 Group cdp - skipping, zero slots allocated
08/01/19 04:14:29 Group cdp - skipping, zero slots allocated
08/01/19 04:14:29 Group cdp.cdp1 - skipping, zero slots allocated
08/01/19 04:14:29 Group cdp.cdp2 - skipping, zero slots allocated
08/01/19 04:14:29 Group cdp.cdp3 - skipping, zero slots allocated
08/01/19 04:14:29 Group <none> - skipping, zero slots allocated
08/01/19 04:14:29 Round 2 totals: allocated= 0 Âusage= 0
08/01/19 04:14:29 ---------- Finished Negotiation Cycle ----------


Working conf:

GROUP_NAMES = cdp, cdp.cdp1, cdp.cdp2, cdp.cdp3
GROUP_QUOTA_DYNAMIC_cdp = .5
GROUP_QUOTA_DYNAMIC_cdp.cdp1 = .3
GROUP_QUOTA_DYNAMIC_cdp.cdp2 = .3
GROUP_QUOTA_DYNAMIC_cdp.cdp3 = .3
GROUP_ACCEPT_SURPLUS_cdp.cdp1 = true
GROUP_ACCEPT_SURPLUS_cdp.cdp2 = true
GROUP_ACCEPT_SURPLUS_cdp.cdp3 = true


# condor_version
$CondorVersion: 8.6.13 Oct 30 2018 BuildID: 453497 $
$CondorPlatform: x86_64_RedHat6 $

Regards,
Vikrant