[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Concurrency limit does not work on primary group



From Jaime Frey, in regard to a problem we were having a few weeks back.
Concurrency limit names are expected to only contain letters, digits,
and '.'. But nothing prevents a job from asking for limits that
contain other characters. If a job does so, then the negotiator starts
tracking the limit. It also includes the limits in its response to
condor_userprio. condor_userprio will fail in the way you've observed
if any of the limits include unexpected characters.
Your underscore may be bombing this out.

bob


On 7/23/2015 8:57 AM, Brian Bockelman wrote:
Hi Gang,

Try changing the concurrency limit to:

debug(strcat(AcctGroup, “,”, AcctSubGroup, “,”, Owner))

This will dump the _expression_ evaluation information into negotiator log.

Brian

On Jul 23, 2015, at 5:44 AM, Gang Qin <Gang.Qin@xxxxxxxxxxxxx> wrote:

Dear Condor experts:

  Our negotiator is running condor-8.2.2-265643.x86_64, in our current configuration, we set  ConcurrencyLimits = strcat(AcctGroup, ",", AcctSubGroup, ",", Owner)

svr011:~# condor_q -global -autoformat ConcurrencyLimits | grep group_NONLHC  | tail -n 1
group_NONLHC,none,ilc023

'group_NONLHC' is the AcctGroup which is the primary group. 
'none' is the AcctSubGroup which is a subgroup in 'group_NONLHC' 
'ilc023' is a Owner in the subgroup 'none'.


Currently we have 453 ilc023 jobs running and 420 ilc jobs idle in the queue:
svr019:~# condor_ls --vo
Owner            Idle  Running  Held  Suspended Completed Removed Transferring_output
ilc023         420     453     4     0     0     0     0

 When I add a line 'ILC023_LIMIT = 400' and reconfigured the Negotiator,  in NegotiatorLog I can see:

     Rejected 183823.0 group_NONLHC.none.ilc023@xxxxxxxxxxxxxxx <10.141.255.10:60599>: concurrency limit none reached

 When I add a line 'NONE_LIMIT = 400' and reconfigured the Negotiator ,  in NegotiatorLog I can see:

     Rejected 186612.0 group_NONLHC.none.ilc023@xxxxxxxxxxxxxxx <10.141.255.10:60599>: concurrency limit none reached

  But when I add  a line 'GROUP_NONLHC = 400' and reconfigured the Negotiator ,  in NegotiatorLog I can only see:

     Rejected 79352.0 group_NONLHC.none.ilc023@xxxxxxxxxxxxxxx <10.141.255.11:46371>: no match found

  This message means the job has already passed the concurrency limit but not able to find free node. 

  Any idea why the concurrency does not work on the primary group? 

  Cheers,Gang
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/



_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/