[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] scaling problems with condor_negotiator


-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Dan Bradley
Sent: 05 March 2007 16:15
To: Condor-Users Mail List
Subject: Re: [Condor-users] scaling problems with condor_negotiator

Matt Hope wrote:

>On 2/28/07, Smith, Ian <I.C.Smith@xxxxxxxxxxxxxxx> wrote:
>>Does the negotiator treat a cluster as a single entity since all its 
>>component process have the same requirements hence putting a smaller 
>>load on it (I moved from cluster to individual jobs with the wrapper 
>Been on holiday so only just seen this.
>In 6.6 condor can treat clusters differently by considering a false 
>result for one job/node pair in a cluster a false result for that node 
>with the entire cluster (with a massive performance boost to 
>negotiation if enabled and clustering is heavily used).
>The new features trialed in later 6.7 releases and added by default in
>6.8 attempt to make such manual intervention unnecessary but I can't 
>remember off the top of my head if they still rely on clustering for 
>maximum benefit.

In 6.8, the thing to look at is how well your jobs are getting "auto
clustered".  This auto clustering happens, for the most part, behind the
scenes and helps improve the efficiency of negotiation by grouping
equivalent jobs together.

You can see how the jobs are getting grouped together by looking at the
job attribute AutoClusterID.  Example:

condor_q -f "%s" ProcID -f ".%s" ClusterID -f " %s\n" AutoClusterID

Jobs with the same AutoClusterID are in the same group for negotiation
purposes.  If you see that many small groups are being created, take a
look at the attribute AutoClusterAttrs.  This will tell you what
attributes are being used to group jobs together.  All jobs in a group
have identical values for these attributes.  In some cases, it may be
necessary to tweak the way a particular attribute is being rounded.  See
SCHEDD_ROUND_ATTR in the manual for more information on that:



I've just upgraded the central manager to 6.8.4 and the auto clustering
to be working really well. The load has dropped substantially on both
central manager and the submit host.

many thanks,