[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Condor-users] condor_status taking ages to report



Matt,

This was the system that Jaime and I were trying to sort out at Condor Week,
without success. Having seen the Condor_ID settings in the central nodes
config file, I suggested changing these to the condor user since the daemons
were not running as the user you would expect. I think the problems came about
as a result of this. There are a LOT of 30s timeouts at the new machine's end.
I haven't checked on the central node's machine again for what is happening there.

Cheers

JK

> -----Original Message-----
> From: condor-users-bounces@xxxxxxxxxxx
> [mailto:condor-users-bounces@xxxxxxxxxxx]On Behalf Of Matt Hope
> Sent: 23 March 2005 15:02
> To: Condor-Users Mail List
> Subject: Re: [Condor-users] condor_status taking ages to report
> 
> 
> On Wed, 23 Mar 2005 10:52:43 +0000, Dr Ian C. Smith
> <i.c.smith@xxxxxxxxxxxxxxx> wrote:
> > Hi,
> > 
> > I've had a Condor pool working fine now for several months
> > but after making a small change to the condor_config
> > on the central manager condor_status and condor_q -global
> > are taking now taking over five minutes to respond (if at all !).
> > 
> > The manager is running condor 6.6.5 on a Sun-Blade-1000
> > with solaris 8. We have around 100 Wintel execute hosts in the pool
> > The load average is < 0.1 so I don't see this as a problem.
> > The condor_collector has been taking upto ~ 500 MB of memory
> > which seems a huge amount and makes me suspect a memory leak.
> > Any one else seen anything similar ?
> > 
> > Any help on this would be very much appreciated !
> 
> perhaps an indication of what the small change you made was 
> would be useful...
> 
> Note that condor_q -global is a BAD thing to do, especially if your
> pool is running slowly since it locks the schedd on your version
> slowing down negotiation/job starting/preemptions etc tec.
> 
> The collector sounds like it is far too much (hav you tried restarting
> it?) you haven't accidentally upped the number of startds running per
> machine or added some horrifically large value to all classads have
> you?
> 
> Matt
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>