[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] condor_status taking ages to report





--On 23 March 2005 16:00 +0000 Matt Hope <matthew.hope@xxxxxxxxx> wrote:
[snip]

I'm still puzzled as to why the collector is taking up so much memory
( getting on for 500 MB ). I've restarted the daemons, rebooted the
machine but no change. How does this scale with the number of startds
in the pool ? At present we have ~ 100 but this is small compared to
some sites. If we run out of real memory and are into swap presumably
it's going to crawl along.

Take a look at a cross selection of the ads in the collector for the startds with condor_status -l vm1@xxxxxxx (ensure you include some which have an active claim) are you exposing a job attribute which can be extremely large perhaps.

The only attributes are pretty much the standard condor ones - I don't think any of
these are excessive.


Note that the collector also holds an add for each schedd and master.
do you have any unecessary schedds cluttering things up (and possibly
slowing down negotiation albeit not by much since iit will be passed
over pretty quickly)

No I don't think so. Jobs are all submitted from a single host. I haven't
really looked at the memory usage before so I'm not sure if this is a recent
thing or not. It's upto 534 MB at the moment, luckily this is on a 1 GB server.
The fact that it has gone up another 60 MB since yesterday despite no
changes to the pool seems to strongly suggest a memory leak to me.


It would be useful if _the_people_who_wrote_this_stuff_ could tell me
how the dynamic memory allocation for the collector scales with no of
startds, schedds etc etc.  At least that way we'd have a handle on the
requirements for the central master.

-ian.




Matt _______________________________________________ Condor-users mailing list Condor-users@xxxxxxxxxxx https://lists.cs.wisc.edu/mailman/listinfo/condor-users