[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] memory leak in Condor 7.4.2 schedd ???



Dear All,

I've recently moved to Condor 7.4.2 on our central manager/submit host running
Solaris 10 and found that the schedd seems to be taking a worrying amount
of memory. For instance, at present there are only ~ 150 jobs in the queue and
the schedd is taking over 900 MB. The documentation seems to suggest that it
should only be using around 10 kB per job !! Since this has been rising montonically
seemingly since I restarted the daemons just a few days ago I can only
assume that this is down to a leak.

The net result of this is that condor_q etc can be very slow to respond (more than
five minutes on occasion) and it is difficult to submit more than ~ 1000 jobs
at once whereas before there was no problem with 10 000 jobs. As far as I can
see the auto-clustering is working fine although I sometimes see in the schedd log
messages about rebuilding tables ??

Anyone else seem this on other systems ?

Any suggestions for a fix/workaround ?

regards,

-ian.

--------------------------------------------
Dr Ian C. Smith,
Advanced Research Computing (e-Science) Team,
The University of Liverpool,
Computing Services Department.