Mailing List Archives
Public Access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] collector memory leak / history truncation
- Date: Tue, 1 Jul 2008 09:57:38 +0200
- From: <ucarlino@xxxxxxxxxx>
- Subject: Re: [Condor-users] collector memory leak / history truncation
Master host is RadHat Linux RHEL 3.
Total machines:
101:condor@lnxgen7/home/condor> condor_status -t
Total Owner Claimed Unclaimed Matched Preempting Backfill
INTEL/LINUX 204 24 88 92 0 0 0
INTEL/WINNT51 2347 204 477 1658 8 0 0
INTEL/WINNT52 12 5 0 7 0 0 0
SUN4u/SOLARIS28 15 11 0 4 0 0 0
SUN4u/SOLARIS29 6 2 0 4 0 0 0
Total 2584 246 565 1765 8 0 0
105:condor@lnxgen7/home/condor> ps auxw | grep condor_
condor 3606 0.1 0.4 21084 17332 ? S Mar12 166:59 /home/condor/6.8.6/Linux-2.4-i386/sbin/condor_master
condor 3674 0.0 0.1 7992 3912 ? S Mar12 19:16 condor_startd -f
condor 3675 0.0 0.0 8168 3572 ? S Mar12 1:36 condor_schedd -f
condor 3676 43.8 3.6 143232 139488 ? S Mar12 69754:49 condor_negotiator -f
condor 25070 16.0 2.5 102648 98528 ? S Jun30 181:47 condor_collector -f
condor 2537 0.0 0.0 1616 472 pts/1 S 01:52 0:00 grep condor_
109:condor@lnxgen7/home/condor> top -bn1 | grep condor_
3676 condor 25 0 136M 136M 2512 R 22.5 3.6 69755m 2 condor_negotiat
25070 condor 15 0 98560 96M 2460 S 3.1 2.5 182:11 0 condor_collecto
3606 condor 15 0 17332 16M 2596 S 0.0 0.4 166:59 0 condor_master
3674 condor 15 0 3912 3912 2840 S 0.0 0.1 19:16 0 condor_startd
3675 condor 15 0 3572 3572 2792 S 0.0 0.0 1:36 3 condor_schedd
Collector is restarted by cron twice a week:
110:condor@lnxgen7/home/condor> crontab -l
# Cron entries for Micron 'is' Condor pool
# Activate on the pool controller using: crontab /home/condor/cron/collector_crontab
#
# Restart the Collector at 7:00 AM on Mondays and Thursdays
0 7 * * mon,thu /home/condor/6.8.6/Linux-2.4-i386/sbin/condor_restart -subsystem Collector >/tmp/collector_restart.log 2>&1
Regards,
Umberto
-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Steffen Grunewald
Sent: 01 July 2008 09:47
To: condor-users@xxxxxxxxxxx
Subject: Re: [Condor-users] collector memory leak / history truncation
On Tue, Jul 01, 2008 at 09:37:11AM +0200, ucarlino@xxxxxxxxxx wrote:
> We've been experiencing the same problem for long time now.
> With 6.8.6, 7.0.1, 7.0.2, 7.1.0 and 7.0.3. They all have the same
> problem.
> And I agree with the fact that it seems proportional with the number
> of machine in the pool.
How many are there? We're running a 600+ node cluster, and after more than 1 million hours of accumulated usage:
# ps auxw | grep condor
condor 491 0.0 0.1 17020 3260 ? Ss Jun02 28:43 /usr/sbin/condor_master
condor 492 14.2 2.8 71480 57960 ? Ss Jun02 5930:59 condor_collector -f
condor 495 1.7 3.0 74520 61300 ? Ss Jun02 730:49 condor_negotiator -f
condor 496 0.0 0.1 18152 3916 ? Ss Jun02 0:24 condor_schedd -f
root 497 0.0 0.1 11812 2492 ? S Jun02 9:03 condor_procd -A /usr/share/condor/local/log/procd_pipe.SCHEDD -C 666
root 9903 0.0 0.0 2748 604 pts/1 R+ 09:44 0:00 grep condor
# top -bn1 | grep condor
492 condor 20 0 71480 56m 2564 R 34 2.8 5931:00 condor_collecto
491 condor 20 0 17020 3260 2168 S 0 0.2 28:43.55 condor_master
495 condor 20 0 74520 59m 2764 S 0 3.0 730:49.44 condor_negotiat
496 condor 20 0 18152 3916 3144 S 0 0.2 0:24.44 condor_schedd
497 root 20 0 11812 2492 1064 S 0 0.1 9:03.47 condor_procd
Extra classAd attributes? (we don't have any...)
Steffen
--
Steffen Grunewald * MPI Grav.Phys.(AEI) * Am Mühlenberg 1, D-14476 Potsdam Cluster Admin * http://pandora.aei.mpg.de/merlin/ * http://www.aei.mpg.de/
* e-mail: steffen.grunewald(*)aei.mpg.de * +49-331-567-{fon:7233,fax:7298} No Word/PPT mails - http://www.gnu.org/philosophy/no-word-attachments.html
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/