Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Lots of TIME_WAIT sockets killing server

Date: Tue, 1 Jun 2010 14:19:39 +0200
From: "J.A. Gutierrez" <spd@xxxxxxxxxxxxxxxxxxxx>
Subject: [Condor-users] Lots of TIME_WAIT sockets killing server

	Hello

	I've found a problem in condor and I can't find the cause:

	Since we upgrade our Linux condor slave ("execute") nodes
	from Fedora Core 2 to CentOS 5.2 (and then, to CentOS 5.4),
	if condor is active for a couple of days, the condor master host
	gets its connection table filled with thousands of "TIME_WAIT"
	sockets, so no new connections can be opened and the server
	(which also acts as central NFS/NIS+ server) gets killed.


	Our current setup is:

	* NFS/NIS+/Condor master server:

	- Sun SPARC server running Solaris 8.
	- Condor master version 7.4.2

	* NFS/NIS+/Condor clients:

	- x86 PC's running Linux CentOS 5.4
	- Condor 7.4.2
	(when the server starts getting irresponsive, usually there are
	no more than 6 PC's running condor)


	Condor configuration:

	- Common FILESYSTEM_DOMAIN/UID_DOMAIN on master and slaves
	- USE_NFS = False 
	- USE_AFS = False
	- ~condor is local on every PC
	- mostly default settings for everything


	IIRC, the problem started with the upgrade from Fededora Core 2
	to Centos 5.2, while keeping the same condor installation.
	Then, I upgraded condor to current release, but I got the same
	problem.


	Any idea?


	Thanks...


-- 
PGP and other useless info at      \
http://webdiis.unizar.es/~spd/      \
finger://daphne.cps.unizar.es/spd    \       Timeo Danaos et dona ferentes
ftp://ivo.cps.unizar.es/pub/          \                         (Virgilio)

Follow-Ups:
- Re: [Condor-users] Lots of TIME_WAIT sockets killing server
  - From: Steven Timm

Prev by Date: [Condor-users] Help going a little green
Next by Date: Re: [Condor-users] Help going a little green
Previous by thread: Re: [Condor-users] Help going a little green
Next by thread: Re: [Condor-users] Lots of TIME_WAIT sockets killing server
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

[Condor-users] Lots of TIME_WAIT sockets killing server