[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Very slow response from condor_q and condor_status



We moved the {log, execute, spool} directories onto local disks, but
that didn't seem to help.  We then upped the debug information that was
being logged and took a closer look.  We were getting lots of errors
trying to send information back to the folk at wisc.edu.  

	12/6 15:34:41 Can't connect to <128.105.143.14:9618>:0, errno =
110
	12/6 15:34:41 Will keep trying for 10 seconds...
	12/6 15:34:42 Connect failed for 10 seconds; returning FALSE
	12/6 15:34:42 ERROR:
	SECMAN:2003:TCP connection to <128.105.143.14:9618> failed

	12/6 15:34:42 Can't send UPDATE_COLLECTOR_AD to collector
(condor.cs.wisc.edu):\
	 Failed to send UDP update command to collector

I'm not sure if this is related or not, but as soon as we disabled this
reporting our problem went away.  We're now getting very good response
from both 'condor_queue -global' and 'condor_status'.

Thanks for all the help
-Colin Little


-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of John Horne
Sent: Tuesday, December 06, 2005 4:20 PM
To: condor-users@xxxxxxxxxxx
Subject: Re: [Condor-users] Very slow response from condor_q and
condor_status


On Sat, December 3, 2005 20:37, Erik Paulson wrote:
> On Sat, Dec 03, 2005 at 08:32:07PM -0000, Chris Miles wrote:
>> I have the same issue and have put it down to the mounted home
>> directory, as when I occasionaly get the slow response from
>> condor_status If I try to copy a file from /home/condor to the local
>> drive its terribly unresponsive and slow also. Never bothered me
enough
>> to find a solution though.
>>
>
> Problems with NFS file locking, maybe?
>
> Try putting your log files on a local disk - I could believe that
> the problem is Condor writing to a logfile that's on NFS, and either
> the lock takes a long time to accquire, or the write itself takes a
long
> time to finish. Both of those would cause a daemon to just freeze up.
>
We see the same problem with condor_q/condor_status but have no NFS,
everything is running via the local disk. We're running condor 6.7.6.


John.

-- 
---------------------------------------------------------------
John Horne, University of Plymouth, UK  Tel: +44 (0)1752 233914
E-mail: John.Horne@xxxxxxxxxxxxxx       Fax: +44 (0)1752 233839

_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users