[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[condor-users] condor_status and condor_q failing

Hash: SHA1


On our cluster, occasionally when there is a lot of work going on and a
lot of jobs in the queue, condor_q and condor_status have a hard time
connecting to the collector.  Is there any specific reason/fix for this?
~ Below is one of the messages we get, more comments to follow
(hostnames/IPs removed):

CEDAR:6001:Failed to connect to <###.##.#.##:9618>
Error: Couldn't contact the condor_collector on hostname.domainname.

Extra Info: the condor_collector is a process that runs on the central
manager of your Condor pool and collects the status of all the machines and
jobs in the Condor pool. The condor_collector might not be running, it might
be refusing to communicate with you, there might be a network problem, or
there may be some other problem. Check with your system administrator to fix
this problem.

If you are the system administrator, check that the condor_collector is
running on hostname.domainname, check the HOSTALLOW configuration in
your condor_config, and check the MasterLog and CollectorLog files in your
log directory for possible clues as to why the condor_collector is not
responding. Also see the Troubleshooting section of the manual.

I'm running this command from the condor master itself, so HOSTALLOW
isn't an issue (and I know it's not because a lot of the time those
commands work, it's just maybe 10% of the time under load).  Also, when
this happens, there is no corresponding entry in the MasterLog or
CollectorLog to indicate a problem.

This is running Condor 6.5.5, RedHat 9 dynamic package, under Gentoo Linux.


- --
Corey Shields - IU Unix Systems Support Group

My PGP/GPG public encryption key is at:
GPG fingerprint: 78A8 E5EB E455 0A90 F392 59BC A6AF F8A3 A304 1453
Version: GnuPG v1.2.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org


Condor Support Information:
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>