[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] failing remote condor_q

Turns out the problem was that the collector on machine B
had HOSTALLOW_READ blocked via a list of IP's via its condor_config.local,
even though the schedd on machine A did not.  That was the confusion.

Steve Timm

On Wed, 24 May 2006, Steven Timm wrote:

I am trying to do a remote query on a condor pool which is not
listed in my condor_config file at all. Its schedd is running
on machine A and the collector is running on machine B

I try to do

condor_q -name A -pool B

and get the following error:

Error: Couldn't contact the condor_collector on .

Extra Info: the condor_collector is a process that runs on the central
manager of your Condor pool and collects the status of all the machines
jobs in the Condor pool. The condor_collector might not be running, it
be refusing to communicate with you, there might be a network problem, or
there may be some other problem. Check with your system administrator to
this problem.

If you are the system administrator, check that the condor_collector is
running on , check the HOSTALLOW configuration in your condor_config, and
check the MasterLog and CollectorLog files in your log directory for
clues as to why the condor_collector is not responding. Also see the
Troubleshooting section of the manual.


It's like the condor_q command is looking
for the condor_collector in the wrong place somehow.
From inside that same cluster, the command works fine.
As far as I can tell, all the hostallow settings of the config
are correct, and the same as other pools where the same command
works fine.

Steve Timm

Steven C. Timm, Ph.D  (630) 840-8525  timm@xxxxxxxx  http://home.fnal.gov/~timm/
Fermilab Computing Div/Core Support Services Dept./Scientific Computing Section
Assistant Group Leader, Farms and Clustered Systems Group
Lead of Computing Farms Team