[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] problem with condor_status on non-central-manager condor machine



Dear all,

I installed condor 6.8.1 on two machines, one of which is the central
manager. 

Check the status on the central manager:
[condor@nini ~]$ condor_master
[condor@nini ~]$ condor_status

Name          OpSys       Arch   State      Activity   LoadAv Mem
ActvtyTime

vm1@nini      LINUX       INTEL  Owner      Idle       0.350   503  0
+00:00:09
vm2@nini      LINUX       INTEL  Owner      Idle       0.000   503  0
+00:00:10

                     Total Owner Claimed Unclaimed Matched Preempting
Backfill

         INTEL/LINUX     2     2       0         0       0          0
0

               Total     2     2       0         0       0          0
0

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
However, when checking the status on the other machine, I got a problem:
[condor@condor3 ~]$ condor_master
[condor@condor3 ~]$ condor_status
CEDAR:6001:Failed to connect to <129.254.175.78:9618>
Error: Couldn't contact the condor_collector on nini.

Extra Info: the condor_collector is a process that runs on the central
manager of your Condor pool and collects the status of all the machines
and
jobs in the Condor pool. The condor_collector might not be running, it
might
be refusing to communicate with you, there might be a network problem,
or
there may be some other problem. Check with your system administrator to
fix
this problem.

If you are the system administrator, check that the condor_collector is
running on nini, check the HOSTALLOW configuration in your
condor_config, and
check the MasterLog and CollectorLog files in your log directory for
possible
clues as to why the condor_collector is not responding. Also see the
Troubleshooting section of the manual.
[condor@condor3 ~]$

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

>From the status from the central manager, is it running properly? How
can I check whether condor_collector is running on nini(central manager
hostname) or not?

The HOSTALLOW configurations are all * in condor_config.

The MasterLog on the non-central-manager shows:

10/18 10:53:53 ERROR: SECMAN:2003:TCP connection to
<129.254.175.78:9618> failed
10/18 10:53:53 Failed to start non-blocking update to
<129.254.175.78:9618>.
10/18 10:58:32 attempt to connect to <129.254.175.78:9618> failed: No
route to host (connect errno = 113).  Will keep trying for 20 total
seconds (20 to go).

10/18 10:58:53 attempt to connect to <129.254.175.78:9618> failed: No
route to host (connect errno = 113).
10/18 10:58:53 ERROR: SECMAN:2003:TCP connection to
<129.254.175.78:9618> failed
10/18 10:58:53 Failed to start non-blocking update to
<129.254.175.78:9618>.

And the condor_config.local file in the non-central-manager is empty.
What's the problem with the non-central-manager? Is it not installed
well or not configured well?