I am a linux newbie and trying to install condor for my academic project.
I have been stuck with this problem for quite a while now and after trying to find out the cause for it,I have given up.
I installed condor 6.6.10 on fedora core 4 which is running on vmware workstation 5.0 on my laptop.
I have two copies(central manager and working nodes) of fedora core 4 running on windows(host) operating system and I installed condor on both.
I can ping and ssh both the central manager and working node from each other and they seem to be communicating well.
Then I set the condor_config environment variable to /usr/local/condor 6.6.10/etc/condor_config
Then i start the master daemon and do ps aux | egrep condor_.I can see the required daemons running.
./condor_status shows me the central manager as available.
and remaining are the same steps.The startd,schedd and master daemons start properly.
MASTER LOG
2/11 10:22:37 ******************************************************
2/11 10:22:37 ** condor_master (CONDOR_MASTER) STARTING UP
2/11 10:22:37 ** /usr/local/condor-6.6.10/sbin/condor_master
2/11 10:22:37 ** $CondorVersion: 6.6.10 Jun 13 2005 $
2/11 10:22:37 ** $CondorPlatform: I386-LINUX_RH9 $
2/11 10:22:37 ** PID = 2738
2/11 10:22:37 ******************************************************
2/11 10:22:37 Using config file:
/usr/local/condor-6.6.10/etc/condor_config
2/11 10:22:37 Using local config files:
/usr/local/condor-6.6.10/local.slave/condor_config.local
2/11 10:22:37 DaemonCore: Command Socket at <
192.168.60.129:32770>
2/11 10:22:37 Started DaemonCore process
"/usr/local/condor-6.6.10/sbin/condor_schedd", pid and pgroup = 2739
2/11 10:22:37 Started DaemonCore process
"/usr/local/condor-
6.6.10/sbin/condor_startd", pid and pgroup = 2740
2/11 10:22:43 Can't connect to <
192.168.60.128:9618>:0, errno = 113
2/11 10:22:43 Will keep trying for 10 seconds...
2/11 10:23:01 Connect failed for 10 seconds; returning FALSE
2/11 10:23:01 ERROR:
SECMAN:2003:TCP connection to <
192.168.60.128:9618> failed
2/11 10:23:01 Can't send UPDATE_MASTER_AD to collector
<
192.168.60.128:9618>: Failed to send UDP update command to collector
2/11 10:28:01 Can't connect to <
192.168.60.128:9618>:0, errno = 113
2/11 10:28:01 Will keep trying for 10 seconds...
2/11 10:28:13 Connect failed for 10 seconds; returning FALSE
2/11 10:28:13 ERROR:
SECMAN:2003:TCP connection to <
192.168.60.128:9618
> failed
2/11 10:28:13 Can't send UPDATE_MASTER_AD to collector
<
192.168.60.128:9618>: Failed to send UDP update command to collector
2/11 10:33:13 Can't connect to <
192.168.60.128:9618>:0, errno = 113
2/11 10:33:13 Will keep trying for 10 seconds...
2/11 10:33:24 Connect failed for 10 seconds; returning FALSE
2/11 10:33:24 ERROR:
SECMAN:2003:TCP connection to <
192.168.60.128:9618> failed
SchedLog
2/11 10:22:38 ******************************************************
2/11 10:22:38 ** condor_schedd (CONDOR_SCHEDD) STARTING UP
2/11 10:22:38 ** /usr/local/condor-6.6.10/sbin/condor_schedd
2/11 10:22:38 ** $CondorVersion: 6.6.10 Jun 13 2005 $
2/11 10:22:38 ** $CondorPlatform: I386-LINUX_RH9 $
2/11 10:22:38 ** PID = 2739
2/11 10:22:38 ******************************************************
2/11 10:22:38 Using config file:
/usr/local/condor-6.6.10/etc/condor_config
2/11 10:22:38 Using local config files:
/usr/local/condor-6.6.10/local.slave/condor_config.local
2/11 10:22:38 DaemonCore: Command Socket at <
192.168.60.129:32771>
2/11 10:22:39 Can't connect to <
192.168.60.128:9618>:0, errno = 113
2/11 10:22:39 Will keep trying for 10 seconds...
2/11 10:22:50 Connect failed for 10 seconds; returning FALSE
2/11 10:22:50 ERROR:
SECMAN:2003:TCP connection to <
192.168.60.128:9618> failed
2/11 10:27:50 Can't connect to <
192.168.60.128:9618>:0, errno = 113
2/11 10:27:50 Will keep trying for 10 seconds...
2/11 10:28:02 Connect failed for 10 seconds; returning FALSE
2/11 10:28:02 ERROR:
SECMAN:2003:TCP connection to <
192.168.60.128:9618> failed
StartLog
2/11 10:22:38 ******************************************************
2/11 10:22:38 ** condor_startd (CONDOR_STARTD) STARTING UP
2/11 10:22:38 ** /usr/local/condor-6.6.10/sbin/condor_startd
2/11 10:22:38 ** $CondorVersion: 6.6.10 Jun 13 2005 $
2/11 10:22:38 ** $CondorPlatform: I386-LINUX_RH9 $
2/11 10:22:38 ** PID = 2740
2/11 10:22:38 ******************************************************
2/11 10:22:38 Using config file:
/usr/local/condor-6.6.10/etc/condor_config
2/11 10:22:38 Using local config files:
/usr/local/condor-6.6.10/local.slave/condor_config.local
2/11 10:22:38 DaemonCore: Command Socket at <
192.168.60.129:32772>
2/11 10:22:50 New machine resource allocated
2/11 10:22:50 About to run initial benchmarks.
2/11 10:22:55 Completed initial benchmarks.
2/11 10:22:55 State change: IS_OWNER is false
2/11 10:22:55 Changing state: Owner -> Unclaimed
2/11 10:23:01 Can't connect to <
192.168.60.128:9618>:0, errno = 113
2/11 10:23:01 Will keep trying for 10 seconds...
2/11 10:23:12 Connect failed for 10 seconds; returning FALSE
2/11 10:23:12 ERROR:
SECMAN:2003:TCP connection to <
192.168.60.128:9618> failed
2/11 10:23:12 Error sending update to the collector
<
192.168.60.128:9618>: Failed to send UDP update command to collector
2/11 10:23:12 Error sending update to collector(s)
2/11 10:27:59 Can't connect to <
192.168.60.128:9618>:0, errno = 113
2/11 10:27:59 Will keep trying for 10 seconds...
2/11 10:28:11 Connect failed for 10 seconds; returning FALSE
2/11 10:28:11 ERROR:
SECMAN:2003:TCP connection to <
192.168.60.128:9618> failed
2/11 10:28:11 Error sending update to the collector
<
192.168.60.128:9618>: Failed to send UDP update command to collector
2/11 10:28:11 Error sending update to collector(s)
2/11 10:32:59 Can't connect to <
192.168.60.128:9618>:0, errno = 113
2/11 10:32:59 Will keep trying for 10 seconds...
2/11 10:33:11 Connect failed for 10 seconds; returning FALSE
2/11 10:33:11 ERROR:
SECMAN:2003:TCP connection to <
192.168.60.128:9618> failed
Any advice on this would really be helpful