[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] condor_startd doesn't start



Dear Condor users,

I tried installing Condor 6.6.5 on a cluster of G5.

At first, I began to install Condor on one machine for a local use according to the newer method of installation.

I checked the two first sections of the condor_config, everything looks good.

if I execute condor_master to start the condor daemons, when I run ps -aux | egrep condor_ to ensure that condor daemons are running, the result is :

[11:34am condor ~/Programmes/condor-6.6.5/sbin]% sudo ./condor_master
Password:
[11:35am condor ~/Programmes/condor-6.6.5/sbin]% ps -aux | egrep condor_
condor 27613 0.3 -0.1 30992 2076 ?? Ss 11:35AM 0:00.08 condor_schedd -f
condor 27612 0.0 -0.0 39412 984 ?? Ss 11:35AM 0:00.03 ./condor_master
condor 27615 0.0 -0.1 30228 1756 ?? Ss 11:35AM 0:00.05 condor_collector -f
condor 27616 0.0 -0.1 30108 1600 ?? Ss 11:35AM 0:00.07 condor_negotiator -f
condor 27660 0.0 -0.0 0 0 std UV+ 11:35AM 0:00.00 egrep condor_

As you can see the daemon condor_startd isn't here.

The Masterlog shows that the daemons exits just a <x-tad-bigger>little time after being launched.

</x-tad-bigger>
6/9 11:10:43 ******************************************************
6/9 11:10:43 ** condor_master (CONDOR_MASTER) STARTING UP
6/9 11:10:43 ** $CondorVersion: 6.6.5 May 3 2004 $
6/9 11:10:43 ** $CondorPlatform: PPC-DARWIN-6_8 $
6/9 11:10:43 ** PID = 12988
6/9 11:10:43 ******************************************************
6/9 11:10:43 Using config file: /Users/condor/Programmes/condor-6.6.5/etc/condor_config
6/9 11:10:43 Using local config files: /Users/condor/Programmes/condor-6.6.5/local.e6-g5/condor_config.local
6/9 11:10:43 DaemonCore: Command Socket at <172.18.45.82:62167>
6/9 11:10:43 Started DaemonCore process "/Users/condor/Programmes/condor-6.6.5/sbin/condor_schedd", pid and pgroup = 12989
6/9 11:10:43 Started DaemonCore process "/Users/condor/Programmes/condor-6.6.5/sbin/condor_startd", pid and pgroup = 12990
6/9 11:10:43 Create_Process:Failed to post listen on command socket(s) (port 9618)
6/9 11:10:43 ERROR: Create_Process failed trying to start /Users/condor/Programmes/condor-6.6.5/sbin/condor_collector
6/9 11:10:43 restarting /Users/condor/Programmes/condor-6.6.5/sbin/condor_collector in 10 seconds
6/9 11:10:43 Create_Process:Failed to post listen on command socket(s) (port 9614)
6/9 11:10:43 ERROR: Create_Process failed trying to start /Users/condor/Programmes/condor-6.6.5/sbin/condor_negotiator
6/9 11:10:43 restarting /Users/condor/Programmes/condor-6.6.5/sbin/condor_negotiator in 10 seconds
6/9 11:10:43 The STARTD (pid 12990) exited with status 4
6/9 11:10:43 Sending obituary for "/Users/condor/Programmes/condor-6.6.5/sbin/condor_startd"
6/9 11:10:43 restarting /Users/condor/Programmes/condor-6.6.5/sbin/condor_startd in 10 seconds
6/9 11:10:53 Started DaemonCore process "/Users/condor/Programmes/condor-6.6.5/sbin/condor_startd", pid and pgroup = 13045
6/9 11:10:53 Create_Process:Failed to post listen on command socket(s) (port 9614)
6/9 11:10:53 ERROR: Create_Process failed trying to start /Users/condor/Programmes/condor-6.6.5/sbin/condor_negotiator
6/9 11:10:53 restarting /Users/condor/Programmes/condor-6.6.5/sbin/condor_negotiator in 120 seconds
6/9 11:10:53 Create_Process:Failed to post listen on command socket(s) (port 9618)
6/9 11:10:53 ERROR: Create_Process failed trying to start /Users/condor/Programmes/condor-6.6.5/sbin/condor_collector
6/9 11:10:53 restarting /Users/condor/Programmes/condor-6.6.5/sbin/condor_collector in 120 seconds
6/9 11:10:53 The STARTD (pid 13045) exited with status 4
...<x-tad-bigger>
</x-tad-bigger>

I saw this error has been encountered by some users : I changed "START = " by "START = TRUE" in the local config file. It doesn't work. Somebody has an idea ?

Thanks,

Jérôme