Mailing List Archives
Public Access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] New User: Condor_status problem
Hi,
I've just installed condor (6.8.5) on a small test network of 2 PCs (both on
Windows XP). PC1 is the central manager and only submits jobs whilst PC2
executes them. When I run 'condor_status', I only see PC2 in the pool.
Running 'condor_status -direct <PC1_name>' gives me the error :
"CEDAR:6001:Failed to connect to <192.168.xx.xxx:2907>"
"Error: Couldn't contact the condor_collector on <192.168.xx.xxx:2907>
Extra Info...." .
It seems strange to me that the collector (running on PC1) is communicating
with PC2, but not with PC1 itself.
I have checked that the condor_collector is running (which it is). I have
turned off the firewalls on both PCs, tried manually enabling the port 2907
on PC1, checked that the DAEMON_LIST in my condor_config file contains all
the daemons (master, collector, negotiator, schedd, - and startd for PC2),
checked the COLLECTOR_HOST (which is set to $(CONDOR_HOST) i.e. PC1),
HOSTALLOW_READ/WRITE all contain PC1, and still no good. Also checked the
log files. Last few lines from CollectorLog and MasterLog are attached at
end.
Also, after I do a 'condor_restart', executing 'condor_status' just echos a
blank line (on both PC1 and PC2) for about 15minutes or so, after which it
reverts to showing me PC2 in the pool.
I have read and re-read the manual and the user-lists and can't find any
answers. Would really appreciate any ideas on what I may be doing wrong.
Many thanks in advance.
Bobane
Bristol -UK.
.................................
NegotiatorAd : Inserting ** "< PC1.mydomain >"
7/27 17:28:38 DC_AUTHENTICATE: attempt to open invalid session
PC1:3716:xxxxxxxxxx:11, failing.
7/27 17:32:46 (Sending 7 ads in response to query)
7/27 17:32:46 Got QUERY_STARTD_PVT_ADS
7/27 17:32:46 (Sending 1 ads in response to query)
7/27 17:32:46 NegotiatorAd : Inserting ** "< PC1.mydomain>"
7/27 17:33:17 Got QUERY_STARTD_ADS
7/27 17:33:17 (Sending 1 ads in response to query)
7/27 17:33:37 ** Master < mesh1 > rejuvenated from recently down
7/27 17:33:37 stats: Inserting new hashent for
'Master':'PC2':'192.168.xx.xxx'
7/27 17:37:46 (Sending 8 ads in response to query)
7/27 17:37:46 Got QUERY_STARTD_PVT_ADS
7/27 17:37:46 (Sending 1 ads in response to query)
7/27 17:37:46 NegotiatorAd : Inserting ** "< PC1.mydomain >"
And from Masterlog:
7/27 17:14:01 ******************************************************
7/27 17:14:01 ** condor_master (CONDOR_MASTER) STARTING UP
7/27 17:14:01 ** C:\condor\bin\condor_master.exe
7/27 17:14:01 ** $CondorVersion: 6.8.5 May 17 2007 $
7/27 17:14:01 ** $CondorPlatform: INTEL-WINNT50 $
7/27 17:14:01 ** PID = 1204
7/27 17:14:01 ** Log last touched 7/27 17:13:46
7/27 17:14:01 ******************************************************
7/27 17:14:01 Using config source: C:\condor\condor_config
7/27 17:14:01 Using local config sources:
7/27 17:14:01 C:\condor/condor_config.local
7/27 17:14:01 DaemonCore: Command Socket at <192.168.0.xxx>
7/27 17:14:01 Started DaemonCore process
"C:\condor/bin/condor_collector.exe", pid and pgroup = 2204
7/27 17:14:01 Started DaemonCore process
"C:\condor/bin/condor_negotiator.exe", pid and pgroup = 1332
7/27 17:14:01 Started DaemonCore process "C:\condor/bin/condor_schedd.exe",
pid and pgroup = 2544
BEGIN:VCARD
VERSION:2.1
N:Tameh;Eustace K
FN:Eustace K Tameh
EMAIL;PREF;INTERNET:Tek.Tameh@xxxxxxxxxxxxx
REV:20070727T165454Z
END:VCARD