[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Condor Version 6.9.0 X86_64 - GCB clients fail to start



Hi All,

I'm using Condor Version 6.9.0 the X86_64 version and I'm trying to setup
GCB.
I've configured my central_master machine according to the documentation
and it successfully starts the gcb_broker and the gcb_relay_server.
My network layout is as follows: To one nic I've connected a switch which
connects, for now, 10 boxes. These boxes are all configured as execute
machines.
Works like a charm (without gcb). Now this little network is private, on
the other nic present at the central_master we've got a public network. In
this public network, among many other things, there's my notebook, dying
to submit a job to the cluster.
Without GCB this won't work since my laptop on the public side won't be
able to directly connect to the execute machines on the private side, and
so the files necessary to start the job can't be transfered. GCB jumps to
mind.

Well, when I add the following lines to my notebook configuration (or to
one of the execute machines):

NET_REMAP_ENABLE = true
NET_REMAP_SERVICE = gcb
NET_REMAP_INAGENT = ip.address.of.gcbserver

I'm presented with the following error in the MasterLog, and as you
guessed, condor won't start.

12/7 22:11:45 ******************************************************
12/7 22:11:45 ** condor_master (CONDOR_MASTER) STARTING UP
12/7 22:11:45 ** /opt/condor/sbin/condor_master
12/7 22:11:45 ** $CondorVersion: 6.9.0 Oct 18 2006 $
12/7 22:11:45 ** $CondorPlatform: X86_64-LINUX_RHEL3 $
12/7 22:11:45 ** PID = 21763
12/7 22:11:45 ** Log last touched 12/7 22:11:43
12/7 22:11:45 ******************************************************
12/7 22:11:45 Using config source: /opt/condor/etc/condor_config
12/7 22:11:45 Using local config sources:
12/7 22:11:45    /var/condor/condor_config.local
12/7 22:11:45 GCB: GCB_bind: _myIP failed
12/7 22:11:45 Failed to bind to command ReliSock
12/7 22:11:45 (Make sure your IP address is correct in /etc/hosts.)
12/7 22:11:45 ERROR "BindAnyCommandPort failed" at line 6808 in file
daemon_core.C

I've tried to google this thingy but nothing came up. Anybody seen this
before? Suggestions?

Kind Regards,

Cor

-- 
A lie told often enough becomes the truth.

Lenin (1870 - 1924)