[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] GCB Socket connection issues (others never had this resolved?)



Hi all,

I'm trying to set up flocking between two small condor pools, as a test case for a more complicated arrangement, and I'm using a GCB to mediate connections, as one of the pools is essentially private (or, at least, firewalled enough that it can be regarded as such).

However, after enabling the NET_REMAP variables in condor_config, the condor services no longer run on those nodes: the MasterLog having errors like

GCB: GCB_bind(6[GCB_SOCKET], 0x7fbffff910-><0.0.0.0:0>): Unable to determine local IP address (_myIP failed).
8/13 11:49:19 Failed to bind to command ReliSock
8/13 11:49:19 (Make sure your IP address is correct in /etc/hosts.)
8/13 11:49:19 ERROR "BindAnyCommandPort() failed" at line 8403 in file daemon_core.C

My /etc/hosts contains both localhost and also (in a bid to fix this issue) an explicit mapping for the network ip of the machine.

I notice, in the archives, that Cor Cornelisse had similar problems, which were determined (by them) to be specific to certain NICs. However, there was never any conclusion to the matter posted on any threads I can find - despite the developers expressing an interest in looking into the matter last year. (See, for example https://lists.cs.wisc.edu/archive/condor-users/2007-January/msg00179.shtml )

Does anyone know what the issue is with this? Cor, did you solve the issue yourself?
Is this not, in fact, my problem?

Any assistance welcome,
Sam Skipsey