[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor Version 6.9.0 X86_64 - GCB clients failto start



> Cor Cornelisse <ccorneli@xxxxxxxx> wrote:
>> 12/7 22:11:45 GCB: GCB_bind: _myIP failed
>
> The most likely cause is that your machine (the one with the
> master) doesn't have any active IP addresses beyond loopback
> (127.0.0.1).  That seems plausible on your laptop if you tried to
> start Condor before attaching to a network.
>
> That doesn't explain why you would see that error message on your
> execute nodes, which presumably are working fine.  To take a wild
> guess, are you starting Condor in your init scripts?  If so, is
> Condor possibly higher priority than initializing the network?
> Having Condor start before the network is up if a recipe for
> problems.
>
> If that's not the case for your execute nodes, you might want to
> double check that you're not seeing a different error.
>
> --
> Alan De Smet                              Condor Project Research
> adesmet@xxxxxxxxxxx                 http://www.condorproject.org/
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at either
> https://lists.cs.wisc.edu/archive/condor-users/
> http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR
>

Hi,

I'm sure networking is up before condor. I do start the service through
init scripts, but to test if your hypothesis is correct, I simply
restarted the condor service, resulting in the same error. So the network
is definitly up and running. I set the masterlog debug option to D_ALL,
this gives a little more debug information, but still not enough for me to
understand what's going wrong (it looks like it's trying to bind to
0.0.0.0 :s)

Anyone?

12/8 18:29:12 (fd:3) (pid:4559) Using config source:
/opt/condor/etc/condor_config
12/8 18:29:12 (fd:3) (pid:4559) Using local config sources:
12/8 18:29:12 (fd:3) (pid:4559)    /var/condor/condor_config.local
12/8 18:29:12 (fd:5) (pid:4559) Attempting to lock
/tmp/condor-lock.portal0.998036533202143/InstanceLock.
12/8 18:29:12 (fd:6) (pid:4559) Obtained lock on
/tmp/condor-lock.portal0.998036533202143/InstanceLock.
12/8 18:29:12 (fd:6) (pid:4559) Setting up command socket
12/8 18:29:12 (fd:6) (pid:4559) CONDOR_INHERIT: is NULL
12/8 18:29:12 (fd:7) (pid:4559) GCB: GCB_socket(fd = 6, TCP)
12/8 18:29:12 (fd:7) (pid:4559) PRIV_CONDOR --> PRIV_ROOT at sock.C:526
12/8 18:29:12 (fd:7) (pid:4559) GCB: GCB_bind(6[GCB_SOCKET], <0.0.0.0:0>)
12/8 18:29:12 (fd:7) (pid:4559) GCB: GCB_bind: _myIP failed
12/8 18:29:12 (fd:7) (pid:4559) PRIV_ROOT --> PRIV_CONDOR at sock.C:532
12/8 18:29:12 (fd:7) (pid:4559) bind failed errno = 0
12/8 18:29:12 (fd:7) (pid:4559) Failed to bind to command ReliSock
12/8 18:29:12 (fd:7) (pid:4559) (Make sure your IP address is correct in
/etc/hosts.)
12/8 18:29:12 (fd:7) (pid:4559) ERROR "BindAnyCommandPort failed" at line
6808 in file daemon_core.C


-- 
A lie told often enough becomes the truth.

Lenin (1870 - 1924)