[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Avoiding the docker interface for primary IP address



[ubuntu 14.04, condor 8.3.8-338845-deb7, lxc-docker 1.7.1]

I am trying to set up a test personal condor node for use with the docker universe. It initially had two interfaces:

br-lan    inet addr:192.168.56.13
br-wan    inet addr:10.0.2.15

and now because docker is there, it has created a third internal one:

docker0   inet addr:172.17.42.1

The machine's hostname is set to "trusty.ws.nsrc.org" and /etc/hosts includes the following mapping to what I expect to be the primary interface:

192.168.56.13    trusty.ws.nsrc.org trusty

The problem is: jobs are failing to start, and the reason is that Condor is apparently using the docker0 IP address when communicating with services on the same machine - and this is being refused. NegotiatorLog says:

09/01/15 16:38:00 Phase 4.1:  Negotiating with schedds ...
09/01/15 16:38:00 Negotiating with brian@xxxxxxxxxxxxxxxxxx at <172.17.42.1:52390?addrs=172.17.42.1-52390>
09/01/15 16:38:00 0 seconds so far
09/01/15 16:38:00 SECMAN: FAILED: Received "DENIED" from server for user unauthenticated@unmapped using method (no authentication). 09/01/15 16:38:00 ERROR: SECMAN:2010:Received "DENIED" from server for user unauthenticated@unmapped using method (no authentication). 09/01/15 16:38:00 Failed to send NEGOTIATE command to brian@xxxxxxxxxxxxxxxxxx (<172.17.42.1:52390?addrs=172.17.42.1-52390>)

And I also see:

# condor_status -l | grep 172
AddressV1 = "{[ p=\"primary\"; a=\"172.17.42.1\"; port=44341; n=\"Internet\"; ], [ p=\"IPv4\"; a=\"172.17.42.1\"; port=44341; n=\"Internet\"; ]}"
StartdIpAddr = "<172.17.42.1:44341?addrs=172.17.42.1-44341>"
MyAddress = "<172.17.42.1:44341?addrs=172.17.42.1-44341>"

The problem looks similar to https://www-auth.cs.wisc.edu/lists/htcondor-users/2015-July/msg00027.shtml
so I added to /etc/condor/condor_config.local:

IP_ADDRESS = 192.168.56.13
ALLOW_ADMINISTRATOR = *
ALLOW_OWNER = *
ALLOW_READ = *
ALLOW_WRITE = *
ALLOW_NEGOTIATOR = *
ALLOW_NEGOTIATOR_SCHEDD = *

and then shutdown and restarted condor. At this point jobs will now run. However the primary address is still the Docker one:

# condor_status -l | grep Addr
AddressV1 = "{[ p=\"primary\"; a=\"172.17.42.1\"; port=16100; n=\"Internet\"; ], [ p=\"IPv4\"; a=\"172.17.42.1\"; port=16100; n=\"Internet\"; ]}"
StartdIpAddr = "<172.17.42.1:16100?addrs=172.17.42.1-16100>"
MyAddress = "<172.17.42.1:16100?addrs=172.17.42.1-16100>"
HardwareAddress = "...."

# netstat -natp | grep 172
tcp 0 0 172.17.42.1:53 0.0.0.0:* LISTEN 2157/named tcp 0 0 172.17.42.1:34365 172.17.42.1:49112 ESTABLISHED 1842/condor_negotia
tcp        0      0 172.17.42.1:44959 172.17.42.1:49112       TIME_WAIT   -
tcp 0 0 172.17.42.1:49112 172.17.42.1:34365 ESTABLISHED 1843/condor_schedd

How can I fix this? I can see that condor is listening on all interfaces:

# netstat -natp | grep condor | grep LISTEN
tcp 0 0 0.0.0.0:18644 0.0.0.0:* LISTEN 1843/condor_schedd tcp 0 0 0.0.0.0:49112 0.0.0.0:* LISTEN 1843/condor_schedd tcp 0 0 0.0.0.0:63712 0.0.0.0:* LISTEN 1825/condor_master tcp 0 0 0.0.0.0:62305 0.0.0.0:* LISTEN 1833/condor_collect tcp 0 0 0.0.0.0:16100 0.0.0.0:* LISTEN 1844/condor_startd tcp 0 0 0.0.0.0:43781 0.0.0.0:* LISTEN 1843/condor_schedd tcp 0 0 0.0.0.0:49356 0.0.0.0:* LISTEN 1842/condor_negotia tcp 0 0 0.0.0.0:9618 0.0.0.0:* LISTEN 1833/condor_collect

but I'd prefer the advertised address to be one of the "real" interfaces, not the docker0 one which is behind NAT and hence unreachable from elsewhere.

Thanks,

Brian.