[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] execute hosts advertise loopback address



Just chiming in quickly to say I have a similar sounding issue on my Win10 Condor clients, running Condor 8.8.3.  I have a quick audit script running across my machines at present, and I had around 12 percent of them advertising 127.0.0.1 today.

Running 'condor_config_val IP_ADDRESS' on an affected machine always returns the correct IP address.

It seems to be related to machines coming out of sleep.  A service restart or PC restart always fixes it, and honestly all Iâve done with it so far is to automate a restart of the Condor service if the client's 'shared_port_ad' file has the loopback address in it.  

Weâre back on campus now though, with a little time on our hands, so I hope to investigate this properly in the near future.  Iâll report any findings here.

Cheers, Craig

On 22/05/2020, at 11:23 PM, Smith, Ian <I.C.Smith@xxxxxxxxxxxxxxx> wrote:

Hi again,

I've not been able to login directly to the execute hosts yet although I hope to try remote login next week. I have though tried getting the 
config values remotely using something like this:

$ condor_config_val -address "<138.253.107.4:9612>" IP_ADDRESS

I'm assuming that this does actually contact the startd on the execute host to retrieve the info ?

On the machines that advertise the loopback address I do see a value of 127.0.0.1 most of the time (although sometimes it is UNDEFINED).  Also
I get:

$ condor_config_val -address "<138.253.107.4:9612>" -dump NETWORK
# Configuration from master on (null) <138.253.107.4:9612>

# Parameters with names that match NETWORK:
NETWORK_HOSTNAME =
NETWORK_INTERFACE = *
NETWORK_MAX_PENDING_CONNECTS = 0
PRIVATE_NETWORK_INTERFACE =
PRIVATE_NETWORK_NAME = $(FULL_HOSTNAME)
VM_NETWORKING = false
VM_NETWORKING_DEFAULT_TYPE = nat
VM_NETWORKING_MAC_PREFIX =
VM_NETWORKING_TYPE = nat
VMWARE_BRIDGE_NETWORKING_TYPE = bridged
VMWARE_NAT_NETWORKING_TYPE = nat
VMWARE_NETWORKING_TYPE = nat
# Contributing configuration file(s):
#       <Default>
#       C:\Condor\condor_config

for all the hosts (working properly or not). 


Interestingly some hosts on the same subnet advertise the loopback whereas others advertise the correct address and *this is not
consistent*. On subsequent forced wake ups I see different machines advertising the correct/incorrect address. This strongly
suggests a race condition on the service start up to me and I'll see if I can check this by remotely starting the htcondor service.

One other thing - is it possible to get the daemons to bind to a specific interface in Windows - similar to eth0 in Linux ?

thanks again,

-ian.

  


From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Todd L Miller <tlmiller@xxxxxxxxxxx>
Sent: 21 May 2020 17:59
To: HTCondor-Users Mail List
Subject: Re: [HTCondor-users] execute hosts advertise loopback address
 
> The comment about the service startup order is interesting. If this 
> isn't explicity set then I could imagine a race condition between 
> htcondor and the network service which would explain why some machines 
> get the correct interface address and some get the loopback. I'll get 
> back to you when I have some more information.

         On Linux, this was the cause of a lot of problems with HTCondor 
advertising loopback addresses, particularly because some distributions 
considered the network to be up when the loopback interface was ready, not 
when DHCP (or whatever) had finished.  I don't know what the case is on 
Windows.

- ToddM
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://apc01.safelinks.protection.outlook.com/?url="">

The archives can be found at:
https://apc01.safelinks.protection.outlook.com/?url="">