[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] execute hosts advertise loopback address

Hi Ian


Have you tried putting ip subnet info in NETWORK_INTERFACE, rather than just *?


e.g. NETWORK_INTERFACE = 138.253.*


I think in the dim dark past we had a similar intermittent issue but have never had

problems since adding our network subnets, at least on our windows machines.


Linux VMs (VMWare, vSphere, ESX servers) still require a cron job to check the

condor network binding as they occasionally come up bound to the loopback

address after outages/rebooting.






From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Smith, Ian
Sent: Thursday, 28 May 2020 4:48 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] execute hosts advertise loopback address


Hello Again,


I've now had to chance to remotely login to a few of the Windows execute hosts and find pretty much the same as below.



condor_config_val IP_ADDRESS


always returns the correct IP address even if the loopback address is adverstised.  On restarting the HTCondor service the

correct address then gets advertised (this seems to be repeatable).


The service is set as Automatic (delayed start) with a dependency on DHCP. If anyone knows a way of delaying this further (or

restarting it automatically) , I'd be grateful to hear it.


As a workaround, I'm going to set things up so that I can restart the HTCondor processes on the execute hosts remotely where

machines advertise the loopback address. Not ideal - but hopefully an improvement.






From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Craig Parker <craig.parker@xxxxxxxxx>
Sent: 26 May 2020 04:37
To: HTCondor-Users Mail List
Subject: Re: [HTCondor-users] execute hosts advertise loopback address


Just chiming in quickly to say I have a similar sounding issue on my Win10 Condor clients, running Condor 8.8.3.  I have a quick audit script running across my machines at present, and I had around 12 percent of them advertising today.


Running 'condor_config_val IP_ADDRESS' on an affected machine always returns the correct IP address.


It seems to be related to machines coming out of sleep.  A service restart or PC restart always fixes it, and honestly all I’ve done with it so far is to automate a restart of the Condor service if the client's 'shared_port_ad' file has the loopback address in it.  


We’re back on campus now though, with a little time on our hands, so I hope to investigate this properly in the near future.  I’ll report any findings here.


Cheers, Craig

On 22/05/2020, at 11:23 PM, Smith, Ian <I.C.Smith@xxxxxxxxxxxxxxx> wrote:


Hi again,


I've not been able to login directly to the execute hosts yet although I hope to try remote login next week. I have though tried getting the 

config values remotely using something like this:


$ condor_config_val -address "<>" IP_ADDRESS


I'm assuming that this does actually contact the startd on the execute host to retrieve the info ?


On the machines that advertise the loopback address I do see a value of most of the time (although sometimes it is UNDEFINED).  Also

I get:


$ condor_config_val -address "<>" -dump NETWORK

# Configuration from master on (null) <>


# Parameters with names that match NETWORK:













# Contributing configuration file(s):

#       <Default>

#       C:\Condor\condor_config


for all the hosts (working properly or not). 


Interestingly some hosts on the same subnet advertise the loopback whereas others advertise the correct address and *this is not

consistent*. On subsequent forced wake ups I see different machines advertising the correct/incorrect address. This strongly

suggests a race condition on the service start up to me and I'll see if I can check this by remotely starting the htcondor service.


One other thing - is it possible to get the daemons to bind to a specific interface in Windows - similar to eth0 in Linux ?


thanks again,




From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Todd L Miller <tlmiller@xxxxxxxxxxx>
Sent: 21 May 2020 17:59
To: HTCondor-Users Mail List
Subject: Re: [HTCondor-users] execute hosts advertise loopback address


> The comment about the service startup order is interesting. If this 
> isn't explicity set then I could imagine a race condition between 
> htcondor and the network service which would explain why some machines 
> get the correct interface address and some get the loopback. I'll get 
> back to you when I have some more information.

         On Linux, this was the cause of a lot of problems with HTCondor 
advertising loopback addresses, particularly because some distributions 
considered the network to be up when the loopback interface was ready, not 
when DHCP (or whatever) had finished.  I don't know what the case is on 

- ToddM
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting

The archives can be found at:

HTCondor-users mailing list
To unsubscribe, send a message to 
htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting

The archives can be found at: