Re: [HTCondor-users] execute hosts advertise loopback address

Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

If you log in to one of these machines and run

condor_config_val IP_ADDRESS

is the result 127.0.0.1 ?

This would indicate that Htcondor is unable to determine which interface is external, OR that it has

been explicitly configured to bind only to the loopback.

try

condor_config_val -dump NETWORK

is NETWORK_INTERFACE set to something?

Do the public interfaces of these machines perhaps have IPv4 disabled, so they are IPv6 only?

A newer HTCondor like 8.8.9 will have better support for IPv6, including the ability to prefer it

or to prefer IPv4

If you restart condor on the machine, does it continue to advertise the loopback? If so, the problem may

be that the network is not initialized and so only the loopback can be found at the time that condor_starts up.

You might also want to check in the services control panel to make sure that Htcondor is not started until

after the network service, this should be setup automatically by the MSI installer package, but it’s worth checking.

-tj

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Smith, Ian
Sent: Thursday, May 21, 2020 8:49 AM
To: htcondor-users@xxxxxxxxxxx
Subject: [HTCondor-users] execute hosts advertise loopback address

Hello All,

I have come across a very strange problem with our HTCondor pool whereby *some* execute hosts advertise

the loopback address as the address of the startd as evidenced by this from CollectorLog:

05/21/20 14:09:35 StartdAd : Inserting ** "< slot1@xxxxxxxxxxxxxxxxxxxxxxxx , 127.0.0.1 >"
05/21/20 14:09:35 StartdPvtAd : Inserting ** "< slot1@xxxxxxxxxxxxxxxxxxxxxxxx , 127.0.0.1 >"

Some execute hosts work fine and advertise their correct address whereas a substantial number advertise the

loopback and I believe there are even examples of both on the same subnet. The execute hosts all run Windows 10

and HTCondor version 8.4.6 and employ power saving so that idle machines (viz no local user use or HTCondor

use) go into hibernation after approx 10 minutes.

A typical scenario is that I wake a machine to a run job, the machine advertises its loopback address to the

collector. The negotiator either finds a match or ignores the loopback - no quite sure which. but in any case the job never

starts on the execute host and so the host returns to hibernation.

I turned up this submission to htcondor-users in the archives but it seems pretty old (Windows XP) and doesn't seem to

come up with a satisfactory solution:

https://lists.cs.wisc.edu/archive/htcondor-users/2006-October/msg00069.shtml

Any suggestions would be extremely useful as I'm totally baffled by this.

regards,

-ian.

Dr Ian C. Smith,

Condor Manager,

Advanced Research Computing,

University of Liverpool

UK.

Mailing List Archives

Public Access

Re: [HTCondor-users] execute hosts advertise loopback address