[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] HTcondor High-Availability and dual-stack



Hi Jaime,

Thank you very much. I will check with your workaround.

Cheers,

Carles

On 06/14/2016 08:50 PM, Jaime Frey wrote:
This is a bug in HTCondor that occurs when the first DNS record for a hostname in HAD_LIST is an IPv6 address. The HAD daemon is comparing this IP address to the first IP address in its own contact information, which is an IPv4 address, and not recognizing that they belong to the same machine.

We will fix this in a future release:
https://htcondor-wiki.cs.wisc.edu/index.cgi/tktview?tn=5728

If you need PREFER_IPV4=False for your HTCondor configuration, then you can work around this problem by enabling this configuration parameter for just the HAD daemons, like so:

HAD. PREFER_IPV4 = True
REPLICATION. PREFER_IPV4 = True

Or, you can use IPv4 addresses in HAD_LIST.

  - Jaime

On Jun 8, 2016, at 2:56 AM, Carles Acosta <cacosta@xxxxxx> wrote:

Hello again,

I've updated my pool to version 8.5.5 and, using ENABLE_IPV4=auto, ENABLE_IPV6=auto and PREFER_IPV4=true options, the error is gone. However, when I change to PREFER_IPV4=false, there is still this error related with HA daemon:

HAD CONFIGURATION ERROR:  my address '<ipv4:51450?addrs=[ipv6]-51450+ipv4-51450>'is not present in HAD_LIST 'xxxx.pic.es:51450, xxxx.pic.es:51450'

Cheers,

Carles

On 06/07/2016 12:33 PM, Carles Acosta wrote:
Hi,

Ok, thank you very much Brian.

Cheers,

Carles

On 06/07/2016 12:14 PM, Brian Bockelman wrote:
Hi Carles,

This is a known bug:

https://htcondor-wiki.cs.wisc.edu/index.cgi/tktview?tn=5659

The fix was released (yesterday) in v8.5.5.

Brian

On Jun 7, 2016, at 10:46 AM, Carles Acosta <cacosta@xxxxxx> wrote:

Dear all,

We are doing some testing with a small Htcondor pool with dual-stack. We are running the development version 8.5.4.

At the beginning, we were using the options: ENABLE_IPV4 = auto, ENABLE_IPV6 = auto and PREFER_IPV4 = false, so, our idea was to force HTcondor to use IPv6 as preferred option. We observed that the communication between the execution nodes and the central managers was fine, also with the schedds, but we had problems with the condor_had daemon in our central managers.

In the HADlog, we can see (where ipv4 and ipv6 are the corresponding addresses):

HAD CONFIGURATION ERROR:  my address '<ipv4:51450?addrs=[ipv6]-51450+ipv4-51450>'is not present in HAD_LIST 'xxxx.pic.es:51450, xxxx.pic.es:51450'

The High Availability daemon fails and then the negotiator daemon is not running in any of our central managers.

Similarly, changing to PREFER_IPV4 = true doesn't solve the problem and we see:

HADStateMachine::setReplicationDaemonSinfulStringhost names of machine and replication daemon do not match: ipv4:51450?addrs=ipv4-51450+[ipv6 vs. ipv4

Thus, we have to change to ENABLE_IPV4 = false and PREFER_IPV4 = false, to have High-Availability working again with IPv6 (or ENABLE_IPV6= false to use IPv4).

I'm not sure if I'm using the correct options or this is a known issue.

Thanks in advance.

Best regards,

Carles
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


--
Carles Acosta i Silva
PIC (Port d'Informació Científica)
Campus UAB, Edifici D
E-08193 Bellaterra, Barcelona
Tel: +34 93 581 33 22
Fax: +34 93 581 41 10
http://www.pic.es
Avís - Aviso - Legal Notice: http://www.ifae.es/legal.html