[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Schedd dies with an exception when communicating with IPv6 startd



Hi Todd, Greg,

Thank you for the info. We are working around the problem at the moment by setting
ENABLE_IPV6=False
PREFER_IPV4=True
and also forbidding users to include the site in their site whitelists.

I did however revert the setting to reproduce the problem. Below you can find the SchedLog leading to the exception. I masked the host and user names because everything will become public on the internet - I hope you don't mind. 
grid_collector, A.B.C.D = the grid collector we flock to, and its IP address
W.X.Y.Z = schedd IP address
worker_ip = matched worker node (supposedly IPv6 only)
another_remote_ip = another node at the grid site.

Cheers,
Yutaro

=================================
07/11/17 02:34:50 Finished negotiating for analysis.userx in pool grid_collector:9620: 1 matched, 0 rejected
07/11/17 02:34:50 Setting delay until next queue scan to 2 seconds
07/11/17 02:34:50 CONNECT bound to <W.X.Y.Z:19788> fd=19 peer=<A.B.C.D:9665>
07/11/17 02:34:50 condor_write(fd=19 collector A.B.C.D:9665?addrs=A.B.C.D-9665,,size=557,timeout=20,flags=0,non_blocking=0)
07/11/17 02:34:50 condor_read(fd=19 collector A.B.C.D:9665?addrs=A.B.C.D-9665,,size=5,timeout=20,flags=0,non_blocking=1)
07/11/17 02:34:50 Reading header would have blocked.
07/11/17 02:34:50 msgReady would have blocked.
07/11/17 02:34:50 condor_read(fd=19 collector A.B.C.D:9665?addrs=A.B.C.D-9665,,size=5,timeout=20,flags=0,non_blocking=1)
07/11/17 02:34:50 condor_read(fd=19 collector A.B.C.D:9665?addrs=A.B.C.D-9665,,size=275,timeout=20,flags=0,non_blocking=1)
07/11/17 02:34:50 condor_read(fd=19 collector A.B.C.D:9665?addrs=A.B.C.D-9665,,size=5,timeout=20,flags=0,non_blocking=1)
07/11/17 02:34:50 Reading header would have blocked.
07/11/17 02:34:50 msgReady would have blocked.
07/11/17 02:34:50 condor_read(fd=19 collector A.B.C.D:9665?addrs=A.B.C.D-9665,,size=5,timeout=20,flags=0,non_blocking=1)
07/11/17 02:34:50 condor_read(fd=19 collector A.B.C.D:9665?addrs=A.B.C.D-9665,,size=193,timeout=20,flags=0,non_blocking=1)
07/11/17 02:34:50 Address rewriting: refused for attribute MyAddress (MyAddress = "<W.X.Y.Z:9615?addrs=W.X.Y.Z-9615&noUDP&sock=1044306_42d0_7>"): clients now choose addresses.
07/11/17 02:34:50 condor_write(fd=19 collector A.B.C.D:9665?addrs=A.B.C.D-9665,,size=276,timeout=20,flags=0,non_blocking=0)
07/11/17 02:34:50 ACCEPT bound to  fd=25 peer=
07/11/17 02:34:50 condor_read(fd=25 ,,size=5,timeout=0,flags=0,non_blocking=0)
07/11/17 02:34:50 condor_read(fd=25 ,,size=8,timeout=0,flags=0,non_blocking=0)
07/11/17 02:34:50 CONNECT bound to <W.X.Y.Z:9615> fd=26 peer=<worker_ip:6725>
07/11/17 02:34:50 condor_write(fd=25 ,,size=13,timeout=5,flags=0,non_blocking=0)
07/11/17 02:34:50 condor_read(fd=26 <worker_ip:6725>,,size=5,timeout=1,flags=2,non_blocking=0)
07/11/17 02:34:50 condor_read(): fd=26
07/11/17 02:34:50 condor_read(): select returned 1
07/11/17 02:34:50 condor_read(fd=26 <worker_ip:6725>,,size=5,timeout=0,flags=0,non_blocking=1)
07/11/17 02:34:50 condor_read(fd=26 <worker_ip:6725>,,size=174,timeout=0,flags=0,non_blocking=1)
07/11/17 02:34:50 CCBClient: received reversed (non-blocking) connection <worker_ip:6725> (intended target is startd slot1@worker_node <another_remote_ip:6346>#1499692222#205#... for analysis.userx)
07/11/17 02:34:50 (bt:8f9e:20) Failed to assert (sockProto == objectProto) at /slots/02/dir_3420274/userdir/.tmpTjNgI4/BUILD/condor-8.6.3/src/condor_io/sock.cpp, line 539; aborting.



> On Jul 10, 2017, at 6:07 PM, Todd L Miller <tlmiller@xxxxxxxxxxx> wrote:
> 
>> Note that currently, HTCondor does not support an IPv4 only schedd (or central manager, for that matter) with mixed-mode startds.
> 
> 	To be clear: HTCondor supports pure IPv6 pools; pure IPv4 pools; and pools which use a mixture of protocols, as long as each schedd and the central manager support both IPv4 and IPv6.  If they do, each startd may use IPv4 or IPV6, or both -- it's not required that each startd also be mixed-mode.
> 
> - ToddM
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/