[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Worker nodes (startds) behind firewall



Dear all,

I'm trying to run a condor worker node in a virtual machine at home.
And that looks pretty much my use case.

The central-manager is reachable over WAN, it's a amazon EC2 server.
when worker nodes are within EC2 network, where I have control over
the firewall and can open ports, everything works just fine.

I'm using the usual shared port daemon on 4080 to make it easier.

If the problem I spotted makes sense, I think it's an architectural
problem and likely nothing small that can be done, except (that's my
hope) that people found workarounds.

Once I bring up everything and finally bring up my condor_master in
the VM, I can see this in the Collector's log [1].

What I could understand from that, is that the collector tries to
connect back to <whatever_lan_ip>:4080 , what obviously will never
happen, I could use my public IP, and forward the port in my firewall,
problem is that I won't be able to open ports in all the cases. The
idea was to have a fire&forget worker-node VM. Anywhere, not only in
controlled environments.

That said, is there a way to run a setup, where no socket is needed to
be exposed on the worker-node side? (node connects to central-manager
and all is handled throught there?)

I tried to read the "Networking" part of the FAQ(Manual) but didn't
find something that I could use here.

Thanks!
Samir

[1] :

11/10/12 13:56:40 DC_AUTHENTICATE: attempt to open invalid session
ec2-54-247-144-149:30913:1352584280:7, failing; this session was
requested by <86.209.136.220:52676> with return address
<192.168.1.15:4080?noUDP&sock=2660_9f84>

11/10/12 13:57:01 attempt to connect to <192.168.1.15:4080> failed:
timed out after 20 seconds.
11/10/12 13:57:01 Failed to send DC_INVALIDATE_KEY to daemon at
<192.168.1.15:4080>: SECMAN:2003:TCP connection to daemon at
<192.168.1.15:4080> failed.
11/10/12 13:57:12 DC_AUTHENTICATE: attempt to open invalid session
ec2-54-247-144-149:30913:1352584280:7, failing; this session was
requested by <86.209.136.220:48799> with return address
<192.168.1.15:4080?noUDP&sock=2660_9f84>
11/10/12 13:57:33 attempt to connect to <192.168.1.15:4080> failed:
timed out after 20 seconds.
11/10/12 13:57:33 Failed to send DC_INVALIDATE_KEY to daemon at
<192.168.1.15:4080>: SECMAN:2003:TCP connection to daemon at
<192.168.1.15:4080> failed.

11/10/12 13:59:22 DC_AUTHENTICATE: attempt to open invalid session
ec2-54-247-144-149:30913:1352584162:5, failing; this session was
requested by <10.208.37.243:58066> with return address
<10.208.37.243:4080?noUDP&sock=25849_2cfb>
11/10/12 13:59:35 DC_AUTHENTICATE: attempt to open invalid session
ec2-54-247-144-149:30913:1352584175:6, failing; this session was
requested by <10.208.37.243:55358> with return address
<10.208.37.243:4080?noUDP&sock=25849_2cfb_2>