[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Worker nodes (startds) behind firewall




Have you tried using HTCondor's CCB feature? It is intended to help in cases where you cannot open an incoming port. It does, however, require that you can open an incoming port to all other parts of the system that the firewalled node needs to talk to.

--Dan

On 11/10/12 4:47 PM, Samir Cury wrote:
Dear all,

I'm trying to run a condor worker node in a virtual machine at home.
And that looks pretty much my use case.

The central-manager is reachable over WAN, it's a amazon EC2 server.
when worker nodes are within EC2 network, where I have control over
the firewall and can open ports, everything works just fine.

I'm using the usual shared port daemon on 4080 to make it easier.

If the problem I spotted makes sense, I think it's an architectural
problem and likely nothing small that can be done, except (that's my
hope) that people found workarounds.

Once I bring up everything and finally bring up my condor_master in
the VM, I can see this in the Collector's log [1].

What I could understand from that, is that the collector tries to
connect back to <whatever_lan_ip>:4080 , what obviously will never
happen, I could use my public IP, and forward the port in my firewall,
problem is that I won't be able to open ports in all the cases. The
idea was to have a fire&forget worker-node VM. Anywhere, not only in
controlled environments.

That said, is there a way to run a setup, where no socket is needed to
be exposed on the worker-node side? (node connects to central-manager
and all is handled throught there?)

I tried to read the "Networking" part of the FAQ(Manual) but didn't
find something that I could use here.

Thanks!
Samir

[1] :

11/10/12 13:56:40 DC_AUTHENTICATE: attempt to open invalid session
ec2-54-247-144-149:30913:1352584280:7, failing; this session was
requested by <86.209.136.220:52676> with return address
<192.168.1.15:4080?noUDP&sock=2660_9f84>

11/10/12 13:57:01 attempt to connect to <192.168.1.15:4080> failed:
timed out after 20 seconds.
11/10/12 13:57:01 Failed to send DC_INVALIDATE_KEY to daemon at
<192.168.1.15:4080>: SECMAN:2003:TCP connection to daemon at
<192.168.1.15:4080> failed.
11/10/12 13:57:12 DC_AUTHENTICATE: attempt to open invalid session
ec2-54-247-144-149:30913:1352584280:7, failing; this session was
requested by <86.209.136.220:48799> with return address
<192.168.1.15:4080?noUDP&sock=2660_9f84>
11/10/12 13:57:33 attempt to connect to <192.168.1.15:4080> failed:
timed out after 20 seconds.
11/10/12 13:57:33 Failed to send DC_INVALIDATE_KEY to daemon at
<192.168.1.15:4080>: SECMAN:2003:TCP connection to daemon at
<192.168.1.15:4080> failed.

11/10/12 13:59:22 DC_AUTHENTICATE: attempt to open invalid session
ec2-54-247-144-149:30913:1352584162:5, failing; this session was
requested by <10.208.37.243:58066> with return address
<10.208.37.243:4080?noUDP&sock=25849_2cfb>
11/10/12 13:59:35 DC_AUTHENTICATE: attempt to open invalid session
ec2-54-247-144-149:30913:1352584175:6, failing; this session was
requested by <10.208.37.243:55358> with return address
<10.208.37.243:4080?noUDP&sock=25849_2cfb_2>
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/