[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Trouble sending commands to nodes inside NAT network



Hi all,

Does anyone have any Idea why commands sent to the localhost do not work?

I am running coLinux/Condor with the tap network card attached to the
ethernet card using Internet Connection Sharing (NAT) and so am using
GCB to connect to the central manager.  From what I can see, Condor
depends on the nodes being registered in DNS.  The problem with that is
this: the guest OS that is running Condor has a 192.168.x.x address, the
host that it is on has a live IP address and the Condor installation is
also associated with the live IP address of the GCB machine.   I'm not
sure what, if any, I need to register in DNS.  The local DHCP server
also hands out hostnames but WINXP ignores them.

When I ssh to the localhost and run condor_off I get:
Sent "Kill-All-Daemons" command to local master
but nothing happens, the node just keeps executing the job and all the
daemons/processes keep running (SCHEDD, STARTD and the running job
processes if it's busy).  If I run condor_off on a machine that is
native Linux, everything dies exept the MASTER as it's supposed to.

Also, when I run condor_off computenode2 from the central manager,
Condor has no idea how to find the node and just reports:
condor_off: unknown host computenode3

I am using condor_off as an example but the same thing happens with all
the tools such as condor_reconfig, condor_on, etc.

Any ideas?


Thanks
===============================
Dave Schulz
dschulz@xxxxxxxxxxx
1 (403) 220-2102
High Performance Computing Group
University of Calgary
Alberta, Canada