[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [condor-users] Problems Re-adding a Condor Client



On Thu, Sep 11, 2003 at 09:45:31AM -0400, Jess Cannata wrote:
> I have a 48-node Linux cluster running Condor. One of the node's hard 
> drive crashed and was rebuilt. I have tried, unsuccessfully, to get 
> Condor running again on the rebuilt node (it worked fine before the node 
> crashed, and it works fine for the other 47 machines). The Condor base 
> install is on /home/condor, which is shared across all of the nodes via 
> NFS. The condor user exists on the new node. All I did was create 
> /var/lock/condor/InstanceLock with the proper permissions and make it so 
> the Condor services would start.
> 
> The Condor services start on the rebuilt node without any errors, but 
> the Condor master never sees the new node (condor_status doesn't report 
> the rebuilt node). It is as if no information were being sent from the 
> rebuilt node to the master node. However, I know that the network 
> communication is fine (the new node is loading the Condor services off 
> the NFS mount).
> 
> The following services are running on the rebuilt node:
> 
> condor_master
> condor_startd
> condor_schedd
> 
> and their log files show no errors.
> 
> Has anyone seen this problem before? How exactly do the Condor clients 
> communicate to the master node? Is it via a specific TCP/UDP port?

Yes - The Condor central manager listens on port 9614 for updates. The
rest of Condor knows which machine this is from the COLLECTOR_HOST setting
in the config file (which, by default, is set to be the value of 
CONDOR_HOST)

> I've 
> disabled IPTABLES on both the master and the client to no avail. The 
> weird thing is that all of the other clients are showing up.
> 
> Any help would be appreciated.
> 

If everything looks OK on the missing node, the next place to check is the
CollectorLog file on the central manager.

-Erik
Condor Support Information:
http://www.cs.wisc.edu/condor/condor-support/
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>