I am still having the same problem.
here is a part listing of the
condor_config
/home/condor/condor_config
CONDOR_HOST = thebeast
## Where have you installed the bin,
sbin and lib condor directories? RELEASE_DIR = /home/condor/release
## Where is the local condor directory for each
host? LOCAL_DIR = /home/condor/hosts/$(HOSTNAME)
## Where is the machine-specific local config file for each
host?
LOCAL_CONFIG_FILE =
/home/condor/release/etc/$(HOSTNAME).local
------
so in respect of $(HOSTNAME).local I have edited the following file for my
central manager.
/home/condor/release/etc/thebeast.local
DAEMON_LIST = MASTER, COLLECTOR, NEGOTIATOR, STARTD,
SCHEDD
Now the hostname environmental variable looks up fine.
thebeast:/home/condor # hostname thebeast thebeast:/home/condor #
echo $HOSTNAME thebeast thebeast:/home/condor #
So I could assume i could run the condor daemons on the central manager
machine successfully.
thebeast:/home/condor # condor_master thebeast:/home/condor # ps -fe |
grep condor condor 5272 1 0
16:34 ? 00:00:00
condor_master condor 5273 5272 0 16:34
? 00:00:00 condor_schedd -f -n root@xxxxxxxxxxxxxxxxxxx
This is all im getting.. And running the collector and negotiator
manually doesnt make any difference.
They are running in memory but no node can connect. condor_status
on any node returns error that
the collector on thebeast can not be contacted even after I ran
them manually.
Each node can ping thebeast fine and the host read and write are
set properly.
any ideas?
thanks
Chris
----- Original Message -----
Sent: Tuesday, September 20, 2005 12:19
AM
Subject: [Condor-users] Problem
installing condor on cluster.
I have a cluster with 24 nodes and a manager
node.
names of each are
mgmnt.cluster.int
node1.cluster.int
node2.cluster.int
..
node24.cluster.int
mgmnt.cluster.int has an alias thebeast.cluster.int (it was setup this
way by the company that installed the cluster).
so when i log into the manager i get the prompt [thebeast] but if i ping
thebeast it starts pinging mgmnt.cluster.int (192.168.1.1).
on the cluster /home/condor is shared between all nodes.
I ran condor_install and setup the various options, selecting that
/home/condor was shared and the relevant options with config files etc.
During condor_install i set the condor central manager to
mgmnt.cluster.int
The cluster manager machine is also my condor central
manager.
I then ran condor_init and then condor_master but the collector and a few
other processes associated with the central manager
is not running so I can only presume than when im executing condor_master
on this machine its only recognising itself as a normal
node and not the central manager which I want it to be.
Is there something simple I am missing or overseeing?
Many thanks in advance
Chris Miles
University Of Paisley, Scotland
_______________________________________________ Condor-users mailing
list Condor-users@xxxxxxxxxxx https://lists.cs.wisc.edu/mailman/listinfo/condor-users
|