[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Installation problems in a future Grid testbed



You should have a global configuration file on all the machines, and
they should be identical. You must have a local configuration file on
all the machines if the REQUIRE_LOCAL_CONFIG_FILE macro in the global
configuration file is set to true, which it is by default. However,
that local config file can be empty on machines that are not the
central manager.

(You are not using NFS, right? That seems to be the root of most of
your installation problems.)

-Avi

On 8/25/05, Fabiano Portella <fabiano_portella@xxxxxxxxxxxx> wrote:
> So, you're meaning that I must have only one global
> config file in central manager and a local config file
> with data too. All others machines in the pools
> (submitters/executers) must have ONLY one empty local
> config file. Is that correct?
> Please let me know if I'm wrong.
> Thanks one more time for your help.
> Regards,
> Fabiano.
> 
> --- Avi Flamholz <flamholz@xxxxxxxxx> escreveu:
> 
> > Each machine must have a local config file, but it
> > need not have
> > anything in it. You should empty out the local
> > config files for the
> > machines that you do not want to be the central
> > manager. You do not
> > want to have an empty global config file - how else
> > would you define
> > global settings for condor? You should undo that if
> > possible, or
> > reinstall.
> >
> > I believe, also, that the condor_config script will
> > set up the
> > appropriate local config files for you if you run it
> > with the correct
> > parameters.
> >
> > -Avi
> >
> > On 8/24/05, Fabiano Portella
> > <fabiano_portella@xxxxxxxxxxxx> wrote:
> > > Thanks for the fast response Avi!
> > > I'm sorry about the confusion! You're right: I'm
> > having 2 managers instead
> > > of 1.
> > > But I couldn't understand your point about the
> > configuration files: which
> > > must be an empty file (condor_config ou
> > condor_config.local)?
> > > I've tried to do this tip with condor_config in
> > the non-manager machine:
> > > 1. Replace condor_config with an empty file in the
> > non-manager machine of
> > > the pool
> > > 2. Start condor_master in pool manager
> > > 3. Start condor_master in pool non-manager (just
> > submitter/executer)
> > > But seems that none happened (no daemons turned
> > on).
> > > =============================================
> > > [globus@crithidia sbin]$ condor_master
> > > [globus@crithidia sbin]$ ps -ef | egrep condor_
> > > globus    6639  6585  0 23:47 pts/0    00:00:00
> > egrep condor_
> > > [globus@crithidia sbin]$
> > > =============================================
> > > I assume that an empty condor_config.local is not
> > the case, according to the
> > > condor documentation (each machine must have a
> > local configuration file).
> > > Regards,
> > > Fabiano.
> > >
> > >
> > > Avi Flamholz <flamholz@xxxxxxxxx> escreveu:
> > > Do you mean that both machines are executing as a
> > central manager?
> > > (You wrote "execute both machines as masters," but
> > the condor_master
> > > daemon is supposed to run on all machines, so I
> > assume you meant
> > > central manager.)
> > >
> > > If this is the case, you should look at the local
> > configuration files
> > > for the individual machines. It will probably have
> > a comment on the
> > > top saying something like "this is the config file
> > for the central
> > > manager." It will also probably have the macro
> > DAEMONS_LIST set to
> > > MASTER, NEGOTIATOR, COLLECTOR, STARTD, SCHEDD.
> > This means that all
> > > your machines are running all the daemons, which
> > is not what you want.
> > > On the machines that you do not want to be central
> > managers, you
> > > should replace this file with an empty file. The
> > default, given an
> > > empty local config file, is that the condor_master
> > daemon will start
> > > the condor_star! td and condor_schedd daemons,
> > making the local machine
> > > a submit/execute node.
> > >
> > > -Avi
> > >
> > > On 8/24/05, Fabiano Portella wrote:
> > > >
> > > >
> > > > Hi Condor community!
> > > > I'm trying to create a Grid with machines from 2
> > research labs. First one
> > > > contains 3 Linux FC3 machines (one is the
> > central manager) and the other
> > > lab
> > > > contains 2 Linux FC1 machines.
> > > > I've just tried to install condor-6.7.10.
> > Following the condor
> > > > documentation, I've issued the following
> > commands (as globus user and sudo
> > > > permission):
> > > >
> > > >
> > > > $tar xzf
> > condor-6.7.10-linux-x86-glibc23-dynamic.tar.gz
> > > > $cd condor-6.7.10
> > > > $sudo ./condor_configure
> > --type=manager,submit,execute
> > > > --install-dir=/usr/local/condor-6.7.10/
> > --owner=globus
> > > > --install
> > > >
> > > > WARNING: Unable to determine local IP address.
> > Condor might not work
> > > > propertly until you set NETWORK_INTERFACE=
> > > >
> > > > Use of uninitialized value in concatenation (.)
> > or string at
> > > > ./condor_configure line 908.
> > > >
> > > > Condor has been installed into:
> > > > /usr/local/condor-6.7.10
> > > >
> > > > It seems strange to me, since NETWORK_INTERFACE
> > was set to the IP address
> > > of
> > > > the specified machine.
> > > > Anyway, I continued the process, updating
> > condor_config and
> > > > condor_config.local properly:
> > > >
> > > > #######/etc/condor/condor_config#############
> > > > RELEASE_DIR = /usr/local/condor-6.7.10
> > > > LOCAL_DIR = /usr/local/condor-6.7.10/local.vivax
> > > > CONDOR_ADMIN = globus@xxxxxxxxxxxxxxxxxx
> > > > MAIL = /bin/mail
> > > > FULL_HOSTNAME = vivax.biowebdb.org
> > > > UID_DOMAIN = $(FULL_HOSTNAME)
> > > > FILESYSTEM_DOMAIN = $(FULL_HOSTNAME)
> > > > COLLECTOR_NAME = BioWebDB Pool
> > > > CONDOR_IDS = 504.504
> > > > QUEUE_SUPER_USERS = root, condor, globus
> > > > #############################################
> > > >
> > > &! gt;
> > >
> >
> ##/usr/local/condor-6.7.10/local.vivax/condor_config.local###
> > > > CONDOR_HOST = vivax.biowebdb.org vivax
> > > > CONDOR_ADMIN = globus@xxxxxxxxxxxxxxxxxx
> > > > UID_DOMAIN = $(FULL_HOSTNAME)
> > > > FILESYSTEM_DOMAIN = $(FULL_HOSTNAME)
> > > > CONDOR_IDS = 504.504
> > > >
> > >
> >
> ##############################################################
> > > >
> > > > After that I decided to move forward to
> > install/configure Condor in other
> > > > machine. I was aware about the type parameter
> > for condor_install, so it
> > > was
> > > > "--type=submit,execute".
> > > > I don't have a shared file system nor a common
> > UID, so I changed
> > > > FULL_HOSTNAME to its name either.
> > > > The problems come now. After updates all
> > necessary fields in
> > > > condor_config.local and condor_config files for
> > both machines, I tried to
> > > > start daemons.
> > > > But both machines, after the "condor_master"
> > command issued in each one,
> > > > execute both machines as masters!
> > > > So, how could I dea! l with that? Is there any
> > configuration missing?
> > > What's
> > > > wrong? Should I reinstall all the stuff?
> > > > Please, any glue will be important! I really
> > need this feedback to go on!
> > > > Thanks in advance.
> > > > Regards,
> > > > Fabiano.
> > > >
> > > >
> > > >
> > __________________________________________________
> > > > Converse com seus amigos em tempo real com o
> > Yahoo! Messenger
> > > > http://br.download.yahoo.com/messenger/
> > > > _______________________________________________
> > > > Condor-users mailing list
> > > > Condor-users@xxxxxxxxxxx
> > > >
> >
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> > > >
> > > >
> > >
> > > _______________________________________________
> > > Condor-users mailing list
> >
> === message truncated ===
> 
> 
> 
> 
> 
> 
> _______________________________________________________
> Yahoo! Acesso Grátis - Internet rápida e grátis.
> Instale o discador agora! http://br.acesso.yahoo.com/
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>