[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] starter process exits



The processes are
started as root but run as the user condor. The execute nodes get their
/home/condor served up by NFS and the dirs are auto-mounted. My global
condor_config is in /opt/condor/etc/condor_config and there is a symlink
from /home/condor/condor_config to this file.

It sounds like you may have a problem in your config setup -- you realize that there are two config files, right?


The main config file is in /opt/condor/etc, and the other in /opt/condor/local.X, where X is the name of your machine. You're supposed to set the CONDOR_CONFIG environment variable to point to the location of the main config file (the one in /opt/condor/etc), which, in turn, has a macro pointing to the local config file. By default, you need both for condor to start properly.

I'm seeing some strange behavior both when I start up condor_master and
when I submit jobs to the pool. In the case of condor_master, if I start
this process without first doing an 'ls /home/condor' it dies with a
complaint about not having CONDOR_CONFIG set, not being able to find
/etc/condor/condor_config, or not being able to find
/local/condor/condor_config. The complaint also mentions not finding
~/condor. When I trace the condor_master with strace, however, it doesn't
look like an open() attempt is ever made on ~/condor_config. Eventhough
df shows /home/condor as already mounted, if I 'ls /home/condor', however,
it succeeds in checking for and finding this directory. It seems there is
some reason condor is not even attempting to open
/home/condor/condor_config.

I don't know why condor would start properly once you've listed the /home/condor directory. If your CONDOR_CONFIG variable is set correctly, and your local config file is in the right place, condor should start.


If you have everything correctly set up, perhaps it's a problem with your NFS automounting configuration -- I've seen similar problems before, but never on my own systems.

Best,
Tim