Re: [Condor-users] shared OS file system

On 08/09/2010 09:00 AM, Dave STREET wrote:

I have been using condor for a while now and it was all going fine, then
we decided to try booting mutiply Debian systems from a single nfs
mounted image.

this all works fine and we can run multiply machines, the issue comes
when trying to run Condor,

I booted one instance of the machine and installed condor and every
thing ran fine, joined the pool and was happy enough.

but booting up a second machine then I got the error message

/var/run/condor does not exist or is not a directory?

I can see in the config there is

    The location of the local Condor directory on each machine in your
    pool. One common option is to use the condor user's home directory
    which may be specified with $(TILDE). There is no default value for
    LOCAL_DIR . For example:

             LOCAL_DIR = $(tilde)

    On machines with a shared file system, where either the $(TILDE)
    directory or another directory you want to use is shared among all
    machines in your pool, you might use the $(HOSTNAME)macro and have a
    directory with many subdirectories, one for each machine in your
    pool, each named by host names. For example:

             LOCAL_DIR = $(tilde)/hosts/$(hostname)


             LOCAL_DIR = $(release_dir)/hosts/$(hostname)

    so I have LOCAL_DIR = $(release_dir)/hosts/$(hostname)

    Do I need to manually create and populate each directory for each machine, or is

    there a way to get them to auto-create as a new machine joins the pool?



Condor isn't going to create all those directories for you on startup, and it is good that you are not pointing your LOCAL_DIR to a shared filesystem!

Since RELEASE_DIR should be local to the node there should be no need to use $(HOSTNAME) at all. Just create /var/{run,log}/condor in your image and maybe a /var/lib/condor/{execute,spool} for your SPOOL and EXECUTE config. I set LOCAL_DIR to /var/lib/condor.