[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] New installation of Condor : questions about configuration files and other general things



Hello,

I'm doing a work placement related to Condor, but as i'm a newbie in Condor, i need your help. Actually, i have to put in place a desktop grid composed of virtual machines running on dual-core computers (Windows hosts running VirtualBox or VMWare Server virtual machines - Debian Etch) and i want to use Condor to deal with these virtual machines.

I set up Condor-6.9.2 as a Central Manager (cm) on a linux server (Debian Etch) running a NFS Server, which allows me to share the release directory (/usr/local/condor in my case) and a unique home (/home/condor/hosts/$hostname/) with the virtual machines (node1,node2 ...).

To install the central manager, i used the Older Unix Installation Procedure (condor_install).

Unfortunately, there is a lot of things i don't know how to deal with, even with the Condor Manual.

First, about files sharing. Do you think my setup is correct ?
On each virtual machine, i mount cm:/home/condor/ on /home/condor (i have a condor user on each vm) but is it correct ? .Same thing for the release directory : is it good to share this same folder between the central manager and the nodes ?

After the installation, i tried to run condor_status but i only got the status of my central manager, but there was nothing about my nodes. :s

http://pastebin.ca/448387

When i run "condor_status -direct node1", "condor_status -direct node2" ..., i get the following error : "condor_status:can't find address for startd ...".
So i checked which processes were running with ps -ef | egrep condor_ :
- on my cm : condor_master,condor_procd,condor_negociator,condor_collector,condor_startd,condor_schedd.
- on my nodes : condor_master,condor_procd,startd,condor_sched

So, i think that my problem must be related to my configuration files.

Here is my condor_config : http://pastebin.ca/448390
My cm condor_config.local : http://pastebin.ca/448397
But the nodes condor_config.local are empty. I wonder whether it is correct or not to have empty configurations files. :s Is another script to run on my virtual machines ? I ran condor_init but it didn't fix my problem.

Condor is a very complex project, and i'm a bit lost! So i apologize if my questions seem stupid or have already been asked earlier !

I'm french ... so i also apologize for my english.

Thank you in advance!