Re: [Condor-users] Having trouble with installation

I forgot to mention that there is nothing suspicious in any log file, and there is no StartLog on either machine.


On Wed, Jan 11, 2012 at 10:04 PM, Jordan Perr-Sauer <perr@xxxxxxxxxx> wrote:

I am trying to install Condor on some machines on my network and am running into a problem. I currently have one master node (on a unix box) and two worker nodes (one unix box and one windows box). I installed condor 4.6 "stable" from the deb package and msi package. I'm focusing on the two unix machines for now, even though neither platform is working.

I can't seem to get the master node to "see" the available slots on the worker node. When I run condor_status from either machine with the configuration specified below, nothing is printed. When I add the STARTD daemon to the master node (for testing) then condor_status shows the two slots available on the master node (I can run condor_status from either machine).

Is there any obvious reason why the master node can't see the STARTD daemon on the worker node? I believe that all ports are open, I can reach the master node from the worker using ping, and I have double checked the CONDOR_HOST value in the configuration files.

Secondly, I am confused as to how one enables security features with Condor. I was never prompted for a cluster or computing pool password during installation. This worries me. I read through the security section of the manual, but it is unclear to me what I must do to secure the cluster. I would ideally like to enable all security features, as I can not trust the network.

Thanks in advance! I hope this question isn't asked too often... I searched through the archives and didn't find anything useful.

================= CONDOR CONFIGURATION ===============

I have left the configuration as default for the most part, but have modified the following values:

In condor_config on the master node:
    - CONDOR_HOST (as the full name of this machine)
    - CONDOR_ADMIN (as my email)

In condor_config.local on the master node:
    - CONDOR_HOST (as the full name of this machine)
    - COLLECTOR_NAME (I made up a name for my pool)

In condor_config on the unix worker node:
    - CONDOR_HOST (as the full name of the master node machine)

In condor_config.local on the unix worker node:
    - CONDOR_HOST (as the full name of the master node machine)
    - COLLECTOR_NAME (I made up a name for my pool, same name as before)

=================== PS -EF | GREP CONDOR_ =======================

On the master node:
condor    5931     1  0 21:59 ?        00:00:00 /usr/sbin/condor_master -pidfile /var/run/condor/condor.pid
condor    5932  5931  0 21:59 ?        00:00:00 condor_collector -f
condor    5933  5931  0 21:59 ?        00:00:00 condor_negotiator -f
condor    5934  5931  0 21:59 ?        00:00:00 condor_schedd -f
root      5935  5934  0 21:59 ?        00:00:00 condor_procd -A /var/run/condor/procd_pipe.SCHEDD -R 10000000 -S 60 -C 112

On the worker node:
condor   25833     1  0 Jan10 ?        00:00:36 /usr/sbin/condor_master -pidfile /var/run/condor/condor.pid
condor   25834 25833  0 Jan10 ?        00:00:00 condor_schedd -f
condor   25835 25833  0 Jan10 ?        00:00:01 condor_startd -f
root     25836 25834  0 Jan10 ?        00:00:17 condor_procd -A /var/run/condor/procd_pipe.SCHEDD -R 10000000 -S 60 -C 120