[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Problem with openmpi in parallel universe on CENTOS5/Condor 7.2.1
- Date: Wed, 25 Feb 2009 18:34:45 -0600
- From: David Anderson <mr.anderson.neo@xxxxxxxxx>
- Subject: Re: [Condor-users] Problem with openmpi in parallel universe on CENTOS5/Condor 7.2.1
It seems the problem was related to USE_NFS=TRUE which I had set, and it was causing condor_chirp to not work right.
Once I removed that everything appears to setup fine for openmpi in parallel universe.
On Tue, Feb 24, 2009 at 2:50 PM, David Anderson <mr.anderson.neo@xxxxxxxxx>
I am having trouble with getting condor to start a openmpi job. From previous notes to the list I have obtained my submit script and submit code but whenever I submit the job it comes up with a chirp error "error 0 chirp putting identity keys back". It appears to be a problem with getting the special sshd.sh environment setup. I am including a tarball of my directory including stdout/stderr from a run with a set -x in the ompscript file.
Any insight would be appreciated, as I am having a hard time trying to figure out what is wrong.
Condor version is 7.2 The OS is centos 5.2, the openmpi version is 1.2.5 that is packaged with Centos5.
[danders5@hal9000-server mpitest]$ condor_config_val CONDOR_SSHD CONDOR_SSH_KEYGEN
Thanks in advance.