[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Problem with openmpi in parallel universe on CENTOS5/Condor 7.2.1

It seems the problem was related to USE_NFS=TRUE which I had set, and it was causing condor_chirp to not work right.

Once I removed that everything appears to setup fine for openmpi in parallel universe.

On Tue, Feb 24, 2009 at 2:50 PM, David Anderson <mr.anderson.neo@xxxxxxxxx> wrote:
Hello All,

    I am having trouble with getting condor to start a openmpi job.   From previous notes to the list I have obtained my submit script and submit code but whenever I submit the job it comes up with a chirp error "error 0 chirp putting identity keys back".    It appears to be a problem with getting the special sshd.sh environment setup.   I am including a tarball of my directory including stdout/stderr from a run with a set -x in the ompscript file.

Any insight would be appreciated, as I am having a hard time trying to figure out what is wrong.  

Condor version is 7.2  The OS is centos 5.2, the openmpi version is 1.2.5 that is packaged with Centos5.

[danders5@hal9000-server mpitest]$ condor_config_val CONDOR_SSHD CONDOR_SSH_KEYGEN
[danders5@hal9000-server mpitest]$

Thanks in advance.

David Anderson

David Anderson