[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] which mpi is used ?



On Tue, Feb 01, 2005 at 12:42:23PM +0100, Tobias Edler wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hello everybody !
> 
> I try to set up condor to use MPI, so i installed mpi to /usr/global/mpi
> and linked mpirun to /usr/bin/ 
> 
> no matter where i have mpi installed or not , whenever i try run the
> simplempi from the users manual, all i get is this output:
> 
> p0_2957:  p4_error: Child process exited while making connection to
> remote process on c029.cip.physik.local: 0
> p0_2957: (6.333597) net_send: could not write to fd=4, errno = 32
> 
> As far as i understand, condor uses /home/condor/condor/sbin/rsh to
> start the job, right ? this doesn't work, as for security reasons, rsh is
> not allowed here.

You'll note that /home/condor/condor/sbin/rsh is not really rsh, it's
just named rsh. It does not have the security problems of the Berekely
rsh.

> So i set up ssh :
> 
> bash-2.05b$ whoami
> condor
> bash-2.05b$ ssh c029 date
> Tue Feb  1 12:41:20 CET 2005
>   
> and linked it there, but this didn't help either.
> 

That was your mistake. Put the condor program named 'rsh' back.

> So 
> a) how do i tell condor where to look for mpi 
> b) how do i tell condor to use ssh ?
> 

a) you don't need to
b) you can't

Link your job with MPICH 1.2.4 for the ch_p4 device. Condor does not
need any MPI runtime support (we don't use mpirun)

-Erik