[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Re: which mpi is used ?



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, Feb 01, 2005, Erik Paulson wrote:
> > p0_2957:  p4_error: Child process exited while making connection to
> > remote process on c029.cip.physik.local: 0
> > p0_2957: (6.333597) net_send: could not write to fd=4, errno = 32
> You'll note that /home/condor/condor/sbin/rsh is not really rsh, it's
> just named rsh. It does not have the security problems of the Berekely
> rsh.
> That was your mistake. Put the condor program named 'rsh' back.
OK. just did that.
> Link your job with MPICH 1.2.4 for the ch_p4 device. Condor does not
> need any MPI runtime support (we don't use mpirun)
Just did so. The problem stays the same. is there any configuration
needed for this 'rsh' ? It doesn't seem to do anything:

login:condor>./condor/sbin/rsh localhost date
login:condor>

Still i get 
login:mpi>cat outfile.0
p0_6373:  p4_error: Child process exited while making connection to
remote process on c029.cip.physik.local: 0
p0_6373: (6.026546) net_send: could not write to fd=4, errno = 32
login:mpi>cat outfile.1 
rm_20830: (-) net_recv failed for fd = 3
rm_20830:  p4_error: net_recv read, errno = : 104

here's the submitfile:
universe = MPI
executable = /home/toedler/mpi/simplempi
log = logfile
input = infile.$(NODE)
output = outfile.$(NODE)
error = errfile.$(NODE)
machine_count = 4
should_transfer_files = yes
when_to_transfer_output = on_exit


Regards, 
Tobias
- -- 
________ This message is made of 100 % recycled electrons
\..|     PGP Key: www.stud.uni-goettingen.de/~s242275/pgpkey.pub     (o_
.\.|--   Jabber:  te_linuxguru at jabber.fsinf.de            (o  (o  //\
..\|____ ICQ:     124557012                                  (/)_(/)_V_/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFB/5K+KpMiVYRJv9YRAtHjAJ0RCetMefOckkWeI5FbS2oazvfJgACdGVxc
a3VoJVGH+mYBYnMdUOAvjAI=
=+S+7
-----END PGP SIGNATURE-----