[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] MPI job problem

I'm having a problem using the MPI universe.

My condor run's just fine on vanila universe,
but problem comes when using the MPI universe.

The job will finish,
but the output is not right.
here are the outputs of the two output files:

[condor@hiroyuki 9-simplempijap]$ cat out.0
p0_7107:  p4_error: Timeout in making connection to
remote process on hiroyuki.hiroyuki4: 0
p0_7107: (302.014591) net_send: could not write to
fd=4, errno = 32

[condor@hiroyuki 9-simplempijap]$ cat out.1
rm_23723:  p4_error: Could not gethostbyname for host
hiroyuki.hiroyuki2; may be invalid name
: 61

First i thought the problem will be the version
of mpich, so i downloaded:

the problem stays the same.
I have three machines.

I really need some help, thank you.

p.s. I am having the same trouble 
this guy is having.