[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] MPI job problem



Hi,
I'm having a problem using the MPI universe.

My condor run's just fine on vanila universe,
but problem comes when using the MPI universe.

The job will finish,
but the output is not right.
here are the outputs of the two output files:

[condor@hiroyuki 9-simplempijap]$ cat out.0
p0_7107:  p4_error: Timeout in making connection to
remote process on hiroyuki.hiroyuki4: 0
p0_7107: (302.014591) net_send: could not write to
fd=4, errno = 32

[condor@hiroyuki 9-simplempijap]$ cat out.1
rm_23723:  p4_error: Could not gethostbyname for host
hiroyuki.hiroyuki2; may be invalid name
: 61

First i thought the problem will be the version
of mpich, so i downloaded:
mpich-1.2.2.1.tar.gz  
mpich-1.2.4.tar.gz

the problem stays the same.
I have three machines.

I really need some help, thank you.

p.s. I am having the same trouble 
https://lists.cs.wisc.edu/archive/condor-users/2005-February/msg00263.shtml
this guy is having.

__________________________________________________
想即時收到新信通知?
馬上下載Yahoo!奇摩即時通訊 
http://messenger.yahoo.com.tw/