[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Re: [Condor-users] MPI job problem



omg,
thx a lot,
it works now,
the problem seems to be the domain name
=) appreciate your help....
--- Maxim Kovgan <maxk@xxxxxxxxxxxxxxxxx> 說:

> young董 wrote:
> 
> >Hi,
> >I'm having a problem using the MPI universe.
> >
> >My condor run's just fine on vanila universe,
> >but problem comes when using the MPI universe.
> >
> >The job will finish,
> >but the output is not right.
> >here are the outputs of the two output files:
> >
> >[condor@hiroyuki 9-simplempijap]$ cat out.0
> >p0_7107:  p4_error: Timeout in making connection to
> >remote process on hiroyuki.hiroyuki4: 0
> >p0_7107: (302.014591) net_send: could not write to
> >fd=4, errno = 32
> >
> >[condor@hiroyuki 9-simplempijap]$ cat out.1
> >rm_23723:  p4_error: Could not gethostbyname for
> host
> >hiroyuki.hiroyuki2; may be invalid name
> >: 61
> >  
> >
> Not sure this is THE main reason, but:
> 1. the above error line indicates a name resolving
> issue.
> One of the network systems condor relies on is the
> name resolving.
> you should setup an environment that allows direct
> and reversed lookups
> ( either by /etc/hosts, or DNS server )
> 2. Besides, NOTE, you have names that indicate
> different DNS domains:
> hiroyuki.hiroyuki2 and hiroyuki.hiroyuki4 are in
> different domains.
> if it were:
> hiroyuki2.hiroyuki
> hiroyuki4.hiroyuki
> it would be much better.
> 
> 3. And, if you have multiple interfaces on the
> machines, you must
> specify the interface you want condor to use with
> NETWORK_INTERFACE
> directive.
> 
> Max.
> 
> 
> >First i thought the problem will be the version
> >of mpich, so i downloaded:
> >mpich-1.2.2.1.tar.gz  
> >mpich-1.2.4.tar.gz
> >
> >the problem stays the same.
> >I have three machines.
> >
> >I really need some help, thank you.
> >
> >p.s. I am having the same trouble 
>
>https://lists.cs.wisc.edu/archive/condor-users/2005-February/msg00263.shtml
> >this guy is having.
> >
> >__________________________________________________
> >想即時收到新信通知?
> >馬上下載Yahoo!奇摩即時通訊 
> >http://messenger.yahoo.com.tw/
> >_______________________________________________
> >Condor-users mailing list
> >Condor-users@xxxxxxxxxxx
>
>https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >  
> >
> 
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
>
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 


__________________________________________________
想即時收到新信通知?
馬上下載Yahoo!奇摩即時通訊 
http://messenger.yahoo.com.tw/