[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Quick n Dumb MPI Question



On Mon, Apr 25, 2005 at 11:07:58PM -0500, Matt Baker wrote:
> I'm currently setting up a 10-node cluster to run MPI under Condor. This
> cluster has the latest stable ROCKS configuration, with MPI 1.2.6.
> I know I need to use 1.2.4 for Condor to run jobs, but I'm not that fluent
> in MPI yet.
> 
> 1. Does 1.2.4 need to be installed on compute nodes, or is just mpicc needed
> on the head node?
> 

Just mpicc on the head node - or wherever you plan on using it (it doesn't
have to be on the cluster, of course)

> 2. Are there plans for 1.2.x / MPICH2 / whatever MPI under Condor?
> 

Yes, it's a feature planned for 6.8.0, so it will show up in a 6.7 release.
Not 6.7.7 though. 

> 3. What's the (probably obvious) problem in condor_submitting a shell script
> that calls mpirun instead of using the MPI universe?
> 

Your shell script is only sort of told what nodes are allocated to it,
and it won't be easily able to tell Condor that it's alive - if it's 
not the MPI universe, the execute nodes will never hear from the condor
daemons on the submit machine that things are going well, and they'll timeout
and tear down the allocation after a few minutes. 

If you really want to try, see this:

https://lists.cs.wisc.edu/archive/condor-users/pre-2004-June/msg00355.shtml

-Erik