[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] condor: mpi programs under windows



If mpiexec works, then Condor should be able to start up an MPI-process
as well. However, in the parallel Universe only systems can be used that
are 'dedicated'. This means that once a process is started, it will not
be allowed to stop before it finishes. You'll need to set the
corresponding ClassAds for the machines you want in the MPI-pool. Also,
the machine that is used to start up the process, must be a special
commit machine. Unfortunately I can't tell you all the precise details
(since our sys-admin figured out all the right settings) But if you
didn't do anything yet along these lines, you may want to pick up the
manual and look how to do these things. I'm sure it's in there,
somewhere...... :P

Greetings, Jakob

Sangamesh B wrote:
> Hello,
>       Does condor-7.2 support mpich2 on windows?
> Thanks in advance,
> Sangamesh
> 
> On Mon, Apr 27, 2009 at 12:08 PM, Sangamesh B <forum.san@xxxxxxxxx
> <mailto:forum.san@xxxxxxxxx>> wrote:
> 
>     Dear condor users,
>         Is anybody successful in running mpi (mpich2-1.0.8p1:
>     http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads)
>     programs with condor(7.2.1) under MS-Windows (XP)?
>         The problem is, job gets submitted but remains in "idle" state.
>     Without using condor, the program runs fine:
>     E:\mpich2\examples>mpiexec -n 2 himpi.exe
>     User credentials needed to launch processes:
>     account (domain\user) [SUPPORT-2\Administrator]:
>     password:
>     Greetings: 1 of 2 from the node support-2
>     Greetings: 0 of 2 from the node support-2
> 
>     Under condor:
>     D:\condor\CON-MP~1>condor_q
> 
>     -- Submitter: support-2 : <10.129.150.53:1044
>     <http://10.129.150.53:1044>> : support-2
>      ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
>       4.0 Administrator 4/27 11:38 0+00:00:00 I 0 0.0 job.cmd
>     1 jobs; 1 idle, 0 running, 0 held
> 
>     D:\condor\CON-MP~1>condor_q -analyze
> 
>     -- Submitter: support-2 : <10.129.150.53:1044
>     <http://10.129.150.53:1044>> : support-2
>      ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
>     ---
>     004.000: Run analysis summary. Of 3 machines,
>       3 are rejected by your job's requirements
>       0 reject your job because of their own requirements
>       0 match but are serving users with a better priority in the pool
>       0 match but reject the job for unknown reasons
>       0 match but will not currently preempt their existing job
>       0 are available to run your job
> 
>     WARNING: Be advised:
>       No resources matched request's constraints
>       Check the Requirements expression below:
> 
>     Requirements = ((machine == "support-2.locuzblr.co.in
>     <http://support-2.locuzblr.co.in>")) && (Arch == "INTEL") &&
>      (OpSys == "WINNT51") && (Disk >= DiskUsage) && ((Memory * 1024) >=
>     ImageSize) &
>     & (HasMPI) && (HasFileTransfer)
> 
>     WARNING: Analysis is meaningless for MPI universe jobs.
>     1 jobs; 1 idle, 0 running, 0 held
>     D:\condor\CON-MP~1>
> 
>     The job submit files are:
>     D:\condor\CON-MP~1>type job.sub
> 
>       universe = MPI
>       executable = job.cmd
>       log = mpilog
>       output = mpioutput
>       error = mpierr
>       machine_count = 1
>       requirements = ( machine == "support-2.locuzblr.co.in
>     <http://support-2.locuzblr.co.in>" )
>       queue
> 
>     D:\condor\CON-MP~1>type job.cmd
>     E:\mpich2\bin\mpiexec.exe -n 1 E:\mpich2\examples\himpi.exe > proout.txt
> 
>     D:\condor\CON-MP~1>
> 
>     Any clue, what's wrong going here?
>     Thanks,
>     Sangamesh
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at: 
> https://lists.cs.wisc.edu/archive/condor-users/