On Jun 22, 2004, at 11:31 AM, Arun nayar wrote:

I have a Cactus executable compiled with MPICH-G2 which I can run using
the mpirun command. If I want to use condor to submit the job, which
universe do I use;vanilla,mpi or globus.

Is it possible that I compile the Cactus executable without MPI option and
then use the condor job submit to specify it to run as an mpi job. Any
advantages/disadvantages of doing so.

If anybody has used these 3 tools, cactus,condor and mpich-g2 together, can
they tell me how mpirun, condor_job_submit and globus_job_run are used in
the above scenario.

I assume you want to run your program on a remote globus-accessible resource. In that case, you have three options for using condor-g to run it.

1) Run as an ordinary mpi job via globus universe: If a single instance of your program doesn't have to span multiple execution sites, this is probably the best option. You'll need to recompile your program with an ordinary version of mpi (i.e. one other than mpich-g2). Then, you'll submit your jobs as globus universe jobs and include the following line your submit file:

globusrsl = (jobtype=mpi)

2) Use mpi universe and glide-in: With this option, you'll submit your jobs as mpi universe jobs. Then you'll submit glide-in jobs to multiple globus resources, which start up the condor daemons as user jobs. When they report back, condor will start your mpi jobs just like in a local condor pool. You'll have to recompile your program with a version of mpich that the condor mpi universe works with.

3) Use coordinator: If you want to use mpich-g2 and condor-g, this is the option you'll need. To coordinate the nodes of your mpi job across multiple sites, mpich-g2 expects your submit machine to act as a coordinator. globus-job-run/globusrun can do this, but condor-g doesn't. We've written a coordinator that runs on top of condor-g. It's very much beta-quality software, but one set of users have been using it successfully for a while now. You'll write a globus universe submit file for each resource that you want to be included in your job. Then you'll run the coordinator as a scheduler universe job, giving the submit files as arguments.

The big question is how much you need mpich-g2 (versus another version of mpi). Is it alright for each job to run at a single site?

