[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [condor-users] synchronously starting multiple jobs in Condor



On Wed, Nov 05, 2003 at 11:50:09PM +0200, Mark Silberstein wrote:
> The only way ( that I am aware of ) to force jobs to start at once on
> all machines is MPI universe. For our progs, which have to be started
> this way, we wrap them in MPI_Init and MPI_Finalize and run as if it was
> MPI application in MPI universe. This functionality of start barrier is
> implemented in the dedicated schedd, running for MPI universe only, and
> I don think that you have any way to hack it to do the same for non-mpi
> jobs.
> I remember that there were talks about generic parallel universe. Where
> is it now, and any future plans on this - developers - answer our call!
> I think that this 'start barrier' feature is rather necessary thing.
> Mark

Mark, you're right, we don't yet have a true "start barrier" that we're
happy with. However, you're on the right track with using the MPI universe.
This also only works on UNIX - I'd have to read the code for a while to
tell you how to do it on Windows, but it's possible there.

It turns out that there's nothing in the MPI universe that forces it to 
be an MPI program. What is magical is the way we start jobs up - the MPI
universe assumes that the job will be following the startup method of the 
ch_p4 device of MPICH. If your parallel application can follow that, you
can use Condor to start a parallel job roughly syncronously. 

Here's how it works: When you say in your submitfile:

universe = mpi
executable	= testparallel
log = testparallel.log
output = output.$(NODE)
error = error.$(NODE)
machine_count = 4
should_transfer_files = yes
when_to_transfer_output = onexit
queue

The testparallel job will get four machines allocated to it, and Condor will
run testparallel on the "first" node of the four. That job is now responsible
for telling Condor when to start job the other nodes, and Condor gives you
two things to help you:

1. The first process that starts up will get two new arguments:
   -p4pg filename

The file specified by filename will have the names of all of the machines
allocated by Condor for the job, formatted one per line:
firstmachinename 0 conodr_exec
secondmachinename 1 condor_exec
thrirdmachinename 1 condor_exec
forthmachinename 1 condor_exec

2. Condor provides a "replacement" for RSH, that uses the Condor job 
management stuff to startup the next process in the parallel job. It 
should be the first rsh in your path, but you can always use the 
enviornment variable P4_RSHCOMMAND to find it. Your job, when it decides
to start the next machine, can pass arguments to the next machine. The one
caveat - one of your arguments MUST be condor_exec, and we don't pass any
arguments before condor_exec to the job. So, if your job invokes rsh like
this:

system("rsh nextnode -l something -t whatever condor_exec -port 1234")

your jobs will be only get
"condor_exec" "-port" "1234" as arguments.


We're not happy with this solution, and we've written some of the beginnings
of a general parallel job solution. The big thing that our current MPI
implementation doesn't allow you to do is make any sort of decisions on
how to startup the job before ANY nodes have run. We have plans to allow
jobs to have a script that runs on the submit machine, underneath the 
condor_shadow, and have access to the Classads of the machines that
were allocated to you. For example, maybe you've got a dedicated cluster
with Myrinet, and you want the carefully control what order your nodes are
started up on (maybe you do some sort of cartesian domain decomposistion,
and you want to place jobs to match the myrinet switch topology).

I'm pretty sure that my instructions aren't 100% complete - I'm mostly writing
from memory, and glancing at the code as I go, and I've not actually ever 
tried this, but I think it should work. If you try it, please respond to
the list and let us know what else we need to do.

-Erik
	
> On Tue, 2003-11-04 at 14:57, Hahn Kim wrote:
> > My group has developed a Matlab library, called MatlabMPI, which 
> > implements a subset of the MPI library.  Currently, it launches Matlab 
> > on multiple machines by sending commands via rsh.  Currently, we are 
> > trying to integrate MatlabMPI with Condor.
> > 
> > Like MPI, all processes in a MatlabMPI program must start executing at 
> > the same time.  Otherwise, any process that needs to communicate with an 
> > idle process will cause the MatlabMPI program to hang.
> > 
> > We have been trying to figure out if there is a way to force Condor to 
> > synchronously start executing a set of Matlab processes distributed 
> > across a cluster.  Does any one have any ideas?  Is this functionality 
> > built into Condor, or will this require a hack?
> 
> Condor Support Information:
> http://www.cs.wisc.edu/condor/condor-support/
> To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
> unsubscribe condor-users <your_email_address>
> 
Condor Support Information:
http://www.cs.wisc.edu/condor/condor-support/
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>