[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [condor-users] synchronously starting multiple jobs in Condor
- Date: 05 Nov 2003 23:50:09 +0200
- From: Mark Silberstein <marks@xxxxxxxxxxxxxxxxxxxxxxx>
- Subject: Re: [condor-users] synchronously starting multiple jobs in Condor
The only way ( that I am aware of ) to force jobs to start at once on
all machines is MPI universe. For our progs, which have to be started
this way, we wrap them in MPI_Init and MPI_Finalize and run as if it was
MPI application in MPI universe. This functionality of start barrier is
implemented in the dedicated schedd, running for MPI universe only, and
I don think that you have any way to hack it to do the same for non-mpi
I remember that there were talks about generic parallel universe. Where
is it now, and any future plans on this - developers - answer our call!
I think that this 'start barrier' feature is rather necessary thing.
On Tue, 2003-11-04 at 14:57, Hahn Kim wrote:
> My group has developed a Matlab library, called MatlabMPI, which
> implements a subset of the MPI library. Currently, it launches Matlab
> on multiple machines by sending commands via rsh. Currently, we are
> trying to integrate MatlabMPI with Condor.
> Like MPI, all processes in a MatlabMPI program must start executing at
> the same time. Otherwise, any process that needs to communicate with an
> idle process will cause the MatlabMPI program to hang.
> We have been trying to figure out if there is a way to force Condor to
> synchronously start executing a set of Matlab processes distributed
> across a cluster. Does any one have any ideas? Is this functionality
> built into Condor, or will this require a hack?
Condor Support Information:
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>