[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] New to Condor - Difficult (I think) problem...



What you want to do is tricky.  For type 3 (Multi-CPU, Multi-Machine),
checkout MPI or PVM universes, they might be able to be wrangled into
what you want. 

For type 1 vs 2.... check the archives for a message from my, subject
"Condor and Blast on multi-processor compute nodes".  I've been trying
to do much the same thing for a while, and have given up (in my case I
can rearrange the original problem in a way that avoids this particular
technique being required), but my message details a couple of techniques
I tried and why they failed.  It might spark something for you (and if
you do get it going, *please* let me know ;-))

Craig Miskell


> -----Original Message-----
> From: condor-users-bounces@xxxxxxxxxxx 
> [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Bob Mortensen
> Sent: Tuesday, 16 May 2006 5:24 a.m.
> To: Condor-Users Mail List
> Subject: Re: [Condor-users] New to Condor - Difficult (I 
> think) problem...
> 
> One thing I failed to mention is that this is an existing 
> application that
> implements it's own multi-CPU/multi-machine architecture and 
> we do not want
> to have to build and "condor smarts" into the application 
> itself, although
> wrapper scripts are perfectly reasonable.
> 
> Thanks again,
> Bob Mortensen
> 
> On Mon, 15 May 2006 09:59:01 -0700, Bob Mortensen  
> <condor@xxxxxxxxxxxxxxxxxxxx> wrote:
> 
> > Hi All,
> >
> > I'm new to condor and distributed computing, so the problem 
> I'm trying to
> > solve may be trivial, difficult or impossible; briefly, 
> here is what I
> > need to do.
> >
> > We have a pool of multi-CPU (actually dual-CPU) windows 
> machines that we
> > would like to maximize the use of CPU time on. We have 
> three types of  
> > jobs
> > to be run with the following requirements for each job type:
> >
> > 1. Single-CPU (about 80% of jobs). These jobs require only 
> one CPU and
> > thus can run concurrently on the same multi-CPU machine up 
> to the number
> > of CPUs on the machine. This seems easy enough and should 
> work "straight
> > out of the box".
> >
> > 2. Multi-CPU (about 15% of jobs).  These jobs require all 
> the CPUs on the
> > machine and no other job running on the machine. The 
> application will  
> > take
> > care of starting it's own processes/threads to make full 
> use of all CPUs.
> >
> > 3. Multi-CPU, Multi-Machine (about 5% of jobs). These jobs require
> > multiple multi-CPU machines, one master and one or more 
> "slaves". Each
> > machine will be dedicated to this job (i.e. no other jobs on these
> > machines). The application, running on the "master" machine 
> will take  
> > care
> > of starting it's own processes/threads (local and remote) to fully  
> > utilize
> > the machines assigned to the job. In addition, the "master" 
> machine needs
> > to get a list of all the "slave" machines. (It may be 
> sufficient to limit
> > this to one slave.)
> >
> > Once started, each job must complete before another is 
> started. If it
> > helps, we may be able to identify two machines to handle 
> the "Multi-CPU,
> > Multi-Machine" case, as long as they can also run type 1 
> and 2 jobs when
> > type 3 jobs are not in the queue. Writing scripts around 
> the application
> > to gather information to pass to the application is also a possible
> > solution (we have MKS and perl available on all machines).
> >
> > If this is fairly straight-forward, please say so, but also 
> point in the
> > direction of some documentation and preferably examples.
> >
> > Any pointers and/or advise will be greatly appreciated.
> >
> > Thanks,
> > Bob Mortensen
> > _______________________________________________
> > Condor-users mailing list
> > Condor-users@xxxxxxxxxxx
> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> 
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================