[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor on Mac OS X



On Sat, Jan 29, 2005 at 03:18:52PM +0100, Eric Circlaeys wrote:
> On Jan 28, 2005, at 5:50 PM, Erik Paulson wrote:
> 
> >>I would like to use different version of MPI with different
> >>interconnects (GigE & Myrinet for example) is it possible?
> >>And I would like to know if it is also possible to use LAM MPI?
> >>
> >
> >Not (easily) in the currently released version of Condor. In later
> >6.7 releases, we'll have the generic parallel universe and support
> >any type of MPI you can think of.
> 
> I would like to setup such things because I would prefer to use Condor 
> instead of PBS.
> 
> Is it possible to have access to the beta version? Which release is 
> planned to support this feature?
> Do you have a roadmap to know the release date?
> On Mac OS X definitely I would like to use LAM with GigE and IB based 
> on IBM compilers and use MVAPICH and GM stuff from Myrinet...
> 

We don't have a date set. 

> I have a quick question too:
> I am trying to find requirements syntax to submit a job that require a 
> dedicated machine even if it only uses one CPU?
> I have a pool of hosts (dual proc) and I would like to run a single job 
> on one proc of one of these hosts but be sure to get all the computer 
> ressources so others jobs in that case cannot use the freed other CPUs.
> Can you help me?
> 

The easiest way is just to configure condor to only start one job per
machine, regardless of number of processors (actually, just put 
NUM_CPUS = 1) in your config file. 

You can write policy expressions for VMs that are aware of what is happening
on other CPUs, so you can have the second CPU refuse to start a job if the
CPU on the first machine is busy. Or, you can create a "third" CPU, and have
the first and second CPU evict the jobs running there and refuse to start new
ones if a job that wants the entire machine to itself arrives. I would
point you at this writeup:

http://www.cs.wisc.edu/~pfc/bologna_batch_system.html

that shows some of these tricks.


> >>I installed Condor version 6.7.3 on a Mac OS X Server 10.3.7.
> >>First I failed to use same share condor user home directory with
> >>./condor_configure. Finally I installed it manually on each nodes.
> >>Do you have a tutorial or an example how to use same shared home
> >>directory?
> >>
> >
> >How did it fail?
> 
> I will try again on monday, but as I remember, I used 
> ./condor_configure with the master installing all in the shared condor 
> user homedir then I log-in on a node with condor user and tried again 
> ./condor_configure to setup the node as a worker and I had files 
> conflicts, this seemed to modified previously master configurations.
> Do you have guidelines for installation with same repository? I did not 
> find any helps in the documentation.
> 

condor_configure creates a directory called local.<hostname>, and puts a 
condor_config.local in that directory. You can use some of the guidelines
found in

http://www.cs.wisc.edu/condor/manual/v6.7.3/3_10Setting_Up.html#SECTION004102000000000000000	

they should help you with using LOCAL_CONFIG_FILE to setup a shared 
installation of Condor.

-Erik