[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] use free machines first but overload cpu's



On Thu, Jan 19, 2006 at 06:26:14PM +0100, van Pee wrote:
> Hi all,
> 
> My problem is the following: All users should have the same priority and 
> can use all machines. Its intended to
> give all users maximum throughput. If there are small jobs which can be 
> parallised they should always run!

Harald,

I'm a bit puzzled: first you're talking about vanilla (which is fine for
a lot of applications), then you want to parallelise. Condor vanilla is
meant for *serialised* tasks. If you want parallel execution, you will
need the MPI universe.

Let me assume that you meant "split a task into n subtasks which can run
independent of each other" - then it can happen that the same CPU (or
virtual machine, in Condor-speak) will process all n jobs if no other
resources are free. 
Remember that a maximum throughput solution may be unfair to individual 
users! It's the general throughput that counts - there's no guarantee that
you individual job batch will be finished within a given time range.
(Of course there are means to tweak the configuration to favor certain
classes of tasks, but that's not what you'd like to have at the very
beginning of your Condor experience.)

> If I use just as many cpu's as there are (6 at the moment) than I can 
> use just 6 jobs at once. If there would be
> a user who wants to run a job splitted to 6 cpu's (on filebase) which 
> take in total 5 minutes it could happen, that
> he have to wait for hours or days for this job, which is not acceptable.

If you got n cpus, and don't redefine virtual machines, there will be a 
one-to-one mapping of cpus to VMs, correct. Each of those VMs will get
negotiated (by the master) and matched with a job, and once it's finished
its work it will receive the next chunk of work. In our setup, a VM
negotiated for a certain user will stay assigned to that user until it
runs out of work - so if youu manage to grab at least one CPU odds are good
to finish the whole batch in limited time.
 
> with NUM_CPUS = ,
> I can change this, but it seems, that condor uses first all 6 (of course 
> virtual) cpus of the first machine
> and then it starts with the next one!

That depends on the negotiator cycles and will randomize over time.
You may prioritise using a RANK=(7-VirtualMachineID).

> What I want to have is:
> I allow a maximum of 4 jobs per real cpu. We have 2 types (later 3 or 4 
> types) of cpus: fast and faster.
> condor should use
> 1. all faster cpu with one job
> 2. all fast cpu with one job

Use RANKing to prefer faster cpus, based on the classad attributes 
related to speed (MIPS or the like). To prefer slow machines, use
100000-MIPS :-)

Are you sure you want to run 4 jobs on a single CPU? What about real
and virtual memory? If the machine starts swapping, your execution 
times may explode.

> if there are 6 jobs each real cpu should run one of them.
> if there are 12 jobs, each real cpu should run two of them
> and so on!

What's the point? If every real CPU has a single job to run it will do
so 100% if the time, and finish after time T. If the same real CPU 
(split into 2 VMs) has to run 2 jobs, it will run every job at max 50%,
and finish both after 2*T (or later, if swapping has to be accounted for).
In both cases, 2 jobs will be done after 2*T - the one-to-one solution
is far more predictable.

> For me the condor configuration is too sophisticated and I don't find the
> correct setting for the above task. Therefore it would be very helpful 
> if someone can lead me in the right direction.

Don't try to do everything at the same time. Serialisation is a good
thing (unless you're MPIing). If you need dependencies, DAG will be 
your friend...

Cheers,
 Steffen

-- 
Steffen Grunewald * MPI fuer Gravitationsphysik (Albert-Einstein-Institut)
SciencePark Golm, Am Mühlenberg 1, D-14476 Potsdam * http://www.aei.mpg.de
* e-mail: steffen.grunewald(*)aei.mpg.de * +49-331-567-{fon:7233,fax:7298}
No Word/PPT mails - http://www.gnu.org/philosophy/no-word-attachments.html