[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor and GPUs



Our experiences and requirements broadly echo what's already been discussed in this thread. The need for a form of dynamic multi-core support was raised at the HTC workshop in Edinburgh last Autumn. Ideas were converging to what was mentioned in Ian Chesal's post in this thread, i.e. the Startd would subtract whatever's being currently used by a multi-core job and advertise the remainder. Such support would allow for discrimination of jobs on numbers of cores used, e.g. I may want to preempt a current job if another one that makes use of more cores comes along.

One way that our users make use of multi core machines is to run parallel/MPI jobs that don't span more than one SMP host. Performance-wise these are great, but because they operate under the parallel universe they necessitate a dedicated scheduler, which is a pain in our flocked environment (we currently have 13 pools and climbing); ideally we'd like this restriction removed for such single-host jobs.

On the GPU front, quite a few of our users have caught the bug and would like Condor to recognise a video card that can be scavenged, and would suitably advertise 'cuda, 'ati', 'OpenCL', etc., maybe even the card type itself. However, we realise that this is quite a new and fast moving field, and so tricky for the Condor team to work against.

Just my/our two cents worth...

Mark

Steffen Grunewald wrote:
On Thu, Jul 03, 2008 at 11:41:50AM -0400, Ian Chesal wrote:
Exactly. Pre-defined static slots have to be replaced by
something that's aware of the whole picture (sees the
*machine* that hosts the slots which can be dynamically
reconfigured, or even created on demand).
I'm at the point where this is almost becoming a need, not a want. We're
parallelizing code left right and center to take advantage of multi-core
CPUs and our admins are flipping configurations on an almost daily basis
to load balance parallel v. serial jobs in our pools.

IMHO multi-core support, and the possible addition of other resources
to manage, is the most pressing need these days (and Intel's recent
announcement cited at http://www.c0t0d0s0.org/archives/4571-Intel-finally-admits-it....html seems to give us only a couple of years). GPU support will automatically show up when the set of resources is
extended, as will SPE support for the Cell.

Of course, with dynamic slot creation, another problem comes along:
If a machine is already partially taken, how to define a
ranking among machines to allow for maximum flexibility in
the future?
Prior to using Condor our home grown sol'n allowed for dynamic
machine/slot allocations *but* we handled the scenario you described by
simplifying things down to only a handful of constraints per job: OS,
number of CPUs required, memory. We always negotiated for the biggest
CPU, biggest memory request jobs in the queue first. Taking the approach
that you fill the jar with rocks, then pebbles, then sand.

Future Condor development/extension should be aware of multi-threaded applications (requiring multi-core slots!). -> Resource "Thread count".
Currently I've got a user who is running 2-thread apps, and forcing his
way onto the corresponding nodes by faking a huge memory requirement.
That's ugly but ATM the only way to get a single-slot machine with 2 CPU cores.
Needless to say that the CPUs are wasted when he's not around.

What Carsten suggested in a previous mail: have "network bandwidth" as
a resource, too. In particular with multi-core (multi >> 2) machines,
this cannot be neglected anymore.

On the other hand, MPI applications would profit from locality, even
in a NUMA setup (as multiple Opterons would provide: their HT links
are faster than the standard network which is attached using another
HT anyway).
Other apps would prefer to be matched different machines as long as this
is possible (to spread the generated heat and reduce silicon wear).

I certainly don't envy the Condor Team -- I know Derek has talked about
adaptive machine setups but how it'd work in the face of all those
constraints I can't imagine. Maybe it'd make a good thesis? Who ever
does get this into Condor is my hero though. :)

:-) Indeed, sounds like a major transition. The "vm -> slot" one was
nothing compared with that, and it took quite a while...

IMHO all boils down to dynamic slot definition. Something
that would no longer happen on the execute node but on the master ...
Interesting. So the startd's would tell the collector what they have in
total. And the negotiator would read this. Assign a job. Subtract what
the job estimates it will use or what it says it wants, and updates the
ad in the collector for the machine. Sort of a "best guess" ad. And then
the startd can correct anything the negotiator got wrong at a later
point in time. Interesting...

Actually, since there is no defined number of slots anymore, there probably
would be only one startd per machine. But basically you're right. Currently, "only" the shadow gets notified of
changes of the memory footprint. In the future, the whole Condor should
know about it.
This undoubtedly requires users to be honest about their actual requirements,
and (to enforce them to be honest) mechanisms to track them.
Count me among the Condor users who really, really needs dynamic machine
slots. Multi-core machines and parallel software are the future in the
EDA industry.

All available hands raised here.

Cheers,
 Steffen


--
Cambridge eScience Centre, University of Cambridge
Centre for Mathematical Sciences, Wilberforce Road, Cambridge CB3 0WA
Tel. (+44/0) 1223 765317, Fax  (+44/0) 1223 765900
http://www.escience.cam.ac.uk/~mcal00