On Thu, Jul 03, 2008 at 11:41:50AM -0400, Ian Chesal wrote:
Exactly. Pre-defined static slots have to be replaced by
something that's aware of the whole picture (sees the
*machine* that hosts the slots which can be dynamically
reconfigured, or even created on demand).
I'm at the point where this is almost becoming a need, not a want. We're
parallelizing code left right and center to take advantage of multi-core
CPUs and our admins are flipping configurations on an almost daily basis
to load balance parallel v. serial jobs in our pools.
IMHO multi-core support, and the possible addition of other resources
to manage, is the most pressing need these days (and Intel's recent
announcement cited at
http://www.c0t0d0s0.org/archives/4571-Intel-finally-admits-it....html
seems to give us only a couple of years).
GPU support will automatically show up when the set of resources is
extended, as will SPE support for the Cell.
Of course, with dynamic slot creation, another problem comes along:
If a machine is already partially taken, how to define a
ranking among machines to allow for maximum flexibility in
the future?
Prior to using Condor our home grown sol'n allowed for dynamic
machine/slot allocations *but* we handled the scenario you described by
simplifying things down to only a handful of constraints per job: OS,
number of CPUs required, memory. We always negotiated for the biggest
CPU, biggest memory request jobs in the queue first. Taking the approach
that you fill the jar with rocks, then pebbles, then sand.
Future Condor development/extension should be aware of multi-threaded
applications (requiring multi-core slots!). -> Resource "Thread count".
Currently I've got a user who is running 2-thread apps, and forcing his
way onto the corresponding nodes by faking a huge memory requirement.
That's ugly but ATM the only way to get a single-slot machine with
2 CPU cores.
Needless to say that the CPUs are wasted when he's not around.
What Carsten suggested in a previous mail: have "network bandwidth" as
a resource, too. In particular with multi-core (multi >> 2) machines,
this cannot be neglected anymore.
On the other hand, MPI applications would profit from locality, even
in a NUMA setup (as multiple Opterons would provide: their HT links
are faster than the standard network which is attached using another
HT anyway).
Other apps would prefer to be matched different machines as long as this
is possible (to spread the generated heat and reduce silicon wear).
I certainly don't envy the Condor Team -- I know Derek has talked about
adaptive machine setups but how it'd work in the face of all those
constraints I can't imagine. Maybe it'd make a good thesis? Who ever
does get this into Condor is my hero though. :)
:-) Indeed, sounds like a major transition. The "vm -> slot" one was
nothing compared with that, and it took quite a while...
IMHO all boils down to dynamic slot definition. Something
that would no longer happen on the execute node but on the master ...
Interesting. So the startd's would tell the collector what they have in
total. And the negotiator would read this. Assign a job. Subtract what
the job estimates it will use or what it says it wants, and updates the
ad in the collector for the machine. Sort of a "best guess" ad. And then
the startd can correct anything the negotiator got wrong at a later
point in time. Interesting...
Actually, since there is no defined number of slots anymore, there probably
would be only one startd per machine.
But basically you're right. Currently, "only" the shadow gets notified of
changes of the memory footprint. In the future, the whole Condor should
know about it.
This undoubtedly requires users to be honest about their actual requirements,
and (to enforce them to be honest) mechanisms to track them.
Count me among the Condor users who really, really needs dynamic machine
slots. Multi-core machines and parallel software are the future in the
EDA industry.
All available hands raised here.
Cheers,
Steffen