[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Looking for something like CpuBusy for Disk
- Date: Fri, 03 Sep 2010 12:24:55 -0700
- From: Lans Carstensen <Lans.Carstensen@xxxxxxxxxxxxxx>
- Subject: Re: [Condor-users] Looking for something like CpuBusy for Disk
Ian Chesal wrote:
On Fri, Sep 3, 2010 at 2:30 PM, Lans Carstensen
The seriously cool way would be to standardize a global resource
classad type and allow for expression logic to have a way to
directly address all resource ads that are deemed to apply to a
particular slot and job.
That certainly takes it a step further. To that end: a way to put *any*
type of ad in to the collector would really be interesting. So you could
collect and reference anything you want and use it during matchmaking
and execution decisions.
Baby steps though. :) I don't want to dream my way out this being possible.
This reminds me that the other way we address the common use case
for this class of problem is with concurrency limits on jobs - but
that doesn't handle slot-level use cases, only job-level ones.
Partitionable slot-level resource/concurrency limits would be a
useful addition towards that goal. Then you might want to apply
some feedback mechanism to make the upper bound of that
partitionable slot dynamic based upon some business logic (like a
global resource classad).
Ahh...it's fun to dream. In all honesty though, while it's a bit
cumbersome to setup once, Startd cron jobs can handle a lot of what
we're dreaming about here when it comes to slot-level concurrency
controls. And once you get one set up, the rest follow pretty easily.
True, startd cron's can (and have) been used to do all sorts of things -
but I haven't seen one applied to do host-level concurrency limits
(successfully). The "cron"/timeperiod nature makes doing resource
counters unsafe for resource reservation. Do you have an example of one
of those, or is it (like I currently believe) a gap in functionality
that you end up having to build up custom slot types around to handle?
For instance, say I have a SAN-attached host and want to enable no more
than 3 concurrent SAN IO jobs while also enabling other job types.
Today I'd have to set up a special partitionable slot with a SAN
attribute and start expression to only allow SAN jobs and do something
like dedicating 3 CPU's and some amount of RAM towards that
partitionable slot. Or make 3 SAN slots with dedicated memory
resources. And then add a partitionable slot for all remaining CPU,
memory, and local disk resourcse. There's no way to apply a "SAN"
resource counter, and no way to alter the number "3" live based on
actual SAN link utilization or storage subsystem latency. Right?
If you have a startd cron for that class of use case, we'd be interested
in seeing it. Other competing resource schedulers have host-based
resource counters for this reason.
-- Lans Carstensen