[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] counting licenses

On 5/4/05, Joshua Kolden <joshua@xxxxxxxxxxxxxxxxx> wrote:
> Thanks, but the problem here is that other jobs not restricted by
> licenses may have control of these machines when a maya (to use your
> example) job is submitted.  In theory the maya job is free to run on
> another cpu in the queue because licenses are available, however it does
> not run.  This is the problem we are running into now, along with more
> complex examples with multiple licensed software packages interacting.
> I need to have a dynamic expression in which I can tick off licenses use
> independently of the cpu I'm running on.  Are dynamic expressions like
> this even possible?

Dynamic user defined expressions are not really supported by condor itself.

Hawkeye allows dynamic behaviour but I don't know if it would fit well
to this particular problem.

conceptually (ignore implementation or physical location in this part) you need:

A 'licence server' which hands out a licence, takes it back and
reports how many are free to use.

While you could relatively quickly knock such a thing up (especially
if you didn't really send out licences but simply tokens indicating
permission) the tricky bit is integrating such a thing into condors
matchmaking process.

Ignore for now the smp licensing and just think of the 'easy' solution
where 1 job == 1 licence.

Say that you have X free nodes and Y free licences where X >> Y.
condor will simply schedule X jobs (since any externally inserted
attributes will be unchanged on the scheduling pass). It might then be
able (with some horrific expressions) to make it kill jobs to come
back down to Y but I would think that any progress to a steady state
would be torturous and wasteful, not to mention prone to race
conditions with whatever external process was updating the licences...

Realistically condor could do with a condor_licence daemon and a
simple way for jobs to indicate that they required the licence. condor
would then be able to control the whole thing much more effectively.
Of course making this a reality is a slightly more complex issue...

I know you don't want to here this but at the moment any dynamic
solution you come up with is likely to be more wasteful/error prone
than simply assigning a set of machines as maya enabled and having
jobs requiring these machines preempt any non maya jobs running on