[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Suspending other jobs when a privileged user submits a new job
- Date: Thu, 18 Oct 2007 14:12:27 +0100
- From: "Kewley, J \(John\)" <j.kewley@xxxxxxxx>
- Subject: Re: [Condor-users] Suspending other jobs when a privileged user submits a new job
> Hi Matt,
> Thanks for your reply.
> matthew.hope@xxxxxxxxx said:
> > I fear you have misunderstood the capabilities of the
> SUSPEND functionality.
> > It does not allow you to free up the slot for use by
> another job. Only to stop
> > a job doing anything for a while (the original idea
> presumably being that if
> > someone wants to use the machine interactively they can be
> largely unaffected
> > by the job without totally killing it).
> Hmmm. So you're saying that the job will be suspended, but it
> will continue to occupy a slot?
> I'll take a look at multiple slots and see what I can do. Is
> there no other way to ask condor to overcommit a processor (i.e.,
> have two jobs -- in my case, one suspended -- assigned to the same
There still seems to be a little confusion. Maybe this will help:
* The number of "slots" seen by Condor can be whatever you like:
o typically 1 per processor core
o this gets double if hyperthreading is enabled
You can affect these slots:
o tell condor not to consider hyperthreading when allocating slots
o for multi-core processors, setup a complicated group of overlapping
slots (see previous posts on this).
e.g. for a Quad, 4GB RAM, pretend you have a single 4GB RAM proc, 2x 2GB procs
and 4x 1GB proc - and then do some smart config setup so that certain
combinations of these is disallowed.
o Declare (say) twice as many slots per processor so more jobs run concurrently
(you might know that jobs spend a long time in I/O for instance and your tests
have shown this gives better throughput).
* When non-condor activity happens on machine, you can configure condor to
behave in a variety of waves by tuning values such as SUSPEND and PREEMPT.
These values will allow you to specify things like:
* Don't give user priority and continue regardless
* suspend job (I believe this will "swap" job out of memory
* kill job (allowing it to be restarted elsewhere)
I hope this helps, although it didn't directly answer any of your questions