[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] preempt and then hold?



> On Fri, Dec 03, 2004 at 11:42:16AM -0600, Scott Koranda wrote:
> > Hi,
> > 
> > For a long time we have set up our pool with 
> > 
> > PREEMPT = False
> > 
> > so that the nodes in our cluster would not preempt a running
> > job for any reason (of course, the negotiator could still
> > cause jobs to preempt).
> > 
> > Lately, however, a few users have been running jobs that
> > malloc() a lot of memory and then eventually run the machine
> > in full swap, which eventually takes them into the weeds.
> > 
> > So we plan to change our configuration to
> > 
> > PREEMPT = (TARGET.ImageSize > ( 512 * 1024))
> > 
> > since each machine has 512 MB of physical memory (yes, the OS
> > uses some but we don't mind a little use of swap).
> > 
> > The idea is that when the job's memory usage grows, and Condor
> > notices, it will preempt the running job.
> > 
> > Two questions:
> > 
> > 1) Will this work?
> > 
> 
> It should.

How often is the ImageSize computed for 

a) standard universe jobs ?
b) vanilla universe jobs ?

If we do not do periodic checkpointing then will the ImageSize
still be updated for the standard universe jobs so that Condor
can act on the PREEMPT?

Thanks,

Scott