[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Avoiding preemption ping-pong



On Wed, 2012-04-25 at 11:37 +0200, Martin Billinger wrote:
> Hi there,
> 
> I have recently observed the jobs of two users keep preempting each 
> other every few hours.
> 
> We are running a small condor pool with some dedicated machines.
> Two users have submitted a large number of long running jobs. Both users 
> have roughly the same priority.
> 
> Now, what happens is that first one of the users gets assigned all 
> available resources. Once EUPs differ by 20%, all of that users' jobs 
> are preempted and the other user gets all resources. This repeats every 
> few hours and causes many cycles to be lost.

You can change that '20%' threshold by editing this:
UWCS_PREEMPTION_REQUIREMENTS = ( $(StateTimer) > (1 * $(HOUR)) &&
RemoteUserPrio > TARGET.SubmitterUserPrio * 1.2 ) || (MY.NiceUser ==
True)

So, you could increase '1.2' to something like 1.5, or 2.0, etc.

Also, you can make use of:
# allow up to 24 hours for a job to complete after initial preemption:
MAXJOBRETIREMENTTIME = 3600 * 24