[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Preemption issues

Hi All,

I'm looking to disable preemption on some of the systems in my cluster

$ condor_version 
$CondorVersion: 7.4.0 Nov  1 2009 BuildID: 193173 $
$CondorPlatform: X86_64-LINUX_DEBIAN50 $

The goal being for running jobs never to be interrupted (which I know
isn't quite the same as not preempting claims).

My first attempt using the example in the manual (

#Disable preemption by machine activity.
#Disable preemption by user priority.
#Disable preemption by machine RANK by ranking all jobs equally.
RANK = 0

still gets jobs preempted due to user priority (checked runtime values
with condor_config_val to see the values I expect are the ones
actually in use)

My second attempt was to set a high MAXJOBRETIREMENTTIME as suggested
in the same section this "works" but queued jobs seem to get stuck to
a node that is doing this slow preemtion and are not reassinged to
other resources if the become available and since some jobs in the
cluster run for minutes and some for weeks this is not really what I'm
looking for.

I had thought this was working previously and has been part of an
advertized feature of our cluster for years, but I'm honestly not
certain if the behaviour has changed or if it were simply
insufficiently tested in the past.