[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Upper bound on Job Run-time



My Condor grid is made mostly of WinXP machines running 6.6.5.  Playing around with condor_userlog, I noticed that there seems to be a 2-hour upper bound on the running time of my jobs.  I didn't configure this, and I don't want it.  Is this controlled by a Condor config file entry?

A typical condor_userlog output:
Job statistics from submission log C:\Jobs\...\Current Jobs\sim11\sim.log
446.0    10.81.1.206     10/19 14:38 10/19 16:38   0+01:59   0+00:00   0+00:00
446.0    10.81.2.53      10/20 14:26 10/20 16:26   0+02:00   0+00:00   0+00:00
446.0    10.64.2.55      10/20 16:44 10/20 18:44   0+02:00   0+00:00   0+00:00
446.0    10.25.2.99      10/20 19:29 10/20 21:29   0+01:59   0+00:00   0+00:00
446.0    10.25.2.99      10/20 21:49 10/20 23:49   0+02:00   0+00:00   0+00:00

Here are the relevant config file entries from the execute machines.  The "RANDOM_CHOICE" macros are intended to spread out the submission and preemption of jobs, so as not to overload our network.

IS_WEEKDAY          = ( ClockDay >= 1 && ClockDay <= 5 )
IS_SCHOOL_HOURS     = ( ClockMin >= $RANDOM_CHOICE(300, 315, 330, 345, 360, 375, 390, 415) \
                     && ClockMin <= $RANDOM_CHOICE(930, 945, 960, 975, 990) )
IS_DURING_SCHOOL    = ( $(IS_WEEKDAY) && $(IS_SCHOOL_HOURS) )

WANT_SUSPEND    = False
WANT_VACATE     = True
START           = ( $(UWCS_START) && $(IS_DURING_SCHOOL) != True )
SUSPEND         = $(UWCS_SUSPEND)
CONTINUE        = $(UWCS_CONTINUE)
PREEMPT         = ( $(UWCS_PREEMPT) || $(IS_DURING_SCHOOL) )
KILL            = $(UWCS_KILL)