[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Eviction due to keyboard busy?

I've been checking a lot of log files recently to try and
figure out why we're having problems just recently with jobs
not running to completion overnight.

The logs show many suspend/unsuspend events.

We are currently using the default UWCS suspend and continue
configs, i.e.

# Suspend jobs if:
# 1) the keyboard has been touched, OR
# 2a) The cpu has been busy for more than 2 minutes, AND
# 2b) the job has been running for more than 90 seconds
UWCS_SUSPEND = ( $(KeyboardBusy) || \
                 ( (CpuBusyTime > 2 * $(MINUTE)) \
                   && $(ActivationTimer) > 90 ) )

# Continue jobs if:
# 1) the cpu is idle, AND 
# 2) we've been suspended more than 10 seconds, AND
# 3) the keyboard hasn't been touched in a while
UWCS_CONTINUE = ( $(CPUIdle) && ($(ActivityTimer) > 10) \
                  && (KeyboardIdle > $(ContinueIdleTime)) )

where the ContinueIdleTime = 5 * $(MINUTE)

All the logs show unsuspend happening 5 mins after suspend.
To me this means that the suspends were happening due to
keyboard activity and NOT the cpu being busy. Does this
Logic sound correct? If so then I now need to figure out
why there is keyboard activity on many, many machines
overnight that was not happening previously. And no, we
don't have people working during those hours! :)




P.S. We have changed all the update type parameters from
300 secs (5 mins) to 30 secs.