[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Unexpected length of time in 'Owner / Idle' state



Burnett, Ben wrote:
Hey Mark:

Are these machines being used exclusively for Condor use?  If so, you could simply set the START expression to TRUE, and skip the Owner state entirely.

-B

On 2010-06-10, at 5:14 AM, Mark Whidby wrote:

Hi,

We have a couple of dual boot clusters (Windows XP/Scientific Linux) which are
booted into Linux each night at 19:00 for the purpose of running Condor jobs.
When these were initially set up they would spend about 10 minutes in
the 'Owner' state after booting into Linux before entering the 'Unclaimed' state.

However, they now consistently spend about 70 minutes (i.e. 1 hour longer)
in the 'Owner' state (with 'Idle' activity) before entering the 'Unclaimed' state.
*As far as I can tell*, this change in behaviour occurred when we moved to
British Summer Time (I'm writing from the UK) at the end of March.

Hi,
That isn't really a possibility because the clusters are also used for
non-Condor work at other times.

However, I've done a bit more digging around and it would seem that I'm
seeing exactly the same problem as described in this post:-

https://lists.cs.wisc.edu/archive/condor-users/2005-February/msg00198.shtml

and unfortunately a solution was never suggested. This is confirmed by this extract from the StartLog:-

06/13 11:59:41 ******************************************************
06/13 11:59:41 ** condor_startd (CONDOR_STARTD) STARTING UP
06/13 11:59:41 ** /opt/condor/condor-7.4.2-install/sbin/condor_startd
06/13 11:59:41 ** SubsystemInfo: name=STARTD type=STARTD(7) class=DAEMON(1)
06/13 11:59:41 ** Configuration: subsystem:STARTD local:<NONE> class:DAEMON
06/13 11:59:41 ** $CondorVersion: 7.4.2 Mar 29 2010 BuildID: 227044 $
06/13 11:59:41 ** $CondorPlatform: X86_64-LINUX_RHEL5 $
06/13 11:59:41 ** PID = 3064
06/13 11:59:41 ** Log last touched 6/13 11:57:57
06/13 11:59:41 ******************************************************
06/13 11:59:41 Using config source: /opt/condor/condor-7.4.2-install/etc/condor_config 06/13 11:59:41 Using local config sources: 06/13 11:59:41 /opt/condor/condor-7.4.2-local/condor_config.local 06/13 11:59:41 DaemonCore: Command Socket at <xxx.xxx.xxx.xxx:9644> 06/13 11:59:42 sscanf didn't parse correctly 06/13 11:59:53 VM-gahp server reported an internal error 06/13 11:59:53 VM universe will be tested to check if it is available 06/13 11:59:53 History file rotation is enabled. 06/13 11:59:53 Maximum history file size is: 20971520 bytes 06/13 11:59:53 Number of rotated history files is: 2 06/13 11:59:53 slot1: New machine resource allocated
06/13 11:59:53 slot2: New machine resource allocated
06/13 11:59:53 sscanf didn't parse correctly
06/13 11:59:53 slot1: Idle time: Keyboard: 0        Console: 0
06/13 11:59:53 slot2: Idle time: Keyboard: 0        Console: 0
06/13 11:59:53 About to run initial benchmarks.
06/13 11:59:57 Completed initial benchmarks.
06/13 12:04:57 slot1: Idle time: Keyboard: 0        Console: 0
06/13 12:04:57 slot2: Idle time: Keyboard: 0        Console: 0
06/13 12:09:57 slot1: Idle time: Keyboard: 0        Console: 0
06/13 12:09:57 slot2: Idle time: Keyboard: 0        Console: 0
06/13 12:14:57 slot1: Idle time: Keyboard: 0        Console: 0
06/13 12:14:57 slot2: Idle time: Keyboard: 0        Console: 0
06/13 12:19:57 slot1: Idle time: Keyboard: 0        Console: 0
06/13 12:19:57 slot2: Idle time: Keyboard: 0        Console: 0
06/13 12:24:57 slot1: Idle time: Keyboard: 0        Console: 0
06/13 12:24:57 slot2: Idle time: Keyboard: 0        Console: 0
06/13 12:29:57 slot1: Idle time: Keyboard: 0        Console: 0
06/13 12:29:57 slot2: Idle time: Keyboard: 0        Console: 0
06/13 12:34:57 slot1: Idle time: Keyboard: 0        Console: 0
06/13 12:34:57 slot2: Idle time: Keyboard: 0        Console: 0
06/13 12:39:57 slot1: Idle time: Keyboard: 0        Console: 0
06/13 12:39:57 slot2: Idle time: Keyboard: 0        Console: 0
06/13 12:44:57 slot1: Idle time: Keyboard: 0        Console: 0
06/13 12:44:57 slot2: Idle time: Keyboard: 0        Console: 0
06/13 12:49:57 slot1: Idle time: Keyboard: 0        Console: 0
06/13 12:49:57 slot2: Idle time: Keyboard: 0        Console: 0
06/13 12:54:57 slot1: Idle time: Keyboard: 0        Console: 0
06/13 12:54:57 slot2: Idle time: Keyboard: 0        Console: 0
06/13 12:59:57 slot1: Idle time: Keyboard: 85       Console: 85
06/13 12:59:57 slot2: Idle time: Keyboard: 85       Console: 85
06/13 13:04:57 slot1: Idle time: Keyboard: 385      Console: 385
06/13 13:04:57 slot2: Idle time: Keyboard: 385      Console: 385
06/13 13:09:57 slot1: Idle time: Keyboard: 685      Console: 685
06/13 13:09:57 slot2: Idle time: Keyboard: 685      Console: 685
06/13 13:14:57 slot1: Idle time: Keyboard: 985      Console: 985
06/13 13:14:57 slot2: Idle time: Keyboard: 985      Console: 985

So, only after 1 hour after a reboot does KeyboardIdle start to get incremented. Is this a bug?