[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] timer killfamily running every 60 seconds



We’ve got a condor pool running Windows 7 Ent x64 and condor 7.6.1 and we are seeing an issue with condor_master and condor_procd consuming 1 processor core each for about 20 seconds every 60 seconds when condor is in the owner and unclaimed states.

 

When setting logging to D_all I found that condor is running “Calling Timer handler 8 (KillFamily::takesnapshot)” over and over again.  This log item matches up to the cpu activity.  Does anyone know how to fix this problem?

 

08/10/11 13:18:04 (fd:3) (pid:150184) Calling Timer handler 8 (KillFamily::takesnapshot)

08/10/11 13:18:04 (fd:3) (pid:150184) PRIV_CONDOR --> PRIV_CONDOR at c:\condor\execute\dir_2156\userdir\src\condor_utils\killfamily.cpp:279

08/10/11 13:18:09 (fd:3) (pid:150184) KillFamily: parent: 43464 family: 43464

08/10/11 13:18:09 (fd:3) (pid:150184) KillFamily: alive_cpu_user = 0, exited_cpu = 0, max_image = 9120k

08/10/11 13:18:09 (fd:3) (pid:150184) PRIV_CONDOR --> PRIV_CONDOR at c:\condor\execute\dir_2156\userdir\src\condor_utils\killfamily.cpp:480

08/10/11 13:18:09 (fd:3) (pid:150184) Return from Timer handler 8 (KillFamily::takesnapshot)

08/10/11 13:18:09 (fd:3) (pid:150184) PRIV_CONDOR --> PRIV_CONDOR at c:\condor\execute\dir_2156\userdir\src\condor_daemon_core.v6\daemon_core.cpp:3812

08/10/11 13:18:09 (fd:3) (pid:150184) DaemonCore Timeout() Complete, returning 3

08/10/11 13:18:09 (fd:3) (pid:150184) selector 03ADF950 resetting

08/10/11 13:18:09 (fd:3) (pid:150184) selector 03ADF950 adding fd 592 ()

08/10/11 13:18:09 (fd:3) (pid:150184) selector 03ADF950 adding fd 596 ()

08/10/11 13:18:09 (fd:3) (pid:150184) selector 03ADF950 adding fd 584 ()

08/10/11 13:18:09 (fd:3) (pid:150184) PERF: entering select

08/10/11 13:18:09 (fd:3) (pid:150184) Entering thread safe start [select] in selector.cpp:313 150668()

08/10/11 13:18:09 (fd:3) (pid:150184) Leaving thread safe start [select] in selector.cpp:313 150668()

08/10/11 13:18:12 (fd:3) (pid:150184) Entering thread safe stop [select] in selector.cpp:319 150668()

08/10/11 13:18:12 (fd:3) (pid:150184) Leaving thread safe stop [select] in selector.cpp:319 150668()

08/10/11 13:18:12 (fd:3) (pid:150184) PERF: leaving select

08/10/11 13:18:12 (fd:3) (pid:150184) State = TIMED_OUT

08/10/11 13:18:12 (fd:3) (pid:150184) max_fd = 596

08/10/11 13:18:12 (fd:3) (pid:150184) Selection FD's

08/10/11 13:18:12 (fd:3) (pid:150184)            Read {584 592 596 } = 3

08/10/11 13:18:12 (fd:3) (pid:150184)            Write {} = 0

08/10/11 13:18:12 (fd:3) (pid:150184)            Except {} = 0

08/10/11 13:18:12 (fd:3) (pid:150184) Timeout = 3.000000 seconds

08/10/11 13:18:12 (fd:3) (pid:150184) In DaemonCore Timeout()

08/10/11 13:18:13 (fd:3) (pid:150184)

 

Thanks,

 

Sam Beckler

Imaging and PC Management

Email: Beckle2@xxxxxxxxxxx

Phone: (864)656-5885

Cell: (864)650-1251