[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Startd's crashing with fatal error getting process info for starter and descendants



This problem just started affecting a handful of my startd's this
morning. The machine have been up and running now for about 2 weeks
without any problems. Now they crash every 20 minutes or so with the
startd log showing:

7/5 14:13:28 ProcAPI: getProcInfo() failed to get performance info.
7/5 14:13:30 ProcAPI: getProcInfo() failed to get performance info.
7/5 14:13:43 ProcAPI: getProcInfo() failed to get performance info.
7/5 14:13:45 ProcAPI: getProcInfo() failed to get performance info.
7/5 14:14:04 ProcAPI::getProcSetInfo failed to get performance info.
7/5 14:14:06 ProcAPI::getProcSetInfo failed to get performance info.
7/5 14:14:30 ProcAPI: getProcInfo() failed to get performance info.
7/5 14:14:32 ProcAPI: getProcInfo() failed to get performance info.
7/5 14:14:45 ProcAPI::getProcSetInfo failed to get performance info.
7/5 14:14:45 ERROR "Starter::percentCpuUsage(): Fatal error getting
process info for the starter and decendents" at line 859 in file
..\src\condor_startd.V6\Starter.C

This is 6.7.12 on Windows XP Pro SP2 -- I know: we need to get to 6.7.20
and we will any day now. But in the meantime: any idea what might be
causing this? There have been no changes to our configurations. Only new
machines added to the pool.

- Ian

--
Ian R. Chesal <ichesal@xxxxxxxxxx>
Senior Software Engineer

Altera Corporation
Toronto Technology Center
Tel: (416) 926-8300