[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] busy&load calculation problems
- Date: Tue, 7 Mar 2006 08:24:38 +0000
- From: "Matt Hope" <matthew.hope@xxxxxxxxx>
- Subject: Re: [Condor-users] busy&load calculation problems
On 3/6/06, Andrey Kaliazin <A.Kaliazin@xxxxxxxxxxx> wrote:
> Dear all,
> We have various windows boxes running Condor 6.7.10-16 and most of
> them, but not all have this repeating error in their StartLog file -
> 3/6 17:32:30 loadavg thread died, restarting. (exit code=2)
> 3/6 17:32:35 no loadavg samples this minute, maybe thread died???
> I suppose that some misconfiguration of WMI subsytem leads to those
> errors appearing, which, in turn, leads to the wrong Condor conclusions -
> workstations appear to be idling in terms of CPU load, which is not good -
> $ condor_status -l |grep 'oad\|usy'
> CpuBusy = ((LoadAvg - CondorLoadAvg) >= 0.500000)
> CondorLoadAvg = 0.000000
> LoadAvg = 0.000000
> TotalLoadAvg = 0.000000
> TotalCondorLoadAvg = 0.000000
> CpuBusyTime = 0
> CpuIsBusy = FALSE
> Activity = "Busy"
> Who knows how to fix it? Any help is appreciated.
Are these 64bit windows machines?
The WMI query stuff has been broken in the 6.6.x series for a while
(at least on AMD based 64bit windows machines). I have no idea if it's
fixed on the 6.7.x series.
Note that on older (sorry I forget the specific release) 6.6.x
machines suffered from a serious memory and handle leak in these
situations (sufficient to kill the condor sunsystem after a few
weeks). Again I would guess that the latest 6.7 includes the fix for
Sorry I can't be more help than that