[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] busy&load calculation problems



Thanks Matt, 

This WMI query stuff is working fine on my AMD64 PC with Windows x64
installed, running Condor v 6.7.16, but it seems not functioning on
P4 with HT, running Windows XP Pro (32-bit) with the same Condor version.
And it seems to be fine on older boxes with good old P4 2GHz and Condor
versions
from 6.7.2 and up.

I wonder if anyone else is experiencing similar problems 
in windows pools?

Andrey

> -----Original Message-----
> From: condor-users-bounces@xxxxxxxxxxx 
> [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Matt Hope
> Sent: Tuesday, March 07, 2006 8:25 AM
> To: Condor-Users Mail List
> Subject: Re: [Condor-users] busy&load calculation problems
> 
> On 3/6/06, Andrey Kaliazin <A.Kaliazin@xxxxxxxxxxx> wrote:
> > Dear all,
> >
> > We have various windows boxes running Condor 6.7.10-16 and most of
> > them, but not all have this repeating error in their StartLog file -
> >
> > ...
> > 3/6 17:32:30 loadavg thread died, restarting. (exit code=2)
> > 3/6 17:32:35 no loadavg samples this minute, maybe thread died???
> > ...
> >
> >
> > I suppose that some misconfiguration of WMI subsytem leads to those
> > errors appearing, which, in turn, leads to the wrong Condor 
> conclusions -
> > workstations appear to be idling in terms of CPU load, 
> which is not good -
> >
> > $ condor_status -l |grep 'oad\|usy'
> > ...
> > CpuBusy = ((LoadAvg - CondorLoadAvg) >= 0.500000)
> > CondorLoadAvg = 0.000000
> > LoadAvg = 0.000000
> > TotalLoadAvg = 0.000000
> > TotalCondorLoadAvg = 0.000000
> > CpuBusyTime = 0
> > CpuIsBusy = FALSE
> > Activity = "Busy"
> > ...
> >
> > Who knows how to fix it? Any help is appreciated.
> 
> Are these 64bit windows machines?
> 
> The WMI query stuff has been broken in the 6.6.x series for a while
> (at least on AMD based 64bit windows machines). I have no idea if it's
> fixed on the 6.7.x series.
> 
> Note that on older (sorry I forget the specific release) 6.6.x
> machines suffered from a serious memory and handle leak in these
> situations (sufficient to kill the condor sunsystem after a few
> weeks). Again I would guess that the latest 6.7 includes the fix for
> this.
> 
> Sorry I can't be more help than that
> 
> Matt
> 
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>