[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] wrong/negative values from collector



Hmm, actually I had that setting. I will restart startd to see if I will have jobs consume more memory than available. 
 



On Sat, Feb 4, 2012 at 10:40 AM, Rita <rmorgan466@xxxxxxxxx> wrote:
It does detect 32G of memory. I am wondering if dynamic slots setting is messed up. I removed SLOT_TYPE_1_PARTIONABLE=TRUE i suspect that could do it.


On Fri, Feb 3, 2012 at 5:03 PM, Lukas Slebodnik <slebodnik@xxxxxxxx> wrote:
startd daemon print information about amount of detected memory at startup.

You should enable verbose logging, restart startd daemon and look into
the StartLog.

STARTD_DEBUG = D_FULLDEBUG
condor_restart -startd      #this command stop all running jobs on this machine
cd `condor_config_val LOG`
# wait a while until a daemon starts
grep "Memory: Detected" StartLog

I would expect positive value in output.

Regards,
Lukas

On Fri, Feb 03, 2012 at 07:47:17AM -0500, Rita wrote:
> I am beginning to wonder if its related to this bug.
> https://bugzilla.redhat.com/show_bug.cgi?id=565501
>
> One thing I noticed is its affecting only several nodes and not all of
> them. Perhaps, its a configuration issues?
>
>
>
>
>
> On Thu, Feb 2, 2012 at 11:13 PM, Rita <rmorgan466@xxxxxxxxx> wrote:
>
> > It seems condor_status is giving me negative values which is also
> > causing thrashing on several of our servers.
> >
> > I am hoping someone can shed some light....
> >
> > $ condor_status hostA
> > Name               OpSys      Arch   State     Activity LoadAv Mem
> > ActvtyTime
> >
> > slot1@hostA LINUX      X86_64 Owner     Idle     1.000  -23940  0+06:00:46
> > slot1_1@hostA LINUX      X86_64 Claimed   Retiring 29.770  28000
> >  0+04:19:44
> > slot1_2@hostA LINUX      X86_64 Claimed   Retiring 5.030  28000
> >  0+04:19:44
> >                     Total Owner Claimed Unclaimed Matched Preempting
> > Backfill
> >
> >        X86_64/LINUX     3     1       2         0       0          0
> >  0
> >
> >               Total     3     1       2         0       0          0
> >  0
> >
> >
> > $ condor_status -collector -format "%s\n" CondorVersion
> > $CondorVersion: 7.4.3 Aug  4 2010 BuildID: 261829 $
> >
> > condor_status hostA -format "%s\n" CondorVersion
> > $CondorVersion: 7.6.5 Dec 27 2011 BuildID: 397396
> >
> >
> > Any thoughts?
> >
> >
> >
> > --
> > --- Get your facts first, then you can distort them as you please.--
> >
>
>
>
> --
> --- Get your facts first, then you can distort them as you please.--

> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/



--
--- Get your facts first, then you can distort them as you please.--



--
--- Get your facts first, then you can distort them as you please.--