[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor jobs leaving behind Windows Desktops andexhausting Window's heap?



Hey Matt,

> I would guess the machines non paged pool is either exhausted 
> or fragmented to crap.

Yup. Your guess is good.

> At that stage pretty much any attempt to start doing more 
> stuff is screwed (we get this on submit machines 
> occasionally) I just get the box bounced (windows seems 
> entirely incapable of recovering from this state)

It's funny how often my calls end with: just reboot it. Works magic for
Windows executors. We're currently monitoring the boxes and reboot them
when we detect a number of problematic states. This one has been added.

> The desktops hanging around might be the cause but they might 
> also just be a symptom (once the box gets screwed they start 
> accumulating)

I actually thing it's a symptom as well since we've been running fine
with 6.8.6 for quite some time now on WinXPSP2 64-bit.

> It might be worth monitoring the system to see how many open 
> desktop sessions there are and when jobs start dying with the 
> 128 exit codes.
> If the numbers rise before the errors that would suggest the 
> leaking is the problem, if not it's something else.

Will do. I'll let everyone know.

Thanks!

- Ian


Confidentiality Notice.  This message may contain information that is confidential or otherwise protected from disclosure.
If you are not the intended recipient, you are hereby notified that any use, disclosure, dissemination, distribution, 
or copying of this message, or any attachments, is strictly prohibited.  If you have received this message in error, 
please advise the sender by reply e-mail, and delete the message and any attachments.  Thank you.