[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Suppress Windows error dialogs popping up for crashing Condor jobs



Hi,

I am working on fault tolerance on our system.  When our job's run sometimes they crash.  I told the developers to fix the code but they told me to rerun the job because they can't reproduce the problem...I will work on their attitude later.

My problem was windows popping up various error reporting and crash dialogs.  When the dialog pops up the process won't exit till the user clicks OK, and eventually condor will restart the job.  The first process is still holding resources and the second process keeps failing.  After mucking with 4 different places in the registry and UI on xp, vista and 7 (as wall as every place in the UI I could control error reporting, and disabling the error reporting service), I was still seeing popups.  I started using the windows SetErrorMode function, which in practice only worked for me on Windows 7 and Vista.  I was still seeing a popup Application Error, memory could not be "read" on a simple null value dereference

Finally I came across the article
http://support.microsoft.com/kb/128642
which tells you to set in the registry: 
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Windows\ErrorMode = 2

This seems to suppress the failure dialog on the XP systems.  
As a Note: I am still not sure if you need to also disable the Dr. Watson debugger...but I have done that on the way to finding this solution.

--Derrick