[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] assertion, hang



Matt,

In addition to the UNC cmd.exe registry change, we also make a registry
change that suppresses the windows modal dialog popup that occurs when
an application crashes.  I believe your application may be triggering
this modal dialog and until the dialog is cleared, Condor will think the
application is still running.

Change the value of the following registry key:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Windows\ErrorMode

MS KB229012 explains the value settings:

http://support.microsoft.com/?scid=http%3a%2f%2fwww.support.microsoft.co
m%2fkb%2f229012%2fen-us%2f

Hope this helps.

-Bryan

-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Matthew Galati
Sent: Friday, March 17, 2006 10:14 PM
To: Condor-Users Mail List
Subject: [Condor-users] assertion, hang

My condor pool consists of a set of machines running Windows 2003
Server. All of my input, executables and output are on a shared windows
drive. Here is part of my sub:

====
environment =
PATH=\\ordsrv3\ormpdata\bin\WinXP-Debug;c:\WINDOWS\system32;c:\WINNT\sys
tem32
executable  = condor_exec.bat
initialdir  = \\ordsrv3\ormpdata\milprun\test_win
transfer_executable = false
should_transfer_files = NO
requirements = (OpSys=="WINNT52")

   output   = 10teams.out
   error    = 10teams.err
   log      = 10teams.log
   universe = vanilla
   arguments = --parm \\ordsrv3\ormpdata\parm\milpwin.parm --instance
10teams
 queue 1

  
   output   = 22433.out
   error    = 22433.err
   log      = 22433.log
   universe = vanilla
   arguments = --parm \\ordsrv3\ormpdata\parm\milpwin.parm --instance
22433
 queue 1
====

I am using condor_exec.bat as a wrapper to my executable. If I try to
run the executable directly, I get Shadow Exception at "CreateProcess".
The .bat file was suggested on this mailing list - it seems to work.

condor_exec.bat:
\\ordsrv3\ormpdata\bin\WinXP-Debug\exemilpNET.exe %*


If my executable dies due to an assertion failure (this is a C app,
using assert( )), then the failure correctly reports to stderr. However,
the job seems to get hung. That is, it stays in the condor queue
indefinitely, as if condor does not know that it is done - even after
the assertion. Is there some way to handle this situation? I want condor
to treat the assertion as a completion so that it moves on to the next
in the queue.

Thanks,
Matt


_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users