Re: [Condor-users] Dagman error with 6.9.1 on windows


I investigated your report and found a bug in 6.9.1. I'm very sorry about that!

I have yet to identify the full effects of this bug, but it certainly strikes in the case you found, where OnExitRemove evaluates to UNDEFINED, and also when OnExitHold evaluates to UNDEFINED.

The bug is fixed for 6.9.2.

Now the question is why your OnExitRemove expression is evaluating to UNDEFINED. I assume your dag condor.sub file contain the usual expression:

on_exit_remove  = ( ExitSignal == 11 || (ExitCode >=0 && ExitCode <= 2))

Unfortunately, I can't answer that myself by looking at your report, because the log message is not reliable when it claims the OnExitRemove expression was never set. I've fixed that too for the next release.

What I observe about this expression is that it evaluates to undefined when ExitSignal is undefined and (ExitCode < 0 || ExitCode > 2). I really doubt that is intended. I'll find out and get this expression fixed if it is indeed broken.


Horvátth Szabolcs wrote:
I forgot to say that though this error was reported earlier it now crashes the scheduler instantly.


Horvátth Szabolcs wrote:

First experience with 6.9.1 on winXP, after the first submitted dagman job:

1/11 14:07:53 scheduler universe job (44192.0) pid 2112 exited with status -1073741502 1/11 14:07:53 (44192.0) Problem parsing user policy for job: The UNKNOWN (never set) OnExitRemove expression '' evaluated to UNDEFINED. Putting job on hold. 1/11 14:07:53 Job 44192.0 put on hold: The UNKNOWN (never set) OnExitRemove expression '' evaluated to UNDEFINED
1/11 14:07:57 ERROR "Unexpected pending status for fake message delivery.
" at line 4238 in file ..\src\condor_daemon_core.V6\daemon_core.C


