[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Windows Condor Pool: job exit code -1073741502




Hi all,

I am having a problem with only one windows machine that refuses to run jobs, exiting with return value (exit code) of -1073741502.

To try to check what is happening, I am submitting the following batch file:

@echo off
setlocal
set THINKING_TIME=2
set COUNT=10
if not A%1 == A ( set THINKING_TIME=%1 )
if not A%2 == A ( set COUNT=%2 )
echo Thinking really hard for %THINKING_TIME% seconds...
rem We use ping here as a hack because "sleep" is non-standard.
ping -n %THINKING_TIME% 127.0.0.1 >NUL 2>&1
rem ping -n %THINKING_TIME% 127.0.0.1
echo Our result:
if %COUNT% GEQ 1 (
    for /L %%x in (1,1,%COUNT%) do (
         echo %%x
    )
)
endlocal

The submit description file is (forcing to run on the machine that gives the error):

Universe   = vanilla
Executable = simple.bat
Requirements = (Machine == "pc269265.corp.ad.emb") && (OpSys == "WINNT51" || OpSys == "WINNT52")
Arguments  = 4 12
Log        = simple-pc269265.log
Output     = simple-pc269265.out
Error      = simple-pc269265.err
Queue

The Job Log file is:

000 (012.000.000) 03/24 15:08:36 Job submitted from host: <10.3.28.8:2266>
...
001 (012.000.000) 03/24 15:08:45 Job executing on host: <10.20.12.146:2329>
...
005 (012.000.000) 03/24 15:08:47 Job terminated.
        (1) Normal termination (return value -1073741502)
                Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
                Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
                Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
                Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
        0  -  Run Bytes Sent By Job
        473  -  Run Bytes Received By Job
        0  -  Total Bytes Sent By Job
        473  -  Total Bytes Received By Job
...

Forcing to run on another machine in the pool, the Log file is:

000 (014.000.000) 03/24 15:20:54 Job submitted from host: <10.3.28.8:2266>
...
001 (014.000.000) 03/24 15:20:56 Job executing on host: <10.3.28.243:4687>
...
005 (014.000.000) 03/24 15:21:00 Job terminated.
        (1) Normal termination (return value 0)
                Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
                Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
                Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
                Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
        92  -  Run Bytes Sent By Job
        466  -  Run Bytes Received By Job
        92  -  Total Bytes Sent By Job
        466  -  Total Bytes Received By Job
...

And the job result file is correct:

Thinking really hard for 4  seconds...
Our result:
1
2
3
4
5
6
7
8
9
10
11
12

I don't know what this exit code means, so could someone help me?

Thanks, Klaus

This message is intended solely for the use of its addressee and may contain privileged or confidential information. If you are not the addressee you should not distribute, copy or file this message. In this case, please notify the sender and destroy its contents immediately.
Esta mensagem é para uso exclusivo de seu destinatário e pode conter informações privilegiadas e confidenciais. Se você não é o destinatário não deve distribuir, copiar ou arquivar a mensagem. Neste caso, por favor, notifique o remetente da mesma e destrua imediatamente a mensagem.