[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] scheduler universe job exited with status -1073741502



>Does the job run far enough to generate a dagman.out file?  If so, can
>you please send that?  Also, please send the SchedLog from that machine.

The job does generate the dag.lib.out file but its always empty, no matter what -debug level I use.
This is the relevant part of the sched.log:

1/26 16:40:07 Sent ad to central manager for szabolcs@xxxxxxxxxxxxxxxxxxx
1/26 16:40:07 Sent ad to 1 collectors for szabolcs@xxxxxxxxxxxxxxxxxxx
1/26 16:40:07 Successfully created sched universe process
1/26 16:40:07 Starting add_shadow_birthdate(134530.0)
1/26 16:40:07 Successfully created sched universe process
1/26 16:40:07 Starting add_shadow_birthdate(134523.0)
1/26 16:40:07 scheduler universe job (134530.0) pid 2532 exited with status -1073741502
1/26 16:40:07 Starting add_shadow_birthdate(134612.0)
1/26 16:40:08 Started shadow for job 134612.0 on "<192.168.0.104:1039>", (shadow pid = 8884)
1/26 16:40:08 DaemonCore: Command received via UDP from host <192.168.0.71:4976>
1/26 16:40:08 DaemonCore: received command 60011 (DC_NOP), calling handler (handle_nop())
1/26 16:40:08 scheduler universe job (134523.0) pid 14784 exited with status -1073741502
1/26 16:40:08 Starting add_shadow_birthdate(134628.0)



>Just to clarify -- are you saying that you only see this problem on one
>specific machine in your pool?

I used to see it on one one machine only, but in the last few weeks more schedulers
produced the same problem. The only solution I found was restarting the computer,
after that the same condor_submit_dag command works nicely.

Cheers,
Szabolcs