[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] exited with STATUS 100



All - I have a simulation that starts running on Condor and reads in all the data files just fine. But once it starts executing, it issues a "SIGQUIT" and condor thinks it's done. If I run this by hand on cf02012 (one of our Condor boxes), it will run just fine. Nothing in the error file or log file to indicate that something's wrong.

 

Tec232 is our "submit only" machine, cf02012 is the box it tries to run on. We just upgraded to RHEL 4.0 and Condor 6-6-10. we've been running Condor for almost a year now, with great results.

 

Similar Sims have run just fine. There is just something quirky about this one.

 

Any ideas would be a tremendous help.

 

Thanks,

Jim

 

tec232 Shadow log

8/18 11:18:58 (884.0) (20447): Request to run on <172.31.2.12:33355> was ACCEPTED

8/18 11:19:06 (884.0) (20447): DaemonCore: PERMISSION DENIED to unknown user from host <172.31.2.12:33478> for command 71000 (SHADOW_UPDATEINFO)

8/18 11:39:06 (884.0) (20447): DaemonCore: PERMISSION DENIED to unknown user from host <172.31.2.12:33478> for command 71000 (SHADOW_UPDATEINFO)

8/18 11:55:25 (884.0) (20447): Job 884.0 terminated: exited with status 0

8/18 11:55:25 (884.0) (20447): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 100

 

tec232 Schedlog

8/18 11:18:57 Started shadow for job 884.0 on "<172.31.2.12:33355>", (shadow pid = 20447)

8/18 11:18:59 Sent ad to central manager for solis@xxxxxxxxxxxxxxxxxxxxxxxxxxxx

8/18 11:23:59 Sent ad to central manager for solis@xxxxxxxxxxxxxxxxxxxxxxxxxxxx

8/18 11:28:59 Sent ad to central manager for solis@xxxxxxxxxxxxxxxxxxxxxxxxxxxx

8/18 11:33:59 Sent ad to central manager for solis@xxxxxxxxxxxxxxxxxxxxxxxxxxxx

8/18 11:38:59 Sent ad to central manager for solis@xxxxxxxxxxxxxxxxxxxxxxxxxxxx

8/18 11:43:59 Sent ad to central manager for solis@xxxxxxxxxxxxxxxxxxxxxxxxxxxx

8/18 11:48:59 Sent ad to central manager for solis@xxxxxxxxxxxxxxxxxxxxxxxxxxxx

8/18 11:53:59 Sent ad to central manager for solis@xxxxxxxxxxxxxxxxxxxxxxxxxxxx

8/18 11:55:25 Shadow pid 20447 for job 884.0 exited with status 100

 

cf02012 StartLog

8/18 11:55:23 Failed to obtain keyboard or mouse idle information.

8/18 11:55:23 Assuming the keyboard and mouse to be infinitely idle.

8/18 11:55:25 DaemonCore: Command received via TCP from host <192.56.136.232:45119>

8/18 11:55:25 DaemonCore: received command 404 (DEACTIVATE_CLAIM_FORCIBLY), calling handler (command_handler)

8/18 11:55:25 vm1: Called deactivate_claim_forcibly()

8/18 11:55:25 Starter pid 7409 exited with status 0

 

cf02012 SterterLog.vm1

8/18 11:18:58 Starting a VANILLA universe job with ID: 884.0

8/18 11:18:58 IWD: /home/wbs/studies/pmma/scenario/CS20/runs/PMMA-Production

8/18 11:18:58 Error file: /home/wbs/studies/pmma/scenario/CS20/output/cs20-bc2.uav/error.cs20-bc2

8/18 11:18:58 About to exec /home/wbs/studies/pmma/scenario/CS20/runs/PMMA-Production/cmd.uav cs20-bc2.uav 01

8/18 11:18:58 Create_Process succeeded, pid=7412

8/18 11:55:25 Process exited, pid=7412, status=0

8/18 11:55:25 Got SIGQUIT.  Performing fast shutdown.

8/18 11:55:25 ShutdownFast all jobs.

8/18 11:55:25 **** condor_starter (condor_STARTER) EXITING WITH STATUS 0