[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Dagman and rescue files



Dear All,

I have a very simple Dag

Job condor_script1 /home/condor_script1.sub
Script POST condor_script1 /home/data2db.py $RETURN $JOB

but always creates a rescue file. The script currently does something
simple, e.g. nothing or write to file. When the script writes to file,
it does produce the desired output.

Can anyone suggest why a rescue file is being created?

Many thanks 

Colin

The rescue file is:
# Rescue DAG file, created after running
#   the /home/condor_scriptDag1.dag DAG file
#
# Total number of Nodes: 1
# Nodes premarked DONE: 0
# Nodes that failed: 0
#   <ENDLIST>

JOB condor_script1 /home/condor_script1.sub
SCRIPT POST condor_script1 /home/data2db.py $RETURN $JOB


The dagman out file is:
<snip>
9/14 10:47:46 Job condor_script1 completed successfully.
9/14 10:47:46 Running POST script of Job condor_script1...
9/14 10:47:46 Of 1 nodes total:
9/14 10:47:46  Done     Pre   Queued    Post   Ready   Un-Ready   Failed
9/14 10:47:46   ===     ===      ===     ===     ===        ===      ===
9/14 10:47:46     0       0        0       1       0          0        0
9/14 10:47:46 UserLog::initialize: open("") failed - errno 2 (No such
file or directory) 9/14 10:47:51 Of 1 nodes total:
9/14 10:47:51  Done     Pre   Queued    Post   Ready   Un-Ready   Failed
9/14 10:47:51   ===     ===      ===     ===     ===        ===      ===
9/14 10:47:51     0       0        0       0       0          1        0
9/14 10:47:51 ERROR: a cycle exists in the DAG
9/14 10:47:51 Aborting DAG...
9/14 10:47:51 Writing Rescue DAG to
/home1/ncsg3/basis/simulator/condor_scriptDag1.dag.rescue...
9/14 10:47:51 **** condor_scheduniv_exec.738.0 (condor_DAGMAN) EXITING
WITH STATUS 1