Mailing List Archives
Public Access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [Condor-users] Dagman and rescue files
- Date: Tue, 14 Sep 2004 16:15:12 +0100
- From: "Colin Gillespie" <C.Gillespie@xxxxxxxxxxxxxxx>
- Subject: RE: [Condor-users] Dagman and rescue files
Hi Peter,
My log line is:
log=/home1/ncsg3/basis/simulator/condor_script1.log
After the run the log file has this in it:
000 (833.000.000) 09/14 16:07:38 Job submitted from host: <X:32773>
DAG Node: condor_script1
...
001 (833.000.000) 09/14 16:08:01 Job executing on host: <X:32772>
...
005 (833.000.000) 09/14 16:08:03 Job terminated.
(1) Normal termination (return value 0)
Usr 0 00:00:00, Sys 0 00:00:00 - Run Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Total Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Total Local Usage
0 - Run Bytes Sent By Job
0 - Run Bytes Received By Job
0 - Total Bytes Sent By Job
0 - Total Bytes Received By Job
...
Thanks
Colin
>>You wrote
DAGMan is dumping a rescue file in this case because it is unable to
make any forward progress in the DAG. The apparent reason it's unable
to make any forward progress is because it can't open the userlog (the
job event log) for your DAG node, because the filename of the userlog
is "" (i.e., an empty string).
This (admittedly cryptic) error is right in the log you included:
> 9/14 10:47:46 UserLog::initialize: open("") failed - errno 2
> (No such file or directory) 9/14 10:47:51 Of 1 nodes total:
What does the "log =" line in your job submit file look like?