[Condor-users] 'ERROR while bootstrapping' in subdag


On Mon, May 10, 2010 at 17:19, R. Kent Wenger <wenger@xxxxxxxxxxx> wrote:
> On Mon, 10 May 2010, Alexander Dietz wrote:
>> does anyone have new information on my reported problem? I need to
>> finish this DAG soon, so without any reply soon I have to restart the
>> DAG from scratch (and will not be able to make tests regarding my
>> reported problem).
> I haven't figured out yet exactly what happened.  But here's one thing to
> try that's better that starting over from scratch:  if there's a lock file
> (t.lock) remove that, and
> re-submit the DAG.  (I'm assuming that the DAGMan job is no longer in the
> queue.)  That should run the rescue DAG, so you won't be starting from
> scratch, but it won't go into recovery mode, so you'll bypass the problems
> with events that are goofing things up.

I guess this procedure kind of works. Maybe the DAG continued not
exactly where it was, but at least from the rescue-DAG level.

> Before you do that, if you have space, could you tar up all of the node job
> log files and put them someplace I can grab them?  That would help in
> figuring out what has gone wrong.

What log files exactly do you mean? Maybe I still can grab them...?


