[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] DAGMan log file bug....
- Date: Thu, 24 Aug 2006 09:53:40 -0500 (CDT)
- From: "R. Kent Wenger" <wenger@xxxxxxxxxxx>
- Subject: Re: [Condor-users] DAGMan log file bug....
On Wed, 23 Aug 2006, Bob Mortensen wrote:
> I think I've found a bug with dagman managing the log files for a large
> DAG. Actually, it has to do with parsing the DAG and .sub files.
> Ultimately it causes the DAG to hang without ever completing. I'm running
> 6.8.0 on WindowsXP. Here are some details:
> I have a DAG with 82 nodes, no dependencies. In the .dag.dagman.out log
> file I can see that for a few of my nodes, the log file name is not being
> read correctly from the .sub file. A few of the pertinent lines from the
> .dag.dagman.out file are included below. Since dagman never gets the name
> correct, it is unable to read the file and therefore the usual ULOG events
> never show up for those nodes and it doesn't know that they complete. The
> nodes' log files are created and contain reasonable information. Finally,
> if I create a DAG of a subset of the nodes, the problem goes away (or at
> least moves).
We're looking into this.
Could you also send a tarfile of your .sub files? I'd like to get a look
at what's different between the ones that work and the ones that don't.
Also, the complete dagman.out file would be good.