RE: [condor-users] dprintf hit fatal errors

I should post the resolution of this issue, for the benefit of whomever else might be seeing it.  To recap, the symptoms were that jobs we submitted would almost always be matched with a grid node, transition to the "running," then be killed shortly thereafter.  We would find "dprintf hit fatal errors" in the job's log file.

The resolution was to decrease the amount of debugging information we were logging.  We had set our debugging level on the Central manager to D_ALL for all daemons, and the Central Manager just couldn't keep up.  When we put the debugging level down to normal levels, the problem was resolved.

Many thanks to Colin Stolley for confirming this to be the problem.


