[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [condor-users] dprintf hit fatal errors



I should post the resolution of this issue, for the benefit of whomever else might be seeing it.  To recap, the symptoms were that jobs we submitted would almost always be matched with a grid node, transition to the "running," then be killed shortly thereafter.  We would find "dprintf hit fatal errors" in the job's log file.

The resolution was to decrease the amount of debugging information we were logging.  We had set our debugging level on the Central manager to D_ALL for all daemons, and the Central Manager just couldn't keep up.  When we put the debugging level down to normal levels, the problem was resolved.

Many thanks to Colin Stolley for confirming this to be the problem.

-David

-----Original Message-----
From: Erik Paulson [mailto:epaulson@xxxxxxxxxxx]
Sent: Friday, April 02, 2004 11:59 AM
To: condor-users@xxxxxxxxxxx
Subject: Re: [condor-users] dprintf hit fatal errors


On Fri, Apr 02, 2004 at 10:07:31AM -0500, David Vestal wrote:
> To all,
> 
> "dprintf hit fatal errors."  I'm getting this on a very regular basis in my job log files.
Condor Support Information:
http://www.cs.wisc.edu/condor/condor-support/
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>