[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Show job duration history of machines

On 10/23/2012 11:14 AM, Hermann Fuchs wrote:

Currently we do have some Problems with our grid.
It seems some machines abort jobs after about 20 Minutes. In oder to identify the erroneous machines I would need
some command to show the job duration history of the machine.

There's no easy way to do this with Condor today that I can think of. This is because the job history file contains the summary of all the execution attempts for a given job. If there is one user log file, parsing this is probably the best approach.

Or, if you can turn on the startd_history on each execute machine, each startd will write out a history-like file, but you'd need to concantenate those yourself.