[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Show job duration history of machines
- Date: Tue, 23 Oct 2012 13:33:09 -0500
- From: Greg Thain <gthain@xxxxxxxxxxx>
- Subject: Re: [Condor-users] Show job duration history of machines
On 10/23/2012 11:14 AM, Hermann Fuchs wrote:
Currently we do have some Problems with our grid.
It seems some machines abort jobs after about 20 Minutes. In oder to
identify the erroneous machines I would need
some command to show the job duration history of the machine.
There's no easy way to do this with Condor today that I can think of.
This is because the job history file contains the summary of all the
execution attempts for a given job. If there is one user log file,
parsing this is probably the best approach.
Or, if you can turn on the startd_history on each execute machine, each
startd will write out a history-like file, but you'd need to
concantenate those yourself.