[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor and monitoring performance





On 7/1/10 9:07 AM, Michael O'Donnell wrote:

I was curious if anyone has suggestions on how to monitor the health of a Condor pool? I am trying to track down an error (Q3) and was also trying to develop a set of commands for monitoring Condor.

Doug Thain, at Notre Dame, has a Condor Log Analyzer tool which may be useful to you.  I'm working offline now, but if you google for it, I'm sure you'll find it.  This  mostly deals with standard Condor job log files or possibly also DAG log files, rather than service log files, but it may help.

If it is any consolation, we also find it hard to figure out what is going on with Condor.  Staring at service log files (with "tail -f") seems to be about the best we can do.  We agree this is suboptimal.

Ian

-- 
Ian Stokes-Rees, PhD                       W: http://abitibi.sbgrid.org
ijstokes@xxxxxxxxxxxxxxxxxxx               T: +1.617.432.5608 x75
NEBioGrid, Harvard Medical School          C: +1.617.331.5993
begin:vcard
fn:Ian Stokes-Rees, PhD
n:Stokes-Rees;Ian
org:Harvard Medical School;Biological Chemistry and Molecular Pharmacology
adr:250 Longwood Ave;;SGM-105;Boston;MA;02115;USA
email;internet:ijstokes@xxxxxxxxxxxxxxxxxxx
title:Research Associate, Sliz Lab
tel;work:+1.617.432.5608 x75
tel;fax:+1.617.432.5600
tel;cell:+1.617.331.5993
url:http:/sbgrid.org
version:2.1
end:vcard