[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Monitoring Condor servers (and Nagios)

We're in the early stages of updating and moving our Condor system.  As part of this we'd like to add monitoring of the system to our existing central monitoring system (using Nagios).  I was surprised that I didn't find any plugins/recipes for this in either the Condor or Nagios communities (but I may not have found the right place to look yet).  I've spotted some passing references to monitoring central nodes using Nagios but nothing substantial.

Obviously we would knock something together using the standard Nagios plugins to check for processes etc but I can't help thinking that several people must have been through this already.  Can anyone point me at references/useful sites?

To be clear, at the moment I'm interesting in monitoring the system for availability and alerting system staff about problems.  Recording usage is a separate project (and I've seen more leads for that sort of stuff).