[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Ganglia Heartbeats sent 0



Hello,

I trying to learn how to setup ganglia to monitor a condor pool.

I'm currently working on localhost to make things easier. I configured ganglia and it's working to monitor this 1 node cluster. The default metrics of gmond.conf are working fine and appear on the web frontend, but I'm having trouble to get the condor metrics.

In the GangliaLog I have:

03/09/15 14:20:28 ******************************************************
03/09/15 14:20:28 ** condor_gangliad (CONDOR_GANGLIAD) STARTING UP
03/09/15 14:20:28 ** /home/condor/condor-8.3.4-x86_64_Ubuntu14-unstripped/libexec/condor_gangliad
03/09/15 14:20:28 ** SubsystemInfo: name=GANGLIAD type=DAEMON(12) class=DAEMON(1)
03/09/15 14:20:28 ** Configuration: subsystem:GANGLIAD local:<NONE> class:DAEMON
03/09/15 14:20:28 ** $CondorVersion: 8.3.4 Mar 02 2015 BuildID: 304666 $
03/09/15 14:20:28 ** $CondorPlatform: x86_64_Ubuntu14 $
03/09/15 14:20:28 ** PID = 8922
03/09/15 14:20:28 ** Log last touched 3/9 14:20:11
03/09/15 14:20:28 ******************************************************
03/09/15 14:20:28 Using config source: /home/condor/condor-8.3.4-x86_64_Ubuntu14-unstripped/etc/condor_config
03/09/15 14:20:28 Using local config sources:
03/09/15 14:20:28 Â Â/home/condor/condor-8.3.4-x86_64_Ubuntu14-unstripped/local.xxxx/condor_config.local
03/09/15 14:20:28 config Macros = 58, Sorted = 58, StringBytes = 1697, TablesBytes = 2136
03/09/15 14:20:28 CLASSAD_CACHING is ENABLED
03/09/15 14:20:28 Daemon Log is logging: D_ALWAYS D_ERROR
03/09/15 14:20:28 Daemoncore: Listening at <0.0.0.0:45401> on TCP (ReliSock) and UDP (SafeSock).
03/09/15 14:20:28 DaemonCore: command socket at <xxx.xxx.xx.xx:45401>
03/09/15 14:20:28 DaemonCore: private command socket at <xxx.xxx.xx.xx:45401>
03/09/15 14:20:28 Testing /usr/bin/gmetric
03/09/15 14:20:28 Loading libganglia libganglia.so
03/09/15 14:20:28 Will use libganglia to interact with ganglia.
03/09/15 14:20:28 Will perform stats publication every GANGLIAD_INTERVAL=60 seconds.
03/09/15 14:20:28 Reading metric definitions from /home/condor/condor-8.3.4-x86_64_Ubuntu14-unstripped/etc/condor/ganglia.d/00_default_metrics
03/09/15 14:20:48 Starting update...
03/09/15 14:20:48 Ganglia is monitoring 1 hosts
03/09/15 14:20:48 Got 8 daemon ads
03/09/15 14:20:48 Heartbeats sent: 0
03/09/15 14:21:08 Starting update...
03/09/15 14:21:08 Heartbeats sent: 0


Here are my configs of ganglia:

$condor_config_val -dump |grep -i ganglia
DAEMON_LIST = COLLECTOR MASTER NEGOTIATOR SCHEDD STARTD GANGLIAD
GANGLIA_CONFIG = /etc/ganglia/gmond.conf
GANGLIA_GMETRIC = /usr/bin/gmetric
GANGLIA_GSTAT_COMMAND = gstat --all --mpifile --gmond_ip=localhost --gmond_port=8649
GANGLIA_LIB = libganglia.so
GANGLIA_LIB64_PATH = /lib64,/usr/lib64,/usr/local/lib64
GANGLIA_LIB_PATH = /lib,/usr/lib,/usr/local/lib
GANGLIA_SEND_DATA_FOR_ALL_HOSTS = false
GANGLIAD = $(LIBEXEC)/condor_gangliad
GANGLIAD_INTERVAL = 60
GANGLIAD_LOG = $(LOG)/GangliadLog
GANGLIAD_METRICS_CONFIG_DIR = /home/condor/condor-8.3.4-x86_64_Ubuntu14-unstripped/etc/condor/ganglia.d
GANGLIAD_PER_EXECUTE_NODE_METRICS = true
GANGLIAD_REQUIREMENTS =Â
GANGLIAD_VERBOSITY = 10
MAX_GANGLIAD_LOG = $(MAX_DEFAULT_LOG)

GANGLIAD daemon is running but I think it's not transmitting its monitoring data to ganglia.
Do I have to do something to include the condor default metrics into ganglia?

Well I'm not sure why but I keep getting "Heartbeats sent: 0". I would appreciate some help.

Thanks in advance,
Ricardo Oda