[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] HTCondor and condor_ganglia issues



More about condor_gangliad process:

I stopped condor (systemctl stop). and after that condor_gangliad was still there. I then killed it. And restarted condor after adding GANGLIAD to DAEMON_LIST. Sure enough condor_gangliad was one of the processes. But strangely, less than a second a second condor_gangliad appeared.

[root@simclu-ce ~]# ps -ea|grep gangliad
2592326 ?        00:00:00 condor_gangliad
2592334 ?        00:00:00 condor_gangliad

Would it be because I have a wrong configuration?

Secondly, Gangliadlog has this error:

07/27/21 21:40:23 my_popenv: Failed to exec âgstat, errno=2 (No such file or directory) 07/27/21 21:40:23 Failed to execute âgstat --all --mpifile --gmond_ip=192.168.55.79 --gmond_port=8652â: No such file or directory

What file is it complaining about? I replaced "gstat" with "/bin/gstat" and the error shows up again "Failed to exec "/bin/gstat, .."

-
Nagaraj




On 2021-07-27 21:15, John M Knoeller wrote:
I'm not sure why the condor_gangliad would be running if you did not
add it to your daemon list.   But the error is because you need to put
GANGLIAD in your daemon list not GANGLIA_D.

 Instructions for how to handle the case where the metad is on a
different machine than the condor_collector is here

 Monitoring â HTCondor Manual 9.1.0 documentation [1]

 -tj

-------------------------

FROM: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of
Nagaraj Panyam <pn@xxxxxxxxxxx>
SENT: Tuesday, July 27, 2021 6:34 AM
TO: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx>
SUBJECT: [HTCondor-users] HTCondor and condor_ganglia issues

Hi,

I am trying to configure HTcondor's ganglia monioring. In that
context, I see something I do not understand.

Firstly, I see the process condor_gangliad even though it is not in
the DAEMON_LIST. config_val_dump shows DAEMON_LIST = MASTER COLLECTOR
NEGOTIATOR SCHEDD). Is this expected?

Secondly, When I specifically add GANGLIA_D to DAEMON_LIST in condor
config file, the error given below shows up in MasterLog. Where do I
add the executable path? We  have CONDOR_VERSION = 8.9.13

GANGLIA_D is in the DAEMON_LIST parameter, but there is no
executable path for it defined in the config files!
ERROR "Must have the path to GANGLIA_D defined." at line 1606 in
file

/var/lib/condor/execute/slot1/dir_19111/userdir/.tmp9djsO9/BUILD/condor-8.9.13/src/condor_master.V6/masterDaemon.cpp

Thirdly, after resolving above issues, what is the scheme to hookup
HTCondor's monitoring to existing Ganglia? We will have
condor_gangliad on same machine as Collector, and Ganglia's metad
running on a different host.

Thanks

Nagaraj



Links:
------
[1]
https://htcondor.readthedocs.io/en/latest/admin-manual/monitoring.html?highlight=gangliad#ganglia
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/