[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Setting up HTCondorView



Yes, I should have looked into it in more in detail... I was looking for a utility where admins and users can check the load on the cluster, maybe split up by resource (CPU, memory, GPU). Even better if it could show the job queue instead of having to use condor_q, all of this from the browser.

Even though the client is outdated, the server (which I assume is what is used by the condor_stats command) is still reporting some errors, like I said. Is it outdated too?


    
El 24/06/2022 a las 22:37, Jason Patton escribiÃ:

Hi Javier,

The HTCondor View client is a very old contributed module that we have not updated to keep up with the latest HTCondor releases. For example, the last time the "view_client-2.1-Any-Java.tar.Z" file linked from the wiki was modified was in 2011 (actual changes to the code may be even before that). It's not part of the HTCondor code base, and it is probably due to be removed from our documentation.

Can you let us know what kind of monitoring you had in mind and maybe we can help you find a more recent and better supported solution?

Jason Patton

On 6/23/22 2:30 PM, Javier Barbero wrote:

Hi everyone, I'm trying to set up the HTCondorView server in our cluster by following the instructions from the documentation (https://htcondor.readthedocs.io/en/latest/admin-manual/setting-up-special-environments.html#configuring-a-machine-to-be-a-htcondorview-server, by the way, even when knowing that it is there, this section is very hard to find and separate from the Client information).

I decided to use my existing collector as the HTCondorView collector by adding the following configuration:Â

POOL_HISTORY_DIR = $(LOCAL_DIR)/log/condorview
POOL_HISTORY_MAX_STORAGE = 322122547
KEEP_POOL_HISTORY = True

(also, is the POOL_HISTORY_MAX_STORAGE in bytes? It is not very well specified in https://htcondor.readthedocs.io/en/latest/admin-manual/configuration-macros.html#POOL_HISTORY_MAX_STORAGE)

POOL_HISTORY_DIR resolves to "/var/log/condorview" and the "condor" user has been given ownership of this directory. After a few mistakes with the configuration, the View server appears to be running, but the response I get with condor_stats is very weird. For example, if I run "condor_stats -userlist" I get (I have anonimized the user, host and domain, but it is always the same every time I try):

failed to receive data from the CondorView server
user@xxxxxxxxxx/main.domain.com

I have tried to set up the Client following its corresponding instructions (https://htcondor.readthedocs.io/en/latest/contrib-source-modules/view-client-contrib-module.html) and while running the setup command I get several messages saying "failed to receive data from the CondorView server". If I run the "./make_stats hour" command I also get the same message once.

I cannot see any error messages in the collector logs and the condorview logs look fine, although I don't really know what they should look like. Am I missing something?


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/