[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] HTCondor on Amazon Linux



Thank you for the bug report. I will be able to look at this and probably fix it tomorrow.

...Tim

On 1/2/24 16:00, Wakefield, Brendan F via HTCondor-users wrote:

Hello HTCondor community!

First time poster: I work with the US Geological Survey with Mike Fienen, and we are working on ways to deploy HTCondor clusters in AWS. I believe there is an issue with the get_htcondor script thatâs referenced as the best way to install HTCondor on Linux machines: https://htcondor.readthedocs.io/en/latest/getting-htcondor/install-linux-as-root.html. Now, Iâll also say that we are learning and either a) I could be wrong or b) it could be there is a more ideal method for installing HTCondor for our use case. In summary:

So hereâs what I tried and led me to the above conclusion: weâre running HTCondor in AWS, and so most recently weâve been working with Amazon Linux 2 for our HTCondor clusters. I noticed the get_condor script started failing to run on Amazon Linux 2 because it appears that script tries to install HTCondor v23

"+ sh -c 'yum install -y https://research.cs.wisc.edu/htcondor/repo/23.0/htcondor-release-current.amzn2.noarch.rpm || yum reinstall -y https://research.cs.wisc.edu/htcondor/repo/23.0/htcondor-release-current.amzn2.noarch.rpm'", "Cannot open: https://research.cs.wisc.edu/htcondor/repo/23.0/htcondor-release-current.amzn2.noarch.rpm. Skipping.", "Error: Nothing to do", "Cannot open file: https://research.cs.wisc.edu/htcondor/repo/23.0/htcondor-release-current.amzn2.noarch.rpm. Skipping.", "Error: Nothing to do"

Looking at the location of the .rpm files, I assumed you dropped Amazon Linux 2 support for HTCondor v23 since there is not AL2-specific RPM file here: https://research.cs.wisc.edu/htcondor/repo/23.0/ like there is for v10: https://research.cs.wisc.edu/htcondor/repo/10/10.0/amzn2/

So, we figured weâd try migrating to AL 2023 to stay current, but we noticed that running the get_condor script also fails on AL 2023 because it requires installing EPEL: https://github.com/htcondor/htcondor/blob/master/src/condor_scripts/get_htcondor#L460 BUT sadly EPEL has been dropped form AL 2023 and is not compatible. So, this just fails the get_condor script.

So, Iâve made some progress on AL2023 by abandoning the get_condor script and installing by means of manually running curl to obtain the RPM file at https://research.cs.wisc.edu/htcondor/repo/23.0/amzn2023/x86_64/release/condor-23.0.2-1.amzn2023.x86_64.rpm and then running âyum install condorâ. Iâm thinking this should be fine since we are trying to do as much configuration as possible for the Controller and Worker nodes as possible using static files in /etc/condor.d/config (I think that is the pathâ), however I do believe we did pass some environment variables to the get_condor script when things were working previously.

 

So, to recap, the purpose of reaching out is two-fold. First, we wanted to let you know about that bug in the get_condor script relating to AL2/AL2023/HTCondor v10/HTCondor v23 but also 2). Please let me know if youâd like any other logs or have follow-up questions.

 

Second, Iâd like to ask if you have a recommendation for the most robust method of installing HTCondor. We liked using the get_condor script because we didnât have to maintain it and it was able to perform some of the configuration steps at the time of installation, but it does seem like it might be a better practice to download the RPM file from a hardcoded URL and then force ourselves to perform all configuration from the static config files at /etc/condor.d/config. Let me know if you have any thoughts! I appreciate the help!

 

 

Brendan Wakefield (he/his)

USGS â Cloud Hosting Solutions

DevOps Team

bwakefield@xxxxxxxx


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
-- 
Tim Theisen (he, him, his)
Release Manager
HTCondor & Open Science Grid
Center for High Throughput Computing
Department of Computer Sciences
University of Wisconsin - Madison
4261 Computer Sciences and Statistics
1210 W Dayton St
Madison, WI 53706-1685
+1 608 265 5736