[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] 10.8.0 / 10.7.0 (Follow up, upgrading running nodes)



The condor_master will notice that new binaries have been installed and automatically restart the daemons.

For the Central Manager: It contains the collector and the negotiator. The collector maintains an in memory database of the HTCondor pool. Upon restart, the nodes in the HTCondor pool periodically report to the collector. After several minutes, all the nodes will have reported back in and you are back to where you were before the restart. The Access Points (submit node) can still send jobs to slots that have been given to them to manage until the claim life time expires. So, when jobs complete, similar jobs can continue to start. However, no new matches can be made until the Central Manager comes back online.

For the Access Point (your submit node): When the schedd restarts, the new schedd will read the on disk job queue and begin reconnecting to jobs that were running when the old schedd shut down. After a while, the condor_schedd emails out a restart report that details how many running jobs it was able to reconnect.

In either case, the impact to running jobs should be minimal.

...Tim

On 9/18/23 11:49, Weatherby,Gerard wrote:

Thanks.

Whatâs the impact on the cluster if components are upgraded? Specifically in our case we have:


deb [arch=amd64] https://research.cs.wisc.edu/htcondor/repo/ubuntu/10.x focal main

deb-src https://research.cs.wisc.edu/htcondor/repo/ubuntu/10.x focal main

 

in /etc/apt/sources.list.d/htcondor10.list

 

and weâd do:

apt-get update && apt-get install htcondor


We would be looking to update:

our central manager

our dedicated submit node




 

From: Tim Theisen <tim@xxxxxxxxxxx>
Date: Friday, September 15, 2023 at 10:54 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>, Weatherby,Gerard <gweatherby@xxxxxxxx>
Subject: Re: [HTCondor-users] 10.8.0 / 10.7.0

*** Attention: This is an external email. Use caution responding, opening attachments or clicking on links. ***

Yes, you can mix 10.7.0 and 1.8.0 in a pool. We ran the configuration you described in our production pool during testing.

In fact, we also test the latest 10.0 release against the latest 10.x release. (HTCondor 10.0.8 interoperates with HTCondor 10.8.0.)

We strive to make HTCondor versions work well together. In particular, the latest LTS release must be able to interoperate with the previous LTS.

...Tim

 

On 9/15/23 07:02, Weatherby,Gerard wrote:

Can 10.8.0 nodes be mixed with 10.7.0 components? e.g. Can 10.7.0 execution points be mixed with a 10.8.0 central manager and access point?

 



_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
 
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
-- 
Tim Theisen (he, him, his)
Release Manager
HTCondor & Open Science Grid
Center for High Throughput Computing
Department of Computer Sciences
University of Wisconsin - Madison
4261 Computer Sciences and Statistics
1210 W Dayton St
Madison, WI 53706-1685
+1 608 265 5736
-- 
Tim Theisen (he, him, his)
Release Manager
HTCondor & Open Science Grid
Center for High Throughput Computing
Department of Computer Sciences
University of Wisconsin - Madison
4261 Computer Sciences and Statistics
1210 W Dayton St
Madison, WI 53706-1685
+1 608 265 5736