[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Condor and Docker Live-Restore



Hi Greg,

 

Indeed, during the Docker restart process, the Docker cli is unavailable, giving the standard warning of ‘cannot connect to the docker daemon at unix:///var/run/docker.sock. is the docker daemon running?’ however the cgroups and processes of the containers continue to run. Once the Docker daemon starts again, it re-establishes the ‘docker run’ processes.

 

For my use case it is just restarting the Docker daemon for small incremental patches to Docker (without having to drain the entire node)

 

Many thanks,

 

Tom

 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Greg Thain via HTCondor-users <htcondor-users@xxxxxxxxxxx>
Date: Monday, 11 December 2023 at 23:02
To: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx>
Cc: Greg Thain <gthain@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Condor and Docker Live-Restore

On 12/11/23 03:33, Thomas Birkett - STFC UKRI via HTCondor-users wrote:

Hi all,

 

Apologies for the follow up, does anyone have any experience with the aforementioned use case?

 

 

Hi Thomas:

This is very interesting, thanks for pointing it out.  Presumably the 'docker run' process that starts the container exits in this case, when the docker daemon goes away.  It would be very useful for HTCondor to be able to differentiate that the 'docker run' has gone away, and is reconnectable -- do you know if that is the case?  I assume that for your usage you just want to restart the docker daemon, and not any of the rest of HTCondor?

-greg