Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] condor_rm & the docker universe

Date: Thu, 30 Jul 2015 14:45:16 -0500
From: Todd Tannenbaum <tannenba@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] condor_rm & the docker universe

On 7/30/2015 2:31 PM, Brian Bockelman wrote:

On Jul 30, 2015, at 11:40 AM, Dimitri Maziuk <dmaziuk@xxxxxxxxxxxxx> wrote:

On 07/30/2015 10:01 AM, andrew.lahiff@xxxxxxxxxx wrote:

Hi Greg,

Ok, I didn't realized it worked like this - I had assumed HTCondor

would do something like "docker stop", rather than send a signal to the
actual executable running inside the container. Isn't this rather
unsafe? It makes it very easy for people to run jobs which escape
HTCondor's control - according to HTCondor the job has been killed but
the Docker container continues running for as long as it wants.

Greg can correct me if I am wrong, but I believe the signal sending isonly to give the job a chance to "gracefully" shut down (vacate). AfterHTCondor sends the signals, it sets a timer to follow up with a dockerstop. Thus nothing is allowed to continue running forever. See themanual for MachineMaxVacateTime and JobMaxVacateTime - I think thedefault on these is 10 minutes. So to achieve today what you statedabove, I think you could submit your docker universe job with something like

  job_max_vacate_time = 2

and then HTCondor should do a docker-stop two seconds after sending thesignal if the instance is still lingering. I think Greg is thinkingabout changing the default JobMaxVacateTime to be much smaller fordocker universe than the default of 10 minutes...


regards
Todd

Follow-Ups:
- Re: [HTCondor-users] condor_rm & the docker universe
  - From: andrew . lahiff

References:
- [HTCondor-users] condor_rm & the docker universe
  - From: andrew . lahiff
- Re: [HTCondor-users] condor_rm & the docker universe
  - From: Greg Thain
- Re: [HTCondor-users] condor_rm & the docker universe
  - From: andrew . lahiff
- Re: [HTCondor-users] condor_rm & the docker universe
  - From: Dimitri Maziuk
- Re: [HTCondor-users] condor_rm & the docker universe
  - From: Brian Bockelman

Prev by Date: Re: [HTCondor-users] Avoiding combinatorial explosion in dependencies between spliced DAGS
Next by Date: Re: [HTCondor-users] condor_rm & the docker universe
Previous by thread: Re: [HTCondor-users] condor_rm & the docker universe
Next by thread: Re: [HTCondor-users] condor_rm & the docker universe
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [HTCondor-users] condor_rm & the docker universe