Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] condor_rm & the docker universe

Date: Wed, 29 Jul 2015 11:30:45 -0500
From: Greg Thain <gthain@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] condor_rm & the docker universe


Andrew:

I think what's going on here is that docker uses linux pid namespaces,and your job runs with pid 1 inside the namespace. The Linux kernel hasa (mis)feature wherein it does not deliver signals to pid 1 if there isno signal handler for that signal installed (for handle-able signals).

Condor, by default, sends SIGTERM on remove (and preempt and evictions),in order for the job to be able to clean up gracefully. To be sure, ifthe job hasn't exited after a longer timeout, condor will send SIGKILL,which can't be caught, and which the kernel will deign to correctlydeliver. During this interval, the job will be in the X state.

I believe the job will exit promptly if it catches SIGTERM/SIGQUIT.Perhaps the easiest way to do this is to run it under a shell.


-Greg

Follow-Ups:
- Re: [HTCondor-users] condor_rm & the docker universe
  - From: andrew . lahiff

References:
- [HTCondor-users] condor_rm & the docker universe
  - From: andrew . lahiff

Prev by Date: [HTCondor-users] Two Factor Authentication
Next by Date: Re: [HTCondor-users] Windows Keyboard Idle Isn't Working on 8.3.6
Previous by thread: [HTCondor-users] condor_rm & the docker universe
Next by thread: Re: [HTCondor-users] condor_rm & the docker universe
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [HTCondor-users] condor_rm & the docker universe