[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] condor_rm & the docker universe

> On Jul 30, 2015, at 10:01 AM, andrew.lahiff@xxxxxxxxxx wrote:
> Hi Greg,
> Ok, I didn't realized it worked like this - I had assumed HTCondor would do something like "docker stop", rather than send a signal to the actual executable running inside the container. Isn't this rather unsafe? It makes it very easy for people to run jobs which escape HTCondor's control - according to HTCondor the job has been killed but the Docker container continues running for as long as it wants.

PID 1 is a very special creature.

In addition to having the somewhat-bizarre signal handling mentioned below (PS Greg - why donât we use the same trick here as in vanilla universe to avoid the problem there?), if PID1 dies, the kernel will immediately kill all processes in the namespace.  Hence, thereâs not really much of a need to worry about leaky processes.

> Just running the job under a shell doesn't seem to work either. I've also been trying scripts which will catch SIGTERM but I haven't managed to get this to have any affect either. Still looking at itâ

One thing to check Gregâs theory would be to strace the lead process and seeing what signal comes in (and how it responds).