[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[HTCondor-users] condor_rm & the docker universe
- Date: Tue, 28 Jul 2015 11:37:07 +0000
- From: andrew.lahiff@xxxxxxxxxx
- Subject: [HTCondor-users] condor_rm & the docker universe
Has anyone been able to successfully kill a docker universe job using "condor_rm"? When I try this (with 8.3.6 and also 8.3.7) the job just stays in the X state:
[root@vm168 condor]# condor_q
-- Schedd: vm168.nubes.stfc.ac.uk : <188.8.131.52:38993>
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
115.0 alahiff 7/28 11:42 0+00:00:00 X 0 0.0 sleep 10000
1 jobs; 0 completed, 1 removed, 0 idle, 0 running, 0 held, 0 suspended
while the container keeps on running:
[root@vm168 condor]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
261d23866b71 busybox "/bin/sleep 10000" 3 minutes ago Up 3 minutes HTCJob115_0_slot1_1_PID8325
In ShadowLog I see:
07/28/15 11:42:34 (fd:6) (pid:8324) (D_ALWAYS) (115.0) (8324): Requesting graceful removal of job.
but nothing else.
Note that eventually the job disappears from condor_q about 10 minutes later (i.e. condor thinks that the job has finishing being removed) but the container itself continues running (!)
I'm using Docker 1.7.1 on SL7.