[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] condor_rm not killing subprocesses
- Date: Fri, 03 Jun 2005 13:52:55 -0400
- From: Jacob Joseph <jmjoseph@xxxxxxxxxxxxxx>
- Subject: Re: [Condor-users] condor_rm not killing subprocesses
Thanks for the reply. I'm not sure it solves my troubles though. Does
condor send a SIGTERM only to the parent bash process it spawned? If
so, I can reproduce the behavior outside of condor by simply killing
(SIGTERM) the bash script. Bash does not forward this signal to
processes started from within a loop. I believe the correct terminology
is that it is no longer the controlling shell. The end result is that
Condor never ends up getting a signal to the subprocess, which continues
What does work is to send a kill to all processes in the same process
group ID. (kill does this with a negative <pgid> argument). Is there a
way to have condor do this as well? Can condor be modified? Can condor
spawn my own script to accomplish this?
Mark Silberstein wrote:
> It seems that your condor setup doesn't give a time to a program to
> finish nicely when condor is evicting it - look at KILL expression.
> Usually Condor first tries to kill with SIGTERM, and then when KILL
> expression is true - it will kill with -9. It seems that bash doesn't
> have a chance to clean up all its processes, which it does when you kill
> with Ctl-C.
> You may also want to specify kill_sig=SIGQUIT, which will cause Condor
> to kill it with SIGQUIT first.
> On Fri, 2005-06-03 at 01:18 -0400, Jacob Joseph wrote:
>>Hi. I have a number of users who have taken to wrapping their jobs
>>within shell scripts. Often, they'll use a for or while loop to execute
>>a single command with various permutations. When such a job is removed
>>with condor_rm, the main script is killed, but subprocesses spawned from
>>inside a loop will not be killed and will continue to run on the compute
>>machine. This naturally interferes with jobs which are later assigned
>>to that machine.
>>Does anyone know of a way to force bash subprocesses to be killed along
>>with the parent upon removal with condor_rm? (This behavior is not
>>unique to condor_rm. A kill to the parent also leaves the subprocess
>>Condor-users mailing list
> Condor-users mailing list