Mailing List Archives
Public Access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] condor_interactive & condor_ssh_to_job & /usr/libexec/condor/condor_ssh_to_job_shell_setup & PID namespaces
- Date: Thu, 28 Feb 2019 18:52:15 +0000
- From: Todd Tannenbaum <tannenba@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] condor_interactive & condor_ssh_to_job & /usr/libexec/condor/condor_ssh_to_job_shell_setup & PID namespaces
On 2/28/2019 3:42 AM, Bert DeKnuydt wrote:
>
> Hello all people Condorese,
>
Hi Bert, thank you for sharing your thoughts! More below inline...
> 1) There is still a thinko in
> /usr/libexec/condor/condor_ssh_to_job_shell_setup if
> I understand things correctly.
>
> *) There's a code snippet meant to kill the dummy sleep, killing
> whatever is in
> ÂÂ _CONDOR_JOB_PIDS, when the job is 'Interactive'.
>
> *) However, if I launch an Interactive job, and later, possibly much
> much later,
> ÂÂ do a 'condor_ssh_to_job' to that Interactive Job, that code is ran
> again and
>  a process with _CONDOR_JOB_PIDS is killed. That process can be
> literally
> ÂÂ anything, as pids can easily have rolled over by then.
>
> ÂÂ This is obviously unintentional and can pose a risk to other processes.
> ÂÂ (Luckily it all runs with the users credentials only).
>
> *) That could be fixed by assuring that the target of the kill is indeed
> that
> ÂÂ sleep; or alternatively by disallowing ssh_to_job to an interactive
> job.
>  (But that is really used here though). Or just leave the sleep to
> die by itself.
>
You say it is common at your site to ssh_to_job to an interactive
job.... I am curious, what is the use case motivation to do this (I
could guess, but I'd rather hear your real-world scenario)?
This is indeed a use case we did not anticipate.
Weighing the available options, I am inclined to simply leave the sleep
job to die by itself. I guess the only downsides of this are a)
potential user confusion --- if they do a ps they may wonder why they
see some sleep process running, and b) the slot will remain claimed for
a minimum of 3 minutes by default even if the user quits the interactive
session in 5 seconds. I think I can live with both of these down sides.
> 2) Apart from that, there's another inconsistency, when a startd runs
> with PID
> isolation (i.e. : USE_PID_NAMESPACES = True); then neither ssh_to_job
> nor a plain
> 'Interactive Job' really run under the PID namespace; only the dummy
> sleep did.
>
> In other words, for practical purposes, you should not allow ssh_to_job nor
> interactive jobs, if you really need PID isolation.
>
Yes, currently that is the case - batch jobs run in a pid namespace,
interactive ssh session run in the global namespace. The wisdom behind
this is in this ticket (http://tinyurl.com/y398eugh) for those who
really are interested. Of course, there is isolation in the fact that
each slot runs with the user credentials, or you can even configure each
slot to use its own unique pid.
Thanks and regards,
Todd