[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] condor_ssh_to_job broken with 8.8 on CentOS 7



Am 27.02.19 um 10:14 schrieb Steffen Grunewald:
On Wed, 2019-02-27 at 09:57:12 +0100, Oliver Freyermuth wrote:
Good morning,

Am 27.02.19 um 09:43 schrieb Steffen Grunewald:
- The argument "-a" to nsenter not being present on CentOS 7

also not in Debian Stretch (and now that I check it with Jessie,
it hasn't been there too - why dod nobody notice?)

It's present in Ubuntu 18.04 LTS, maybe all container users (apart from us) have that on their servers,
or have not upgraded yet.

I've still got to find a man page that lists -a as a valid option;
https://www.systutorials.com/docs/linux/man/1-nsenter/ doesn't have it.

This one has it:
http://manpages.ubuntu.com/manpages/bionic/man1/nsenter.1.html
Here's the commit which added that feature to nsenter:
https://github.com/karelzak/util-linux/commit/974cc006f122f36e2187cedb9d3e58dc2d24814c


The problem was introduced with 8.8.0, as I could find by comparing
src/condor_starter.V6.1/os_proc.cpp of 8.6 and 8.8 versions.
Since some of the biggest Condor users refuse to run x.y.0 this wasn't
discover earlier...

The switch to nsenter in general is something I strongly congratulate on (and actually asked for).
Without nsenter, the only way to have interactive jobs at all was to run sshd inside the container -
with all the problems this causes (SELinux,
you need setuid root for Singularity due to permission issues when not all uids are mapped
[ https://bugzilla.mindrot.org/show_bug.cgi?id=2813 ], dirty bind mounts are needed).

So nsenter is definitely the way to go, but I had hoped it was tested better beforehand.
But I'm positivie it will start to work in one of the next versions if we provide sufficient input ;-).


I'm lacking a test case at the moment, but I'm fearing the worst.
Still nobody has attempted to run containers, it seems (or they failed
and failed to report it?)

I think Greg is correct and "-U" only fails with setuid root Singularity
(the code "only" affects Singularity users in any case). Probably "-a" would do the correct thing,
since Singularity with setuid root does not create a new user namespace so there's no need to attach.

According to both the man page referred to above, and the Debian one,
replacing "-a" with "-m -u -i -n -p -U" should do the trick (this
leaves to be discussed whether all of those are needed - but if "-a"
is supposed to work in the given context, then its "full expansion"
should also do).

The man page does not match the code, apparently.
The diff I linked shows that for "-U" a special handling is done, and
it is dropped if the to-be-attached-to-PID is in the very same user namespace.


Here's the diff:

condor-8.8.1# diff -u src/condor_starter.V6.1/os_proc.cpp{.ORIG,}
--- src/condor_starter.V6.1/os_proc.cpp.ORIG    2019-02-19 05:08:49.000000000 +0100
+++ src/condor_starter.V6.1/os_proc.cpp 2019-02-27 10:09:43.513715435 +0100
@@ -1106,7 +1106,13 @@
         }
         ArgList args;
         args.AppendArg("/usr/bin/nsenter");
-       args.AppendArg("-a"); // all namespaces
+       #args.AppendArg("-a"); // all namespaces
+       args.AppendArg("-m");
+       args.AppendArg("-u");
+       args.AppendArg("-i");
+       args.AppendArg("-n");
+       args.AppendArg("-p");
+       args.AppendArg("-U");
         args.AppendArg("-t"); // target pid
         char buf[32];
         sprintf(buf,"%d", pid);

Greg, did I overlook something?

I'll make this a Debian patch, and rebuild, if there's no veto...

It will break (at least) singularity with setuid root, since "-U" does not work there if /proc/<pid>/user
is the same as the host's namespace.


I'll try to find out the best working combination in the next days. Potentially, disabling the automatic killing of the "sleep" job
before nsenter attaches, wrappering nsenter correctly for CentOS 7 and setting some environment variables to have a well-defined PATH and working
bash initialization when attaching could work around all discovered issues.

Since adding the reaper seems to be the second change made to the same source file,
perhaps we can learn about the rationale behind that?

That would indeed be interesting.


At least, now our users are out of the game and there's less stress from people flooding me with mails since their interactive jobs don't start,
and we have a good way to test out such things before rolling them out to the full cluster ;-).

Thanks (to you and them) for being the Guinea pigs...

;-)

I'm now down to seeing:
-------------------------------------
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell
-------------------------------------
when attaching to a job, after injecting an nsenter-wrapper (not using -U) and patching
/usr/libexec/condor/condor_ssh_to_job_shell_setup
to not kill the "sleep" process, and doing those exports in the wrapper:
  export SHELL=/bin/bash
  export PATH=$PATH:/bin
I probably need to revisit what exactly is happening to the TTYs now.

Cheers,
	Oliver


- S



Attachment: smime.p7s
Description: S/MIME Cryptographic Signature