[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Help on running HTCondor as root



ciao, going on with Debug.
The first hint of something not nice is when at the startup condor tries to run some benchmarks.

I get in startdlog:

10/21/15 12:11:46 (pid:175) BenchMgr:StartBenchmarks()
10/21/15 12:11:47 (pid:175) Create_Process(/usr/libexec/condor/condor_mips): child failed because PRIV_USER_FINAL process was still root before exec()
10/21/15 12:11:47 (pid:175) CronJob: Error running job 'mips'
10/21/15 12:11:47 (pid:175) State change: benchmarks completed
10/21/15 12:11:47 (pid:175) slot1: Changing activity: Benchmarking -> Idle
10/21/15 12:11:48 (pid:175) Create_Process(/usr/libexec/condor/condor_kflops): child failed because PRIV_USER_FINAL process was still root before exec()
10/21/15 12:11:48 (pid:175) CronJob: Error running job 'kflops'
10/21/15 12:11:48 (pid:175) State change: benchmarks completed

can you point me to the change user that condor tries to do in an occasion like this? which is the user it tries to change to?

I was also trying to implement the change you suggested, by copying /bin/su somewhere else

Indeed:

[root@12ef368dde51 root]# ls -l /bin/su
-rwsr-xr-x 1 root root 34904 Oct 14 Â2014 /bin/su

is sutuid

Âcp /bin/su /root/su

bash-4.1# ls -l Â/root/su
-rwxr-xr-x 1 root root 34904 Oct 21 12:19 /root/su

is ok, but still:

bash-4.1# id glidein_pilot
uid=313(glidein_pilot) gid=313(glidein_pilot) groups=313(glidein_pilot)
bash-4.1# ./su glidein_pilot
2015/10/21 12:22:23.36 parrot_run[1] <child:120> notice: warning: system call 250 (keyctl) not supported for program /root/./su
[root@12ef368dde51 root]# id
uid=0(root) gid=0(root) groups=0(root),313(glidein_pilot)

does not seem to do the job ...

tom




On Mon, Oct 19, 2015 at 4:26 PM, Greg Thain <gthain@xxxxxxxxxxx> wrote:
On 10/19/2015 08:50 AM, Tommaso Boccali wrote:
(sorry if it is a double posting, I have multiple email addresses and often I mistake them ...)
ciao, thanks a lot.
I need some more time to identify the problem: for the moment it is not even clear which command is problematic.
I wouldÂbetÂon 'su': condor runs as root, receives a payload and want to run it. In order to do so it probably wants to change used, using either 'su' or 'sudo'.
fact is that client side I do not see error too clearly, so I need collector operators' help and that is slowing down the things quite a bit.


To be clear, HTCondor _really_ does not want to start user jobs as root, because such jobs could compromise the machine. It double-checks that the uid of the job is not zero at the last minute before exec'ing it, just to be sure that no bugs have snuck in. This particular check is what we are hitting here -- HTCondor thinks it is root, changes uid to the job in question (which fails, leaving us at root), and right before we would run the job, we notice that the job would be run as root, so we throw our hands up.

-greg

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/