[HTCondor-users] startd doesn't start


I'm in a bit of a pickle and can't understand what I'm doing wrong. I have two small testbeds which I should have the same configuration and one works and the other doesn't. They both are configured with puppet.

The one that doesn't work is condor-8.6.1 the one that works is condor-8.4.11.

They are both started by root, on both the UID domain is set to the same value both on the head node and the pool node (as a matter of fact startd doesn't start on the head node either), the both have the same pool_password, but there are some differences. For example the 8.6.1 condor_shared_p starts automatically while in 8.4.11 it doesn't. We don't The pool_password are created differently that's why I stuck with the one that worked on at least one testbed. I can see startd starting for few seconds and then dying or, according to the logs, getting killed

In the StartLog files I have this error

03/17/17 08:20:35 ERROR: Attempt to initialize user_priv with root privileges rejected
03/17/17 08:20:35 ERROR "Programmer Error: attempted switch to user privilege, but user ids are not initialized" at line 1500 in file

While the MasterLog I have an endless series of these messages

03/17/17 03:20:33 restarting /usr/sbin/condor_startd in 3600 seconds
03/17/17 04:20:33 Started DaemonCore process "/usr/sbin/condor_startd", pid and pgroup = 2717119
03/17/17 04:20:34 DefaultReaper unexpectedly called on pid 2717119, status 1024.
03/17/17 04:20:34 The STARTD (pid 2717119) exited with status 4
03/17/17 04:20:34 restarting /usr/sbin/condor_startd in 3600 seconds
03/17/17 05:20:34 Started DaemonCore process "/usr/sbin/condor_startd", pid and pgroup = 2723991
03/17/17 05:20:35 DefaultReaper unexpectedly called on pid 2723991, status 1024.
03/17/17 05:20:35 The STARTD (pid 2723991) exited with status 4

I can only find references to these errors that are pretty old or not applicable.

thanks for any help


