[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] credentials issue .. until putting CREDD on that host briefly.



On Wed, 2008-07-30 at 04:51 +0000, Steve Shaw wrote:
> Hmm.. the formatting of my email certainly looks pretty messed up.
> Let me try that again :).
> 
> Hey all, 
> 
> I have been having an ongoing weird issue with Condor related to
> credentials as in my previous post and am still hoping somebody out
> there has an inkling as to what is happening (using 7.0.4) This is the
> scenario:

Hi Steve,

So is it enough to just briefly start a CredD on a machine then stop it,
or are you also re-running "condor_store_cred add" as part of this
process?

> Machine A has CREDD, SCHEDD and STARTD on it 
> Machine B has SCHEDD and STARTD on it 
> Machine C has SCHEDD and STARTD on it 
> 
> I'm running jobs with run_as_owner (and with CREDD_CACHE_LOCALLY set
> to true as well as STARTER_ALLOW_RUN_AS_OWNER set to true) 
> 
> If I do a condor_store_cred add and add my username@DOMAIN, I can
> successfully query from any machine for that username. But to start,
> if I do a condor_submit from Machine B or Machine C, then I will get a
> failure: 
> 
> ShadowLog (of submitter): 
> 7/29 18:51:48 (1.0) (1816): condor_read(): recv() returned -1, errno =
> 10054, assuming failure reading 5 bytes from . 
> 7/29 18:51:48 (1.0) (1816): IO: Failed to read packet header 7/29
> 18:51:48 (1.0) (1816): ERROR: Could not locate valid credential for
> user 'steveshaw89@XYCANADA' 
> 7/29 18:51:48 (1.0) (1816): init_user_ids() failed! Schedlog reports
> something similar... CredLog: 
> 7/29 21:04:48 DaemonCore: Command received via TCP from host , access
> level DAEMON 
> 7/29 21:04:48 DaemonCore: received command 81099 (CREDD_GET_PASSWD),
> calling handler (get_passwd_handler) 

Perhaps the CredD is refusing to give the password to the Shadow and
SchedD because the connections are not encrypted. Are there any log
messages like this in the CredLog:

"WARNING - password fetch attempt without encryption from ..."

Greg

> However, if I then, take Machine B and put the CREDD on it, then I can
> now successfully communicate between Machine A and Machine B, but
> Machine C will keep the same failures. But if I then move the CREDD to
> Machine C, I can now successfully send jobs from A,B or C and have it
> received by A, B, or C. Then I can put the CREDD on any machine and
> everything works great. They all communicate with eachother a-ok. Note
> that the only settings that I changed on all 3 machines to get them to
> work was the DAEMON_LIST and the CREDD_HOST. Anybody know why this
> behaviour might be occurring? 
> 
> So, now I can add machines to my network by installing, configuring,
> briefly adding the CREDD to them and then relinquishing the CREDD from
> it. There may very well be something that I haven't configured
> properly as I did not change the authentication from what is there at
> default (which is the old HOSTALLOW, etc. configurations -- maybe I
> need to start using ALLOW?), though through reading all the docs, if I
> don't want/need any authentication, etc. then it didn't appear that I
> needed to make any changes in addition to the few changes I made. 
> 
> Any help is appreciated greatly, of course :) 
> Steve