[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Shared credentials host, shadow user/password issues






Hey all.. I'm still really stuck with this issue .. any insight to even what might have caused this (i.e. something that I can dig into) would be hugely appreciated.  At the moment, its drivin' me nuts :|.

Steve


________________________________
> From: steveshaw89@xxxxxxxxxxx
> To: condor-users@xxxxxxxxxxx
> Subject: Shared credentials host, shadow user/password issues
> Date: Fri, 4 Jul 2008 01:11:18 +0000
> 
> Hello all,
> 
> This issue is hopefully a cinch for somebody out there ;).  I'm using Condor 7.0.1.  I've got a credentials server running as well as a bunch of submit/execute machines.  Everything was running great a month ago, but Condor hasn't been used since then.  Now, getting back into it, I've run into a problem where no job will submit because once a machine claims a job, the submitter sends off a RELEASE_CLAIM to the startd of the execute machine.  The RELEASE_CLAIM looks to be being sent due to exceptions in the shadow daemon:
> 
> /3 17:54:51 (pid:3528) Started shadow for job 2.0 on "", (shadow pid = 2284)
> 7/3 17:54:51 (pid:3528) Shadow pid 2284 for job 2.0 exited with status 4
> 7/3 17:54:51 (pid:3528) ERROR: Shadow exited with job exception code!
> 7/3 17:54:51 (pid:3528) Match for cluster 2 has had 5 shadow exceptions, relinquishing.
> 7/3 17:54:51 (pid:3528) Sent RELEASE_CLAIM to startd at 
> 7/3 17:54:51 (pid:3528) Match record (, 2, 0) deleted
> 7/3 17:54:51 (pid:3528) Got VACATE_SERVICE from 
> 
> And sure enough, the ShadowLog is showing a whole lot of errors regarding the user privileges:
> 
> 7/3 17:54:51 Initializing a VANILLA shadow for job 2.0
> 7/3 17:54:51 (2.0) (2284): condor_read(): recv() returned -1, errno = 10054, assuming failure reading 5 bytes from .
> 7/3 17:54:51 (2.0) (2284): IO: Failed to read packet header
> 7/3 17:54:51 (2.0) (2284): ERROR: Could not locate valid credential for user 'steveshaw89@XYCANADA'
> 7/3 17:54:51 (2.0) (2284): init_user_ids() failed!
> 7/3 17:54:51 (2.0) (2284): condor_read(): recv() returned -1, errno = 10054, assuming failure reading 5 bytes from .
> 7/3 17:54:51 (2.0) (2284): IO: Failed to read packet header
> 7/3 17:54:51 (2.0) (2284): ERROR: Could not locate valid credential for user 'steveshaw89@XYCANADA'
> 7/3 17:54:51 (2.0) (2284): init_user_ids() failed!
> 7/3 17:54:51 (2.0) (2284): ERROR "set_user_priv() failed!" at line 522 in file ..\src\condor_c++_util\uids.C
> 
> What I'm confused about is that, looking on the credentials daemon, the CredLog looks like it has successfully received the user/password request and processed it:
> 
> 7/3 17:54:51 DaemonCore: Command received via TCP from host , access level DAEMON
> 7/3 17:54:51 DaemonCore: received command 81100 (CREDD_NOP), calling handler (nop_handler)
> 7/3 17:54:51 DaemonCore: in SendAliveToParent()
> 7/3 17:54:51 DaemonCore: Leaving SendAliveToParent() - success
> 7/3 17:54:51 DaemonCore: Command received via TCP from host , access level WRITE
> 7/3 17:54:51 DaemonCore: received command 479 (STORE_CRED), calling handler (store_cred_handler)
> 7/3 17:54:51 Checking for steveshaw89@XYCANADA in credential storage.
> 7/3 17:54:51 Succeeded to log in steveshaw89@XYCANADA
> 7/3 17:54:51 Switching back to old priv state.
> 
> However, there is the following in the log:
> 
> 7/3 17:54:52 DaemonCore: Command received via TCP from host , access level DAEMON
> 7/3 17:54:52 DaemonCore: received command 81099 (CREDD_GET_PASSWD), calling handler (get_passwd_handler)
> 7/3 17:54:52 WARNING - password fetch attempt without authentication from 
> 7/3 17:54:52 DaemonCore: Command received via TCP from host , access level DAEMON
> 7/3 17:54:52 DaemonCore: received command 81099 (CREDD_GET_PASSWD), calling handler (get_passwd_handler)
> 7/3 17:54:52 WARNING - password fetch attempt without authentication from 
> 7/3 17:54:52 DaemonCore: Command received via TCP from host , access level DAEMON
> 7/3 17:54:52 DaemonCore: received command 81099 (CREDD_GET_PASSWD), calling handler (get_passwd_handler)
> 7/3 17:54:52 WARNING - password fetch attempt without authentication from 
> 7/3 17:54:52 DaemonCore: Command received via TCP from host , access level DAEMON
> 7/3 17:54:52 DaemonCore: received command 81099 (CREDD_GET_PASSWD), calling handler (get_passwd_handler)
> 7/3 17:54:52 WARNING - password fetch attempt without authentication from 
> 7/3 17:54:52 DaemonCore: Command received via TCP from host , access level DAEMON
> 7/3 17:54:52 DaemonCore: received command 81099 (CREDD_GET_PASSWD), calling handler (get_passwd_handler)
> 7/3 17:54:52 WARNING - password fetch attempt without authentication from 
> 7/3 17:54:52 DaemonCore: Command received via TCP from host , access level DAEMON
> 7/3 17:54:52 DaemonCore: received command 81099 (CREDD_GET_PASSWD), calling handler (get_passwd_handler)
> 7/3 17:54:52 WARNING - password fetch attempt without authentication from 
> 7/3 17:54:52 DaemonCore: Command received via TCP from host , access level DAEMON
> 7/3 17:54:52 DaemonCore: received command 81099 (CREDD_GET_PASSWD), calling handler (get_passwd_handler)
> 7/3 17:54:52 WARNING - password fetch attempt without authentication from 
> 7/3 17:54:52 DaemonCore: Command received via TCP from host , access level DAEMON
> 7/3 17:54:52 DaemonCore: received command 81099 (CREDD_GET_PASSWD), calling handler (get_passwd_handler)
> 7/3 17:54:52 WARNING - password fetch attempt without authentication from 
> 7/3 17:54:52 DaemonCore: Command received via TCP from host , access level DAEMON
> 7/3 17:54:52 DaemonCore: received command 81099 (CREDD_GET_PASSWD), calling handler (get_passwd_handler)
> 7/3 17:54:52 WARNING - password fetch attempt without authentication from 
> 7/3 17:54:52 DaemonCore: Command received via TCP from host , access level DAEMON
> 7/3 17:54:52 DaemonCore: received command 81099 (CREDD_GET_PASSWD), calling handler (get_passwd_handler)
> 7/3 17:54:52 WARNING - password fetch attempt without authentication from 
> 
> That's all I can come up with, but it doesn't really look like an error per se.  Anybody know what might be going on between the communication of my submitters and the credentials host.  *Any* suggestions / help / etc.  is greatly appreciated :).
> 
> Thanks,
> Steve S.
> 
> 
> ________________________________
> Express yourself with free Messenger emoticons. Get them today!

_________________________________________________________________
If you like crossword puzzles, then you'll love Flexicon, a game which combines four overlapping crossword puzzles into one!
http://g.msn.ca/ca55/208