[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] credentials issue - condor_submit credential verification?



Hi all,

On Wed, 2008-07-30 at 15:36 -0400, Thompson, Cooper wrote:
> The comment that condor_submit will verify that a credential is stored
> before allowing a job to be submitted perked my interest.  We've
> actually run into the problem where if our users submit a job while
> their credentials are not stored, the submission will succeed, however
> condor_credd will repeatedly crash (producing a core.CREDD.WIN32 file).

Well this is most certainly a bug we should fix. Could you provide us
with a core.CREDD.WIN32 file, along with a D_FULLDEBUG CredLog from the
time period where the crashes occur? Please mail these to
condor-admin@xxxxxxxxxxxx Thanks!

> Our workaround was to implement a gatekeeper that checked for a stored
> credential - and as a "just in case" measure we set the
> MASTER_CREDD_BACKOFF_CEILING to a fairly low value to make sure our
> credd gets back up quickly in case we let a job though.

So does the job eventually make it through or do you end up having to
manually remove it and/or tell the user to run condor_store_cred?

> Does condor_submit really do the check?  If so - any ideas what we are
> doing wrong?

condor_submit does the check provided neither the "-n" nor "-r" options
are given. You don't seem to be doing anything wrong; again, there's at
the very least a bug here causing the CredD to crash.

Greg

> Thanks - and sorry to butt in,
> Coop
> 
> -----Original Message-----
> From: condor-users-bounces@xxxxxxxxxxx
> [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Greg Quinn
> Sent: Wednesday, July 30, 2008 3:24 PM
> To: Condor-Users Mail List
> Subject: Re: [Condor-users] credentials issue .. until putting CREDD on
> that host briefly.
> 
> Hello again,
> 
> On Wed, 2008-07-30 at 19:11 +0000, Steve Shaw wrote:
> > Really appreciate the response Greg, thank-you.  There were such
> > entries and adding encryption worked (i.e. I can now start a new
> > machine in my pool and not have to do that credd trick).  
> 
> Great!
> 
> > Regarding the scenario, I do a condor_store_cred add when I started up
> > a new machine with credd.  Probably something to do with the local
> > cache?
> 
> Yes, that's what I was thinking.
> 
> > ... although I can see the requests actually sent from the submitter
> > to the credd host fail first then become successful after giving it
> > credd briefly.
> 
> Hmm. Password fetch requests definitely shouldn't succeed unless
> encryption is enabled. Perhaps these log entries correspond to other
> types of requests. For example, condor_submit will verify that
> a credential is stored before allowing a job to be submitted. This can
> succeed even if encryption is off.
> 
> Greg
> 
> > Thanks again,
> > Steve
> > 
> > 
> > > From: gquinn@xxxxxxxxxxx
> > > To: condor-users@xxxxxxxxxxx
> > > Date: Wed, 30 Jul 2008 10:23:01 -0500
> > > Subject: Re: [Condor-users] credentials issue .. until putting CREDD
> > on that host briefly.
> > > 
> > > On Wed, 2008-07-30 at 04:51 +0000, Steve Shaw wrote:
> > > > Hmm.. the formatting of my email certainly looks pretty messed up.
> > > > Let me try that again :).
> > > > 
> > > > Hey all, 
> > > > 
> > > > I have been having an ongoing weird issue with Condor related to
> > > > credentials as in my previous post and am still hoping somebody
> > out
> > > > there has an inkling as to what is happening (using 7.0.4) This is
> > the
> > > > scenario:
> > > 
> > > Hi Steve,
> > > 
> > > So is it enough to just briefly start a CredD on a machine then stop
> > it,
> > > or are you also re-running "condor_store_cred add" as part of this
> > > process?
> > > 
> > > > Machine A has CREDD, SCHEDD and STARTD on it 
> > > > Machine B has SCHEDD and STARTD on it 
> > > > Machine C has SCHEDD and STARTD on it 
> > > > 
> > > > I'm running jobs with run_as_owner (and with CREDD_CACHE_LOCALLY
> > set
> > > > to true as well as STARTER_ALLOW_RUN_AS_OWNER set to true) 
> > > > 
> > > > If I do a condor_store_cred add and add my username@DOMAIN, I can
> > > > successfully query from any machine for that username. But to
> > start,
> > > > if I do a condor_submit from Machine B or Machine C, then I will
> > get a
> > > > failure: 
> > > > 
> > > > ShadowLog (of submitter): 
> > > > 7/29 18:51:48 (1.0) (1816): condor_read(): recv() returned -1,
> > errno =
> > > > 10054, assuming failure reading 5 bytes from . 
> > > > 7/29 18:51:48 (1.0) (1816): IO: Failed to read packet header 7/29
> > > > 18:51:48 (1.0) (1816): ERROR: Could not locate valid credential
> > for
> > > > user 'steveshaw89@XYCANADA' 
> > > > 7/29 18:51:48 (1.0) (1816): init_user_ids() failed! Schedlog
> > reports
> > > > something similar... CredLog: 
> > > > 7/29 21:04:48 DaemonCore: Command received via TCP from host ,
> > access
> > > > level DAEMON 
> > > > 7/29 21:04:48 DaemonCore: received command 81099
> > (CREDD_GET_PASSWD),
> > > > calling handler (get_passwd_handler) 
> > > 
> > > Perhaps the CredD is refusing to give the password to the Shadow and
> > > SchedD because the connections are not encrypted. Are there any log
> > > messages like this in the CredLog:
> > > 
> > > "WARNING - password fetch attempt without encryption from ..."
> > > 
> > > Greg
> > > 
> > > > However, if I then, take Machine B and put the CREDD on it, then I
> > can
> > > > now successfully communicate between Machine A and Machine B, but
> > > > Machine C will keep the same failures. But if I then move the
> > CREDD to
> > > > Machine C, I can now successfully send jobs from A,B or C and have
> > it
> > > > received by A, B, or C. Then I can put the CREDD on any machine
> > and
> > > > everything works great. They all communicate with eachother a-ok.
> > Note
> > > > that the only settings that I changed on all 3 machines to get
> > them to
> > > > work was the DAEMON_LIST and the CREDD_HOST. Anybody know why this
> > > > behaviour might be occurring? 
> > > > 
> > > > So, now I can add machines to my network by installing,
> > configuring,
> > > > briefly adding the CREDD to them and then relinquishing the CREDD
> > from
> > > > it. There may very well be something that I haven't configured
> > > > properly as I did not change the authentication from what is there
> > at
> > > > default (which is the old HOSTALLOW, etc. configurations -- maybeI
> > > > need to start using ALLOW?), though through reading all the docs,
> > if I
> > > > don't want/need any authentication, etc. then it didn't appear
> > that I
> > > > needed to make any changes in addition to the few changes I made. 
> > > > 
> > > > Any help is appreciated greatly, of course :) 
> > > > Steve 
> > > 
> > > _______________________________________________
> > > Condor-users mailing list
> > > To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
> > with a
> > > subject: Unsubscribe
> > > You can also unsubscribe by visiting
> > > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> > > 
> > > The archives can be found at: 
> > > https://lists.cs.wisc.edu/archive/condor-users/
> 
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
> a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at: 
> https://lists.cs.wisc.edu/archive/condor-users/
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at: 
> https://lists.cs.wisc.edu/archive/condor-users/