[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Kerberos AS-REPs for Daemon communication not cached



Dear HTCondor experts,

trying to read through "kinit" itself:
https://github.com/krb5/krb5/blob/master/src/clients/kinit/kinit.c
I do mainly see two major differences to HTCondor code:
- They use "krb5_cc_resolve" always first to check if the a credential cache exists, and if it does, they use:
  "krb5_get_init_creds_opt_set_in_ccache" to have the init command use it. 
- They are using "krb5_get_init_creds_opt_set_out_ccache" to enable storage of the fetched credential in the cache. 

Maybe that is already sufficient? 
I am unsure about the parameters and implications, but maybe the HTCondor authentication expert can use this information. 

For reference, the manual steps would be:
------------------------------------------
$ kinit -k host/condor-cm1.domain/REALM
$ kvno host/schedd1.domain@REALM
$ kvno host/condor-cm1.domain@REALM
------------------------------------------

to get the wanted behaviour (i.e. credentials go to cache by default with kinit and kvno):
------------------------------------------
$ klist -Af
Ticket cache: KEYRING:persistent:0:0
Default principal: host/condor-cm1.domain@REALM

Valid starting     Expires            Service principal
05/23/19 02:54:29  05/24/19 02:52:58  host/condor-cm1.domain@REALM
        renew until 05/30/19 02:52:58, Flags: FRT
05/23/19 02:53:01  05/24/19 02:52:58  host/schedd1.domain@REALM
        renew until 05/30/19 02:52:58, Flags: FRT
05/23/19 02:52:58  05/24/19 02:52:58  krbtgt/REALM@REALM
        renew until 05/30/19 02:52:58, Flags: FRI
------------------------------------------

Hope this helps! 

Cheers,
	Oliver

Am 23.05.19 um 02:06 schrieb Oliver Freyermuth:
> Dear HTCondor experts,
> 
> we've observed hefty AS-REQs (Kerberos Authentication Service Requests) with rates up to several hundred requests per second
> when a lot of jobs are started and daemons (using Kerberos auth) need to talk to each other, issued by the central manager node (running negotiator and collector). 
> 
> I can also reproduce that more easily by running "condor_q -all -global" as "root" user who does not have Kerberos credentials on our condor-cm (central manager),
> but can access the host principal (and hence use the service credentials to authenticate). A snippet from the debug logs running condor_q confirms my observation:
> 
> 05/23/19 01:48:15 (fd:4) (pid:2411) (D_SECURITY) KERBEROS: Server principal is host/schedd1.domain@REALM
> 05/23/19 01:48:15 (fd:4) (pid:2411) (D_SECURITY) init_daemon: client principal is 'host/condor-cm1.domain@REALM'
> 05/23/19 01:48:15 (fd:4) (pid:2411) (D_SECURITY) init_daemon: Using default keytab FILE:/etc/krb5.keytab
> 05/23/19 01:48:15 (fd:4) (pid:2411) (D_SECURITY) init_daemon: Trying to get tgt credential for service host/schedd1@REALM
> 05/23/19 01:48:15 (fd:4) (pid:2411) (D_PRIV) PRIV_UNKNOWN --> PRIV_ROOT at /slots/10/dir_2560730/userdir/.tmpV7H12D/BUILD/condor-8.8.2/src/condor_io/condor_auth_kerberos.cpp:632
> 05/23/19 01:48:15 (fd:4) (pid:2411) (D_PRIV) PRIV_ROOT --> PRIV_UNKNOWN at /slots/10/dir_2560730/userdir/.tmpV7H12D/BUILD/condor-8.8.2/src/condor_io/condor_auth_kerberos.cpp:634
> 05/23/19 01:48:15 (fd:4) (pid:2411) (D_SECURITY) init_daemon: gic_kt creds_->client is 'host/condor-cm1.domain@REALM'
> 05/23/19 01:48:15 (fd:4) (pid:2411) (D_SECURITY) init_daemon: gic_kt creds_->server is 'host/schedd1.domain@REALM'
> 05/23/19 01:48:15 (fd:4) (pid:2411) (D_SECURITY) Success..........................
> 
> It seems that in daemon authentication, a fresh credential is fetched for each single daemon-to-daemon interaction. We realized that since the KDC of our computing centre got DOSed by that
> and the service failed (twice up to now). 
> Fetching a credential means, in "Kerberos speak" issuing an AS-REQ and having the KDC generate an AS-REP. This is computationally pretty expensive on the KDC end. 
> 
> Our computing centre is trying to improve the situation on their end to stand this hefty load better, but still it's best practice in Kerberos to cache AS-REPs. 
> 
> Could caching be added? 
> Sadly, I do not have a straightforward suggestion what the implementation is missing to get that - for user credentials, the Kerberos library takes care of that automatically
> (by using credential caches in files or the persistent kernel keyring), but that does not seem to happen for host / service credentials with HTCondor. Maybe HTCondor purges them after usage? 
> But I did not find that explicitly in the code. 
> However, issuing:
> kinit -k host/condor-cm1.domain@REALM
> successfully adds a TGT to the credential cache (in our case, the persistent kernel keyring), as I would expect it. But that does not happen with HTCondor. 
> 
> Cheers,
> 	Oliver
> 


-- 
Oliver Freyermuth
UniversitÃt Bonn
Physikalisches Institut, Raum 1.047
NuÃallee 12
53115 Bonn
--
Tel.: +49 228 73 2367
Fax:  +49 228 73 7869
--

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature