[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] CondorCE token subject mapping not working anymore



Hi Brian,

networking between our CEs and the WLCG IAM seems in principle OK with responses taking <1s [1] But the timeout really seems to be the trigger breaking things - knowing that the log [2] with 4.409412s looks really suspicious ð

I had also tried to generate a client/tockens at another instance [3] but failed so far to find one working and offering compute scopes (the EGI instances seem to be down not returning expected JSONs and the Helmholtz and such do not know about compute scopes).
Maybe there is a OSG instance I could try?

Cheers,
  Thomas

[1]
root@grid-htcondorce1: [~] time curl https://wlcg.cloud.cnaf.infn.it/.well-known/openid-configuration
4{"request_parameter_supported":true,"introspection_endpoint":"https://wlcg.cloud.cnaf.infn.it/introspect","claims_parameter_supported":false,"scopes_supported":["openid","profile","email","offline_access","wlcg","wlcg.groups","storage.read:/","storage.create:/","compute.read","compute.modify","compute.create","compute.cancel","storage.modify:/","eduperson_scoped_affiliation","eduperson_entitlement","eduperson_assurance","storage.stage:/","entitlements"],"issuer":"https://wlcg.cloud.cnaf.infn.it/","userinfo_encryption_enc_values_supported":["XC20P","A256CBC+HS512","A256GCM","A192GCM","A128GCM","A128CBC-HS256","A192CBC-HS384","A256CBC-HS512","A128CBC+HS256"],"id_token_encryption_enc_values_supported":["XC20P","A256CBC+HS512","A256GCM","A192GCM","A128GCM","A128CBC-HS256","A192CBC-HS384","A256CBC-HS512","A128CBC+HS256"],"authorization_endpoint":"https://wlcg.cloud.cnaf.infn.it/authorize","request_object_encryption_enc_values_supported":["XC20P","A256CBC+HS512","A256GCM","A192GCM","A128GCM","A128CBC-HS256","A192CBC-HS384","A256CBC-HS512","A128CBC+HS256"],"device_authorization_endpoint":"https://wlcg.cloud.cnaf.infn.it/devicecode","userinfo_signing_alg_values_supported":["HS256","HS384","HS512","RS256","RS384","RS512","ES256","ES384","ES512","PS256","PS384","PS512"],"claims_supported":["sub","name","preferred_username","given_name","family_name","middle_name","nickname","profile","picture","zoneinfo","locale","updated_at","email","email_verified","organisation_name","groups","wlcg.groups","external_authn"],"op_policy_uri":"https://wlcg.cloud.cnaf.infn.it/about","claim_types_supported":["normal"],"token_endpoint_auth_methods_supported":["client_secret_basic","client_secret_post","client_secret_jwt","private_key_jwt","none"],"token_endpoint":"https://wlcg.cloud.cnaf.infn.it/token","response_types_supported":["code","token"],"request_uri_parameter_supported":false,"userinfo_encryption_alg_values_supported":["RSA-OAEP-512","RSA-OAEP","RSA-OAEP-256","RSA1_5","RSA-OAEP-384"],"grant_types_supported":["authorization_code","implicit","refresh_token","client_credentials","password","urn:ietf:params:oauth:grant-type:token-exchange","urn:ietf:params:oauth:grant-type:device_code"],"revocation_endpoint":"https://wlcg.cloud.cnaf.infn.it/revoke","userinfo_endpoint":"https://wlcg.cloud.cnaf.infn.it/userinfo","op_tos_uri":"https://wlcg.cloud.cnaf.infn.it/about","token_endpoint_auth_signing_alg_values_supported":["HS256","HS384","HS512","RS256","RS384","RS512","ES256","ES384","ES512","PS256","PS384","PS512"],"require_request_uri_registration":false,"code_challenge_methods_supported":["plain","S256"],"id_token_encryption_alg_values_supported":["RSA-OAEP-512","RSA-OAEP","RSA-OAEP-256","RSA1_5","RSA-OAEP-384"],"jwks_uri":"https://wlcg.cloud.cnaf.infn.it/jwk","subject_types_supported":["public","pairwise"],"id_token_signing_alg_values_supported":["HS256","HS384","HS512","RS256","RS384","RS512","ES256","ES384","ES512","PS256","PS384","PS512","none"],"registration_endpoint":"https://wlcg.cloud.cnaf.infn.it/iam/api/client-registration","request_object_signing_alg_values_supported":["HS256","HS384","HS512","RS256","RS384","RS512","ES256","ES384","ES512","PS256","PS384","PS512"],"request_object_encryption_alg_values_supported":["RSA-OAEP-512","RSA-OAEP","RSA-OAEP-256","RSA1_5","RSA-OAEP-384"]}
real	0m0.392s
user	0m0.055s
sys	0m0.068s


[2]
04/18/23 16:19:17.811 (D_COMMAND) Calling Handler <DaemonCommandProtocol::WaitForSocketData> (5)
04/18/23 16:19:17.811 (D_SECURITY) Writing round 1.
04/18/23 16:19:17.811 (D_SECURITY) SSL write has succeeded.
04/18/23 16:19:17.811 (D_SECURITY) SSL Auth: Receive message.
04/18/23 16:19:17.811 (D_SECURITY) Received message (4).
04/18/23 16:19:17.811 (D_SECURITY) Status: c: 4, s: 4
04/18/23 16:19:17.811 (D_SECURITY:2) CRYPTO: simple reset m_ivec(len 8) and m_num
04/18/23 16:19:17.811 (D_SECURITY) Reading SciTokens round 0.
04/18/23 16:19:17.811 (D_SECURITY) SSL Auth: SciToken: continue read/write.
04/18/23 16:19:17.811 (D_SECURITY) SSL Auth: Would block when trying to receive message
04/18/23 16:19:17.811 (D_SECURITY) AUTHENTICATE: auth would still block
04/18/23 16:19:17.811 (D_SECURITY) Will return to DC to continue authentication.. 04/18/23 16:19:17.811 (D_COMMAND) Return from Handler <DaemonCommandProtocol::WaitForSocketData> 0.000085s 04/18/23 16:19:17.811 (D_COMMAND) Calling Handler <DaemonCommandProtocol::WaitForSocketData> (5)
04/18/23 16:19:17.811 (D_SECURITY) Reading SciTokens round 0.
04/18/23 16:19:17.811 (D_SECURITY) SSL Auth: SciToken: continue read/write.
04/18/23 16:19:17.811 (D_SECURITY) SSL Auth: Receive message.
04/18/23 16:19:17.811 (D_SECURITY) Received message (4).
04/18/23 16:19:17.811 (D_SECURITY) SciToken exchange server status: c: 4, s: 2
04/18/23 16:19:17.811 (D_SECURITY) Reading SciTokens round 1.
04/18/23 16:19:17.811 (D_SECURITY:2) Peeked at the sent token; 730 bytes long; SSL status 4.
04/18/23 16:19:17.811 (D_SECURITY) SciToken SSL read is successful.
04/18/23 16:19:22.220 (D_SECURITY) SCITOKENS:2:Failed to verify token and generate ACLs: Timeout was reached 04/18/23 16:19:22.220 (D_SECURITY:2) AUTHENTICATION: map file already loaded. 04/18/23 16:19:22.220 (D_ERROR) Failed to map SCITOKENS authenticated identity '', failing authentication to give another authentication method a go.
04/18/23 16:19:22.220 (D_SECURITY) Send message (3).
04/18/23 16:19:22.220 (D_SECURITY) SciToken exchange server status: c: 4, s: 3 04/18/23 16:19:22.220 (D_SECURITY) SSL Auth: SciToken Authentication failed at token exchange.
04/18/23 16:19:22.220 (D_SECURITY) AUTHENTICATE: do_authenticate is 0.
04/18/23 16:19:22.220 (D_SECURITY) AUTHENTICATE: method 4096 (SCITOKENS) failed. 04/18/23 16:19:22.220 (D_SECURITY) AUTHENTICATE: can still try these methods: SCITOKENS 04/18/23 16:19:22.220 (D_SECURITY) HANDSHAKE: in handshake(my_methods = 'SCITOKENS')
04/18/23 16:19:22.220 (D_SECURITY) AUTHENTICATE: handshake would block
04/18/23 16:19:22.220 (D_SECURITY) Will return to DC to continue authentication.. 04/18/23 16:19:22.220 (D_COMMAND) Return from Handler <DaemonCommandProtocol::WaitForSocketData> 4.409412s 04/18/23 16:19:22.221 (D_ALWAYS:2) Evaluated periodic expressions in 0.000s, scheduling next run in 60s

[3]
[1] https://wlcg.cloud.cnaf.infn.it/
[2] https://iam-test.indigo-datacloud.eu/
[3] https://iam.deep-hybrid-datacloud.eu/
[4] https://iam.extreme-datacloud.eu/
[5] https://iam-demo.cloud.cnaf.infn.it/
[6] https://b2access.eudat.eu:8443/oauth2
[7] https://b2access-integration.fz-juelich.de/oauth2
[8] https://login-dev.helmholtz.de/oauth2
[9] https://login.helmholtz.de/oauth2
[10] https://services.humanbrainproject.eu/oidc/
[11] https://accounts.google.com
[12] https://aai.egi.eu/oidc/
[13] https://aai-dev.egi.eu/oidc/
[14] https://aai-dev.egi.eu/auth/realms/egi
[15] https://aai-demo.egi.eu/auth/realms/egi
[16] https://aai.egi.eu/auth/realms/egi
[17] https://login.elixir-czech.org/oidc/
[18] https://oidc.scc.kit.edu/auth/realms/kit


On 18/04/2023 15.12, Bockelman, Brian wrote:
Ah-ha!  I know this one.  This is the relevant error:

04/18/23 12:24:22.680 (D_SECURITY) SCITOKENS:2:Failed to verify token and generate ACLs: Timeout was reached

The remote endpoint is given 4 seconds to respond to a request for the public key.  Is there potentially some networking issue between you and the remote endpoint?  For example, how long does

curl https://wlcg.cloud.cnaf.infn.it/.well-known/openid-configuration

Take to complete?

Brian

On Apr 18, 2023, at 5:35 AM, Thomas Hartmann <thomas.hartmann@xxxxxxx> wrote:

Hi Jamie,

with the Security logging on debug level 2 I see a bit more. However, it is still not really clear to me, what is the cause. The error is due to the token not being verified and not mapped [1].
But from my opinion, the existing mapping [2] should match on the token subject.

Cheers,
  Thomas

[1]
04/18/23 12:24:18.282 (D_SECURITY) SciToken SSL read is successful.
04/18/23 12:24:22.680 (D_SECURITY) SCITOKENS:2:Failed to verify token and generate ACLs: Timeout was reached
04/18/23 12:24:22.681 (D_SECURITY:2) AUTHENTICATION: map file already loaded.
04/18/23 12:24:22.681 (D_ERROR) Failed to map SCITOKENS authenticated identity '', failing authentication to give another authentication method a go.


[2]
root@grid-htcondorce-dev02: [/etc/condor-ce/config.d] cat /etc/condor-ce/mapfiles.d/11_99_token-mapping_DEBUG.conf
SCITOKENS /^https:\/\/wlcg\.cloud\.cnaf\.infn\.it\/,8ec82f26\-a407\-44d7\-aa32\-19cd985cd2d1$/ desyusr009
SCITOKENS /^https:\/\/wlcg\.cloud\.cnaf\.infn\.it\/,1ec796cb\-250b\-479d\-a9e9\-6509995adab0$/ desyusr007
# SCITOKENS /^https:\/\/wlcg\.cloud\.cnaf\.infn\.it\/,bf47638b-5be1-4cda-a156-c2b9d2d1d352$/ desyusr009
# SCITOKENS /^https:\/\/wlcg\.cloud\.cnaf\.infn\.it\/,bc2de59f-c564-4fef-9614-d89c1819426b$/ desyusr009
SCITOKENS /^https:\/\/wlcg\.cloud\.cnaf\.infn\.it\/,737b9ec0\-fb66\-472d\-9ce3\-e943a677464f$/ desyusr008
# SCITOKENS /^https\:\/\/.*,.*/ desyprd004


On 14/04/2023 22.36, Jaime Frey via HTCondor-users wrote:
Can you add the D_SECURITY logging level on the CE daemons? Many SciTokens-related errors are not recorded otherwise.

   - Jaime

On Apr 14, 2023, at 9:20 AM, Thomas Hartmann <thomas.hartmann@xxxxxxx> wrote:

Hi all,

preparing the migration from CondorCE 5 to GSI-less 6, we noticed that the WLCGToken mapping has been failing for some time.

Cross-checking on our production v5 CEs, we realized, that token mapping has been failing for some time and that authz fall back to GSI, which had been unnoticed so far.
Since token authz had worked in the past, I am currently struggling to identify, what change or config broke the mapping.

Starting with a fresh CondorCE installation from scratch and adding configs & mappings, I have not been able to get the token mapping working again.

It is a CondorCE v6, Condor v10.4 installation on EL7 [1].

Mapping rules are tokens only with a test client mapped to (existing) local users [2], so that tokens like [3] should get mapped onto the local `desyusr007`.

However, trace and write pings always fail due to an allegedly broken mapping [4]. Judging from the SchedLog and AuditLog [5,6] the tokens are received and parsed - but then something(??) is not to the CE's liking :-/

Submitting a job to a friendly site's CE works with the mapping rule as of [2] deployed - so I would rule out an issue with the client/tokens. The other way round, a job from the remote site (running under a token from a client of the other site) fails, so that it is most probable something local with my CE.

Also a very trusting catch all map rule
   SCITOKENS /^https\:\/\/.*,.*/ desyprd004
failed.

Daemon output is already on `ALL_DEBUG = D_FULLDEBUG` but maybe there is a way to increase the audit logging to get an idea, why the matching fails?

SELinux or so seems not involved so far.

Maybe someone has an idea, where I might find the underlying issue? (probably something system related and not directly CondorCE config specific??).

Cheers and thanks for any idea,
   Thomas

[1]
condor-procd-10.4.0-1.el7.x86_64
condor-classads-10.4.0-1.el7.x86_64
python2-condor-10.4.0-1.el7.x86_64
htcondor-ce-6.0.0-1.el7.noarch
condor-stash-plugin-6.10.0-1.x86_64
python3-condor-10.4.0-1.el7.x86_64
htcondor-ce-client-6.0.0-1.el7.noarch
condor-externals-9.0.15-1.el7.x86_64
condor-10.4.0-1.el7.x86_64
condor-blahp-10.4.0-1.el7.x86_64
htcondor-ce-apel-6.0.0-1.el7.noarch
htcondor-release-10.x-1.el7.noarch

[2]
grep include /etc/condor-ce/condor_mapfile
@include /etc/condor-ce/mapfiles.d/
@include /usr/share/condor-ce/mapfiles.d/

cat /etc/condor-ce/mapfiles.d/11_99_token-mapping_DEBUG.conf
SCITOKENS /^https:\/\/wlcg\.cloud\.cnaf\.infn\.it\/,8ec82f26\-a407\-44d7\-aa32\-19cd985cd2d1$/ desyusr009
SCITOKENS /^https:\/\/wlcg\.cloud\.cnaf\.infn\.it\/,1ec796cb\-250b\-479d\-a9e9\-6509995adab0$/ desyusr007
# SCITOKENS /^https:\/\/wlcg\.cloud\.cnaf\.infn\.it\/,bf47638b-5be1-4cda-a156-c2b9d2d1d352$/ desyusr009
# SCITOKENS /^https:\/\/wlcg\.cloud\.cnaf\.infn\.it\/,bc2de59f-c564-4fef-9614-d89c1819426b$/ desyusr009
SCITOKENS /^https:\/\/wlcg\.cloud\.cnaf\.infn\.it\/,737b9ec0\-fb66\-472d\-9ce3\-e943a677464f$/ desyusr008


[3]
{
   "wlcg.ver": "1.0",
   "sub": "1ec796cb-250b-479d-a9e9-6509995adab0",
   "aud": "https://wlcg.cern.ch/jwt/v1/any";,
   "nbf": 1681479491,
   "scope": "openid compute.create offline_access compute.read compute.cancel compute.modify",
   "iss": "https://wlcg.cloud.cnaf.infn.it/";,
   "exp": 1681483091,
   "iat": 1681479491,
   "jti": "78d1ad5a-2be0-4367-88bb-6a9f59939bc5",
   "client_id": "8ec82f26-a407-44d7-aa32-19cd985cd2d1"
}

[4]
export _condor_SEC_CLIENT_AUTHENTICATION_METHODS=SCITOKENS
export BEARER_TOKEN_FILE=/tmp/token_$(id -u)
date; oidc-token -f --time=720 belle-desydebug-group > /tmp/token_$(id -u); condor_ce_trace --debug grid-htcondorce-dev.desy.de
Fri Apr 14 15:46:32 CEST 2023
...
04/14/23 15:46:32 SharedPortClient: sent connection request to daemon at <131.169.223.131:9619> for shared port id schedd_1298351_f7d0
04/14/23 15:46:32 Looking for token in file /tmp/token_14053
04/14/23 15:46:37 SECMAN: required authentication with daemon at <131.169.223.131:9619> failed, so aborting command DC_SEC_QUERY.
********************************************************************************
2023-04-14 15:46:41 ERROR: WRITE access failed for scheduler daemon at
<131.169.223.131:9619?addrs=131.169.223.131-9619+[2001-638-700-10df--
1-83]-9619&alias=grid-htcondorce-dev.desy.de&noUDP&sock=schedd_1298351_f7d0>.
WRITE failed!
AUTHENTICATE:1003:Failed to authenticate with any method
AUTHENTICATE:1004:Failed to authenticate using SCITOKENS


********************************************************************************


[5]
04/14/23 15:46:32 Examining SciToken with payload {"wlcg.ver":"1.0","sub":"1ec796cb-250b-479d-a9e9-6509995adab0","aud":"https:\/\/wlcg.cern.ch\/jwt\/v1\/any","nbf":1681479491,"scope":"openid compute.create offline_access compute.read compute.cancel compute.modify","iss":"https:\/\/wlcg.cloud.cnaf.infn.it\/","exp":1681483091,"iat":1681479491,"jti":"78d1ad5a-2be0-4367-88bb-6a9f59939bc5","client_id":"8ec82f26-a407-44d7-aa32-19cd985cd2d1"}.
04/14/23 15:46:37 DC_AUTHENTICATE: required authentication of 131.169.223.90 failed: AUTHENTICATE:1003:Failed to authenticate with any method|AUTHENTICATE:1004:Failed to authenticate using SCITOKENS
04/14/23 15:46:37 Examining SciToken with payload {"wlcg.ver":"1.0","sub":"1ec796cb-250b-479d-a9e9-6509995adab0","aud":"https:\/\/wlcg.cern.ch\/jwt\/v1\/any","nbf":1681479491,"scope":"openid compute.create offline_access compute.read compute.cancel compute.modify","iss":"https:\/\/wlcg.cloud.cnaf.infn.it\/","exp":1681483091,"iat":1681479491,"jti":"78d1ad5a-2be0-4367-88bb-6a9f59939bc5","client_id":"8ec82f26-a407-44d7-aa32-19cd985cd2d1"}.
04/14/23 15:46:41 DC_AUTHENTICATE: required authentication of 131.169.223.90 failed: AUTHENTICATE:1003:Failed to authenticate with any method|AUTHENTICATE:1004:Failed to authenticate using SCITOKENS
04/14/23 15:46:49 Evaluated periodic expressions in 0.000s, scheduling next run in 61s


[6]
04/14/23 15:46:32 (cid:21) Examining SciToken with payload {"wlcg.ver":"1.0","sub":"1ec796cb-250b-479d-a9e9-6509995adab0","aud":"https:\/\/wlcg.cern.ch\/jwt\/v1\/any","nbf":1681479491,"scope":"openid compute.create offline_access compute.read compute.cancel compute.modify","iss":"https:\/\/wlcg.cloud.cnaf.infn.it\/","exp":1681483091,"iat":1681479491,"jti":"78d1ad5a-2be0-4367-88bb-6a9f59939bc5","client_id":"8ec82f26-a407-44d7-aa32-19cd985cd2d1"}.
04/14/23 15:46:37 (cid:23) Examining SciToken with payload {"wlcg.ver":"1.0","sub":"1ec796cb-250b-479d-a9e9-6509995adab0","aud":"https:\/\/wlcg.cern.ch\/jwt\/v1\/any","nbf":1681479491,"scope":"openid compute.create offline_access compute.read compute.cancel compute.modify","iss":"https:\/\/wlcg.cloud.cnaf.infn.it\/","exp":1681483091,"iat":1681479491,"jti":"78d1ad5a-2be0-4367-88bb-6a9f59939bc5","client_id":"8ec82f26-a407-44d7-aa32-19cd985cd2d1"}.
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
<CEclient.txt><SchedLog.20230416>_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature