[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] CondorCE route for local Condor submission with Credential



The log snippet from the Job Router indicates that its attempt to authenticate with the collector failed. I am guessing that your security configuration only allows Kerberos authentication. The Job Router has no logic to retrieve a stored user credential for use in authenticating with the collector or schedd. I think you will need to alter your security configuration such that a condor_submit would work without a kerberos credential.

In addition to the authentication issue, you will want to ensure your Job Router routes set this in the transformed job ad:
SendCredential = True

This will tell the schedd and shadow to use and forward the pre-stored credential to the execute machine.

 - Jaime

> On Aug 13, 2021, at 9:14 AM, Thomas Hartmann <thomas.hartmann@xxxxxxx> wrote:
> 
> Hi all,
> 
> I am trying to write a CE route to submit local jobs, which require the injection of credentials.
> 
> As credential producer, I am trying a simple script, that basically exports a keytab cred cache and calls condor_aklog.
> 
> With it, local Condor submission authorization works for the user - onto which incoming DNs are mapped to by the CE.
> 
> But submissions through the CE fail with authentication being broken [1]. It seems somewhat like the credential handling is bypassed and not attempted - as I do not see related messages in the Condor's Cred*Log (assuming, that the Condor's SEC_CREDENTIAL_* ads get applied globally)
> 
> Since the credentials handling is part of the Condor side, I assume that I cannot influence it from a CE route, or?
> 
> Maybe somebody has an idea, how I can make the CE's local submission also to use the Condor's credential handling? ð
> 
> Cheers,
>  Thomas
> 
> 
> [1]
> 08/13/21 15:58:56 (D_ALWAYS:2) JobRouter: polling state of (0) managed jobs.
> 08/13/21 15:58:56 (D_ALWAYS:2) TimerHandler_JobLogPolling() called
> 08/13/21 15:58:56 (D_ALWAYS:2) === Current Probing Information ===
> 08/13/21 15:58:56 (D_ALWAYS:2) fsize: 6125		mtime: 1628863126
> 08/13/21 15:58:56 (D_ALWAYS:2) first log entry: 16 CreationTimestamp 1623248791
> 08/13/21 15:58:56 (D_ALWAYS:2) TimerHandler_JobLogPolling() called
> 08/13/21 15:58:56 (D_ALWAYS:2) === Current Probing Information ===
> 08/13/21 15:58:56 (D_ALWAYS:2) fsize: 24396		mtime: 1628859724
> 08/13/21 15:58:56 (D_ALWAYS:2) first log entry: 21 CreationTimestamp 1623242507
> 08/13/21 15:58:56 (D_ALWAYS) JobRouter: Checking for candidate jobs. routing table is:
> 08/13/21 15:58:56 (D_ALWAYS:2) JobRouter: Umbrella constraint: (target.JobUniverse =?= 5 || target.JobUniverse =?= 1) && ( (regexp("Thomas Hartmann",x509userproxysubject ?: "") || regexp("Andreas Gellrich",x509userproxysubject ?: "")) || (true) ) && (target.ProcId >= 0 && target.JobStatus == 1 && (target.StageInStart is undefined || target.StageInFinish isnt undefined) && target.Managed isnt "ScheddDone" && target.Managed isnt "External" && target.Owner isnt Undefined && target.RoutedBy isnt "htcondor-ce")
> 08/13/21 15:58:56 (D_ALWAYS:2) JobRouter: Found candidate job src=20.0,route=DESYNAFGPUSpecifics
> 08/13/21 15:58:56 (D_ALWAYS:2) SharedPortClient: sent connection request to schedd at <131.169.223.51:9619> for shared port id schedd_515519_8f68
> 08/13/21 15:58:56 (D_ALWAYS:2) JobRouter (src=20.0,route=DESYNAFGPUSpecifics): claimed job
> 08/13/21 15:58:56 (D_ALWAYS:2) Will use TCP to update collector bird-htc-master01.desy.de <131.169.56.44:9618?alias=bird-htc-master01.desy.de>
> 08/13/21 15:58:56 (D_ALWAYS:2) Trying to query collector <131.169.56.44:9618?alias=bird-htc-master01.desy.de>
> 08/13/21 15:58:56 (D_ALWAYS) SECMAN: required authentication with collector at <131.169.56.44:9618> failed, so aborting command QUERY_SCHEDD_ADS.
> 08/13/21 15:58:56 (D_ALWAYS) ERROR: AUTHENTICATE:1003:Failed to authenticate with any method
> 08/13/21 15:58:56 (D_ALWAYS) ERROR (schedd naf-htcondorce1.desy.de at pool bird-htc-master01.desy.de:9618) Can't find address of schedd
> 08/13/21 15:58:56 (D_ALWAYS) JobRouter failure (src=20.0,route=DESYNAFGPUSpecifics): failed to submit job
> 08/13/21 15:58:56 (D_ALWAYS:2) SharedPortClient: sent connection request to schedd at <131.169.223.51:9619> for shared port id schedd_515519_8f68
> 08/13/21 15:58:56 (D_ALWAYS:2) JobRouter (src=20.0,route=DESYNAFGPUSpecifics): yielded job (done=0)
> 08/13/21 15:58:56 (D_ALWAYS:2) JobRouter (src=20.0,route=DESYNAFGPUSpecifics): Cleaned up and removed routed job.
> 08/13/21 15:58:56 (D_ALWAYS:2) JobRouter (src=20.0): job mirror synchronized; removing job from internal 'retirement' status
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/