[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] CondorCE token subject mapping not working anymore



Great detective work, thanks!



From: Thomas Hartmann
Sent: Thursday, April 20, 2023 11:48 AM
To: HTCondor-Users Mail List
Subject: Re: [HTCondor-users] CondorCE token subject mapping not working anymore

Hi all,

it seems we have a lead.

The issue might be a missing token cache DB
   `.cache/scitokens/scitokens_cpp.sqllite`

But in the default installation I do not find the paths to be created.
There was a thread in another list end of last year, that mentions the
`.cache` dir either be in the `condor` users home directory or under
`var/lib/condor`.
However, we create the condor user from puppet (home-less as functional
user or explicitly with a home (but not enabled) ) and create
/var/lib/condor{-ce} etc. dirs and owning it to the condor user as
requirement for the package installation.
So far, the .cache dir has not been implemented explicitly in our Puppet
nor is it in the rpms AFAIS.
Apparently, the freedesktop envvar ${XDG_CACHE_HOME} is assumed to
define the path, but since it does not exported in the unit definition,
no default dir like `/var/lib/condor/.cache` is created.

After explicitly creating a `condor` user home and restating the
services, the .cache dir is created in there and populated by the sqlite
DB file. With the sqlite DB available, token submission to the CE work
now on my test CE.

Since the condor user is a functional user, I would prefer not to use a
$HOME and are now preparing to set XDG_CACHE_HOME for the unit envs. But
creating a condor $HOME seems to be a working stop gap solution

Cheers,
   Thomas

Cheers,
   Thomas


On 19/04/2023 17.30, Thomas Hartmann wrote:
> Hi all,
>
> our situation got somewhat more troublesome as a large user (ATLAS)
> switched their submission infrastructure to Condor 10.
> Since their jobs do not get authzed properly anymore as token validation
> from the CE at the IAM timeouts after 4s and there is no proxy fallback
> anymore :-/
>
> So far I have not made any progress identifying the cause for the
> validation running into this 4s timeout.
>
> Cheers,
>    Thomas
>
> On 18/04/2023 17.29, Thomas Hartmann wrote:
>> short update - DNS lookup delays are not the cause AFAIS (hard-wiring
>> the IAM IP in hosts)
>>
>> I see a few packages between timeout starts/ends in dumps, but have
>> yet to try to decrypt the TLS handshake & the packages...
>>
>> _______________________________________________
>> HTCondor-users mailing list
>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
>> with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/htcondor-users/
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/