[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Multiple CredDs in a single pool



> Date: Wed, 21 Dec 2022 00:13:07 +0000
> From: John M Knoeller <johnkn@xxxxxxxxxxx>
> To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
> Subject: Re: [HTCondor-users] Multiple CredDs in a single pool
>
> Hi Miguel.
>
> You can have multiple CREDD daemons running, but HTCondor is not really expecting them to be setup as backups for each other.  The purpose of the CredD classad is to allow the daemons to find the address of a CREDD given the name.  I think I you are saying that both of the CREDD daemons are advertising the same IP address?
>
> In that case I think you can configure credd host to be the IP address and port of the credd rather than setting it to a name.
>
> CREDD_HOST=a.b.c.d:port
>
> I can?t say for certain that this will work, but I think it will work if you make sure that both of the CREDDs have the same address and all of the same credentials stored.
>
> The other issue is that condor_submit will automatically add a clause to the job Requirements expression so that jobs that want to run_as_owner will only match to machines that advertise a value for LocalCredd that is the same as the CREDD_HOST of condor_submit.
>
> The job will have something like
>  && (TARGET.LocalCredd =?= "$(CREDD_HOST)")
>
> where $(CREDD_HOST) is the value from the configuration.  This is only used for matchmaking, so it won?t matter what the value actually is, only that it?s the same string on both the submit machines and the execute machines.
>
> As for pool password being required for the credd daemon, I do think that having them both use the same signing key name and key value will work just fine.
>
> Hope this helps,
> -tj
>

Hi John,

This is helpful, thank you. The way it is currently set up is to have
both credds advertise the same CREDD_HOST value which points to a load
balanced address (virtual IP). This requires a load balancer which
adds complexity for little benefit since Condor can locate services by
way of the collector inherently.

I was originally hoping to have two distinct CredD advertisements and
wanted Condor to do the discovery/load balancing for me, but after
reading your explanation it makes sense that it won't work that way,
mainly because I would have to match up the LocalCredd to each
machine's for matchmaking, and then there's no guarantee the workers
will be able to locate a credential as they wouldn't know where to get
it from (unless I misunderstood how that works).

I've done further testing and think I've achieved the desired, and
similar result to the existing setup but without the load balancer, by
specifying CREDD_HOST as "credd@" with the @ forcing Condor to omit
the hostname lookup. Each of the credd systems is configured to the
same value causing its name in the pool to be consistent. This has the
effect of allowing both systems to continue advertising the same credd
and the last system to advertise the class ad winning out. If the
current system owning the CredD class ad goes away, the other system
will eventually become the pool's CredD on a future advertisement, I
have found that I can speed this process up by restarting the daemon
on any one system (in my HA scenario the one that is still up) and it
will become the CredD for the pool.

I think this, in addition to CREDD_CACHE_LOCALLY, will achieve my
goal. Unless I miscalculated somewhere.

Thanks