[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Setting up HAD-enabled central Managers using IDTokens



Hi,

After some tests on selected nodes end last year, I’ve upgraded our flock to the most recent release of HTCondor.
I am now starting investigate upgrading our host-based flock to thew post HTCondor 9 IDTokens security scheme.

So far, I am still testing on few nodes while leaving the rest of the flock intact.

I could setup a base flock as described in the manual at https://htcondor.readthedocs.io/en/latest/getting-htcondor/admin-quick-start.html and submit a working job.

 

Now, my current concern is to replicate the High Availability Daemon CMs setup we’ve had for the last 8 years on a restricted number of nodes.
However the manual documentation only focuses on setting up a flock with a single CM, with the HAD page not giving much of a clue on how to set this up, besides the configuration file as before.

 

I’ve setup new test CMs on nodes A and B and I am attempting to have them talk to each other before I add an AP and a Execute node.

I’ve installed CM A and CM B using the script over https://get.htcondor.org/ with the central manager role and the same password for each (A is setup with A as a CM and B is setup with B as a CM).

I’ve added a file named 99-spc-cm.config to /etc/condor/config.d/ on each CM with the following content:

 

DAEMON_LIST     = MASTER, COLLECTOR, NEGOTIATOR, HAD, REPLICATION

 

NEGOTIATOR_HOST =

CONDOR_HOST     =

 

CENTRAL_MANAGER1 = A

CENTRAL_MANAGER2 = B

 

COLLECTOR_HOST  = $(CENTRAL_MANAGER1),$(CENTRAL_MANAGER2)

 

HAD_USE_PRIMARY = TRUE

HAD_USE_REPLICATION = TRUE

 

I’ve copied IDTokens A to B (/etc/condor/token.id) and IDTokens B to A (same spot) and checked permission where OK but so far the machines does not appear to be able to talk to each other (see below).

I also tried to generate tokens manually on each machine for the opposite CM instead of copying files around, with the same result.

 

Upon reconfiguration, on A, in /var/log/condor/MasterLog, I got errors similar to:

 

01/12/24 12:07:15 SECMAN: FAILED: Received "DENIED" from server for user condor@A using method IDTOKENS.

01/12/24 12:07:15 ERROR: SECMAN:2010:Received "DENIED" from server for user condor@A using method IDTOKENS.

01/12/24 12:07:15 Failed to start non-blocking update to B

01/12/24 12:07:16 Token requested not yet approved; please ask collector B admin to approve request ID 9651637.

 

With some similar errors in logs on B.

 

If I go on B and do:

 

$ condor_token_request_approve

Remote daemon has no request to approve.

$ condor_token_request_approve -reqid 9651637

Remote daemon did not provide information for request ID 9651637.

 

I am not quite sure where to go from here.

 

Thanks for your help.

 

–––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––-

SPC_logo

Fabrice Bouyé

IT Specialist (Scientific Computing) - Fisheries, Aquaculture and Marine Ecosystems Division
Spécialiste des technologies de l'information (informatique scientifique) - Division pêche, aquaculture et écosystèmes marin

Pacific Community | Communauté du Pacifique

CPS – B.P. D5 | 98848 Noumea, New Caledonia | Nouméa, Nouvelle-Calédonie

Tel: (687) 26 20 00 | Ext: 31411 | Mob: (687) 77 91 25 | Fax: (687) 26 38 18

E: fabriceb@xxxxxxx Website Twitter LinkedIn Facebook YouTube Instagram

–––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––-

As part of our emissions reduction strategy, please only print this email if necessary
Dans le cadre de notre stratégie de réduction des émissions, merci d'imprimer cet e-mail uniquement si nécessaire