[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] best way to switch a main negotiator/collector head towards a secondary?



Hi all,

what would probably the best way to move gracefully from a master negotiator/collector to the secondary fallback master in a HA setup?

E.g., with something like [1], where the collector/negotiator on `primary.site.foo` would be the default and `fallback.site.foo` only would take over, when primary does not answer within the backoff constant.

Now, for maintenance/rebuilds/... of `primary` I would like to go safe and gracefully switch the negotiation to the secondary for the whole cluster, i.e., something like "draining" the resource updates and negotiations without affecting all the existing jobs and shadows on the startds and schedds.

Would it be sufficient just to switch the ranking of the masters
  CENTRAL_MANAGER1 = fallback.site.foo
  CENTRAL_MANAGER2 = primary.site.foo
?

I am a bit unsure how to take best the backoff constant and negotiation cycle durations into account with respect to the deployment time on the cluster. Since we run our puppets with a 30m frequency per node, this would be the worst-case time a config update might take to reach a node in the cluster. I.e., if the cluster runs for up to 30m in a mixed state of some nodes on the default on some already on the inverted master ranking, would we have two active collectors/negotiators? (which is probably not a good thing...) Is this something to worry about and is there a better approach - or am I maybe overthinking it?

Cheers,
  Thomas


[1]
> cat 01masterd_ha.conf
CENTRAL_MANAGER1 = primary.site.foo
CENTRAL_MANAGER2 = fallback.site.foo
CONDOR_HOST = $(CENTRAL_MANAGER1), $(CENTRAL_MANAGER2)

DAEMON_LIST = $(DAEMON_LIST), HAD, REPLICATION
MASTER_HAD_BACKOFF_CONSTANT = 360
...

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature