[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] collector update infinite loop



Hi,

I'm trying to set up a collector hierarchy, where I have multiple
collector instances as "main collectors" running on the same node.

It worked fine until I had only two main collectors on two different
nodes, but now it seems that collectors which I selected to be the new
main collectors even send updates to themselves, and kept busy all the
time.

I suspect the relevant part which I should correct is the
COLLECTOR_HOST and CONDOR_VIEW_HOST parameters to make the collector
update circuit loop-free.

Here's the relevant part of my conf on one central manager node:
CONDOR_HOST =
CENTRAL_MANAGER1 = condormaster1.domain.tld
CENTRAL_MANAGER2 = condormaster2.domain.tld

COLLECTOR_HOST = $(FULL_HOSTNAME)

CONDOR_VIEW_CLASSAD_TYPES = Machine,Submitter,DaemonMaster
UPDATE_COLLECTOR_WITH_TCP = True

# define sub collectors
COLLECTOR2 = $(COLLECTOR)
COLLECTOR3 = $(COLLECTOR)
...

# specify the ports for the sub collectors
COLLECTOR2_ARGS = -f -p 10002
COLLECTOR3_ARGS = -f -p 10003


# specify the logs for the sub collectors
COLLECTOR2_ENVIRONMENT = "_CONDOR_COLLECTOR_LOG=$(LOG)/Collector2Log"
COLLECTOR3_ENVIRONMENT = "_CONDOR_COLLECTOR_LOG=$(LOG)/Collector3Log"
...

## Update collector at random intervals
UPDATE_INTERVAL = $RANDOM_INTEGER(230, 370)
MASTER_UPDATE_INTERVAL = $RANDOM_INTEGER(230, 370)

CONDOR_VIEW_HOST = $(CENTRAL_MANAGER1), $(CENTRAL_MANAGER2),
$(CENTRAL_MANAGER1):10002, $(CENTRAL_MANAGER2):10002,
$(CENTRAL_MANAGER1):10003, $(CENTRAL_MANAGER2):10003,
$(CENTRAL_MANAGER1):10004, $(CENTRAL_MANAGER2):10004,
$(CENTRAL_MANAGER1):10005, $(CENTRAL_MANAGER2):10005

Thanks,
Daniel