[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Nodes missing in condor_status list



Hi,

 

Some of my STARTD/SCHEED nodes don’t show up in the condor_status list.

This probably has something to do with the fact that these nodes belong to a different network.

1) Do I have to use the flocking mechanism in order add such an “external node” (see setup below)?

2) If I do not have to use the flocking mechanism then how do I track down the error? I’ve already checked all the logs (on both the invisible nodes as well as on the collector) but I can’t find anything clue.

 

 

This is how my pool is set up:

 

FOO network:

 

Collector/Negotiator:

condor.FOO.my.com (Debian 6)

 

“Internal” Startd/Scheed nodes:

start1.FOO.my.com (CentOS 6)  <- LISTED

start2.FOO.my.com (Windows 7) <- LISTED

 

BAR network:

 

“External” Startd/Scheed node:

start3.BAR.my.com (OpenSuse 13) <- NOT LISTED

start4.BAR.my.com (Windows 7)   <- NOT LISTED

 

 

 

Collector/Negotiator condor_config.local:

 

CONDOR_HOST = condor.FOO.my.com

DAEMON_LIST = MASTER, COLLECTOR, NEGOTIATOR

ALLOW_WRITE = *.FOO.my.com, *.BAR.my.com

 

SGE_GAHP      = $(GLITE_LOCATION)/bin/batch_gahp

GLIDEIN_SITES = *.FOO.my.com

 

HOSTALLOW_WRITE = $(HOSTALLOW_WRITE), $(GLIDEIN_SITES)

 

USE_SHARED_PORT  = TRUE

SHARED_PORT_ARGS = -p 9614

DAEMON_LIST      = $(DAEMON_LIST), SHARED_PORT

 

 

 

Startd/Scheed condor_config.local:

 

CONDOR_HOST = condor.FOO.my.com

DAEMON_LIST = MASTER, STARTD, SCHEDD

ALLOW_WRITE = *.FOO.my.com, *.BAR.my.com

 

SGE_GAHP      = $(GLITE_LOCATION)/bin/batch_gahp

GLIDEIN_SITES = *.FOO.my.com

 

HOSTALLOW_WRITE = $(HOSTALLOW_WRITE), $(GLIDEIN_SITES)

 

USE_SHARED_PORT  = TRUE

SHARED_PORT_ARGS = -p 9614

DAEMON_LIST      = $(DAEMON_LIST), SHARED_PORT

 

START = TRUE

 

 

 

Best regards,

Lukas