[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Condor Configuration Trouble



It appears your ALLOW_READ, ALLOW_WRITE, etc. configuration settings are causing the âPERMISSION DENIEDâ errors you are seeing.

 

Do you have DNS entries for what you call âabxx.xxxâ?  Reverse DNS?  This line in your log seems like trouble:

                WARNING: forward resolution of abxx.xxx doesn't match 10.0.0.30!

 

 

If the DNS isnât set up, try setting the configuration like this:

 

ALLOW_READ = 10.0.*

[similar for other ALLOW_ settings]

 

 

Cheers,

-zach

 

 

 

 

On 2/7/17, 11:18 PM, "HTCondor-users on behalf of Uchenna Ojiaku" <htcondor-users-bounces@xxxxxxxxxxx on behalf of ucojiaku@xxxxxxxxx> wrote:

 

Hi,

 

I've tried to setup condor between two nodes.

 

When I run "condor_status" I get:

 

Error: communication error

CEDAR:6001:Failed to connect to <159.203.152.145:9618>

 

 

My /etc/condor/condor_config file:

 

 

MY_FULL_HOSTNAME = abxx.xxx (here I put my hostname

 

##  Pathnames

RUN     = $(LOCAL_DIR)/run/condor

LOG     = $(LOCAL_DIR)/log/condor

LOCK    = $(LOCAL_DIR)/lock/condor

SPOOL   = $(LOCAL_DIR)/lib/condor/spool

EXECUTE = $(LOCAL_DIR)/lib/condor/execute

BIN     = $(RELEASE_DIR)/bin

LIB = $(RELEASE_DIR)/lib64/condor

INCLUDE = $(RELEASE_DIR)/include/condor

SBIN    = $(RELEASE_DIR)/sbin

LIBEXEC = $(RELEASE_DIR)/libexec/condor

SHARE   = $(RELEASE_DIR)/share/condor

 

PROCD_ADDRESS = $(RUN)/procd_pipe

 

JAVA_CLASSPATH_DEFAULT = $(SHARE) $(SHARE)/scimark2lib.jar .

 

##  What machine is your central manager?

 

CONDOR_HOST = $(MY_FULL_HOSTNAME)

 

##  This macro determines what daemons the condor_master will start and keep its

 watchful eyes on.

##  The list is a comma or space separated list of subsystem names

 

NETWORK_INTERFACE = 10.0.x.x (here I put my ip address)

 

DAEMON_LIST = COLLECTOR, MASTER, NEGOTIATOR, SCHEDD, STARTD

 

 

My /etc/condor/condor_config.local file:

 

 

CONDOR_ADMIN                    = prometheus.abxx.xxx

 

#FILESYSTEM_DOMAIN               = 10.0.x.x

#CONDOR_ADMIN                    = prometheus@xxxxxxxx

 

FILESYSTEM_DOMAIN               = abxx.xxx

UID_DOMAIN                      = abxx.xxx

 

# each slot gets a CPU

NUM_SLOTS                       = 1

NUM_SLOTS_TYPE_1                = 1

SLOT_TYPE_1                     = cpus=100%

SLOT_TYPE_1_PARTITIONABLE       = True

USE_NFS                         = True

DAGMAN_LOG_ON_NFS_IS_ERROR      = FALSE

 

KEEP_POOL_HISTORY               = True

POOL_HISTORY_DIR                = /var/spool/condor

POOL_HISTORY_MAX_STORAGE        = 100000000

POOL_HISTORY_SAMPLING_INTERVAL  = 60

 

 

ALLOW_READ                      = abxx.xxx

ALLOW_WRITE                     = abxx.xxx

ALLOW_ADMINISTRATOR             = $(CONDOR_HOST)

ALLOW_OWNER                     = abxx.xxx, $(ALLOW_ADMINISTRATOR)

HOSTALLOW_ADMINISTRATOR         = abuo.com

 

DAEMON_LIST                     = $(DAEMON_LIST)

#START                          = ($(START)) && target.AcctGroup =?= "group_pseu

do_operational_processing"

NEGOTIATOR_MATCHLIST_CACHING    = FALSE

NEGOTIATOR_ALLOW_QUOTA_OVERSUBSCRIPTION = TRUE

PRIORITY_HALFLIFE               = 1.79769e+308

 

 

Condor MasterLog:

 

02/03/17 15:56:46 restarting /usr/sbin/condor_collector in 10 seconds

02/03/17 15:56:46 attempt to connect to <10.0.2.15:9618> failed: Connection refu

sed (connect errno = 111).

02/03/17 15:56:46 ERROR: SECMAN:2003:TCP connection to collector abxx.xxx failed

.

02/03/17 15:56:46 Failed to start non-blocking update to <10.0.2.15:9618>.

02/03/17 15:56:56 Started DaemonCore process "/usr/sbin/condor_collector", pid and pgroup = 65480

02/03/17 15:56:58 SECMAN: FAILED: Received "DENIED" from server for user unauthenticated@unmapped using method (no authentication).

02/03/17 15:56:58 ERROR: SECMAN:2010:Received "DENIED" from server for user unauthenticated@unmapped using method (no authentication).

02/03/17 15:56:58 Failed to start non-blocking update to <10.0.2.15:9618>.

02/03/17 15:57:11 WARNING: forward resolution of abxx.xxx doesn't match 10.0.0.3

0!                                           

02/03/17 15:57:11 Got SIGTERM. Performing graceful shutdown.

02/03/17 15:57:18 SECMAN: FAILED: Received "DENIED" from server for user unauthenticated@unmapped using method (no authentication).

02/03/17 15:57:18 ERROR: SECMAN:2010:Received "DENIED" from server for user unauthenticated@unmapped using method (no authentication).

02/03/17 15:57:18 Failed to send update to collector abuo.com.

02/03/17 15:57:18 Sent SIGTERM to STARTD (pid 64673)

02/03/17 15:57:18 AllReaper unexpectedly called on pid 64673, status 0.

02/03/17 15:57:18 The STARTD (pid 64673) exited with status 0

02/03/17 15:57:19 All STARTDs are gone.  Stopping other daemons Gracefully

02/03/17 15:57:19 Sent SIGTERM to COLLECTOR (pid 65480)

02/03/17 15:57:19 Sent SIGTERM to NEGOTIATOR (pid 64671)

02/03/17 15:57:19 Sent SIGTERM to SCHEDD (pid 64672)

02/03/17 15:57:19 AllReaper unexpectedly called on pid 65480, status 0.

02/03/17 15:57:19 The COLLECTOR (pid 65480) exited with status 0

02/03/17 15:57:19 AllReaper unexpectedly called on pid 64671, status 0.

02/03/17 15:57:19 The NEGOTIATOR (pid 64671) exited with status 0

02/03/17 15:57:19 AllReaper unexpectedly called on pid 64672, status 0.

02/03/17 15:57:19 The SCHEDD (pid 64672) exited with status 0

02/03/17 15:57:19 All daemons are gone.  Exiting.

02/03/17 15:57:19 **** condor_master (condor_MASTER) pid 4179 EXITING WITH STATUS 0

 

 

My CollectorLog:

 

02/03/17 15:56:58 PERMISSION DENIED to unauthenticated@unmapped from host 10.0.2.15 for command 2 (UPDATE_MASTER_AD), access level ADVERTISE_MASTER: reason: ADVERTISE_MASTER authorization policy contains no matching ALLOW entry for this request; identifiers used for this host: 10.0.2.15, hostname size = 0, original ip address = 10.0.2.15

02/03/17 15:56:58 DC_AUTHENTICATE: Command not authorized, done!

02/03/17 15:56:58 CollectorAd  : Inserting ** "< My Pool - abxx.xxx@xxxxxxxx >"

02/03/17 15:56:58 stats: Inserting new hashent for 'Collector':'My Pool - abxx.xxx@xxxxxxxx':'10.0.x.x'

02/03/17 15:57:18 attempt to connect to <159.203.152.145:9618> failed: timed out after 20 seconds.

02/03/17 15:57:18 Failed to send update to collector abxx.xxx.

02/03/17 15:57:18 Unable to send UPDATE_COLLECTOR_AD to all configured collectors

02/03/17 15:57:18 WARNING: forward resolution of abxx.xxx doesn't match 10.0.2.15!

02/03/17 15:57:18 PERMISSION DENIED to unauthenticated@unmapped from host 10.0.2.15 for command 10 (QUERY_STARTD_PVT_ADS), access level NEGOTIATOR: reason: NEGOTIATOR authorization policy contains no matching ALLOW entry for this request; identifiers used for this host: 10.0.2.15, hostname size = 0, original ip address

 = 10.0.2.15

02/03/17 15:57:18 DC_AUTHENTICATE: Command not authorized, done!

02/03/17 15:57:18 PERMISSION DENIED to unauthenticated@unmapped from host 10.0.2.15 for command 15 (INVALIDATE_MASTER_ADS), access level ADVERTISE_MASTER: reason: cached result for ADVERTISE_MASTER; see first case for the full reason

02/03/17 15:57:18 DC_AUTHENTICATE: Command not authorized, done!

02/03/17 15:57:18 WARNING: forward resolution of abxx.xxx doesn't match 10.0.2.15!

02/03/17 15:57:18 PERMISSION DENIED to unauthenticated@unmapped from host 10.0.2.15 for command 13 (INVALIDATE_STARTD_ADS), access level ADVERTISE_STARTD: reason: ADVERTISE_STARTD authorization policy contains no matching ALLOW entry for this request; identifiers used for this host: 10.0.2.15, hostname size = 0, original ip address = 10.0.2.15

02/03/17 15:57:18 DC_AUTHENTICATE: Command not authorized, done!

02/03/17 15:57:19 Got SIGTERM. Performing graceful shutdown.

02/03/17 15:57:19 **** condor_collector (condor_COLLECTOR) pid 65480 EXITING WITH STATUS 0