[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] SharedPointEndpoint fails to accept connection



Hello,

I am unable to get Condor's shared port to function properly on a
Centos7 client machine (MASTER, START, SCHEDD, SHARED_PORT, KBDD daemons
are active). My shared port configuration is the following:

DAEMON_LIST = $(DAEMON_LIST), SHARED_PORT
USE_SHARED_PORT = TRUE
SHARED_PORT_PORT = 9618
COLLECTOR_HOST = $(CONDOR_HOST)
UPDATE_COLLECTOR_WITH_TCP = TRUE

The SCHEDD daemon logs are full of the following:

<Omitted for brevity>

08/09/16 15:38:29 (pid:227709) SharedPortEndpoint: failed to accept
connection on
637e42b175e7a0a6281c03f433343d5911f698ca2da42a477b3b6a4e58d2f771/227681_ce89_3
08/09/16 15:38:29 (pid:227709) SharedPortEndpoint: failed to accept
connection on
637e42b175e7a0a6281c03f433343d5911f698ca2da42a477b3b6a4e58d2f771/227681_ce89_3
08/09/16 15:38:29 (pid:227709) SharedPortEndpoint: failed to accept
connection on
637e42b175e7a0a6281c03f433343d5911f698ca2da42a477b3b6a4e58d2f771/227681_ce89_3
08/09/16 15:38:29 (pid:227709) SharedPortEndpoint: failed to accept
connection on
637e42b175e7a0a6281c03f433343d5911f698ca2da42a477b3b6a4e58d2f771/227681_ce89_3
08/09/16 15:38:29 (pid:227709) MaxLog = 10485760 bytes, length = 10485845
08/09/16 15:38:29 (pid:227709) Saving log file to
"/var/log/condor/SchedLog.old"

The MasterLog doesn't shows the result of the Schedd connectivity issue

08/09/16 15:38:24 ******************************************************
08/09/16 15:38:24 ** condor_master (CONDOR_MASTER) STARTING UP
08/09/16 15:38:24 ** /usr/sbin/condor_master
08/09/16 15:38:24 ** SubsystemInfo: name=MASTER type=MASTER(2)
class=DAEMON(1)
08/09/16 15:38:24 ** Configuration: subsystem:MASTER local:<NONE>
class:DAEMON
08/09/16 15:38:24 ** $CondorVersion: 8.3.8 Jan 14 2016 BuildID:
RH-8.3.8-1.el7 $
08/09/16 15:38:24 ** $CondorPlatform: X86_64-RedHat_7.2 $
08/09/16 15:38:24 ** PID = 227681
08/09/16 15:38:24 ** Log last touched time unavailable (No such file or
directory)
08/09/16 15:38:24 ******************************************************
08/09/16 15:38:24 Using config source: /etc/condor/condor_config
08/09/16 15:38:24 Using local config sources:
08/09/16 15:38:24    /etc/condor/config.d/00-IERUS_WorkstationNode.conf
08/09/16 15:38:24    /etc/condor/config.d/41-sharedport.conf
08/09/16 15:38:24 config Macros = 114, Sorted = 114, StringBytes = 4901,
TablesBytes = 4160
08/09/16 15:38:24 CLASSAD_CACHING is OFF
08/09/16 15:38:24 Daemon Log is logging: D_ALWAYS D_ERROR
08/09/16 15:38:25 SharedPortEndpoint: waiting for connections to named
socket 227681_ce89
08/09/16 15:38:25 SharedPortEndpoint: failed to open
/var/lock/condor/shared_port_ad: No such file or directory
08/09/16 15:38:25 SharedPortEndpoint: did not successfully find
SharedPortServer address. Will retry in 60s.
08/09/16 15:38:25 DaemonCore: private command socket at
<192.168.6.135:0?sock=227681_ce89>
08/09/16 15:38:25 Master restart (GRACEFUL) is watching
/usr/sbin/condor_master (mtime:1452815958)
08/09/16 15:38:25 Collector port not defined, will use default: 9618
08/09/16 15:38:25 Started DaemonCore process
"/usr/libexec/condor/condor_shared_port", pid and pgroup = 227708
08/09/16 15:38:25 Waiting for /var/lock/condor/shared_port_ad to appear.
08/09/16 15:38:26 Found /var/lock/condor/shared_port_ad.
08/09/16 15:38:26 Started DaemonCore process "/usr/sbin/condor_schedd",
pid and pgroup = 227709
08/09/16 15:38:26 Started DaemonCore process "/usr/sbin/condor_startd",
pid and pgroup = 227710
08/09/16 15:38:26 Started DaemonCore process "/usr/sbin/condor_kbdd",
pid and pgroup = 227711
08/09/16 15:43:30 condor_write(): Socket closed when trying to write
1421 bytes to collector boss.hq.ierustech.com, fd is 12
08/09/16 15:43:30 Buf::write(): condor_write() failed

Where should I start looking to fix this. I am by no means a condor pro.
I just enjoy it when it works.--

Michael McInerny Murphy
Engineer & Physicist
IERUS Technologies, Inc.
2904 Westcorp Blvd. Ste 210
Huntsville, AL  35805
(256) 319-2026 ext 107