[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] SharedPointEndpoint fails to accept connection



Is SELinux enabled?

Due to missing policies, shared port gets blocked by default on CentOS7.

Sent from my iPhone

> On Aug 9, 2016, at 3:50 PM, Michael Murphy <Michael.Murphy@xxxxxxxxxxxxx> wrote:
> 
> Hello,
> 
> I am unable to get Condor's shared port to function properly on a
> Centos7 client machine (MASTER, START, SCHEDD, SHARED_PORT, KBDD daemons
> are active). My shared port configuration is the following:
> 
> DAEMON_LIST = $(DAEMON_LIST), SHARED_PORT
> USE_SHARED_PORT = TRUE
> SHARED_PORT_PORT = 9618
> COLLECTOR_HOST = $(CONDOR_HOST)
> UPDATE_COLLECTOR_WITH_TCP = TRUE
> 
> The SCHEDD daemon logs are full of the following:
> 
> <Omitted for brevity>
> 
> 08/09/16 15:38:29 (pid:227709) SharedPortEndpoint: failed to accept
> connection on
> 637e42b175e7a0a6281c03f433343d5911f698ca2da42a477b3b6a4e58d2f771/227681_ce89_3
> 08/09/16 15:38:29 (pid:227709) SharedPortEndpoint: failed to accept
> connection on
> 637e42b175e7a0a6281c03f433343d5911f698ca2da42a477b3b6a4e58d2f771/227681_ce89_3
> 08/09/16 15:38:29 (pid:227709) SharedPortEndpoint: failed to accept
> connection on
> 637e42b175e7a0a6281c03f433343d5911f698ca2da42a477b3b6a4e58d2f771/227681_ce89_3
> 08/09/16 15:38:29 (pid:227709) SharedPortEndpoint: failed to accept
> connection on
> 637e42b175e7a0a6281c03f433343d5911f698ca2da42a477b3b6a4e58d2f771/227681_ce89_3
> 08/09/16 15:38:29 (pid:227709) MaxLog = 10485760 bytes, length = 10485845
> 08/09/16 15:38:29 (pid:227709) Saving log file to
> "/var/log/condor/SchedLog.old"
> 
> The MasterLog doesn't shows the result of the Schedd connectivity issue
> 
> 08/09/16 15:38:24 ******************************************************
> 08/09/16 15:38:24 ** condor_master (CONDOR_MASTER) STARTING UP
> 08/09/16 15:38:24 ** /usr/sbin/condor_master
> 08/09/16 15:38:24 ** SubsystemInfo: name=MASTER type=MASTER(2)
> class=DAEMON(1)
> 08/09/16 15:38:24 ** Configuration: subsystem:MASTER local:<NONE>
> class:DAEMON
> 08/09/16 15:38:24 ** $CondorVersion: 8.3.8 Jan 14 2016 BuildID:
> RH-8.3.8-1.el7 $
> 08/09/16 15:38:24 ** $CondorPlatform: X86_64-RedHat_7.2 $
> 08/09/16 15:38:24 ** PID = 227681
> 08/09/16 15:38:24 ** Log last touched time unavailable (No such file or
> directory)
> 08/09/16 15:38:24 ******************************************************
> 08/09/16 15:38:24 Using config source: /etc/condor/condor_config
> 08/09/16 15:38:24 Using local config sources:
> 08/09/16 15:38:24    /etc/condor/config.d/00-IERUS_WorkstationNode.conf
> 08/09/16 15:38:24    /etc/condor/config.d/41-sharedport.conf
> 08/09/16 15:38:24 config Macros = 114, Sorted = 114, StringBytes = 4901,
> TablesBytes = 4160
> 08/09/16 15:38:24 CLASSAD_CACHING is OFF
> 08/09/16 15:38:24 Daemon Log is logging: D_ALWAYS D_ERROR
> 08/09/16 15:38:25 SharedPortEndpoint: waiting for connections to named
> socket 227681_ce89
> 08/09/16 15:38:25 SharedPortEndpoint: failed to open
> /var/lock/condor/shared_port_ad: No such file or directory
> 08/09/16 15:38:25 SharedPortEndpoint: did not successfully find
> SharedPortServer address. Will retry in 60s.
> 08/09/16 15:38:25 DaemonCore: private command socket at
> <192.168.6.135:0?sock=227681_ce89>
> 08/09/16 15:38:25 Master restart (GRACEFUL) is watching
> /usr/sbin/condor_master (mtime:1452815958)
> 08/09/16 15:38:25 Collector port not defined, will use default: 9618
> 08/09/16 15:38:25 Started DaemonCore process
> "/usr/libexec/condor/condor_shared_port", pid and pgroup = 227708
> 08/09/16 15:38:25 Waiting for /var/lock/condor/shared_port_ad to appear.
> 08/09/16 15:38:26 Found /var/lock/condor/shared_port_ad.
> 08/09/16 15:38:26 Started DaemonCore process "/usr/sbin/condor_schedd",
> pid and pgroup = 227709
> 08/09/16 15:38:26 Started DaemonCore process "/usr/sbin/condor_startd",
> pid and pgroup = 227710
> 08/09/16 15:38:26 Started DaemonCore process "/usr/sbin/condor_kbdd",
> pid and pgroup = 227711
> 08/09/16 15:43:30 condor_write(): Socket closed when trying to write
> 1421 bytes to collector boss.hq.ierustech.com, fd is 12
> 08/09/16 15:43:30 Buf::write(): condor_write() failed
> 
> Where should I start looking to fix this. I am by no means a condor pro.
> I just enjoy it when it works.--
> 
> Michael McInerny Murphy
> Engineer & Physicist
> IERUS Technologies, Inc.
> 2904 Westcorp Blvd. Ste 210
> Huntsville, AL  35805
> (256) 319-2026 ext 107
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/