[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Communication error when trying to add second machine



Hi Zachary,

Indeed - thereâs a ridiculous number of SELinux issues in the default packaging.  Itâs being worked on, but itâs probably best to do permissive mode for now (Iâd recommend disabled if you use Docker universe).

For this error message:

> Error: communication error
> CEDAR:6001:Failed to connect to <10.0.7.10:9618>
> 

I would look at /var/log/condor/CollectorLog on 10.0.7.10 for any hints of a problem.  If thereâs nothing obvious, try again with D_FULLDEBUG.  Additionally, from your worker, try to connect over TCP directly to the socket to see if that works (telnet 10.0.7.10 9618).  If it doesnât - but the condor_collector process is running - then you might have network firewalls in the way.

Brian

> On Aug 4, 2016, at 5:32 PM, Hughes, Zachary <zdhughes@xxxxxxxxx> wrote:
> 
> Hi all,
> 
> 
> 
> 
> So I'm working on two Centos 7 machines and used yum to install condor follow the steps on https://research.cs.wisc.edu/htcondor/yum/  . The first machine ( the central manager) seems to be working just fine, condor_status . I used the configuration file located at /etc/condor/condor_config on machine 0 to set it up.
> 
> For the second machine the configuration file is nearly identical ( /etc/condor/condor_config on machine 1), but when I start the service :
> 
> [root@herc1 ~]# systemctl start condor.service 
> 
> I get SELinux Alerts:
> 
> ###########################################################################
> SELinux is preventing /usr/bin/bash from write access on the file ip_local_port_range.
> 
> *****  Plugin catchall (100. confidence) suggests   **************************
> 
> If you believe that bash should be allowed write access on the ip_local_port_range file by default.
> Then you should report this as a bug.
> You can generate a local policy module to allow this access.
> Do
> allow this access for now by executing:
> # grep linux_kernel_tu /var/log/audit/audit.log | audit2allow -M mypol
> # semodule -i mypol.pp
> 
> Additional Information:
> Source Context                system_u:system_r:condor_master_t:s0
> Target Context                system_u:object_r:sysctl_net_t:s0
> Target Objects                ip_local_port_range [ file ]
> Source                        linux_kernel_tu
> Source Path                   /usr/bin/bash
> Port                          <Unknown>
> Host                          herc1.lexas
> Source RPM Packages           bash-4.2.46-19.el7.x86_64
> Target RPM Packages           
> Policy RPM                    selinux-policy-3.13.1-60.el7.noarch
> Selinux Enabled               True
> Policy Type                   targeted
> Enforcing Mode                Permissive
> Host Name                     herc1.lexas
> Platform                      Linux herc1.lexas 3.10.0-327.el7.x86_64 #1 SMP Thu
>                              Nov 19 22:10:57 UTC 2015 x86_64 x86_64
> Alert Count                   5
> First Seen                    2016-08-04 16:22:05 CDT
> Last Seen                     2016-08-04 17:04:50 CDT
> Local ID                      a0bb55c9-60a0-442b-8f3a-0ce083c46d22
> 
> Raw Audit Messages
> type=AVC msg=audit(1470348290.276:490): avc:  denied  { write } for  pid=5436 comm="linux_kernel_tu" name="ip_local_port_range" dev="proc" ino=19975 scontext=system_u:system_r:condor_master_t:s0 tcontext=system_u:object_r:sysctl_net_t:s0 tclass=file
> 
> 
> type=SYSCALL msg=audit(1470348290.276:490): arch=x86_64 syscall=open success=yes exit=ESRCH a0=2542a80 a1=241 a2=1b6 a3=fffffff0 items=0 ppid=5433 pid=5436 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm=linux_kernel_tu exe=/usr/bin/bash subj=system_u:system_r:condor_master_t:s0 key=(null)
> 
> Hash: linux_kernel_tu,condor_master_t,sysctl_net_t,file,write
> ###########################################################################
> 
> and condor_status gives:
> 
> Error: communication error
> CEDAR:6001:Failed to connect to <10.0.7.10:9618>
> 
> I've set the SELinux policy to permissive (and disabled) but nothing has changed. Here is my configuration file, 
> 
> 
> RELEASE_DIR = /usr
> LOCAL_DIR = /var
> LOCAL_CONFIG_FILE = /etc/condor/condor_config.local
> REQUIRE_LOCAL_CONFIG_FILE = false
> LOCAL_CONFIG_DIR = /etc/condor/config.d
> use SECURITY : HOST_BASED
> 
> RUN     = $(LOCAL_DIR)/run/condor
> LOG     = $(LOCAL_DIR)/log/condor
> LOCK    = $(LOCAL_DIR)/lock/condor
> SPOOL   = $(LOCAL_DIR)/lib/condor/spool
> EXECUTE = $(LOCAL_DIR)/lib/condor/execute
> BIN     = $(RELEASE_DIR)/bin
> LIB = $(RELEASE_DIR)/lib64/condor
> INCLUDE = $(RELEASE_DIR)/include/condor
> SBIN    = $(RELEASE_DIR)/sbin
> LIBEXEC = $(RELEASE_DIR)/libexec/condor
> SHARE   = $(RELEASE_DIR)/share/condor
> 
> PROCD_ADDRESS = $(RUN)/procd_pipe
> JAVA_CLASSPATH_DEFAULT = $(SHARE) $(SHARE)/scimark2lib.jar .
> CONDOR_HOST = herc0.lexas
> DAEMON_LIST = MASTER, SCHEDD, STARTD (+ NEGOTIATOR and COLLECTOR on machine 0)
> UID_DOMAIN		= lexas
> FILESYSTEM_DOMAIN	= lexas
> COLLECTOR_NAME 		= HERC Condor Pool
> CONDOR_IDS=987.982
> ALLOW_READ = herc*.lexas, 10.0.7.*, *.cs.wisc.edu
> ALLOW_WRITE = herc*.lexas, 10.0.7.*
> USE_NFS		= True NEGOTIATOR
> USE_AFS		= False
> LOCK		= $(LOG)
> TRUST_UID_DOMAIN = True
> 
> Does anyone have any ideas?
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/