[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] htcondor/execute container cannot connect to central manager



Hi all,


I have been trying to use the htcondor/execute container to connect to a central manager with minimum config. After many attempts, container-based execution does not spawn the required processes. Running the execute using normal service works, but giving the same config to the htcondor/execute does not work. 


Here is the command I use. I gave the exact same config as my working service to the container. Using the examples in https://github.com/htcondor/htcondor/tree/master/build/docker/services also doesn't work. I have used both el8 and ubuntu containers, both not working.


docker run --rm --network host --env-file=env --name condor -v /etc/condor:/etc/condor htcondor/execute


cat /etc/redhat-release
AlmaLinux release 9.2 (Turquoise Kodkod)



Here is the log file when using the container:


root@worker0:/var/log/condor# cat StartLog
09/19/23 10:07:25 (D_ALWAYS:2) Result of reading /etc/issue:  Ubuntu 20.04.4 LTS \n \l
 
09/19/23 10:07:25 (D_ALWAYS:2) Using IDs: 1 processors, 1 CPUs, 0 HTs
09/19/23 10:07:25 (D_ALWAYS:2) Reading condor configuration from '/etc/condor/condor_config'
09/19/23 10:07:25 (D_ALWAYS:2) Enumerating interfaces: lo 127.0.0.1 up
09/19/23 10:07:25 (D_ALWAYS:2) Enumerating interfaces: enp0s3 10.0.2.15 up
09/19/23 10:07:25 (D_ALWAYS:2) Enumerating interfaces: enp0s8 192.168.56.101 up
09/19/23 10:07:25 (D_ALWAYS:2) Enumerating interfaces: docker0 172.17.0.1 up
09/19/23 10:07:25 (D_ALWAYS:2) Enumerating interfaces: lo ::1 up
09/19/23 10:07:25 (D_ALWAYS:2) Enumerating interfaces: enp0s3 fe80::a00:27ff:fe5c:373e up
09/19/23 10:07:25 (D_ALWAYS:2) Enumerating interfaces: enp0s8 fe80::4f26:c:cb9d:5de4 up
09/19/23 10:07:25 (D_ALWAYS:2) Enumerating interfaces: docker0 fe80::42:56ff:fe11:aeef up
09/19/23 10:07:25 (D_ALWAYS) ******************************************************
09/19/23 10:07:25 (D_ALWAYS) ** condor_startd (CONDOR_STARTD) STARTING UP
09/19/23 10:07:25 (D_ALWAYS) ** /usr/sbin/condor_startd
09/19/23 10:07:25 (D_ALWAYS) ** SubsystemInfo: name=STARTD type=STARTD(6) class=DAEMON(1)
09/19/23 10:07:25 (D_ALWAYS) ** Configuration: subsystem:STARTD local:<NONE> class:DAEMON
09/19/23 10:07:25 (D_ALWAYS) ** $CondorVersion: 10.1.1 2022-11-10 BuildID: 612938 PackageID: 10.1.1-1.1 RC $
09/19/23 10:07:25 (D_ALWAYS) ** $CondorPlatform: X86_64-Ubuntu_20.04 $
09/19/23 10:07:25 (D_ALWAYS) ** PID = 1
09/19/23 10:07:25 (D_ALWAYS) ** Log last touched time unavailable (No such file or directory)
09/19/23 10:07:25 (D_ALWAYS) ******************************************************
09/19/23 10:07:25 (D_ALWAYS) Using config source: /etc/condor/condor_config
09/19/23 10:07:25 (D_ALWAYS) Using local config sources:
09/19/23 10:07:25 (D_ALWAYS)    /etc/condor/config.d/01-env.conf
09/19/23 10:07:25 (D_ALWAYS)    /etc/condor/config.d/02-execute.config
09/19/23 10:07:25 (D_ALWAYS)    /etc/condor/config.d/10-stash-plugin.conf
09/19/23 10:07:25 (D_ALWAYS)    /etc/condor/condor_config.local
09/19/23 10:07:25 (D_ALWAYS) config Macros = 71, Sorted = 71, StringBytes = 1912, TablesBytes = 2620
09/19/23 10:07:25 (D_ALWAYS) CLASSAD_CACHING is ENABLED
09/19/23 10:07:25 (D_ALWAYS) Daemon Log is logging: D_ALWAYS:2 D_ERROR D_STATUS
09/19/23 10:07:25 (D_ALWAYS:2) Not using shared port because USE_SHARED_PORT=false
09/19/23 10:07:25 (D_ALWAYS) Daemoncore: Listening at <0.0.0.0:44747> on TCP (ReliSock) and UDP (SafeSock).
09/19/23 10:07:25 (D_ALWAYS) DaemonCore: command socket at <192.168.56.101:44747?addrs=192.168.56.101-44747&alias=worker0>
09/19/23 10:07:25 (D_ALWAYS) DaemonCore: private command socket at <192.168.56.101:44747?addrs=192.168.56.101-44747&alias=worker0>
09/19/23 10:07:25 (D_ALWAYS:2) Setting maximum accepts per cycle 8.
09/19/23 10:07:25 (D_ALWAYS:2) Setting maximum UDP messages per cycle 100.
09/19/23 10:07:25 (D_ALWAYS:2) Will use TCP to update collector <192.168.56.1:9618>
09/19/23 10:07:25 (D_ALWAYS:2) Not using shared port because USE_SHARED_PORT=false
09/19/23 10:07:25 (D_ALWAYS:2) Memory: Detected 1024 megs RAM
09/19/23 10:07:25 (D_ALWAYS:2) Found interface enp0s8 that matches <192.168.56.101:0>
09/19/23 10:07:25 (D_ALWAYS:2) Found interface enp0s8 with ip 192.168.56.101
09/19/23 10:07:25 (D_ALWAYS:2) enp0s8 supports Wake-on: no (raw: 0x00)
09/19/23 10:07:25 (D_ALWAYS:2) enp0s8 enabled Wake-on: no (raw: 0x00)
09/19/23 10:07:25 (D_ALWAYS:2) Using network interface enp0s8 for hibernation

====================================================================================================

And here is the log file of when using standard service. The red lines are not written in the container log above, so I suspect something is stuck at this stage.


09/19/23 11:14:11 (D_ALWAYS:2) Result of reading /etc/issue:  \S

09/19/23 11:14:11 (D_ALWAYS:2) Result of reading /etc/redhat-release:  AlmaLinux release 9.2 (Turquoise Kodkod)

09/19/23 11:14:11 (D_ALWAYS:2) Using IDs: 1 processors, 1 CPUs, 0 HTs
09/19/23 11:14:11 (D_ALWAYS:2) Reading condor configuration from '/etc/condor/condor_config'
09/19/23 11:14:11 (D_ALWAYS:2) Enumerating interfaces: lo 127.0.0.1 up
09/19/23 11:14:11 (D_ALWAYS:2) Enumerating interfaces: enp0s3 10.0.2.15 up
09/19/23 11:14:11 (D_ALWAYS:2) Enumerating interfaces: enp0s8 192.168.56.101 up
09/19/23 11:14:11 (D_ALWAYS:2) Enumerating interfaces: docker0 172.17.0.1 up
09/19/23 11:14:11 (D_ALWAYS:2) Enumerating interfaces: lo ::1 up
09/19/23 11:14:11 (D_ALWAYS:2) Enumerating interfaces: enp0s3 fe80::a00:27ff:fe5c:373e up
09/19/23 11:14:11 (D_ALWAYS:2) Enumerating interfaces: enp0s8 fe80::4f26:c:cb9d:5de4 up
09/19/23 11:14:11 (D_ALWAYS:2) Enumerating interfaces: docker0 fe80::42:56ff:fe11:aeef up
09/19/23 11:14:11 (D_ALWAYS) ******************************************************
09/19/23 11:14:11 (D_ALWAYS) ** condor_startd (CONDOR_STARTD) STARTING UP
09/19/23 11:14:11 (D_ALWAYS) ** /usr/sbin/condor_startd
09/19/23 11:14:11 (D_ALWAYS) ** SubsystemInfo: name=STARTD type=STARTD(6) class=DAEMON(1)
09/19/23 11:14:11 (D_ALWAYS) ** Configuration: subsystem:STARTD local:<NONE> class:DAEMON
09/19/23 11:14:11 (D_ALWAYS) ** $CondorVersion: 10.7.0 2023-07-31 BuildID: 665155 PackageID: 10.7.0-1 $
09/19/23 11:14:11 (D_ALWAYS) ** $CondorPlatform: x86_64_AlmaLinux9 $
09/19/23 11:14:11 (D_ALWAYS) ** PID = 334599
09/19/23 11:14:11 (D_ALWAYS) ** Log last touched time unavailable (No such file or directory)
09/19/23 11:14:11 (D_ALWAYS) ******************************************************
9/19/23 11:14:11 (D_ALWAYS) Using config source: /etc/condor/condor_config
09/19/23 11:14:11 (D_ALWAYS) Using local config sources:
09/19/23 11:14:11 (D_ALWAYS)    /etc/condor/config.d/01-env.conf
09/19/23 11:14:11 (D_ALWAYS)    /etc/condor/config.d/02-execute.config
09/19/23 11:14:11 (D_ALWAYS)    /etc/condor/config.d/10-stash-plugin.conf
09/19/23 11:14:11 (D_ALWAYS)    /etc/condor/condor_config.local
09/19/23 11:14:11 (D_ALWAYS) config Macros = 73, Sorted = 73, StringBytes = 2019, TablesBytes = 2692
09/19/23 11:14:11 (D_ALWAYS) CLASSAD_CACHING is ENABLED
09/19/23 11:14:11 (D_ALWAYS) Daemon Log is logging: D_ALWAYS:2 D_ERROR D_STATUS
09/19/23 11:14:11 (D_ALWAYS:2) Internal pipe for signals resized to 4096 from 65536
09/19/23 11:14:11 (D_ALWAYS:2) Not using shared port because USE_SHARED_PORT=false
09/19/23 11:14:11 (D_ALWAYS) Daemoncore: Listening at <0.0.0.0:33407> on TCP (ReliSock) and UDP (SafeSock).
09/19/23 11:14:11 (D_ALWAYS) DaemonCore: command socket at <192.168.56.101:33407?addrs=192.168.56.101-33407&alias=wor>
09/19/23 11:14:11 (D_ALWAYS) DaemonCore: private command socket at <192.168.56.101:33407?addrs=192.168.56.101-33407&a>
09/19/23 11:14:11 (D_ALWAYS:2) Setting maximum accepts per cycle 8.
09/19/23 11:14:11 (D_ALWAYS:2) Setting maximum UDP messages per cycle 100.
09/19/23 11:14:11 (D_ALWAYS:2) Will use TCP to update collector <192.168.56.1:9618>
09/19/23 11:14:11 (D_ALWAYS:2) Not using shared port because USE_SHARED_PORT=false
09/19/23 11:14:11 (D_ALWAYS:2) Memory: Detected 1024 megs RAM
09/19/23 11:14:11 (D_ALWAYS:2) Found interface enp0s8 that matches <192.168.56.101:0>
09/19/23 11:14:11 (D_ALWAYS:2) Found interface enp0s8 with ip 192.168.56.101
09/19/23 11:14:11 (D_ALWAYS:2) enp0s8 supports Wake-on: yes (raw: 0x2e)
09/19/23 11:14:11 (D_ALWAYS:2) enp0s8 enabled Wake-on: no (raw: 0x00)
09/19/23 11:14:11 (D_ALWAYS:2) Using network interface enp0s8 for hibernation
09/19/23 11:14:11 (D_ALWAYS:2) Initially invoking hibernation plugin '/usr/libexec/condor/condor_power_state ad'
09/19/23 11:14:11 (D_ALWAYS:2) Detected hibernation states: S3,S4,S5
09/19/23 11:14:18 (D_ALWAYS) VM universe will be tested to check if it is available
09/19/23 11:14:18 (D_ALWAYS) History file rotation is enabled.

Kind regards,
Reza