[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Failing to start: Debian 8 (assertion error)



Additional info. On the Debian 8 server (which fails):

[/var/log/condor/SharedPortLog]
01/04/16 21:23:33 Using config source: /etc/condor/condor_config
01/04/16 21:23:33 Using local config sources:
01/04/16 21:23:33    /etc/condor/condor_config.local
01/04/16 21:23:33 config Macros = 77, Sorted = 77, StringBytes = 2146, TablesBytes = 2820
01/04/16 21:23:33 CLASSAD_CACHING is ENABLED
01/04/16 21:23:33 Daemon Log is logging: D_ALWAYS D_ERROR
01/04/16 21:23:33 DaemonCore: command socket at <192.168.5.11:9618?noUDP>
01/04/16 21:23:33 DaemonCore: private command socket at <192.168.5.11:9618>
01/04/16 21:23:33 main_init() called
01/04/16 21:23:33 About to update statistics in shared_port daemon ad file at /var/lock/condor/shared_port_ad :
ForkedChildrenPeak = 0
RequestsBlocked = 0
ForkedChildrenCurrent = 0
RequestsSucceeded = 0
RequestsPendingPeak = 0
RequestsPendingCurrent = 0
MyAddress = "<192.168.5.11:9618?noUDP>"
RequestsFailed = 0
CurrentTime = time()
01/04/16 21:23:33 SharedPortServer: failed to connect to /var/lock/condor/daemon_sock/3708_2819 as requested by <192.168.5.11:34288>: No such file or directory (err=2) 01/04/16 21:23:33 ChildAliveMsg: failed to send DC_CHILDALIVE to parent daemon at <192.168.5.11:0> (try 1 of 3): CEDAR:6001:Failed to connect to <192.168.5.11:0?sock=3708_2819> 01/04/16 21:23:33 SharedPortServer: failed to connect to /var/lock/condor/daemon_sock/3708_2819 as requested by <192.168.5.11:44882>: No such file or directory (err=2) 01/04/16 21:23:33 ChildAliveMsg: failed to send DC_CHILDALIVE to parent daemon at <192.168.5.11:0> (try 2 of 3): CEDAR:6001:Failed to connect to <192.168.5.11:0?sock=3708_2819>|CEDAR:6001:Failed to connect to <192.168.5.11:0?sock=3708_2819> 01/04/16 21:23:33 SharedPortServer: failed to connect to /var/lock/condor/daemon_sock/3708_2819 as requested by <192.168.5.11:34435>: No such file or directory (err=2) 01/04/16 21:23:33 ChildAliveMsg: failed to send DC_CHILDALIVE to parent daemon at <192.168.5.11:0> (try 3 of 3): CEDAR:6001:Failed to connect to <192.168.5.11:0?sock=3708_2819>|CEDAR:6001:Failed to connect to <192.168.5.11:0?sock=3708_2819>|CEDAR:6001:Failed to connect to <192.168.5.11:0?sock=3708_2819> 01/04/16 21:23:33 ERROR "FAILED TO SEND INITIAL KEEP ALIVE TO OUR PARENT <192.168.5.11:0?sock=3708_2819>" at line 9474 in file /home/matth/condor_temp/htcondor/src/condor_daemon_core.V6/daemon_core.cpp


But on the Ubuntu 14.04 server (which works)

[/var/log/condor/SharedPortLog]
01/04/16 16:24:13 Using config source: /etc/condor/condor_config
01/04/16 16:24:13 Using local config sources:
01/04/16 16:24:13    /etc/condor/condor_config.local
01/04/16 16:24:13 config Macros = 77, Sorted = 77, StringBytes = 2146, TablesBytes = 2820
01/04/16 16:24:13 CLASSAD_CACHING is ENABLED
01/04/16 16:24:13 Daemon Log is logging: D_ALWAYS D_ERROR
01/04/16 16:24:13 Daemoncore: Listening at <0.0.0.0:9618> on TCP (ReliSock).
01/04/16 16:24:13 DaemonCore: command socket at <192.168.5.41:9618?addrs=192.168.5.41-9618&noUDP> 01/04/16 16:24:13 DaemonCore: private command socket at <192.168.5.41:9618?addrs=192.168.5.41-9618>
01/04/16 16:24:13 main_init() called
01/04/16 16:24:13 About to update statistics in shared_port daemon ad file at /var/lock/condor/shared_port_ad :
ForkedChildrenPeak = 0
RequestsBlocked = 0
ForkedChildrenCurrent = 0
RequestsSucceeded = 0
RequestsPendingPeak = 0
RequestsPendingCurrent = 0
RequestsFailed = 0
SharedPortCommandSinfuls = "<192.168.5.41:9618>"
MyAddress = "<192.168.5.41:9618?addrs=192.168.5.41-9618&noUDP>"
01/04/16 16:29:13 About to update statistics in shared_port daemon ad file at /var/lock/condor/shared_port_ad :
ForkedChildrenPeak = 0
RequestsBlocked = 0
ForkedChildrenCurrent = 0
RequestsSucceeded = 2
RequestsPendingPeak = 1
RequestsPendingCurrent = 0
RequestsFailed = 0
SharedPortCommandSinfuls = "<192.168.5.41:9618>"
MyAddress = "<192.168.5.41:9618?addrs=192.168.5.41-9618&noUDP>"
01/04/16 16:34:13 About to update statistics in shared_port daemon ad file at /var/lock/condor/shared_port_ad :
... etc


Both machines have created /var/lock/condor/shared_port_ad:

[Debian]
ForkedChildrenPeak = 0
RequestsBlocked = 0
ForkedChildrenCurrent = 0
RequestsSucceeded = 0
RequestsPendingPeak = 0
RequestsPendingCurrent = 0
MyAddress = "<192.168.5.11:9618?noUDP>"
RequestsFailed = 0
CurrentTime = time()

[Ubuntu]
ForkedChildrenPeak = 0
RequestsBlocked = 0
ForkedChildrenCurrent = 0
RequestsSucceeded = 145479
RequestsPendingPeak = 69
RequestsPendingCurrent = 4
RequestsFailed = 0
SharedPortCommandSinfuls = "<192.168.5.41:9618>"
MyAddress = "<192.168.5.41:9618?addrs=192.168.5.41-9618&noUDP>"

*Neither* machine has a directory /var/log/condor/daemon_sock/, which the error message refers to.

Regards,

Brian.