[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] SharedPortEndpoint error in dag.dagman.out



Hi all,

Since upgrading to 8.6, users are reporting the following error in their dag.dagman.out files:

02/10/17 13:20:41 SharedPortEndpoint: failed to open ./shared_port_ad: No such file or directory
02/10/17 13:20:41 SharedPortEndpoint: did not successfully find SharedPortServer address. Will retry in 60s.
02/10/17 13:21:41 SharedPortEndpoint: failed to open ./shared_port_ad: No such file or directory
02/10/17 13:21:41 SharedPortEndpoint: did not successfully find SharedPortServer address. Will retry in 60s.

I see some discussion about this in the archives for the regular daemons, but not for dagman.

The first occurrence is at the top of the log, then it repeats:

02/08/17 11:34:32 ******************************************************
02/08/17 11:34:32 ** condor_scheduniv_exec.5452054.0 (CONDOR_DAGMAN) STARTING UP
02/08/17 11:34:32 ** /usr/bin/condor_dagman
02/08/17 11:34:32 ** SubsystemInfo: name=DAGMAN type=DAGMAN(10) class=DAEMON(1)
02/08/17 11:34:32 ** Configuration: subsystem:DAGMAN local:<NONE> class:DAEMON
02/08/17 11:34:32 ** $CondorVersion: 8.6.0 Jan 26 2017 BuildID: 395190 $
02/08/17 11:34:32 ** $CondorPlatform: x86_64_RedHat7 $
02/08/17 11:34:32 ** PID = 1344275
02/08/17 11:34:32 ** Log last touched 2/8 11:03:46
02/08/17 11:34:32 ******************************************************
02/08/17 11:34:32 Using config source: /etc/condor/condor_config
02/08/17 11:34:32 Using local config sources: 
02/08/17 11:34:32    /etc/condor/config.d/00_gwms_general.config
02/08/17 11:34:32    /etc/condor/config.d/02_gwms_schedds.config
02/08/17 11:34:32    /etc/condor/config.d/03_gwms_local.config
02/08/17 11:34:32    /etc/condor/config.d/90_gwms_dns.config
02/08/17 11:34:32    /etc/condor/config.d/92_flocking_osg_ligo.config
02/08/17 11:34:32    /etc/condor/config.d/99_gratia-gwms.conf
02/08/17 11:34:32    /etc/condor/config.d/99_gratia.conf
02/08/17 11:34:32    /etc/condor/condor_config.local
02/08/17 11:34:32 config Macros = 170, Sorted = 170, StringBytes = 8181, TablesBytes = 6224
02/08/17 11:34:32 CLASSAD_CACHING is ENABLED
02/08/17 11:34:32 Daemon Log is logging: D_ALWAYS D_ERROR
02/08/17 11:34:32 DaemonCore: No command port requested.
02/08/17 11:34:32 SharedPortEndpoint: waiting for connections to named socket 1344275_18a2
02/08/17 11:34:32 SharedPortEndpoint: failed to open ./shared_port_ad: No such file or directory
02/08/17 11:34:32 SharedPortEndpoint: did not successfully find SharedPortServer address. Will retry in 60s.

Any ideas?

Cheers,
Duncan.

-- 

Duncan Brown                         http://dbrown10.expressions.syr.edu
Charles Brightman Professor of Physics     Room 263-1 Physics Department
Director of the Graduate Program      Syracuse University, NY 13244, USA
Phone: 315 443 5993                                    Fax: 315 443 9103