[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] SharedPortEndpoint: failed to bind to /var/lock/condor/daemon_sock/25689_90ae: Permission denied



Hello,

I've fixed the issue.

For some reason the /var/lock/condor directory was re-grouped/owned under root.

Changing it back to a condor user/group and restarting was enough for them to register in the pool.

Thanks,

Iain

From: HTCondor-users [htcondor-users-bounces@xxxxxxxxxxx] on behalf of Iain Bradford Steers [iain.steers@xxxxxxx]
Sent: 10 March 2015 08:45
To: htcondor-users@xxxxxxxxxxx
Subject: [HTCondor-users] SharedPortEndpoint: failed to bind to /var/lock/condor/daemon_sock/25689_90ae: Permission denied

Hi,

I noticed some of my worker nodes never showed up in condor_status after creating them.

Doing a pstree on the nodes shows that startd wasn't running. I attempted to start it and encountered the following situation.

~]# condor_startd
03/10/15 08:38:05 Can't open "/var/log/condor/StartLog"
ERROR "Cannot open log file '/var/log/condor/StartLog'" at line 208 in file /slots/01/dir_21000/userdir/src/condor_utils/dprintf_setup.cpp

So I temporarily renamed the file and I'm now getting the following in the StartLog.

03/10/15 08:24:38 ERROR: SharedPortEndpoint: failed to bind to /var/lock/condor/daemon_sock/25689_90ae: Permission denied
03/10/15 08:24:38 ERROR "Failed to start local listener (USE_SHARED_PORT=true)" at line 2897 in file /slots/01/dir_21000/userdir/src/condor_daemon_core.V6/daemon_core.cpp

I'm using Puppet to configure htcondor so it doesn't appear to be a differing config between successful worker nodes and this.

Regards,

Iain