[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Sock::bind failed ShadowLog



i don't believe this is a permission issue because the job actually executes fine and runs for hours and then suddenly I see "CEDAR:6001:Failed to connect to ..."Â

i am at a loss here.Â



On Wed, Apr 30, 2014 at 6:48 AM, Keith Brown <keith6014@xxxxxxxxx> wrote:
has anyone seen this?


On Tue, Apr 29, 2014 at 7:02 AM, Keith Brown <keith6014@xxxxxxxxx> wrote:
On a large pool I noticed several of my job are keep getting rescheduled.Â

By looking at the ShadowLog I noticed,Â

Sock::bind failed: errno = 98 Address already in use
RemoteResource::killStarter(): Could not send command to startd
Sock::bind failed: errno = 98 Address already in use
Can't connect to queue manager: CEDAR:6001:Failed to connect to <scheduler>
Failed to perform final update to job queue!


By checking the mailing list, I have these settings:
NO_DNS = false

Anything I should be looking at?Â