[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] watchdog pipe file missing
- Date: Wed, 28 Jan 2009 13:53:28 -0600
- From: Greg Quinn <gquinn@xxxxxxxxxxx>
- Subject: Re: [Condor-users] watchdog pipe file missing
The "watchdog" pipe is created by the ProcD when it starts up, and is
only ever deleted by Condor when the ProcD shuts down.
Is it possible that something outside of Condor is deleting the pipe? We
have seen problems like this before with programs like tmpwatch
(although I guess it's doubtful that tmpwatch is running over your
Come to think of it, /home/condor/hosts/wolf10/log sounds like it could
be on NFS. It's perfectly fine to have your LOG directory on NFS, but it
is in that case required to have a separate local LOCK directory (where
things like the ProcD's pipes are stored). Please make sure that your
LOCK setting refers to a local directory.
Fernando Rannou wrote:
I'm getting he following error in one of the StaterLog
1/28 11:20:04 About to exec /home/mpetct/sampproc --universal
1/28 11:20:04 error opening watchdog pipe
/home/condor/hosts/wolf10/log/procd_pipe.STARTD.watchdog: No such file
or directory (2)
1/28 11:20:04 ProcFamilyClient: error initializing LocalClient
1/28 11:20:04 ProcFamilyProxy: error initializing ProcFamilyClient
1/28 11:20:04 ERROR "ProcD has failed" at line 599 in file
1/28 11:20:04 ShutdownFast all jobs.
Clealry the "pipe" files are not there. What should I do.
We restarted condor on all nodes but the files did not appear.
This has happened in a couple of nodes. All other nodes do have the
prw-rw---- 1 root isl 0 Nov 4 16:08 procd_pipe.STARTD
prw-rw---- 1 root isl 0 Nov 4 16:08