[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Could not get other processes started except condor_master



Yup, observing that errno 37 is ENOLCK (No locks available) it is possible that /home is mounted to not allow locking. A quick search for ENOLCK suggests this happens sometimes for NFS mounts.

Best,


matt

On 01/06/2011 12:17 AM, hailong.yang1115 wrote:
Dear Matthew,
I have removed all the files under the lock directory
"/home/863/condor-7.4.2/local.n3/lock/", however the problem remained.
Any ideas?
Hailong
2011-01-06
------------------------------------------------------------------------
***********************************************
* Hailong Yang, PhD. Candidate
* Sino-German Joint Software Institute,
* School of Computer Science&Engineering, Beihang University
* Phone: (86-010)82315908
* Email: hailong.yang1115@xxxxxxxxx <mailto:hailong.yang1115@xxxxxxxxx>
* Address: G413, New Main Building in Beihang University,
* No.37 XueYuan Road,HaiDian District,
* Beijing,P.R.China,100191
***********************************************
------------------------------------------------------------------------
*发件人:* Matthew Farrellee
*发送时间:* 2011-01-06 06:52:59
*收件人:* Condor-Users Mail List
*抄送:* hailong.yang1115
*主题:* Re: [Condor-users] Could not get other processes started except
condor_master
On 12/31/2010 07:22 AM, hailong.yang1115 wrote:
 > Dear all,
 > When I started the condor_master process, no other processes were
 > started and the condor_master exit after a while. I also got the
 > following error in the MasterLog:
 > 12/31 19:51:43 ******************************************************
 > 12/31 19:51:43 ** condor_master (CONDOR_MASTER) STARTING UP
 > 12/31 19:51:43 ** /home/863/condor-7.4.2/sbin/condor_master
 > 12/31 19:51:43 ** SubsystemInfo: name=MASTER type=MASTER(2)
class=DAEMON(1)
 > 12/31 19:51:43 ** Configuration: subsystem:MASTER local:<NONE>
class:DAEMON
 > 12/31 19:51:43 ** $CondorVersion: 7.4.2 Mar 29 2010 BuildID: 227044 $
 > 12/31 19:51:43 ** $CondorPlatform: I386-LINUX_RHEL5 $
 > 12/31 19:51:43 ** PID = 20607
 > 12/31 19:51:43 ** Log last touched time unavailable (No such file or
 > directory)
 > 12/31 19:51:43 ******************************************************
 > 12/31 19:51:43 Using config source:
/home/863/condor-7.4.2/etc/condor_config
 > 12/31 19:51:43 Using local config sources:
 > 12/31 19:51:43 /home/863/condor-7.4.2/local.n3/condor_config.local
 > 12/31 19:52:13 FileLock::obtain(1) failed - errno 37 (No locks available)
 > 12/31 19:52:13 ERROR "Can't get lock on
 > "/home/863/condor-7.4.2/local.n3/lock/InstanceLock"" at line 956 in file
 > master.cpp
 > The file InstanceLock indeed existed and had the following permission:
 > -rw------- 1 863 863 0 May 30 2010 InstanceLock
 > Any clue?
 > Hailong
 > 2010-12-31
The condor_master maintains an "instance lock" to prevent you from
accidentally running it multiple times. It may be that the master exited
and didn't clean up the lock. If you have no master running, try
removing the file manually.
Best,
matt