[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Fw: Problems about condor slots




hailong.yang1115 wrote:
> 1. The slot number of some nodes in the condor pool mismatched the
> number of logic cpu cores, which could be seen from /proc/cpuinfo. The
> slot number of node9 we noticed from condor_status was 6, while the
> logic cpu cores we found from /proc/cpuinfo is 4.

Check the output of the following commands on the machine where you see
this problem:

condor_config_val -v NUM_CPUS

condor_config_val DETECTED_CORES

condor_config_val -v COUNT_HYPERTHREAD_CPUS


> 2. After installed condor on some nodes, we started condor_master but
> nothing happened. We checked the MasterLog file, it gave the following
> error:
> 12/27 10:48:41 ERROR "can't
> safe_open_wrapper(/tmp/condor-lock.ddgrid0.745993478763015/InstanceLock,O_WRONLY|O_CREAT|O_APPEND
> ,S_IRUSR|S_IWUSR) - errno 2" at line 946 in file master.cpp

I'm guessing that your LOCK directory
/tmp/condor-lock.ddgrid0.745993478763015 has been deleted. Running
condor_init should recreate it. However, I would recommend reconfiguring
LOCK to be somewhere else--not in /tmp--so it doesn't get accidentally
deleted again in the future.

--Dan