[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Grid Computing, resource is still down



各位老师大家好:
最近我在使用htcondor时遇到了一些问题,我想参照手册第五章做一个网格计算环境,现在我有本地主机181和搭建好condor池的188,提交任务后
一直不执行。

这是提交任务的描述文件的一部分:
  universe                               = grid
  executable                            = /data/condor_test/CondorTest.class
  input                                    = /data/condor_test/list.txt
  arguments                            = CondorTest 181937_2014-05-14_162956.mp4
  log                                       = /data/condor_test/condor.log
  error                                    = /data/condor_test/condor.error
  grid_resource                       = condor zhanglei@xxxxxxxxxxx CPLiJian
  +remote_universe                 = 10
  +remote_requirements          = True
  +remote_ShouldTransferFiles = 'YES'
  queue

condor.log 日志:
020 (034.000.000) 06/16 15:16:05 Detected Down Globus Resource
    RM-Contact: zhanglei@xxxxxxxxxxx
...
026 (034.000.000) 06/16 15:16:05 Detected Down Grid Resource
    GridResource: condor zhanglei@xxxxxxxxxxx CPLiJian
...

/var/log/condor/GridmanagerLog.zhanglei 日志文件里每隔五分钟会输出 resource zhanglei@xxxxxxxxxxx is still down


下面是相关的配置文件
这是181配置文件的一部分:
  FLOCK_TO =188.nodeljB
  FLOCK_COLLECTOR_HOSTS = $(FLOCK_TO)
  FLOCK_NEGOTIATOR_HOSTS = $(FLOCK_TO)
  ALLOW_NEGOTIATOR_SCHEDD = $(CONDOR_HOST), $(FLOCK_NEGOTIATOR_HOSTS), $(IP_ADDRESS)

  CONDOR_GAHP = $(SBIN)/condor_c-gahp
  C_GAHP_LOG  = /tmp/CGAHPLog.$(USERNAME)
  C_GAHP_WORKER_THREAD_LOG = /tmp/CGAHPWorkerLog.$(USERNAME)
  C_GAHP_WORKER_THREAD_LOCK = /tmp/CGAHPWorkerLock.$(USERNAME)

这是188配置文件的一部分:
UID_DOMAIN=nodeljB
COLLECTOR_NAME=CPLiJian

CONDOR_HOST=188.nodeljB
FLOCK_FROM=181.nodeljA
FLOCK_TO=
FLOCK_NEGOTIATOR_HOSTS = $(FLOCK_TO)
FLOCK_COLLECTOR_HOSTS = $(FLOCK_TO)
ALLOW_ADMINISTRATOR = $(CONDOR_HOST), $(IP_ADDRESS)
ALLOW_OWNER = $(FULL_HOSTNAME), $(ALLOW_ADMINISTRATOR)
ALLOW_READ=*.nodeljB
ALLOW_WRITE=*.nodeljB
ALLOW_NEGOTIATOR = zhanglei@$(CONDOR_HOST), $(IP_ADDRESS)
ALLOW_NEGOTIATOR_SCHEDD = $(CONDOR_HOST), $(FLOCK_NEGOTIATOR_HOSTS), $(IP_ADDRESS)
ALLOW_WRITE_COLLECTOR = $(ALLOW_WRITE), $(FLOCK_FROM)
ALLOW_WRITE_STARTD    = $(ALLOW_WRITE), $(FLOCK_FROM)
ALLOW_READ_COLLECTOR  = $(ALLOW_READ), $(FLOCK_FROM)
ALLOW_READ_STARTD     = $(ALLOW_READ), $(FLOCK_FROM)
USE_NFS         = True
LOCK            = $(LOCAL_DIR)/lock/condor

SEC_DEFAULT_NEGOTIATION = OPTIONAL
SEC_DEFAULT_AUTHENTICATION_METHODS = CLAIMTOBE