[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Parallel Jobs Don't Start



Good night,

 I've try to test the Condor's parallel universe with the test tasks in the manual but every job I sent stay in Iddle state for ever, next are the config of the Central Manager (Manager,Submit) and the Execute nodes:
>>>> Master(Master, Submit) Config:
ALLOW_WRITE=*.uni.edu,10.1.*
UNUSED_CLAIM_TIMEOUT = 0
MPI_CONDOR_RSH_PATH = $(LIBEXEC)
ALTERNATE_STARTER_2 = $(SBIN)/condor_starter
STARTER_2_IS_DC = TRUE
SHADOW_MPI = $(SBIN)/condor_shadow

>>>> Execute nodes config:
ALLOW_WRITE=*.uni.edu,10.1.*
DedicatedScheduler = "DedicatedScheduler@xxxxxxxxxx"
STARTD_ATTRS = $(STARTD_ATTRS), DedicatedScheduler
SUSPEND = False
CONTINUE = True
PREEMPT = False
KILL = False
WANT_SUSPEND = False
WANT_VACATE = False
RANK = Scheduler =?= $(DedicatedScheduler)
MPI_CONDOR_RSH_PATH = $(LIBEXEC)
CONDOR_SSHD = /usr/sbin/sshd
CONDOR_SSH_KEYGEN = /usr/bin/ssh-keygen
STARTD_EXPRS = $(STARTD_EXPRS), DedicatedScheduler
START=TRUE

Any idea why the jobs (with and without MPI) don't start?

Thank you.
----
Edier Alberto Zapata Hernández
Est. Ingeniería de Sistemas
Universidad de Valle