[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] sh_loop test failing



Hi again... I've just installed Condor in a small intranet cluster, and run the
test programs shipped with the distribution. All of them have run fine, with
the execption of sh_loop. It gives an error apparently related with the
inexistence of the file /home/condor/execute/dir_4576/condor_exec.exe

Here you can find the corresponding info at StarterLog in the execution machine:

---------------------------------------------------
- 11/29 18:22:09 ******************************************************
- 11/29 18:22:09 ** condor_starter (CONDOR_STARTER) STARTING UP
- 11/29 18:22:09 ** /usr/local/condor-6.8.6/sbin/condor_starter
- 11/29 18:22:09 ** $CondorVersion: 6.8.6 Sep 13 2007 $
- 11/29 18:22:09 ** $CondorPlatform: X86_64-LINUX_RHEL3 $
- 11/29 18:22:09 ** PID = 4576
- 11/29 18:22:09 ** Log last touched 11/29 18:14:04
- 11/29 18:22:09 ******************************************************
- 11/29 18:22:09 Using config source: /home/condor/condor_config
- 11/29 18:22:09 Using local config sources:
- 11/29 18:22:09    /home/condor/condor_config.local
- 11/29 18:22:09 DaemonCore: Command Socket at <192.168.1.22:39483>
- 11/29 18:22:09 Done setting resource limits
- 11/29 18:22:09 Communicating with shadow <192.168.1.21:41839>
- 11/29 18:22:09 Submitting machine is "bioxeon.ibmcp-cluster.upv.es"
- 11/29 18:22:09 File transfer completed successfully.
- 11/29 18:22:10 Starting a VANILLA universe job with ID: 12.0
- 11/29 18:22:10 IWD: /home/condor/execute/dir_4576
- 11/29 18:22:10 Output file: /home/condor/execute/dir_4576/sh_loop.out
- 11/29 18:22:10 Error file: /home/condor/execute/dir_4576/sh_loop.err
- 11/29 18:22:11 About to exec /home/condor/execute/dir_4576/condor_exec.exe 600
- 11/29 18:22:11 Create_Process: child failed with errno 2 (No such file or
directory) before exec()
- 11/29 18:22:11 ERROR
"Create_Process(/home/condor/execute/dir_4576/condor_exec.exe,600, ...) failed"
at line 393 in file os_proc.C
- 11/29 18:22:11 ShutdownFast all jobs.
-----------------------------------------------------

Similar information can be found, for instance, at ShadowLog in the submit
machine:

------------------------------------------------------
- 11/29 16:21:28 ******************************************************
- 11/29 16:21:28 ** condor_shadow (CONDOR_SHADOW) STARTING UP
- 11/29 16:21:28 ** /usr/local/condor-6.8.6/sbin/condor_shadow
- 11/29 16:21:28 ** $CondorVersion: 6.8.6 Sep 13 2007 $
- 11/29 16:21:28 ** $CondorPlatform: X86_64-LINUX_RHEL3 $
- 11/29 16:21:28 ** PID = 8360
- 11/29 16:21:28 ** Log last touched 11/29 16:13:22
- 11/29 16:21:28 ******************************************************
- 11/29 16:21:28 Using config source: /home/condor/condor_config
- 11/29 16:21:28 Using local config sources:
- 11/29 16:21:28    /home/condor/condor_config.local
- 11/29 16:21:28 DaemonCore: Command Socket at <192.168.1.21:41839>
- 11/29 16:21:28 Initializing a VANILLA shadow for job 12.0
- 11/29 16:21:28 (12.0) (8360): Request to run on <192.168.1.22:40613> was
ACCEPTED
- 11/29 16:21:29 (12.0) (8360): Job 12.0 going into Hold state (code 6,2): Error
from starter on vm1@xxxxxxxxxxxxxxxxxxxxxxxxxx: Failed to execute
'/home/condor/execute/dir_4576/condor_exec.exe' with arguments 600: No such
file or directory
- 11/29 16:21:29 (12.0) (8360): **** condor_shadow (condor_SHADOW) EXITING WITH
STATUS 112
---------------------------------------------------

Any idea?

Thanks,

Javier.


-- 
Javier Forment Millet
Instituto de Biología Celular y Molecular de Plantas (IBMCP) CSIC-UPV
 Ciudad Politécnica de la Innovación (CPI) Edificio 8 E, Escalera 7 Puerta E
 Calle Ing. Fausto Elio s/n. 46022 Valencia, Spain
Tlf.:+34-96-3877858
FAX: +34-96-3877859
jforment@xxxxxxxxxxxx