[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Jobs not starting - file transfer problem?



Trying to add a new box running Fedora 12 to my pool.

Having the same problem with 7.2.4, 7.4.0 and 7.4.1 - namely that
submitted jobs (that work on other nodes) fail like this:

01/06 15:52:44 Using config source: /etc/condor/condor_config
01/06 15:52:44 Using local config sources:
01/06 15:52:44    /var/lib/condor/condor_config.local
01/06 15:52:44 DaemonCore: Command Socket at <127.0.0.1:52330>
01/06 15:52:44 Done setting resource limits
01/06 15:52:44 Communicating with shadow <x.x.x.x:36250>
01/06 15:52:44 Submitting machine is "x.x.x.x"
01/06 15:52:44 setting the orig job name in starter
01/06 15:52:44 setting the orig job iwd in starter
01/06 15:52:44 File transfer completed successfully.
01/06 15:52:45 Job 61.0 set to execute immediately
01/06 15:52:45 Starting a VANILLA universe job with ID: 61.0
01/06 15:52:45 IWD: /var/lib/condor/execute/dir_20983
01/06 15:52:45 Input file: /var/lib/condor/execute/dir_20983/dammin-sHA30.0.inp
01/06 15:52:45 Output file: /var/lib/condor/execute/dir_20983/dammin-sHA30.0.out
01/06 15:52:45 Error file: /var/lib/condor/execute/dir_20983/dammin-sHA30.0.err
01/06 15:52:45 About to exec /var/lib/condor/execute/dir_20983/condor_exec.exe
01/06 15:52:45 Create_Process(/var/lib/condor/execute/dir_20983/condor_exec.exe):
child failed with errno 2 (No such file or directory) before exec()
01/06 15:52:45 ERROR
"Create_Process(/var/lib/condor/execute/dir_20983/condor_exec.exe,,
...) failed: No such file or directory" at line 530 in file
os_proc.cpp
01/06 15:52:45 ShutdownFast all jobs.

When I was watching, execution directories were created - they
disappeared after a few seconds.

I have should_transfer_files   = YES in the submit file and I'm
confused by the message saying file transfer completed successfully.

I'd be grateful for any suggestions of how to track this down.

Adam