[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] "Failed to execute" error message
- Date: Thu, 16 Feb 2006 13:34:57 -0800
- From: Rok Roskar <roskar@xxxxxxxxxxxxxxxxxxxx>
- Subject: Re: [Condor-users] "Failed to execute" error message
If it could be a "file not found" problem, make sure your environment
variables are getting passed to your job the way you think they should
be. You can see what environment variables are set for a given job by
using condor_q -l.
University of Washington
Department of Astronomy
On Feb 16, 2006, at 1:14 PM, Stephen Creps wrote:
I have a newly-installed cluster (SUSE Linux 10.0 x86_64) on which I
had Condor working. Without going into a long story, suffice it to say
it was necessary to reinstall the master node's OS to work around some
hardware support issues.
Now I can't get Condor to work. When we submit a job it just sits
there, periodically trying to run again but failing. The job log
repeats the following two messages:
001 (122.000.000) 02/16 16:07:47 Job executing on host:
007 (122.000.000) 02/16 16:07:47 Shadow exception!
Error from starter on vm2@xxxxxxxxxxxxxxxxxxxxxxxxxxxx: Failed
to execute '/data/PRD/bin/perfwrap condor_exec.exe
No such file or directory
0 - Run Bytes Sent By Job
0 - Run Bytes Received By Job
The "No such file or directory" is apparently my problem, but I can
see all the files on the given command line except for condor_exec.exe.
It is my impression that this file is a temporary copy of the job. If
knew where this file is supposed to be located it might help me track
down the problem. Can anyone tell me were to look, or give me other
ideas to try?
- - - - - -
Coordinator of UNIX Systems
Information Technology Group
Condor-users mailing list