Mailing List Archives
Public Access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Job Submission Problem
- Date: Fri, 29 Feb 2008 14:15:43 -0600
- From: Jaime Frey <jfrey@xxxxxxxxxxx>
- Subject: Re: [Condor-users] Job Submission Problem
On Feb 28, 2008, at 9:01 PM, Saurabh Agarwal wrote:
I have a job which is run as ./executable_name < in.input.
The "in.input" file is a plain text file, but has information to
read another file called as "in.data". All the files are stored in
the same directory along with the executable. Following is the
condor script I am using to submit my job. However, when the job
starts executing, it fails to read "in.data" file, which is supplied
in the same directory. Kindly, let me know where am I doing wrong.
Following is the condor script which I am using to submit this job:
Universe = parallel
initialdir = /backup/benchmark_condor/
Executable = mp1script
#WantIOProxy = True
Output = benchmark.out
Error = benchmark.err
Log = benchmark.log
machine_count = 11
getenv = True
should_transfer_files = yes
when_to_transfer_output = on_exit_or_evict
Queue
The executable mp1script is just a simple command like :
# Set this to the bin directory of MPICH installation
MPDIR=/usr/local/bin
PATH=$MPDIR:.:$PATH
export PATH
## run the actual mpijob
mpirun -v -np $_CONDOR_NPROCS /backup/benchmark_condor/
executableName < /backup/benchmark_condor/in.input
I have condor 6.8.6 installed on 3 of my machines and is running
successfully other MPI jobs in which I am just redirecting stdin,
and not reading any other file.
Condor can handle a job's data files in two fashions and you're mixing
them. Things will be less confusing if you pick just one.
The first method is to rely on a shared filesystem. This is used when
should_transfer_files is set to 'no'. The directory set by initialdir
must be on the shared filesystem and Condor will run the job in that
directory.
The second method is to transfer all of the job's file between the
submit and execute machines. This is used when should_transfer_files
is set to 'yes'. The job's executable and standard input, output, and
error are transferred. You can specify additional files to be
transfered using transfer_input_files and transfer_output_files. The
job is run in a temporary directory on the execute machine. The job's
files are transferred into and out of that directory before and after
the job runs.
You have file transfer enabled, but your script is pulling the
executable and standard input directly from your initialdir (which I'm
assuming is shared). I'll bet the trouble you're having is that
'in.data' in your standard input file doesn't have the full path to
the file. Since the job is running in a temporary directory, it would
then be looking in that directory for the file.
Thanks and regards,
Jaime Frey
UW-Madison Condor Team