[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Define where to create the log, output and error files



Hi!

 

I am submitting jobs under the vanilla universe from a Windows machine. The jobs are executed on LINUX machines.

 

Condor defines by default a temporary folder where the jobs are executed.

However, the jobs that I am running are batch files that change the directory to a mounted folder, run a script from this mounted folder and should create output files in this mounted folder.

So, I don’t need to transfer the output files back to the submitting machine. They should have been created in the mounted folder, where I want them to be kept.

 

I have defined should_transfer_files = YES in the submit file because I need to transfer one input file (the batch file that has the instructions described above).

I have also defined when_to_transfer_output = ON_EXIT.

I haven’t defined transfer_ouput_files.

 

My questions are:

 

1.       Which output files will condor try to transfer to the submitting machine?

 

2.       Can I tell Condor to create the log, output and error files in the same mounted folder where I am running the script and creating the other output files?

 

3.       Can I tell Condor to transfer an input file but no output files?

 

 

The problem is that I am getting the following error message (Schedlog in the submitting machine).

 

01/17 17:59:52 (pid:3068) WriteUserLog::initialize: safe_fopen_wrapper("\\server\folder\Run_condor.log",a+tc) failed - errno 22 (Invalid argument)

01/17 17:59:52 (pid:3068) WriteUserLog::initialize: failed to open file

01/17 17:59:52 (pid:3068) WARNING: Invalid user log file specified: \\ server\folder \Run_condor.log

01/17 17:59:52 (pid:3068) Starting add_shadow_birthdate(3494.0)

01/17 17:59:52 (pid:3068) Started shadow for job 3494.0 on slot3@xxxxxxx <ip adress> for submitter, (shadow pid = 7636)

01/17 17:59:52 (pid:3068) Shadow pid 6436 for job 3462.0 exited with status 112

01/17 17:59:52 (pid:3068) Putting job 3462.0 on hold

 

Most of the jobs (except 19, always) are put on hold because of this reason.

 

I don’t understand the reason of this error message. What is the problem?

 

 

Condor-q –better-analyze says that,

 

3964.000:  Request is held.

 

Hold reason: Cannot access initial working directory \\server\folder: Invalid argument

 

 

The initial working directory is indeed \\server\folder. The job is submitted from this directory. But the job looks like this

 

#!/bin/sh
ls
cd /
/home/user/mount/folder/run.py
echo ...Done

 

And /home/user/mount is the mounted directory that has all the files needed for the execution of run.py.

The files resultant from the execution of run.py should be created in the directory /home/user/mount/folder. Condor does not have to transfer them to the submitting machine.

 

So, I believe that Condor tries to access the initial working directory \\server\folder to transfer the log, error and output files but it fails to do it except for 19 jobs.

Is there any maximum number of simultaneous transfers to the submitting machine?

Can I do as I wrote in question 2 above?

 

 

Best regards,

Sónia

 

 

Sónia Liléo
O2 Strandvägen 5B 114 51 Stockholm
Tel: +46 8 559 310 37 Mobile: +46 73 752 95 74

www.o2.se