[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] The /var/lib/condor/execute folder 1, 000, 000 question? :-)



Dear David,
Thanks for the answer :-)
I tried to define the FILESYSTEM_DOMAIN macro on the submit and execute machine and also set the should_transfer_files in my job file.


Here is the job file: 

 executable = /home/user.name/work/chip_sim.pl
 should_transfer_files = ALWAYS
 when_to_transfer_output = ON_EXIT_OR_EVICT
 universe = vanilla
 getenv = True
 notification = Error
 run_as_owner = True
 load_profile = True
 initialdir = /home/user.name/work/
 concurrency_limits = VCS
 transfer_output_files = c_example
 stream_error = true
 stream_output = true
 priority = 0
 args = -no_sva -t c_example -report Report_2011-07-21_17-52-44_c_example -parent_root  /home/user.name/work/
 Error = chip_sim_c_example.err
 Output = chip_sim_c_example.out
 Log = condor_c_example.log
 Queue


I have two issues here:
1. when defining the transfer_output_file macro I do get the files I specified, including even the sub-directories I need, but I have to write all list of files and directories I want. The problem is that sometimes I don't know the name of the files or even the directory that is going to be created, what  make it hard to specified them in the transfer_output_file.
If I do not define the transfer_output_file I get all the files from the current execute dir, including all logs, error and file output but without the sub-directories init.

As the manual point out:

transfer_output_files = < file1,file2,file... > This command forms an explicit list of output files and directories to be transferred back from the temporary working directory on the execute
machine to the submit machine. If there are multiple files, they must be delimited with commas. For Condor-C jobs and all other non-grid universe jobs, if transfer_output_files is not specified,
Condor will automatically transfer back all files in the job’s temporary working directory which have been modified or created by the job. Subdirectories are not scanned for output, so
if output from subdirectories is desired, the output list must be explicitly specified. For grid universe jobs other than Condor-C, desired output files must also be explicitly listed. Another
reason to explicitly list output files is for a job that creates many files, and the user wants only a subset transferred back.

2. The job still running on the /var/lib/execute directory on the execute machine. It does not running on my local home directory where I was when submitting the job (even that this is a share NFS mount point). What I want is to see on running time all the files and directories the job have being created (I know I can do that using condor_ssh_to_job, but this what I want to prevent). 
So If for example I run gcc on a project, which compile object files, headers etc... I would like to see them via condor same way as I run it locally without condor.

to be more specific: if I run locally the following command (no condor)
 /home/gilad.nahor/work> gcc -foo -bar -lalala project
this start to compile and generate 100 files under the /home/gilad.nahor/work dir, and I can look and examine the files even before the complication was completed.
when running this under condor 1) the job doesn't run on the local directory from where I was when submitting the job. 2) I  don't get all the files created by the job if I don't specified them in the transfer_output_files macro. Even when specified them I still need the job to be finished or vacate so the files will transfer back to the submit machine. The log file however due get sync while running since I used the  stream_error = true and stream_output = true. 

My questions are :
1. is it possible to put Wildcard in transfer_input_files?
2. can I stream the output files during the job is running?
3. Can I run the job in the directory from where I submit the job?


Hope this more clear now...


Thanks
Sassy

Thanks
Sassy




On Thu, Jul 21, 2011 at 1:30 AM, David J. Herzfeld <herzfeldd@xxxxxxxxx> wrote:
Hi Sassy:

I think your question assumes that there is an underlying shared
filesystem (e.g. your home directory is mounted on both the schedd
machine and the startd machine). If this is the case, and
FILESYSTEM_DOMAIN in your config file is set correctly (the values
should be equivalent on both your submit and execute machines if they
share this filesystem), then your executable should run in the submit
directory on the execute machine, provided your submit file has an
appropriately defined should_transfer_files line.

If you want to ensure that your job only runs on machines where the
FILESYSTEM_DOMAIN value match, and thus your home directory is mounted,
make sure you specify
should_transfer_files = ALWAYS
in your submit file.

In this case, a directory will still get created in EXECUTE or
SLOT_<N>_EXECUTE which your job can use (for temporary local storage,
for instance), but your executable should start running in the submit
directory.

I believe the default value of should_transfer_files is IF_NEEDED
(although I haven't looked this up in a while), which means that your
job will start in the EXECUTE or SLOT_<N>_EXECUTE directory on machines
where the FILESYSTEM_DOMAINs don't match and will start in the submit
directory on machines where the FILESYSTEM_DOMAINs do match.

Does this answer your question? Perhaps I am missing something crucial.

Cheers,
David

On Thu, 2011-07-21 at 01:11 +0300, Sassy Natan wrote:
> any idea?
>
> On Wed, Jul 20, 2011 at 3:59 PM, Sassy Natan <sassyn@xxxxxxxxx> wrote:
>
> > Well, the initialdir doesn't make the process run in specified directory
> > but it does create the log files in it.
> >
> > what i'm interesting is to send a job to the q, and have it run in my local
> > home directory, where I was when submitting the job. so it is like i run the
> > command locally without using condor.
> >
> > so for example if i use gcc, the command with condor will run in the ~
> > home, and not in /var/lib/condor/execute/%jobid% folder.
> > I saw there is a remote_initialdir option and CONDOR_SCRATCH_DIR option,
> > but i'm not sure this will help me.
> >
> > sassy
> >
> >
> >
> > On Wed, Jul 20, 2011 at 3:48 PM, Matthew Farrellee <matt@xxxxxxxxxx>wrote:
> >
> >> Check the condor_submit manual page for initialdir.
> >>
> >> Best,
> >>
> >>
> >> matt
> >>
> >>
> >> On 07/19/2011 05:22 PM, Sassy Natan wrote:
> >>
> >>> Thank,
> >>>
> >>> But still, changing the path is easy. I would like it to be
> >>> a dynamic path, pointing to the working directory the user submitted the
> >>> job from. Which in my case is always happening inside the user home
> >>> directory.
> >>>
> >>> This /home is NFS volume on a local filers, running CephFS as there FS
> >>> (under RDBD volume).
> >>>
> >>> The SLOT<N>_EXECUTE is also not wise, what if i use over and over the
> >>> same slot no 4? no other slot are being used?
> >>>
> >>> Sassy
> >>>
> >>>
> >>> On Wed, Jul 20, 2011 at 12:09 AM, Erik Erlandson <eje@xxxxxxxxxx
> >>> <mailto:eje@xxxxxxxxxx>> wrote:
> >>>
> >>>    On Tue, 2011-07-19 at 23:57 +0300, Sassy Natan wrote:
> >>>
> >>>     > Is it possible to change the directory where the job processes is
> >>>     > truly running? AKA /var/lib/condor/execute?
> >>>
> >>>    The EXECUTE (or SLOT<N>_EXECUTE) configuration variable can control
> >>> the
> >>>    directory jobs execute in:
> >>>    http://www.cs.wisc.edu/condor/**manual/v7.6/3_3Configuration.**
> >>> html#15493<http://www.cs.wisc.edu/condor/manual/v7.6/3_3Configuration.html#15493>
> >>>
> >>
> >>
> >
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/