[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] How to handle shared libraries with job submission



> I have a program which is made up of an executable and 3 libraries.
> What is the best way to handle libraries? I am running many jobs if
> that should matter.

If they are small and/or frequently updated, send them with each job.

transfer_input_files is what you need for this

> Should I copy them to every node in the grid? This would be kinda
> tricky to administer if those libraries are updated (as they will as
> the program is continiously under developement), but maybe the fastest
> for the execution?
> 
> Could I have them on a NFS share?

That depends on the OS and your setup. I think on Windows if you use the
default low-privelege condor user, it doesn't usually get access to network
drives.

Don't worry too much about the cost of transferring small files by condor 
compared to accessing them over NFS, I am sure I heard that performance
isn't significantly different and it can be faster to transfer the whole
files rather than access them over NFS.
 
> Can I just pass them on as "transfer_input_files" in the submit
> script? The libraries are about 1.4MB all together so it would do a
> lot of transferring.

Yes

1.4MB is (as I see it) small, even for 30s jobs. Think more carefully about this when
the libraries or input files are over 100MB.
 
> I am also looking for some way to send a collection of small
> jobs/executions as a single larger job to a node to enhance the
> scheduling vs execution time ratio. Each single computation is <30
> sec, but there can be many of them. Is there a way to batch up a
> collection of these executions and send this batch as a single job so
> the node will compute more than it is scheduling and transferring
> files? Maybe just use a batch script as executable? How does that
> work? Do I need to have the executable as an input file? Can the
> executable be on a NFS mount along with the libraries?

You could look at DAGMAN, but a shellscript (or batch script for Windows)
could do the job nicely. Your executable becomes your shellscript, your
executable becomes another of your input files to be transferred, along with
the libraries and real input files. Your arguments will be options to the shellscript
telling it how many jobs to run.
Note: you will not be able to use standard universe any more, vanilla becomes
your universe.

Cheers

JK