[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] job splitter joiner



Hi:
I have a small condor cluster running that I use for bioinformatics tasks. The 
most common task in my work includes the processing of large files that 
include a lot of items to analyze. To use it with condor I have to:
- split this large file in several chunks,
- write the condor job file,
- submit the job,
- and collect the results.
Since this is a repetitive process I've created a script in python that does 
all that for me for virtually every command line utility that I use. It's a 
simple script and I've being using it just for a week. If you want to take a 
look at the code or use it you can get it at:
http://bioinf.comav.upv.es/svn/psubprocess/trunk/src/
It's AGPL.

In fact there are a couple of scripts included an a litlle library where the 
meat is. run_with_condor.py takes a command and it runs a condor process with 
it. You just save the hassle of writting the job file. run_in_parallel.py is 
capable of taking a command a set of files and it creates a new set of 
subjobs, it runs them and it returns the resulting output file. The output 
files should be identical to the output file that you would have get running 
the command without condor.
The code is new and it will have a lot of bugs for sure. I just post it 
because maybe you could show me similar utilities. If you know a similar 
utility let me know, I've build my own because I wasn't aware of any 
alternative.
Or, who knows, maybe it could be of some use to somebody.
Best regards,

-- 
Jose M. Blanca Postigo
Instituto Universitario de Conservacion y
Mejora de la Agrodiversidad Valenciana (COMAV)
Universidad Politecnica de Valencia (UPV)
Edificio CPI (Ciudad Politecnica de la Innovacion), 8E
46022 Valencia (SPAIN)
Tlf.:+34-96-3877000 (ext 88473)