[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Most of the time in Condor jobs gets wasted in I/o



On 04/24/2013 03:36 PM, Dr. Harinder Singh Bawa wrote:
> Hi,
> Idea of Transferring files to worker node also comes to my mind but I am
> not sure how to do this.

One way would be to run your job as a DAG and have a PRE script that
does 'for i in xargs -a file.list ; do cp /scratch/$i /var/tmp ; done'
(the job then reads from /var/tmp): see
http://research.cs.wisc.edu/htcondor/manual/v7.8/2_10DAGMan_Applications.html

You could also use 'transfer_input_files', but the above is probably
simpler
(http://research.cs.wisc.edu/htcondor/manual/v7.8/2_5Submitting_Job.html#SECTION00354200000000000000)

> One more thing I dont understand is that if I test to run , one such
> command interactively on one such node, it finishes in 40 minutes. The same
> command in condor doesn't finishes in more than 1 day

It only looks bad if you haven't seen it before. We tried to have BLAST
mmap the same set of 2GB database files over NFS. I think it only took a
couple of dozen nodes, nowhere near 120, to push overall i/o wait times
over 24 hours. It is surprising how bad nfs gets once the filesystem is
over ~75% full, too...

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu

Attachment: signature.asc
Description: OpenPGP digital signature