[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] afs vs nfs vs separate file systems

Miskell, Craig wrote:

In the style of IT everywhere: "it depends".

If your blast database for a given run is smaller than the available
main memory of the compute node, then you'll see minimal performance
difference.  This is because the linux kernel will use spare RAM as a
disk buffer cache.  Thus, after the first sequence, as each subsequent
sequence is blasted, the blast program "reads" the database, which the
kernel conveniently already has in RAM.  Speed is good ;-)  As
speculation, I'd say that a shared filesystem might be slightly faster,
as processing will be interleaved during the initial read, whereas if
you copy the db to local disk, then your compute node does nothing while
it's copying the file, then starts processing.  But that's probably a
minor consideration, assuming your blasting any signficant number of

If your blast database does *not* fit into main memory, then your
performance will drop either way, but it will be *much* better to have
the blast db on local disk.  This is because in this case, the kernel
uses memory as a disk buffer cache, but too much data is read, so "old"
data (the early parts of the database) are flushed and "new" data (later
parts of the database) are put into memory instead.  If you're using a
shared file system, this reading will translate into network traffic,
with all the associated latency, whereas a local copy will still be
slow, but will only be "disk speed" slow, not "network" slow.  ;-)

FWIW, we use local copies (not using condor yet for this, but that's the
plan when we do), and we've also seen *significant* speedups by
splitting up our large databases into separate chunks, each of which can
fit into memory. By doing that, our CPU usage actually hits 100%, as
opposed to barely hitting 20% (this is on dual 2.8Ghz Xeons).

Craig Miskell

-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Michael Thon
Sent: Friday, 17 September 2004 4:42 a.m.
To: condor-users@xxxxxxxxxxx
Subject: [Condor-users] afs vs nfs vs separate file systems

Greetings - we have had a log of success in setting up a small condor pool of 4 linux boxes (workstations) in our lab. Now we are planning to add more systems to the pool and to start actually using it for research and we need to make some decisions about how to configure the pool. Specifically, I need to decide if we will us a shared file system. Some of the programs we run have large input files and output files. (sometimes > 500 MB These are blast databases for any of you biologists out there) currently all files are being transferred by condor. From a nework performance point of view, would it be better to put these files on a shared file system? Another option is to mirror the blast databases on all of the machines, but this could take a lot of disk space on the nodes and can cause problems if the syncronization gets out of wack. If I use a shared file system should I use afs or nfs? I have used nfs a little bit and I have no experience with afs. I want to keep our config as simple as possible, even if I have to sacrifice a little performance of the pool.
thanks for your comments
Condor-users mailing list

Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
Condor-users mailing list


I aggree with the aswer; in our lab I use mpblast (multiplex) to gain roughly a 10 x factor; I use rsync to refresh the local blast db on the nodes and all the intermediate files are kept locally. The primary fasta file (millions of seq) is cut into chunks of 100.000 seq and sent to the grid.

Each run start rsync (to be sure; it costs nearly nothing if the master blast db is not changed).
Only the pertinent results (a perl filter extracts a flat table from the verbous blast output) is _compressed_ and then copied on the main repository. The performance gain is very interesting and the whole grid is better scalable (the network/nfs traffic is not too high).

To blast against the human genome, I found we need roughly 1 GB /node (4 GB for a dual XEON with hyperthreading).
The performance tests made during real life load (3 days on 12 nodes) show that AMD Opterons are _far better_ than XEONs, with a very low variance around the mean time of execution.
Hyperthreading seems to induce chaotic behavior (blast not well adapted to CPU pipelines predictions ?).

I have some informal PDF presentations (in the context of Sun Gridengine, but it is not linked to this kind of grid); ask if you are interested (I avoid to post them to the list ;-)

In particular, the comparison between AMD/INTEL cpus is of more general interest.



Dr Alain EMPAIN <alain.empain@xxxxxxxxx> <alain@xxxxxxxxxx>
Bioinformatics, Molecular Genetics, Fac. Med. Vet., University of Liège, Belgium
Bd de Colonster, B43 B-4000 Liège (Sart-Tilman)
WORK: +32 4 366 3821 FAX: +32 4 366 4122
HOME: rue des Martyrs,7 B- 4550 Nandrin +32 85 51 23 41 GSM: +32 497 70 17 64
"I worry about my child and the Internet all the time, even though she's
too young to have logged on yet. Here's what I worry about. I worry that
10 or 15 years from now, she will come to me and say 'Daddy, where were
you when they took freedom of the press away from the Internet?'" --Mike Godwin, Electronic Frontier Foundation --------------------------------------------------------------------------------