[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] tuning a file server

Hello David & Steffen,

indeed, 3ware is a very stable solution for RAID server; it runs smootly for 3 years in my lab, but a few weeks ago I met problems with condor / NFS on 3ware RAID after tripling my nodes.

So I suspected not the RAID but the NFS configuration : the solution was simply to launch more nfs server daemons (default=4 on SuSE) and tuning the client side.

SERVER: /etc/sysconfig/nfs (SuSE)

# the kernel nfs-server supports multiple server threads

CLIENT: /etc/fstab  /home/grid      nfs \
        rw,hard,nointr,tcp,vers=3,rsize=32k,wsize=32k,bg \
        0 0

REM: I preferred bg to fg because I often work remotely and I do not want to lock the booting process if the NFS server is not responding.

Now all my nodes are 'blasting' steadily.

  Have a good day,


David McBride wrote:
Steffen Grunewald wrote:


some weeks ago, I had to move our file systems (/home and some data) to another server, equipped with 3ware 9000 cards, and using 8 400GB disks
in RAID5 with hot spare (2.3TB net capacity). To manage the disk space,
I had to switch to Linux kernel 2.6 - and there are indications that this
move caused some problems.

3ware tend to make excellent Linux-compatible RAID solutions, so it is unlikely to be a RAID-level problem, as your basic read benchmark suggested.

I'd suggest also running the 'bonnie++' benchmark on your filesystem to get a better idea of real-world performance (file creates/erasures, small loads, etc.)

The file server has to serve 360 VMs in 180 dual-CPU machines, and if there are lots of small I/O operations, the iowait percentage goes up to
more than 80%! I have never seen that before (the previous server was
almost identical, kernel 2.4, 1.4TB each).

Some important missing information:

* How (if at all) are the fileserver contents exported? NFS?
* What filesystem(s) are you using?

 > Now I suspect that the i/o strategies introduced with 2.6 kernels are
 > badly configured... are there any suggestions?

The 2.6 IO schedulers tend to _better_ than the 2.4 one. The default is to use the 'anticipatory' scheduler which tends to be excellent for most needs.


-- ------------------------------------------------------------ Dr Alain EMPAIN <alain.empain@xxxxxxxxx> <alain@xxxxxxxxxx> Bioinformatics, Molecular Genetics, Fac. Med. Vet., University of LIEGEe, Belgium Bd de Colonster, B43 B-4000 LIEGEe (Sart-Tilman) WORK: +32 4 366 4159 FAX: +32 4 366 4122 HOME: rue des Martyrs,7 B- 4550 Nandrin +32 85 51 2341 GSM: +32 497 70 1764 ------------------------------------------------------------------------------- "I worry about my child and the Internet all the time, even though she's too young to have logged on yet. Here's what I worry about. I worry that 10 or 15 years from now, she will come to me and say 'Daddy, where were you when they took freedom of the press away from the Internet?'" --Mike Godwin, Electronic Frontier Foundation -------------------------------------------------------------------------------