PVFS Performance Enhancement

Article Index

Configuring software RAID is rather easy. There are some wonderful HOWTO articles on the web about configuring and maintaining a software RAID set. Daniel Robbins has written two very good articles for IBM's developer website (see Resources Sidebar). If you use software RAID remember that you will see an increase in the CPU usage due to the RAID. However, this impact is traded against improved data throughput from RAID-0. Also, the speed of modern processors and the use of dual processor motherboards usually minimizes the impact for normal operations.

The other option is to use a true hardware RAID card offloading the CPU and minimizing the impact on the node. However, in some cases, the speed of the RAID card is actually slower than software RAID because the CPU in the node is much faster than the processor on the RAID card.

{mosgoogle left}

Underlying File system

PVFS is a virtual file system built on top of an existing file system. Thus, the speed of PVFS is affected by the speed of the underlying file system because PVFS relies on the underlying file systems to write the data to the actual disk.

Nathan Poznick, a frequent contributor to PVFS, performed some tests on PVFS2 and the effect of the underlying file system. He tested ext2, ext3 (data=ordered), ext3 (data=writeback), ext3 (data=journal), fs, xfs, reiserfs, and reiser4. The tests were performed with a single server and a single client (see the Resources Sidebar for a link to an explanation of the ext3 journaling modes). The server was running SUSE Enterprise Server 9 with a 2.6.8.1-mm1 kernel and the client was running Red Hat 7.3 with a 2.4.18 kernel. He performed two tests, a file creation test, and data test. The file creation test just used the Linux command C to create 10,000 empty files in PVFS2. The second test used the Linux command dd with a block size of 16 MB to create a 4 GB file. A link to a plot of the results can be found in the Resources Sidebar.

For the touch test, ext2 was the fastest, followed closely by xfs. Reiser4 was the slowest in the test by a factor of 10. For ext3, the ordered and writeback journaling modes were about the same speed and far faster than the journal mode. For the dd test, jfs was the fastest, followed somewhat closely by ext2 and xfs. Again, reiser4 was the slowest by a factor of 20. For ext3, the writeback journaling mode was a bit faster than the ordered mode, but both were about one-third faster than the journal mode. Nathan admits that the tests are somewhat unscientific in that the file systems could have been tuned to provide better performance. Also, note that reiser4 was probably not a final version.

Even Nathan's simple tests show the effect of the underlying file system on the performance of PVFS.

{mosgoogle right}

Sidebar One: Links Mentioned in Column

PVFS1

PVFS2

Jens Mache Paper

Kent Milfield paper

Dell Study

Nathan Poznick Study

Ext3 Journaling Explanations

hdparm

First IBM Software RAID article

Second IBM Software RAID article

ROMIO


This article was originally published in ClusterWorld Magazine. It has been updated and formated for the web. If you want to read more about HPC clusters and Linux you may wish to visit Linux Magazine.

Dr. Jeff Layton hopes to someday have a 20 TB file system in his home computer. He lives in the Atlanta area and can sometimes be found lounging at the nearby Fry's, dreaming of hardware and drinking coffee (but never during working hours).

    Search

    Feedburner

    Login Form

    Share The Bananas


    Creative Commons License
    ©2005-2012 Copyright Seagrove LLC, Some rights reserved. Except where otherwise noted, this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. The Cluster Monkey Logo and Monkey Character are Trademarks of Seagrove LLC.