Cluster file systems are hot. What good is 1000 processors if you can't write to a file without clogging your network or server. Learn about the issues and experiences of parallel file systems with Distinguished Cluster Monkey Jeff Layton as your guide.

So Why Use Disks on Clusters?

Spinning platters is what matters

A convergence of several key technologies has begun to change some of the old assumptions about how to build clusters. These technologies are high-speed affordable interconnects, high-speed file systems and storage network protocols, MPI implementations with MPI-IO, and high performance processors. This nexus of technologies is allowing the next possible steps in cluster evolution to become mainstream - diskless nodes.

Read more: So Why Use Disks on Clusters?

Using the PIO Benchmark

Benchmarking Parallel File Systems

In the last column I spent quite a bit of time explaining the design of a new set of benchmarks for parallel file systems called PIObench. I discussed critical topics such as timing, memory access patterns, disk access patters, and usage patterns. Due to constraints on how much I can write in a single column, I didn't talk about some other items such as the influence of disk caching. I won't talk about these in this column either. For those interested in things of this nature, take a look at Frank Shorter's PIObench thesis.

Read more: Using the PIO Benchmark

A Benchmark for Parallel File Systems

Our own benchmark, we are special you know

In a previous article I started to explore benchmarks for parallel file systems. In the article, we learned that benchmarks for serial file systems are not the best tools to measure the performance of parallel file systems (big surprise). Five of the most common parallel file system benchmarks were also mentioned, but the use of these was limited because they were only applicable to certain workloads and/or certain access patterns -- either memory access or storage access.

In this article we will take a look at a relatively new parallel file system benchmark suite that was designed to capture the behavior of several classes of parallel access signatures.

Read more: A Benchmark for Parallel File Systems

Benchmarking Parallel File Systems

In past columns, we've been talking about PVFS. We talked about how to configure it for performance, flexibility, and fault tolerance. If you are interested in performance, you need some way of measuring how the performance changes when you make changes. In this column, I'll talk about how one benchmarks parallel file systems. Of course, when I talk about benchmarks I don't mean comparing parallel file systems to one another. Rather, I mean the ability to determine the effects of changes on the performance of the parallel file system. This information gives you the ability to tune applications to maximize performance on a given parallel file system or to tune a parallel file system for a given set of codes.

Read more: Benchmarking Parallel File Systems

Resilient PVFS, Yes It Is Possible

How to sleep well when running PVFS

In the last article we looked at performance improvements for PVFS1 and PVFS2. In this installment, we'll examine improving the resilience or redundancy of PVFS as well as putting some flexibility into the configuration.

Redundancy or Resiliency is the ability to tolerate errors or failures without failure of the entire system. For PVFS (Parallel Virtual File Systems) this means the ability to tolerate individual failures without all of the PVFS being unavailable.

Read more: Resilient PVFS, Yes It Is Possible



Login Form

Share The Bananas

Creative Commons License
©2005-2012 Copyright Seagrove LLC, Some rights reserved. Except where otherwise noted, this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. The Cluster Monkey Logo and Monkey Character are Trademarks of Seagrove LLC.