Linux Cluster Urban Legends | Opinions

Just like there are "Urban Legends" that never seem to die, so it seems there are "Cluster Urban Legends" that persist even today. We have all seen or heard them. Wacky things people say about HPC clusters. As a service to the HPC (High Performance Computing) community, and to the world at large I have decided to prepare a list of my personal favorites. Hopefully, these legends (misconceptions) will eventually fade, but then again, this is the Internet age.

Enough introduction. Here comes the clue train.

1. Can you imagine a Beowulf cluster of these!

This comment/joke is seen often on Slashdot when there is some new processor, computer, video game, cluster, or whatever. My answer, of course is "No, not really." Just connecting things does not make a usable cluster. At the beginning and end of the day, HPC is all about price-to-performance. Whatever you connect must make sense. Connecting a bunch of old PII systems to equal the performance of a single Opteron may make sense if you are building a rendering farm, but may be a bad idea if you want to calculate the weather for next week.

So what is a cluster? Let's define it as a collection of workers that communicate to produce a large amount of work. In computer terms, it is computer hardware connected with a some form of communication medium (Gigabit Ethernet for example).

So here is the first thing to remember:

Price-to-performance rules

So, yes, imagining a cluster of smart washing machines may make economic sense for someone, but for most HPC people is probably won't be too effective.

By the way, clustering is an old idea. Any time you have more than one thing working together to produce something, you are clustering. Ants do it quite well. Go ahead, say it, Can you imagine ... of ant hills"

2. This software will allow me to connect all the desktops in my company to create an unlimited supply of supercomputing power

This statement has appeared in the press quite a bit. It seems every time some talks about clustering, this idea comes up. While it does have some merit, the "unlimited supply of computing power" is where the train comes off the track.

Local intranets vary quite a bit in terms of resources. Providing you can resolve the software issue, the economic justification (that price-to-performance thing) depends on what you want to do.

To help understand the dynamics, imagine playing football (American) using cell phones to communicate. Every time there is any communication on the field (play calling, huddle, snap count, referee whistle, time outs, etc.), everyone must stop and wait for the communication to finish by way of dialing a cell phone and calling everyone who needs that specific information. The game would be interesting in weird kind of way, but would also slow down and cost of advertising would quickly drop because there is more time for commercials and the event just got very boring. The "price-to-performance" would be pretty low.

Now image a marathon race of ten runners, where each runner has a cell phone. Over the phone they are told to start. The delay between calls to individual runners may be on the order of minutes, but the race is on the order of hours. The small delay in the beginning will probably not influence the end result too much.

Are you getting the idea? The economics of using latent PC cycles depends on what you want to do. Things like seti@home are like marathon races, where independent runners can be sent out and return at some point. Other problems may require more communication and thus are not economically suitable. For instance, it may take that big "LAN supercomputer" your company owns one week to compute tomorrows weather.

Price-to-performance is determined by what you want to do with your cluster

Remember this statement as well.

3. The Beowulf/MPI/PVM software will turn ordinary PCs into a supercomputer that will then run your programs faster

After reading about unlimited computing power from HPC/Beowulf clusters everyone gets pretty excited. Sorry to have to bring you down, but if your program is designed to run on one processor, it will run on one processor when loaded on your shiny new cluster. Linking with an MPI (Message Passing Interface) or PVM (Parallel Virtual Machine) libraries does not make your program run on multiple processors. Bummer.

There is no magic Beowulf software either. Get over it. Beowulf is the name of the commodity computing project that was developed by Tom Sterling and Don Becker. You can eliminate a large portion of folklore running around in your head if you read this.

There is software from Scyld that is called Scyld Beowulf. This software does some amazing things, but it does not do any magic. There are other cluster distributions ( WAREWULF, ROCKS, OSCAR ), but again if you are looking for magic, you have come to the wrong place.

The programming story is much larger than can be covered here. Just remember:

Programs must be designed to run on a cluster. Parallel programming can be hard.

4. Communication speed (throughput) between nodes is important

The joke about a station wagon full of tapes being the fastest way to send large amounts of data is quite true. Intuitively, you know something is missing because we use wires and fiber instead of station wagons to send data. It is called latency.

Doug's Latency Experiment: Next time you are in the shower, which for some of you fellow geeks may be next week, adjust the water to where it is comfortable. Now turn the hot water off and start counting. Note the time between when you turn the faucet until you scream. That time is the latency of your shower water. There is also the flush the toilet version of this experiment, but that takes two people.

So ask yourself, "How many times an hour can I make myself scream?" For those following along from home, this can be considered a thought experiment. If there was less of a delay between screams (low latency), you could increase the number of messages (screams) you can send via the shower.

So that is latency. If you have an application that needs to send a lot of messages in a short time (like a football team using cell phones) then latency is important. On the other hand if you only need to send a few messages, like our marathon runners, then latency is not that important.

Of course how much data we can send in a certain amount of time is also important. This is called throughput. Throughput is a rate, (bits/second) think of that station wagon full of tapes, OK a Hummer for you youngsters, moving along the highway. Latency is how long it takes to load and unload the tapes.

5. Communication speed and latency are important

Now that you think you are so smart. Throughput and latency are not the whole story. How much work a processor must do to move data is also important. If your crew that fills the Hummer with tapes is also responsible for creating the tapes, then work must be shared and the overall performance may suffer.

6. Communication speed, latency, and processor overhead are important

Now you think you are really smart. Not so fast. Communication speed, latency, processor overhead are all important, right? It depends.

You now are a cluster expert. The answer to 90% of all cluster questions can be answered with the following:

It all depends on the application

The right hardware and software design can be very dependent on the application. Some applications (marathon types) run well on almost any cluster while other (football types) need very high performance parts.

"Why not just use faster parts and then every application works fine?" Cost is the issue here Skippy. When designing a cluster there are certain constraints. Cost is usually one such constraint. The old bang for the buck idea. Or, as I recall someone mentioning at some point price-to-performance.

7. The Top 500 list is the ultimate measure of computer performance

Behold the Top 500 supercomputers. Got to get me one of those thingies. The Top 500 list is a measure of how fast a specific program runs on a computer or cluster. It is a single data point. So take out a sheet of paper and put a point in the middle. That point is the performance of the worlds fastest computer running something called the "linpack benchmark". Now take out another piece of paper and put a similar point in the middle. That point is the performance of the last computer on the Top 500 list running a program called BLAST. Hold the two pieces of paper next to each other. Now which one is faster. Get the idea. In programming terminology, the scope of a clusters rank in the Top 500 is limited to the Top 500. When running other programs Your Mileage May Vary YMMV.

Here is the last thing to remember.

HPC/Beowulf clusters are about building machines around problems

If you need a fast interconnect, then buy it. If you don't, then don't buy it and instead buy more processors. Maximize your performance and buy what you need to solve your problem. This process does require more engineering on the end users part, however. Of course, Cluster Monkey is there to help.

In conclusion, you should now be able to totally discount all those cluster urban legends you see all the time. And, if you have already forgotten the things you were supposed to remember, here is one last sentence to sum it all up:

there is no free lunch with clusters -- just a reasonable priced buffet on which to feed your computing needs

Douglas Eadline is the swinging Head Monkey at ClusterMonkey.net.