Why Linux on Clusters?

Article Index

Ownership and Community

Somewhat similar to the vendor lock-in issue is the concept of ownership. Why is it that some people use Linux to solve big problems and at the same time run it on old cash registers? I suspect the answer lies in the level of control provided by the software environment. In the absence of anyone saying "no you can't", there are many people saying "what if". Indeed, Linux in sense has become the paint by which and artist can express what ever they want. Along with this expression of ideas comes ownership. It becomes "your" masterpiece and you control your destiny.

Another part of ownership is community. Since many hands have helped create the "practice and art of cluster computing", you can become a co-owner by simply helping a new person with a question on the Beowulf Mailing list. The "community knowledge base" is quite immense and growing each day. If you experience a problem or have a question, rest assured, there is almost always someone else who is an email away from helping -- who incidentally has suffered a similar fate. And, in an open environment the quality of your answers is often higher.

Sidebar Two: Josip's Fix
In march of 1999, Josip Loncaric, found something peculiar with the Linux TCP stack. It seemed that the kernel was introducing delays for small packets. The behavior had to do with how the kernel implemented delayed acknowledgments and TCP timeouts. While this behavior had minimal impact on pretty much every other corner of the market, the cluster community saw very poor performance for small packets.

The availability of source code allowed Josip to address the issue for two different kernels. The first patch resolved the issue for the 2.0.36 kernel and a second patch for the 2.2.12/13 kernel allowed the user to tune the short packet behavior.

Those clusters users that relied on the the kernel TCP implementation were able to patch and rebuild kernels that worked better for small packets. There was no marketing decisions to be made, no release schedule, and no NDA to sign, and no drawn out decision process. Josip just fixed it.

You can read a report about the fix here.

Challenges

In fairness, all is not rosy in Linux "clusterland." There are issues unique to this environment that will need to be addressed. Perhaps the most important issue is how are ISVs (independent software vendors) going to target a fast moving and diverse software environment? It is not an easy problem. And, it is a problem that needs to be solved.

In addition, an ISV needs a hard line between their product and the cluster infrastructure. A mis-configured MPICH library should not be the problem of the ISV (although it often is).

Finding good professional support is also an issue. Clusters are diverse and thus make support from a single source difficult. New support models that leverage the openness of the cluster infrastructure maybe the best way to proceed.

Beyond ISVs and support, another challenge lurking may be an entire kernel forks for HPC systems. A highly optimized HPC kernel may diverge so much from the original kernel a new version is warranted.

It is not really a Linux thing

In a way, the success of Linux in the HPC world, is as much about the openness as it is about the UNIX heritage. Clusters have been and will continue to be built with closed source systems. Simple minded religious "Windows" vs. "Linux" arguments don't really solve anything. The real issue is that an open approach seems to be a much better way to address a small market with specific needs than using a one size fits all approach designed for mass markets. Both are valid models.

Indeed, HPC users now expect the "open plumbing" provided by Linux. The plumbing analogy is quite accurate. Many people live in house where the plumbing just works. It comes and goes into the walls of the house. Most people have no intention of ever modifying the plumbing, but surely would like the option if they had a chance. They might get a bit perturbed, however, if they wanted to hang a picture. Since the plumbing is secret, putting a nail in the wall presents the risk of damaging the pipes. So, no pictures for you, the plumbing will work, but your home is just not as interesting as it could be.

Fundamentally, we all like options, the more the better. Linux on clusters, like Linux on most other things, is about maximizing choice. In the HPC world, the decision to use Linux may seem like the natural choice, but remember, "... you didn't come here to make the choice, you've already made it. You're here to try to understand why you made it. I thought you'd have figured that out by now." -- famous cluster architect.

{mosgoogle right}

Sidebar Three: Resources
Tom Sterling's Beowulf Breakthroughs

Myricom

Gamma Project

bproc

Open mosix

BioBrew

Beowulf mailing list and Web Page

Book: How to Build a Beowulf, by Sterling, Salmon, Becker, Savarese, MIT Press, ISBN 0-262-69218-X

    Search

    Feedburner

    Login Form

    Share The Bananas


    Creative Commons License
    ©2005-2012 Copyright Seagrove LLC, Some rights reserved. Except where otherwise noted, this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. The Cluster Monkey Logo and Monkey Character are Trademarks of Seagrove LLC.