Summer of Code: Gentoo Cluster Project

Published on Friday, 15 August 2008 08:38
Written by Douglas Eadline
Hits: 4612
Or, How I Spent My Summer Vacation

Recently, I had a chance to talk with Eric Thibodeau about his Google Summer of Code Cluster project. It seems Eric and Donnie Berkholz have been working hard on Using Gentoo, Seed Linux and Catalyst, to provide an easy access to a Beowulf clustering/HPC environment to everyday users. i.e. a live cluster Gentoo CD/DVD. The following is Eric's status report.

Through this year's Google Summer of Code (2008) funding, Gentoo is laying down the groundwork for its first cluster-centric packages and LiveCD. We're concentrating our efforts on providing media (CD/DVD) from which one can boot a machine to take on the Master Node's role which will also be provisioning slave nodes with NFS-mounted images. This means: no HDDs required, no modification to that computer lab you promised the techs you wouldn't touch!

Much of the project's emphasis is on retaining the original Gentoo configuration approach so that the management of the cluster doesn't differ from a regular installation. This also means that we stay as close as possible to the upstream's code and implementation docs. In essence, the ebuilds (Gentoo packages) resulting from this project will be usable on any fresh Gentoo installation, taking care of the details of initially setting up DHCP, NFS, PXE, centralized authentication (LDAP), as well as pulling in some basic clustering packages and utilities such as MPI libraries, profiling and benchmarking tools (yes, we're looking into creating an ebuild for the Beowulf Performance Suite).

The current work environment is x86_64 with no specific optimizations so the generated code will work on both AMD and Intel platforms (The master is not limited to booting nodes of its own architecture). But since this is Gentoo, and we love building and tweaking our hardware to sweats, the entire CD creation process can be reproduced and customized by the use of a single command: catalyst. All files and instructions required to reproduce the current iteration of the LiveCD are available on the project's git repository (It is Linux style Alpha, this means it really is in development), this way, anyone can reproduce the process, fine-tune it and even include personal code on the Live Medium.

Hopefully, the process will be simple enough to ease the sharing of the actual execution environment to ease the comparison of actual code performance and environment tweaking results.

If you want to help, you can contact Eric at: kyron (you know what to put here)  neuralbs (and here) com.

Unfortunately you have Javascript disabled, please enable Javascript in order to experience the comments correctly