Features

The Value Cluster (Part 3): Packages and Testing

Install LAM

At this point, we hope we haven't worn you out. We have the VNFS tuned to where we want it and we have installed and tested SGE for queuing/scheduling jobs. It might be a good thing to install an MPI since we are going to want to run codes on the cluster. We chose to install LAM/MPI in the cluster.

Grab a copy of LAM from the Warewulf download site and build the binary rpms using the command rpmbuild --build lam-7.0.6-7.caos.src.rpm.

{mosgoogle right}

This process will create a number of rpms that should be installed on the master node as follows (look in /usr/src/redhat/RPMS/i386 for the files).

# rpm -i lam-7.0.6-7.caos.i386.rpm
# rpm -i lam-devel-7.0.6-7.caos.i386.rpm
# rpm -i lam-docs-7.0.6-7.caos.i386.rpm
# rpm -i lam-extras-7.0.6-7.caos.i386.rpm

The installation on the nodes is a bit simpler. As before, we will install directly to the kronos VNFS.

rpm -i --root /vnfs/kronos/ lam-7.0.6-7.caos.i386.rpm
Again, since we changed the VNFS, we need to build and reboot the nodes with the new image (see above).

Testing LAM/MPI

Not that LAM is installed, let's try testing it to make sure it's installed correctly. Since it's always good to do testing as a user and not as root, make sure you have created a user account on the master node. If not, create one now. After you create a user account, be sure to propagate (as root) the account information to all of the nodes using wwnode.sync command.

Log in as the user and create a subdirectory for testing LAM. Download the lam-tests.tgz tar file from the Kronos download page (look in the src directory). After you have extracted the file, check the lam-test/pi/README for instructions on how to compile and run both C and Fortran MPI programs. Also note the the Fortran programs must be linked as "static" because not all the required libraries are in the VNFS.

For these tests, we're going to run the code outside of SGE, so we need to create the file, lamnodes with a list of nodes to be used for the run. On way to do this is to run the following command.

wwnode.list -r -q > lamnodes
However, this only gets you a list of the compute nodes. If you want to use the master node as part of the LAM network, edit the file to add kronos to the beginning of the file.

Integrating LAM into SGE

In order to keep our ramdisk small we used the excludes-aggressive file. The keyword here is "aggressive". We need to include a few libraries for LAM that were excluded in our ramdisk. Open up the /etc/warewulf/vnfs/excludes-aggressive file and add the following

+ usr/lib/libkrb4.so.*
+ usr/lib/libkrb5.so.*
+ usr/lib/libdes425.so.*
+ usr/lib/libk5crypto.so.*
After, this line
usr/lib/libpopt.[^a]*

While we are at it, lets get rid of the unused kerberos files. Just add the line:

/usr/kerberos
to the end of the vnfs/excludes-aggressive file. The one other thing we need to do is to add rsh to the nodes. For some reasons the VNFS did not include rsh, which is needed by LAM/MPI when it is run by SGE. (SGE may start LAM jobs on a subset of the worker nodes which does not include the master node. LAM needs rsh to start its daemons on remote nodes. When we ran LAM on the master node, which has rsh installed, it was able to start daemons on any node.) You can grab the rpm from the Fedora Cores site by using the following command.
# wget -c http://download.fedora.redhat.com/pub/ \
  fedora/linux/core/2/i386/os/Fedora/ \
  RPMS/rsh-0.17-21.i386.rpm .
Then, as before install it in the kronos VNFS.
                                                                              
  rpm -i --root /vnfs/kronos/ rsh-0.17-21.i386.rpm
Of course you need to rebuild the VNFS and reboot the nodes again. If you would like to use the master node as a SGE execution node, then you will need to install and configure rsh-server rpm.

Finally, the lam-tests.tgz (see above) has some SGE tests that can be used to test LAM and SGE integration. See the lam-test/sge/README file. Also, be sure to look at the SGE submission scripts for more information.

Beowulf Performance Suite

Now that we have the essential programs installed and running, we can give our cluster the once over. Download the Beowulf Performance Suite bps-1.3-1.i386.rpm from the Kronos download page. The bps package needs two other packages (gnuplot and expect). Simply use Yum to install these packages and then install the bps rpm. {mosgoogle right}

# yum install gnuplot
# yum install expect
# rpm -i  bps-1.3-1.i386.rpm

The bps package is a collection of tests for a cluster. It ranges from memory bandwidth tests to the NAS parallel benchmark suite. You can get information on how to use the bps suite from A Tool for Cluster Performance Tuning and Optimization. The results for Kronos are here. We will have more next time on how to run these and other programs. And yes, we will be describing how to run HPL the Top500 benchmark.

Sidebar Resources
Kronos Downloads Page
Warewulf
PDSH
LAM-MPI
Sun Grid Engine
Sun Grid Engine for Warewulf
Intel Gigabit Adapter Optimization

Douglas Eadline is the head Monkey at Clustermonkeys and was the editor of ClusterWorld Magazine. Jeffrey Layton is Doug's loyal sidekick and works for Linux Networx during the day and fights cluster crime at night.

    Search

    Feedburner

    Login Form

    Share The Bananas


    Creative Commons License
    ©2005-2012 Copyright Seagrove LLC, Some rights reserved. Except where otherwise noted, this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. The Cluster Monkey Logo and Monkey Character are Trademarks of Seagrove LLC.