Building Your First Cluster

Article Index

Usage Examples

The following are simple examples of taskmaster/task usage. Don't worry about the fact that parallel speedup seems modest. Next month we'll explore parallel speedup with taskmaster and show how even this very simple example, with the crudest possible task communication mechanism, can still yield excellent parallel scaling. Or you can play with arguments on your own for "homework" in the meantime and see if you can discover this yourself! {mosgoogle right}

rgb@lilith|T:114>./taskmaster hostfile 1 10 1

Spawning host threads

Host lucifer thread running.

rand[0] = 0.840188
rand[1] = 0.394383
rand[2] = 0.783099
rand[3] = 0.798440
rand[4] = 0.911647
rand[5] = 0.197551
rand[6] = 0.335223
rand[7] = 0.768230
rand[8] = 0.277775
rand[9] = 0.553970

Results:
  nhosts   nrands    delay     time
     1        10        1       13

Note that one host takes thirteen seconds to do ten seconds worth of work! Not too good!

rgb@lilith|T:121>./taskmaster hostfile 5 10 1

Spawning host threads

Host lucifer thread running.
Host caine thread running.
Host uriel thread running.
Host abel thread running.
Host archangel thread running.

rand[0] = 0.840188
rand[1] = 0.394383
rand[2] = 0.700976
rand[3] = 0.809676
rand[4] = 0.561380
rand[5] = 0.224983
rand[6] = 0.916458
rand[7] = 0.133982
rand[8] = 0.274746
rand[9] = 0.046468

Results:
  nhosts   nrands    delay     time
     5        10        1        5

Better! Five hosts now take less than 10 seconds to do 10 seconds worth of work. However a lot of computers for only a factor of two speedup! One more try:

rgb@lilith|T:122>./taskmaster hostfile 5 10 10

Spawning host threads

Host lucifer thread running.
Host caine thread running.
Host uriel thread running.
Host abel thread running.
Host archangel thread running.

rand[0] = 0.840188
rand[1] = 0.394383
rand[2] = 0.700976
rand[3] = 0.809676
rand[4] = 0.561380
rand[5] = 0.224983
rand[6] = 0.916458
rand[7] = 0.133982
rand[8] = 0.274746
rand[9] = 0.046468

Results:
  nhosts   nrands    delay     time
     5        10       10       30

Much better. Now five hosts take 30 seconds to do 100 seconds worth of work. This might turn out to be worthwhile after all!

Conclusion

Building a cluster (or discovering that your existing LAN is a cluster) is apparently pretty easy, really. Using a fairly simple "master" script and a "worker" application we can clearly do work in parallel and can already see a significant speedup (a factor of three using five hosts) which should be quite reproducible on just about any LAN. Between now and next month, you can play with taskmaster and task and see if you can discover settings that yield really good parallel speedup (where running on 5 hosts completes in close to 1/5 the time of one host).

In future columns we'll explore themes like Amdahl's Law and parallel speedup, parallel libraries, "the standard linux cluster design", and more, using this basic cluster (and taskmaster/task) as a starting point. We'll also learn better ways of installing and managing a cluster, how (and when) to go beyond the simple NOW-style cluster and into a GRID or a Beowulf, how to compare shelved towers, rackmounts, and different processor types and networks. Our goal will be to achieve a sufficient level of experience that you are ready to be "handed off" to my brother columnists, whose dedicated columns will take you from being a neophyte to a perfect master in the specific areas of clustering that benefit you most.

Hope to see you there.

(source code is on the next page)

    Search

    Feedburner

    Login Form

    Share The Bananas


    Creative Commons License
    ©2005-2012 Copyright Seagrove LLC, Some rights reserved. Except where otherwise noted, this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. The Cluster Monkey Logo and Monkey Character are Trademarks of Seagrove LLC.