Autonegotiation, Diskless, PVFS, Multicast Discussions

Past Postings from the Beowulf, Bioclusters, PVFS Users Lists

There are a huge number of mailing lists available on the Internet. The Beowulf mailing list is a resource for the cluster community. This column will summarize and expand on issues that have been discussed on the Beowulf list and also include issues from other mailing lists that are useful and pertinent to clusters. In this column we visit the Bioclusters, Beowulf, and PVFS lists. We begin our survey in summer of 2003.

Bioclusters: Discussion of autonegotiation between switches and NICs

There was a very interesting discussion on the Bioclusters mailing list that started with Victor Ruotti asking on Aug. 8th 2003 why his Apple Xserve systems would not connect at full-duplex. The initial approach suggested was to force the systems to full-duplex by using ifconfig. Donald Becker came to the rescue and explained that forcing full duplex on both ends, the NIC (Network Interface Card) end and the switch port end, was a bad idea that can lead to administration headaches and nightmares. He said that auto-negotiation is reliable and any failure is likely due to flawed hardware or due to poorly configured network switches and gave a couple of links for people to read. He also pointed out that the transceivers on almost all NICs will fall back to autosensing the link speed if autonegotiation fails and how this affects the link speed of a connection. He finally finished up with a brief historical perspective on why the forced manual configuration was started and why it has stayed so prevalent in network configurations.

{mosgoogle right}

What we can take from this discussion is that you should always allow the NIC and switch to autonegotiate to avoid problems down the road. You can then check your network connection using some diagnostic tools or test the connection using some benchmarking tools. If the connection is not what you think it should be, then the problem may be with bad hardware or with a faulty switch configuration.

There are a number of tools to help you diagnose your network connection. Donald Becker's excellent website, can help you get started and of course, you can always post to the mailing lists, such as the Beowulf mailing list, to get help. But before you post, please read Donald's website, go over the beowulf mailing list archives, and use a search engine, such as Google, to search for help first, and then post your questions to the mailing list.

Beowulf: Where diskless computer nodes are appropriate?

On July 15th 2003, Tod Hagan asked the Beowulf list a question, "where are diskless compute nodes inappropriate?". The idea of a diskless node is to do away with the hard disk in the node and use a network (usually NFS - Network FileSystem) mounted root filesystem. They are used because they require less power, have better reliability, fewer moving parts, and are cheaper than nodes with disks. Nicholas Henke responded that diskless nodes are not appropriate when accessing data locally is faster than via NFS or some other network filesystem. Also, he mentioned that diskfull nodes (nodes with hard disks) are appropriate when the application uses swap for memory. He gave an example of his site where they run BLAST (a bioinformatics application) and they have large data sets that perform a large amount of disk I/O (Input/Output). Putting disks in the nodes and copying the data to the nodes gives faster run times than running over NFS. Joe Landman echoed this and gave some specific numbers and examples and pointed out that a central fileserver serving out a filesystem can be swamped if a relatively small number of nodes are performing I/O operations at the same time.

Bill Broadley continued with a cluster architecture example of building nodes with a local hard drive that is used just for swap and scratch space. The operating system (OS) is mounted via NFS. If a hard drive dies, that node can be rebooted without the swap or scratch space so the node is still useful until it is repaired or replaced. Bill also pointed out that this can reduce the administrative costs of managing, patching, backing up, troubleshooting, etc. of the nodes because the operating system and configuration is centrally stored on the main server for the cluster. You could extend this idea for a large number of nodes, by using a small number of dedicated servers who's only purpose is to be a NFS server for the OS for neighboring nodes. Some of the larger clusters are known use this type of approach.

Beowulf: Multicast or snowball copy?

Rene Storm asked the Beowulf mailing list about efficiently distributing large files throughout a cluster on the 18th of August 2003. Several people responded with some recommendations. Mikael Fredriksson recommended BitTorrent. Felix Rauch recommended Dolly, a tool he wrote to clone hard-drives. Felix also mentioned that by using an appropriate TCP chaining approach, the solution is often faster than a multicast approach. Felix also gave a reference to two papers about copying files efficiently across clusters. Donald Becker echoed that using multicast can cause problems, particularly over larger clusters. He also mentioned that a geometrically cascading copy can work very well. Thomas Lange suggested rgang, a Python code that uses a tree structure for copying files to nodes or executing commands on many nodes.

The tools mentioned here, and others, can be used for copying files to nodes as part of a user's job or as part of administration where files need to be copied to the nodes for installation or updating. They also can be used as part of an administrator's toolkit to execute commands on all of the nodes in the cluster.



    Login Form

    Share The Bananas

    Creative Commons License
    ©2005-2012 Copyright Seagrove LLC, Some rights reserved. Except where otherwise noted, this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. The Cluster Monkey Logo and Monkey Character are Trademarks of Seagrove LLC.