Conference Reports

SC05 Jeff's Monkey Blog - Day 1

Published on Tuesday, 15 November 2005 19:00
Written by Jeff Layton
Hits: 3986
Jeff's Blog: Day 1

I'm sorry I haven't been writing as much as usual but I'm fighting a good case of pneumonia as well as bronchitis (hint, don't kiss me even if you really, really want to :) ).

However this hasn't stopped your intrepid reporter from working the (a href="">Linux Networx booth (they keep me gainfully emploed and allow me to work on some really cool projects) or attending the occasional late night get together or cruising the vendors on the show floor.

Before I jump into Day 1 of SC05, I want to talk about a couple of things prior to the opening day. First, the Supercomputer show is very unusual because on the same exhibit floor you have vendors and their customers all displaying what they do in their booths. This is unlike any other show I've been to. It's a little disconcerting at first, but you get used to it.

Monkey Get Together

The first Monkey get together went very well. A number of people showed up and all of the hats were given out. I got to see some old firends like Roger Smith, Joey, and Trey from the ERC at Mississippi State, Dan Stanzione of Cluster Monkey-famedom, Glen Otero - International Man of Mystery and Super Cluster Monkey (ladies, he's still single and still a body builder), and others. I also got to meet some new friends like Josip Loncaric who was one of the early beowulfers and was a great help in improving the TCP performance of the 2.2 and 2.4 kernels. He now works for Los Alamos on aspects of clusters and high performance computing. It was a real honor to meet him and to talk to him (a little hero worship going on there). I also spent some time talking to Dimitri Mavriplis, who is a professor at the University of Wyoming. He is one of the best CFD (Computational Fluid Dynamics) researchers in the world. It was great fun to talk CFD with him since that's one of my interests, as well as clusters (he uses clusters in his research). If you are looking for CFD codes for your clusters, Dr. Mavriplis is the man to talk to.

Day 1:

This day started out like any other in Seattle - trying to find a Starbucks. Oops. It seems like there is at least one per block all over Seattle. I lived here about 15 years ago and there weren't this many when I left. The lines at the Starbucks and the Subway in the Convention Center are not to be believed. I ended up at a place called Unconventional Pizza which seems to be a Seattle version of the Soup Nazi. However his Pizza isn't as good as the Soup Nazi's soup.

There are a huge number of booths on the show floor and I've only gotten to a couple of them so far. Day 2 should be better. I did talk to the folks at Pathscale some. They are doing some of the best work for clusters and HPC I've seen. Their compilers are among the best and Greg Lindahl took some time to show me how you can use their compiler to search for the best set of compile flags for performance. What was even more interesting was that he said they love to hear what compiler flags people end up using for what codes. This helps them understand how to improve their compiler and how people write code (holy customer feedback Batman!). I can also safely say that their Infinipath interconnect is hot sh**!!! Linux Networx is using it in a new system called LS/X. The benchmarks we've run really confirm that the boosts that Pathscale have been making are really true. Linux Networx will post some benchmarks in the near future, but I can promise you that the performance from Infinipath is amazing for just about any code you can throw at it (except the embarassingly parallel ones, natch).

While I'm talking about Linux Networx, I thought I would throw in a shameless plug for the company I work for :). We introduced two new systems: the LS-1 and the LS/X. Our company is talking a more systems approach to clusters. The idea is to make the clusters easier to use, easier to manage, easier to support, and easier to upgrade.

The LS-1 is designed for the small to medium range market with up to 128 nodes. The current system is Opteron only with dual CPU nodes. You can also choose from a GigE network, Myrinet 2G network, or an Infinband network (Inifinipath is coming around 1Q of 2006). We also have a number of storage options from simple NFS boxes to parallel file systems with great IO performance. We are also showing a technology demo of parallel visualization for the LS-1. We should have a viz product in 1Q of 2006. I can promise you that this product will be really neat and cost much less than the SGI viz equipment.

The LS/X is desgined for the upper range of supercomputer performance. It uses midplane architecture where the boards slide into an 8U sub-rack (I guess you can call them blades). We currently are shipping a 4-socket Opteron node with two built-in Inifinipath NICs, two GigE NICs, and up to 64 GB of memory. We are doing some 8-socket boards for special situations, but they may or may not be generally available. However we are showing an 8-socket node with 8 Opteron sockets, 4 Inifinipath NICs, 4 GigE NICs, and up to 128 GB of memory. We can get up to 6 of the 4-socket nodes in an 8U sub-rack and up to 4 sub-racks in a normal rack, for a total of up to 96 sockets in a single rack. The nodes slide into a mid-plane to get their power (from a DC PDU in the bottom of the rack), communication, and expandibility. the sub-racks have built-in Tier-1 swtiching for the Infinipath and GigE networks. The racks can also have Tier-2 switching in the bottom of the rack. These built-in switches greatly reduce the number of required cables. For a full rack you only need 17 cables!! A very high percentage of the parts of the nodes are field replacable (you just pull them out and put in a new one). The racks are also designed to sit over vented tiles in a raised floor area to pull air up into the rack. This eliminates hot air recirculation. The performance of the LS/X is setting records on benchmarks which should be posted on the website soon, if not now. It is very competitive to the IBM Blue Gene, Power 5, Cray XT3, Cray XD1 on the HPC Challenge Benchmark. In some cases we have the best performance of any of these systems.

The Intel booth is right next to the Linux Networx booth so I did want to mention that an Intel person, who watched our unveiling of the LS-1 and the LS/X on Monday night, that they thought our systems were the "...sexiest machines on the floor..." despite not having Intel chips in them.

OK, enough of the shameless plug :)

Tyan is working on a new personal cluster that has a small chassis with four dual socket Opteron boards in them. They use the HE version of the Opteron to reduce power and cooling. The box is very, very nice. They say they hope to bring it to market in the near future. I hope so to!!

I think I'll stop here since I need to get to some meetings and fulfill my "booth duty" for the day.