the move from pixels to petaflops
It is not often someone does our job for us and when it happens we like to take advantage of it. Earlier this month, Andrew Humber, Senior PR Manager, Tesla & CUDA Technologies at NVIDIA Corporation sent us a nice recap of the Tesla milestones for the past year. The GP-GPU revolution is moving rather fast as evidenced by Andrew's summary. Take a look you will be surprised how fast things are moving.
At-a-glance Summary of the Year
Launched Tesla 10 Series - the second generation of the Tesla GPU Computing processor, doubling precision, memory and performance for computational researchers. Launched the Tesla Personal Supercomputer - the world's first desktop system to deliver cluster class performance at a conventional workstation price, opening up supercomputing to the masses. The NVIDIA Tesla-powered TSUBAME supercomputer at the Tokyo Institute of Technology was ranked 29th in the world in the latest Top 500, making it the first GPU enabled cluster to enter the listing. Mathematica, the world's most powerful general computation software announced that Mathematica 7 will be GPU accelerated through CUDA and will deliver performance improvements up to 100X to more than 3 million users. NVIDIA Tesla and CUDA technologies are recognized by HPCWire and students and professionals clean up at SC08 industry awards demonstrating work based on NVIDIA GPUs and the CUDA architecture. NVIDIA's CUDA toolkit and SDK hit key milestones - 150K downloads, 150 applications published on CUDA Zone, more than 50 schools now teach the CUDA programming model and more than 750 research papers now published.
And here is the year....as it played out through 2008!
In June we launched the second generation of Tesla GPU Computing processor, the Tesla 10 series. For many of the researchers and scientists using the previous generation Tesla, this was an important introduction. Not only did we add double precision support, a much requested feature from several segments including finance, but we doubled the on board memory to 4GB, a huge boost to organizations in fields such as oil and gas who are dealing with large datasets. Perhaps most importantly, within just one year since the launch of the previous generation, we doubled the performance to 1 Teraflop per GPU, delivering seamless boosts in application performance for many developers.
Also in June, we were proud to recognize the University of Illinois at Urbana-Champaign as the world's first CUDA Center of Excellence, a program that recognizes schools that truly embrace the concept of parallel processing as the future of computing. CUDA Centers of Excellence integrate the C for CUDA programming environment into their curriculum as well as leverage GPUs and C for CUDA across several of their research facilities.
Illinois was closely followed by the University of Utah who in July proudly announced their accreditation and disclosed that three of their distinguished research facilities were leveraging GPUs in their work, one of which was the renowned Scientific Computing and Imaging Institute, headed up by Chris Johnson:
"Often before a great discovery there is the creation of a new tool or a tool that is used in a different way than before," said Chris Johnson, director of the Scientific Computing and Imaging (SCI) Institute at the University of Utah. "GPUs and the algorithms and software that they use are today's tools and with them we are entering a golden age, where scientific computing is going to truly change the way we do science and medicine."
Also in June, NVIDIA announced the release of its Folding@Home client, delivering speed increases of more than 140X over traditional CPUs. This announcement was to have a dramatic and immediate effect on the Folding@Home distributed computing application and by August, NVIDIA GPUs were delivering more than 1.25 petaflops, 42 percent of the entire processing power of the application, and more processing power than any other computing architecture had contributed in the entire history of the project.
Manifold, a company specializing in Geographical Information Systems (GIS) announced in August that they had won the 2008 Geospatial Leadership Award for their use of the massively parallel CUDA architecture to speed up the processing of highly complex GIS datasets from 20 minutes to 30 seconds, revolutionizing their industry. Dimitri Rotow, the product manager in charge of the application, said:
"It is no exaggeration to say that, at least for our industry, NVIDIA CUDA technology could be the most revolutionary development in computing since the invention of the microprocessor."
We love Dimitri!
Manifold is just one company out of dozens who have been harnessing the massively parallel architecture of the GPU to transform their work. In September, SciComp announced that they had boosted the performance of their derivative pricing software, SciFinance, by up to 100X by using the CUDA architecture:
"The code takes full advantage of the GPUs parallel architecture, delivering an immediate 20-100X execution speed increase. Pricing models that used to run in minutes now complete in seconds, allowing financial institutions to test alternatives models, increase scenario analysis, and better understand their potential risk exposure."
In a different field, SeismicCity, a trail blazing company specializing in Reverse Time Migration, an advanced technique for locating precious oil and gas reserves announced a 20X boost in their processing time from using NVIDIA Tesla GPUs based on the CUDA architecture. For SeismicCity, this was a game changer and they could now employ techniques that were previously impossible on CPUs.
"Transitioning to GPUs has given us a 10-20X performance boost, but more importantly, GPUs allow us to use computationally-intensive algorithms that we simply couldn't process with CPUs. This is a huge advancement which allows us to use RTM and other more accurate but data-intensive algorithms for larger datasets."
The Tesla business ended the year with a bang. At the SC08 convention in November, the team introduced the Tesla Personal Supercomputer, the first system to truly deliver on the promise of cluster class performance at the desktop and at a price point that makes it accessible to the masses. Industry luminaries such as Jack Dongarra, author of Linpack and Burton Smith, former CTO of Cray and Technical Fellow at Microsoft had fabulous things to say about the announcement:
"We've all heard 'desktop supercomputer' claims in the past, but this time it's for real," said Burton Smith, Microsoft Technical Fellow. "NVIDIA and its partners will be delivering outstanding performance and broad applicability to the mainstream marketplace. Heterogeneous computing, where GPUs work in tandem with CPUs, is what makes such a breakthrough possible."
"GPUs have evolved to the point where many real world applications are easily implemented on them and run significantly faster than on multi-core systems," said Prof. Jack Dongarra, director of the Innovative Computing Laboratory at the University of Tennessee and author of LINPACK. "Future computing architectures will be hybrid systems with parallel-core GPUs working in tandem with multi-core CPUs."
PC builders including Dell, Lenovo, Western Scientific, Boxx, Colfax, JRTI and 30 more global partners came out in support of this launch and according to Addison Snell at Tabor Research made it THE product news of the show.
We were also very proud to announce that NVIDIA Tesla technology made its entry into the Top 500 supercomputers in the world - at #29, courtesy of the Tokyo Institute of Technology who deployed 680 Tesla GPUs across their 77 Tflop TSUBAME supercomputer. We'd like to take this opportunity to congratulate them on this wonderful achievement. NVIDIA also got its own award when HPCWire gave us the coveted Readers Choice award for NVIDIA Tesla and CUDA technologies.
And in other show news, Wolfram Research announced that their upcoming Mathematica 7 software package, which is in use by more than 3 million developers today, will deliver 100X performance boosts to those using NVIDIA GPUs, thanks to its optimization for CUDA. Bull and HP announced they were now offering NVIDIA Tesla S1070 1U server products as integral parts of their HPC offerings and Cray announced that NVIDIA Tesla GPU Computing technology is also available to power it's CX1 Supercomputing system.
It was also a great show for those taking part in the show awards for papers, posters and challenges - through the work of these students and professionals, GPUs and the CUDA architecture received high accolades:
The Best Student Paper Award went to Vasily Volkov and James W. Demmel of the University of California, Berkeley for "Benchmarking GPUs to Tune Dense Linear Algebra." In the ACM Best Graduate Student Poster, first place went to Akila Gothandaraman of the University of Tennessee, Knoxville, for: "Acceleration of Quantum Monte Carlo Applications on Emerging Computing Platforms." David Dynerman of the University of Wisconsin-Madison secured second place for his poster entitled: "CUSA and CUDE: GPU-Accelerated Methods for Estimating Solvent Accessible Surface Area and Desolvation." Both posters focused on advances made possible by the NVIDIA CUDA architecture. See David Dynerman talk about his work on YouTube. National Instruments LabVIEW Application was named Finalist in Supercomputing Analytics Challenge for their use of GPUs to develop real-time control for the European Extremely Large Telescope (E-ELT). See it on YouTube.
To close out the year, we saw the announcement of OpenCL, a new programming environment from the Khronos Group for writing applications to the GPU. NVIDIA is delighted to see the industry moving in this direction and we wholeheartedly support OpenCL and future standards such as DX11 Compute. Exciting times ahead, developers now have more choices in how they write to GPUs and can really start to harness its massive computational muscle to bring an exciting new class of application to professionals and developers alike.
Happy Holidays and we'll see you all in 2009.
The NVIDIA Tesla Team