California's earthquake risk spurs supercomputing efforts

IBM's 10-petaflop system will be used to run quake simulations in 2011

Comments

NEW ORLEANS -- The rush to build more powerful supercomputers is part of a larger race to solve some of mankind's biggest problems and threats, and one person on the front line of that effort is Thomas Jordan, the director of the Southern California Earthquake Center.

"We are very concerned about the current state of the faults in Southern California," said Jordan, who described the San Andreas Fault as "locked and loaded and ready to roll" and one day unleash a sizable earthquake.

Jordan and his team have been running simulations of how an earthquake might affect southern California on the Jaguar system at Oak Ridge National Laboratory. Until this week Jaguar had been the world's fastest supercomputer , at 1.75 petaflops. But it has now been eclipsed by China's 2.5-petaflop system , Tianhe-1A, according to the semiannual Top500 ranking of the world's most powerful machines.

The point of building a supercomputer is to improve research capabilities. Supercomputers allow scientists to study in simulated environments everything from the effect of the BP oil spill on coastlines to the behavior of cells at an atomic level and the impact of an earthquake.

"Normally, we learn these things from just hard experience [but] we think we can learn a lot about what to do from simulation," said Jordan. "We can't predict the future; we can't predict earthquakes. But we can really begin to do some really detailed simulations, and that makes people think."

The idea is to use simulations to show how to prepare for an earthquake. With increasing computer power, simulations can be built that model how an earthquake, at different magnitudes, will hit neighborhoods, affect infrastructure and rock buildings; they could also show where and how an earthquake creates fire risks, among other things.

Jordan is preparing his applications to run on Blue Waters, a planned 10-petaflop system (meaning it would deliver 10 quadrillion calculations per second) that's due to be up and running next year at the National Center for Supercomputing Applications (NCSA) at the University of Illinois, Urbana-Champaign. That scale of computing will mean that an application with a runtime of 4,000 processor hours on Jaguar can be completed in just 770 hours on Blue Waters, said Jordan.

A processor hour is equivalent to one hour on one processing core.

In a presentation at the SC10 supercomputing conference here this week, Jordan ran a video showing a simulation of the energy from a magnitude 8 earthquake being distributed around Southern California. It wasn't a Hollywood production, with scenes of buildings shaking and roadways cracking, but an intensely graphical display of sharp yellow and red peaks and valleys moving swiftly over a map of the region, indicating where an earthquake might do the most damage. The product of mountains of data, the graphics on the map were just as alarming as any images Hollywood produces.

There is urgency to Jordan's need for more computing power. The Pacific Plate is moving northwest at a rate of about 16.4 feet every 100 years. "That means we've got to have big earthquakes every 100 years," said Jordan.

The 1906 San Francisco earthquake, which had about 16 feet of displacement, produced a 7.8 magnitude earthquake. But in the area around the southern San Andreas Fault, the last big quake was in 1857, and further south, there hasn't been a big one since 1680, said Jordan. Two years ago, the California Geological Survey and the earthquake center estimated that there's a 99% probability of magnitude 6.7 or larger quake during the next 30 years in California.

It could be years before there's a big earthquake, said Jordan. But "I can tell you all the seismologists are very nervous."

Blue Waters, which is being built by IBM , will have the fastest cores "of its generation." It also has the fastest memory infrastructure and fastest interconnect, "which means that data movement is much faster in the Blue Waters system than other systems of today -- and actually [faster than] many of the systems that will be in place next year and the following years," said William Kramer, the deputy director of the Blue Waters Project at the NCSA, which is overseeing development of the system.

"The combination of not just how fast the cores go, but how fast can you get data to them, is really the secret why Blue Waters will be such a step forward," said Kramer.

The system's fundamental building block is a 32-core SMP -- four eight-core Power 7 chips -- with shared memory and one operating system image, yielding about a teraflop of aggregated computing power, said Kramer. The interconnect is about 1.1TB per second, and the system was designed to reduce the number of hops data must take, because each hop adds time. "In the normal case, to go anywhere in the system from any one core to any other core; worst case it takes five hops," said Kramer.

The machine will have more than 300,000 processing cores.

The hardware will arrive next year, with scientific research expected to begin late in the year. The system takes up around 5,000 square feet of space -- twice that when you add in the supporting infrastructure and the advanced storage it needs. A new building was built this year to house Blue Waters and other systems, and t was designed with future needs in mind. It has the potential to support up to 100 megawatts of power, with Blue Waters itself using around 12MW. Blue Waters relies on water cooling, as well as the cooler Illinois air to stay chilled, and it will be allowed to run in a hotter environment -- one with temperatures above 80 degrees.

The entire project, including the computer, is expected to cost more than $300 million.

The drive for more computing power is spurred by a number of factors. There is the need for speedy results, particularly in an emergency situation such as when healthcare researchers are modeling the spread of a virus like H1N1.

Blue Waters will be followed by two 20-petaflop systems in 2012, one at Oak Ridge National Laboratory and the other at Lawrence Livermore National Laboratory.

One reason scientists need large systems is for their ability to run thousands or even millions of the same type of simulations. The approach is called "uncertainty quantification," where scientists run simulations with different inputs and physics, such as those used in weather forecasting, and then develop a statistical analysis, producing a result with "higher fidelity," or better predictive capability, said Karl Schulz, associate director of high-performance computing at the Texas Advanced Computing Center at the University of Texas in Austin.

There is also a strong desire to run "multi-physics," which are simulations that bring together, for instance, fluid mechanics and structural dynamics and chemistry, said Schulz.

The need for systems that can handle these big problems is one of the reasons an effort to build an exascale computer is important, Schulz said. An exascale system would be 1,000 times more powerful than a petaflop system, and is the kind of system Jordan is anxious to use. The first one is expected to be ready sometime around 2018.

Patrick Thibodeau covers SaaS and enterprise applications, outsourcing, government IT policies, data centers and IT workforce issues for Computerworld. Follow Patrick on Twitter at @DCgov , or subscribe to Patrick's RSS feed . His e-mail address is pthibodeau@computerworld.com .

Read more about mainframes and supercomputers in Computerworld's Mainframes and Supercomputers Topic Center.