Announced in Denver, Colorado, at the SC13 supercomputing conference, the awards recognize noteworthy achievements using high-performance computing (HPC) technologies. OLCF partners taking home the award this year include GE Global Research, Ford Werke GmbH, an Oak Ridge National Laboratory team, and the Southern California Earthquake Center.
GE brought home the award for simulating hundreds of millions of water molecules freezing in slow motion. By looking at the freezing behavior of water molecules, GE can better design an array of products such as wind turbines to operate more efficiently in cold conditions. This year’s award marks the second win for GE Global Research, which took the award last year for modeling on the OLCF’s Jaguar system of unsteady air flows in the blade rows of turbomachines.
For more information on GE’s research visit: https://www.olcf.ornl.gov/2013/10/25/titan-propels-ge-wind-turbine-research-into-new-territory/
For more information on last year’s award visit: https://www.olcf.ornl.gov/2012/07/07/olcf-partner-wins-major-industry-award-2/
Researchers at Ford won for simulating for the first time the complete under hood air flow and optimizing the under hood cooling package in order to reduce so-called cooling drag and increase fuel efficiency for Ford automobiles.
For more information on Ford’s research visit: https://www.olcf.ornl.gov/2013/08/14/ornls-supercomputer-gets-under-the-hood/
Researchers at Oak Ridge National Laboratory used the OLCF’s Titan system to perform the first simulations of organic solar cell active layers at the scale of working devices. The new understanding gained from Titan will aid in the rational design of cheap solar cells with higher efficiency.
For more information on this research visit: https://www.olcf.ornl.gov/2013/08/21/titan-sheds-light-on-unknowns-in-organic-photovoltaic-research/
The Southern California Earthquake Center (SCEC) received an IDC award for a simulation platform called CyberShake. With the help of code running on Titan, SCEC aims to overcome some of the computational barriers that prevent simulation of an earthquake’s higher frequencies. This platform allows for the center to better assess a potential earthquake’s impact on a region. —by Austin Koenig
For more information about all of this year’s IDC HPC Innovation Excellence Awards visit: http://www.idc.com/getdoc.jsp?containerId=prUS24451613.]]>
The Oak Ridge Leadership Computing Facility (OLCF) earned three HPCwire awards in high-performance computing (HPC) for collaborative industrial research projects conducted at Oak Ridge National Laboratory (ORNL).
HPCwire, a leading publication for supercomputing news, recognized OLCF supercomputer users Ford Motor Company and GE Global Research with two Editor’s Choice Awards and one Readers’ Choice Award.
The companies gained access to the OLCF’s supercomputers through a program at ORNL called Accelerating Competitiveness through Computational Excellence, or ACCEL. This industrial partnership program aims to help companies boost competitiveness through easy access to the lab’s world-class computational resources and expertise.
“ACCEL is helping companies like Ford and GE tackle competitively important problems whose solutions will help reduce the cost and time-to-market for new products,” said Suzy Tichenor, director of industrial partnerships. “The three HPCwire awards are a testament to the success of this collaborative program.”
The Editor’s Choice Award for the best use of HPC in manufacturing went to GE Global Research and the OLCF for research to better understand ice formation at the atomic level. Using OLCF’s Titan supercomputer, researchers for the first time simulated hundreds of millions of water molecules freezing in slow motion. New insights into how ice forms will help GE develop wind turbines that are better able to withstand debilitating ice accumulation in cold climates.
Ford Motor Company and the OLCF were recognized with an Editor’s Choice Award for the best use of HPC in automotive research and a Readers’ Choice Award for the best HPC collaboration between government and industry. The awards recognize research conducted on the OLCF’s Jaguar supercomputer, upgraded and renamed Titan in 2012, to optimize for the first time the underhood airflow in automobiles to reduce cooling and drag and increase fuel efficiency. Jaguar used technology from DataDirect Networks for computer-aided engineering and computational fluid dynamics simulations.
“It’s always an honor to publicly recognize the organizations and individuals whose hard work, dedication, and efforts over the past year have contributed to scientific discoveries and new breakthroughs in emerging technologies,” said Tom Tabor, CEO of Tabor Communications Inc., publisher of HPCwire. “The awards represent recognition by the high-performance computing community to its own for significant contributions to the advancement of science and technology.”—by Jennifer Brouner]]>
The U.S. Department of Energy’s Office of Science announced 59 projects, promising to accelerate scientific discovery and innovation, that will share nearly 6 billion core hours on two of America’s fastest supercomputers dedicated to open science. Their work will advance knowledge in critical areas from sustainable energy technologies to the environmental consequences of energy use.
The allocations come from the Innovative and Novel Computational Impact on Theory and Experiment, or INCITE, program. Through it, the world’s most advanced computational research projects from academia, government, and industry are given access to the Department of Energy’s (DOE’s) leadership computing facilities at Oak Ridge and Argonne national laboratories.
“The INCITE program addresses the largest, most computationally pressing projects in science and engineering,” said Michael Papka, director of the Argonne Leadership Computing Facility (ALCF). “These allocations enable state-of-the-art science in a wide range of domains.”
“The INCITE program—which is celebrating its 10-year anniversary—provides researchers with the opportunity to make scientific breakthroughs in fields that would not be probable or even possible without access to the most powerful available supercomputers,” said James Hack, director of the National Center for Computational Sciences, which houses the Oak Ridge Leadership Computing Facility (OLCF).
When INCITE made its first awards in 2004, three projects received an aggregate five million hours on DOE supercomputers. Today’s collective allocation of nearly 6 billion core hours represents a 1,000-fold growth in resources provided to researchers. The average award is more than 75 million core hours—with individual awards of up to several hundred million core hours—on systems capable of quadrillions of calculations each second.
The ALCF’s primary leadership computing resource is Mira, a 10-petaflops IBM Blue Gene/Q system with 49,152 compute nodes and a power-efficient architecture. The OLCF’s Titan supercomputer is a 27-petaflops Cray XK7 hybrid system employing both CPUs and energy-efficient, high-performance GPUs in its 18,688 compute nodes.
Despite continued upgrades and expansions, demand for leadership computing facilities surpasses availability, and DOE’s world-class facilities continue to attract new users. This year INCITE applications greatly exceeded awards.
“INCITE is one of the main programs that gives researchers access to some of the country’s leadership computing facilities,” said Julia White, INCITE manager at DOE’s Leadership Computing Facilities. “Large supercomputer awards like this also give researchers support from computer experts who design code and optimize it for the supercomputers, which helps ensure that the scientists who run simulations on DOE’s machines can take full advantage of their enormous processing power.”
Supercomputer simulations create a detailed picture of complex phenomena by relying on codes packed with math equations. For a complete list of 2014 INCITE awards, see http://www.doeleadershipcomputing.org/awards/2014INCITEFactSheets.pdf. Highlights of the 2014 allocations include the following:
DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time.
The INCITE program promotes transformational advances in science and technology through large allocations of time on state-of-the-art supercomputers. For more information, see http://www.doeleadershipcomputing.org/incite-program/.]]>
There is no doubt that the Oak Ridge Leadership Computing Facility’s (OLCF’s) Titan, the nation’s most powerful supercomputer, gets its kick from its 18,688 GPU accelerators. On Titan GPUs operate in tandem with CPUs to simulate groundbreaking scientific research at breakneck speeds. Now the OLCF is working with Mentor Graphics, a leading electronic design automation company, to bring accelerated computing to a broader audience.
By integrating the OpenACC programming standard into the open-source GCC compiler suite, Mentor Graphics is providing the first open-source implementation of OpenACC 2.0. Because GCC is the default compiler on most Linux distributions and is readily available on other platforms, including Mac and Windows, this implementation will greatly expand access to the language and facilitate the development and testing of OpenACC applications on smaller systems, such as workstations and clusters.
OpenACC is an application program interface that allows programmers to provide simple directives to the compiler, identifying which areas of code to offload to the accelerator in a readable and portable fashion. For non-OpenACC-aware compilers, the directives are simply comments, which are ignored. The directives let the compiler do the heavy lifting to map the computation onto an accelerator, while the programmer can focus on the overall structure of the application and the results of the computation instead of generating code specific to the accelerator.
As a member of the OpenACC Standards Group, the OLCF is lending Mentor Graphics its institutional expertise and user insight. The group is a nonprofit corporation funded by the tech companies that developed OpenACC, and members include many companies, universities, and national laboratories that develop or use accelerated computing.
“We’ve been involved with OpenACC since its inception because we want our users to have a standard, portable programming approach for GPUs,” said David Bernholdt, OLCF lead for programming environment and tools. “We’ll be working with Mentor Graphics to help ensure that the GCC implementation of OpenACC works well for OLCF users as well as the larger community.”
For the past year scientists and engineers have used OpenACC, as well as other approaches, on Titan to accelerate many types of applications, ranging from molecular dynamics codes to particle physics models. By accelerating applications to GPUs, users have routinely seen speedups as high as two to seven times faster than equivalent CPU-only architectures.
To prime 2013–2014 users for Titan’s new GPUs, the OLCF offered the Center for Accelerated Application Readiness (or CAAR) program. A handful of world-class applications and teams of researchers using them were selected to run early projects on Titan. OLCF staff worked closely with application experts to optimize each application’s performance on Titan’s GPUs. Mentor Graphics will be tapping into this experience as it develops OpenACC for the GCC compiler.
“The GCC compiler is used in several applications users run on Titan. Having OpenACC in GCC could not only expand research possibilities for OLCF users, but for those at other high-performance computing centers as well,” Bernholdt said.
As OpenACC in GCC becomes readily available, OLCF, Mentor Graphics, and other members of the OpenACC Standards Group predict the open-source version will continue to transform science and industry, just as its commercial predecessors are already doing.]]>
A team of researchers simulating high-temperature superconductors has topped 15 petaflops—or 15 thousand trillion calculations a second—on Oak Ridge National Laboratory’s (ORNL’s) Titan supercomputer. More importantly, they did it with an algorithm that substantially overcomes two major roadblocks to realistic superconductor modeling.
For their achievement, the team from ETH Zurich in Switzerland and ORNL was named a finalist for the Gordon Bell Prize, awarded each year for “outstanding achievement in high-performance computing.”
Materials become superconducting when electrons within them form pairs—called Cooper pairs—allowing them to collect into a condensate. As a result, superconducting materials conduct electricity without resistance, and therefore without loss. This makes them immensely promising in energy applications such as power transmission. They are also especially powerful magnets, a property exploited in technologies such as maglev trains and MRI scanners.
The problem with these materials is that they are superconducting only when they are very, very cold. For instance, the earliest discovered superconductor, mercury, had a transition temperature of 4.2 Kelvin, which is below -450 degrees Fahrenheit and very close to absolute zero. Mercury and other early superconductors were cooled with liquid helium—a very expensive process. Later materials remained superconducting above liquid nitrogen’s boiling point of −321 degrees Fahrenheit, making their use less expensive.
The discovery, or creation, of superconductors that needn’t be cooled would revolutionize power transmission and the energy economy.
The Swiss and American team approaches the problem with an application called DCA++, with DCA standing for “dynamical cluster approximation.” DCA++ simulates a cluster of atoms using the Hubbard model—which describes the behavior of electrons in a solid. It does so with a quantum Monte Carlo technique, which involves repeated random sampling.
The application earned its development team the Gordon Bell Prize in 2008.
The new method, known as DCA+, was developed largely by Peter Staar at ETH Zurich. It scaled to the full 18,688-node Titan system and took full advantage of the system’s NVIDIA GPUs, reaching 15.4 petaflops.
In addition, it takes full advantage of the energy efficiency inherent in Titan’s hybrid architecture. Each node of Titan contains both a CPU and a GPU. Using this system, simulation of the team’s largest realistic clusters consumed 4,300 kilowatt-hours. The same simulation on a comparable CPU-only system, the Cray XE6, would have consumed nearly eight times as much energy, or 33,580 kilowatt-hours.
The DCA+ algorithm also took a bite out of two nagging problems common to dynamic cluster quantum Monte Carlo simulations. These are known as the fermionic sign problem and the cluster shape dependency.
The sign problem is a major complication in the quantum physics of many-particle systems when they are modeled with the Monte Carlo method.
Particles in quantum mechanics are described by a wave function. For electrons and other fermions, this function switches from positive to negative—or vice versa—when two particles are interchanged. If you then sum the many-particle states, the positive and negative values nearly cancel one another out, essentially destroying the accuracy of the simulation.
“This is a cluster method,” said team member Thomas Maier of ORNL. “If you could make the cluster size infinite, then you would get the exact solution. So the goal is to make it as large as possible.
“But there’s a problem when you deal with electrons, which are fermions. It’s the infamous fermion sign problem, and it really limits the cluster size we can go to and the lowest temperature we can go to with quantum Monte Carlo.”
The sign problem cannot be overcome simply by creating larger supercomputers, Maier noted, because computational demands grow exponentially with the number of atoms being simulated. In other words, as you go to realistically large systems, you get problems that overwhelm not only every existing supercomputer, but any system we’re likely to see in the foreseeable future.
According to team member Thomas Schulthess of ETH Zurich and ORNL, the DCA+ algorithm arrives at a solution nearly 2 billion times faster than its DCA predecessor. So while it doesn’t make the sign problem go away entirely, it does make room for much more useful simulations, specifically by allowing for more atoms at lower temperatures—a key requirement, since so far superconductivity happens only in very cold environments.
The other problem—cluster shape dependency—meant that when the researchers simulated an atom cluster, the answer they got varied widely depending on the shape of the cluster.
“Let’s say you have two 16-site clusters,” Maier explained, “one two-dimensional system with a four-by-four cluster and another 16-site cluster of a different shape. The results with the standard DCA method depended a lot on the cluster shape. That’s of course something you don’t really want.
“By improving the algorithm we succeeded in getting rid of this cluster shape dependence. Before you would get vastly different results for the superconducting transition temperature, but now you get pretty much the same.”
The reduced sign problem, combined with the power of Titan, also allows the group to simulate much larger systems. In the past, the group was limited to eight-atom cluster simulations if it wanted to get down to the transition temperature for realistic parameters. More recently it has been able to scale up to 28-atom systems.
As the team moves forward, Maier noted, it would like to simulate more complex and realistic systems. For instance, two of the most promising materials in high-temperature superconducting research, which contain copper and iron, hold their electrons in a number of different orbitals. Yet, so far the team has simulated only one of these orbitals.
“One direction we want to go into is to make the models more realistic by including more degrees of freedom, or orbitals. But before you do that you want to have a method that allows you to get an accurate answer for the simple model. Then you can move on to more complicated models.”
“The question is always, ‘Do you get to the interesting region where you get interesting physics before you hit the sign problem,’” Maier noted. “We were able to get there to some extent before we had this new method. But now we really have a significant improvement. Now we can really look at realistic parameters.”
The Gordon Bell Prize will be presented November 21 during the SC13 supercomputing conference in Denver. Besides Staar, Maier, and Schulthess, the DCA+ team includes Raffaele Solca and Gilles Fourestey of ETH Zurich and Michael Summers of ORNL.]]>
Radiation measured from Earth can help scientists characterize far-away plasma dynamics
Outshining the black holes they surround, the bright, hot centers of galaxies known as active galactic nuclei can spew jets of plasma thousands of light-years long. These streams of plasma create an effect often seen in popular images—galaxies speared through the heart by intense light. Such jets are also associated with stars and other astronomical phenomenon.
Although physicists are keen to learn the fluid-like mechanics taking place in jets of roiling protons and electrons, there is one problem: distance. Millions or billions of light-years stretch between scientists here on Earth and cosmic jets, and observing individual electromagnetic particles through a telescope is impossible—impossible, but no less desirable.
“Understanding these plasma jets can help explain what is happening to the matter in these objects—how it is accelerated to such high energies and other fundamental physics out of our reach,” said Michael Bussmann, HZDR–Dresden Computational Radiation Physics group leader.
In pursuit of the impossible, a team from Germany’s HZDR–Dresden used Titan, the most powerful supercomputer in the United States located at Oak Ridge National Laboratory, to simulate billions of particles in two passing jet streams. The code’s run on Titan, the Oak Ridge Leadership Computing Facility’s 27-petaflop, hybrid CPU/GPU Cray XK7 machine, earned the team a finalist nomination for the Association of Computing Machinery’s 2013 Gordon Bell Prize. The prize recognizes outstanding achievements in high-performance computing (HPC) applications, specifically codes that redefine what is possible in HPC. This year’s winner will be announced at the supercomputing conference SC13 held November 17–22.
An unlikely couple
By modeling a well-known property of plasma turbulence called the relativistic Kelvin-Helmholtz instability (KHI), which occurs where passing plasma jets collide, researchers were able to make out patterns of particle behavior—the inner workings of these far-away objects.
Then they used radiative signatures, one of the clues we can measure with the help of a telescope, to correlate plasma dynamics with radiation emitted during turbulence. If the jets simulated on Titan were flung far into space, particles disrupted by the KHI could not be observed from Earth, but the radiation they put off could.
Ultimately the KHI tells physicists about the properties of passing plasma jets through comparison: Is one plasma stream denser than the other? What are their velocities? In what directions are they traveling?
And understanding plasma jet dynamics could reveal information about their objects of origin, such as active galactic nuclei.
“Our scientific question was, ‘Can we correlate the radiative signature with individual particles?’ ” Bussman said. “Is there a chance to really see what’s happening inside the plasma just by looking at the radiation? We are very limited in our tools to connect plasma dynamics to what we observe, and this is where simulation comes in.”
Indeed, the results from Titan show radiation can be a diagnostic for the plasma dynamics taking place far beyond our reach.
Like wind causing ripples on the surface of a lake, plasma turbulence has often been viewed as a hydrodynamics question because the local electromagnetic field in plasma creates similar currents. However, while similar, the two are not exactly the same thing.
Hydrodynamics, or fluid, simulations generate smooth patterns of movement by averaging velocity, density, and other parameters around a point. Until now KHI simulations have largely been done with hydrodynamics calculations.
“The KHI is so fundamental—it occurs in all fluid systems—so it’s a very good test system to see if we can see it in the radiation patterns,” he explained.
But to simulate radiation emitting from the jets as well, the team had to perform kinetic simulations, which follow the path of individual particles and require many more unique calculations than hydrodynamics simulations.
“Calculations for kinetic KHI simulations are difficult because we have particles being accelerated by electromagnetic fields in plasmas, so they follow different paths,” Bussmann said. “And every time particles collide, there is a slight change in direction. They slow down or speed up, and then they radiate at a given frequency.”
Using a particle-in-cell code that computes the interaction between charged particles, the team modeled two streams of unmagnetized hydrogen plasma totaling 75 billion particles per simulation, including protons and electrons. Further calculations simulated the entire spectrum of radiation streaming from the jets in 481 directions.
When simulations of the jets’ charged particles began running, researchers saw chaos—at first.
“There is a lot of complex substructure, and it looks like particles are moving randomly. It looks chaotic, but as the simulation continues to run, we see structures emerge,” Bussmann said. “What we see at the interface between the two streams where turbulence occurs looks like mushrooms or whirlpools.”
These patterns are coming to light primarily because, on Titan, the KHI simulations achieved a resolution not previously obtained.
“To our knowledge, this simulation on Titan is 46 times larger, and the spatial resolution is 4.2 times higher than the largest kinetic KHI simulation to date,” Bussmann said.
Most of this scalability is due to Titan’s GPUs, or graphics processing units, which are incredibly fast at the repetitive calculations that compose many HPC applications. Both the plasma dynamics and emitted radiation were computed on the GPUs.
“We believe that this task could not have been done without the accelerator architecture,” Bussmann said. “In order to compute the particle motion and radiation together, one needs a high bandwidth and computing power. We take the trajectory of each of the several billion particles and then use this data to calculate the radiation emitted in hundreds of directions for all relevant wavelengths.”
With the data from Titan, researchers can begin to apply the results to actual plasma jets. The spectrum of radiative signatures emanating from the simulated jets provides a measurement stick of sorts against which scientists can calibrate for observed objects.
“We know every spectrum and every direction of the radiation from the Titan simulations, and we can use this information to map the radiative signatures to different objects,” Bussmann said. “By extension, we can use it as an input to predict the dynamics for different plasma jets we observe from Earth.”
Turns out, maybe Earth isn’t such a bad place to study black holes and quasars after all, no matter the distance.—Katie Elyce Jones]]>
The Oak Ridge Leadership Computing Facility (OLCF) delivered more than 374 million supercomputer core hours to 17 projects through the Department of Energy’s (DOE’s) Office of Advanced Scientific Computing Research (ASCR) Leadership Computing Challenges (ALCC)—76 million hours more than expected.
Beginning in July 2012 and ending in October of this year, ASCR’s allocation program was designed to supply time on supercomputers for projects of special interest to DOE, with emphasis on high-risk, high-reward simulations. It was open to research scientists in industry, academia, and national laboratories.
In the past year these projects had access to the OLCF’s Titan supercomputer—a machine capable of 27,000 trillion calculations per second.
“With the addition of Titan to the OLCF, this was the first time that our ALCC projects could utilize GPUs,” said OLCF Director of Science Jack Wells. “This meant that our ALCC projects were often very aggressive in adopting our new technologies. For example, since Titan’s GPUs have been available to users, 42 percent of service units consumed by these ALCC projects have been consumed by jobs using the GPUs.”
By combining GPUs and CPUs in the same system, Titan tackled many projects that could not be attempted at any other facility.
Creating new material applications
In 2008 Thomas Maier was on the team that won the prestigious Gordon Bell Prize — an award that recognizes outstanding achievement in high-performance computing (HPC) applications — with a petaflop simulation of disorder effects on the transition temperature of high-temperature copper-based superconducting materials.
This year Maier and colleagues used more than 58 million Titan core hours to further understand and predict the behavior of these materials by conducting the first-ever systematic simulation studying the mechanisms that lead copper-based materials to superconductivity. By studying these mechanisms he hopes to discover new compounds that scientists can use to revolutionize many technologies, from efficient, green power transmission, power generation, and grid technology to medical applications and high-speed levitating trains.
Reducing carbon emissions
Finding a cost-effective way to reduce carbon dioxide emissions is extremely important to DOE. To aid the process the agency is sponsoring large-scale demonstration projects of carbon capture and sequestration (CCS).
Using more than 71 million Titan core hours, Allan Grosvenor of Ramgen Power Systems, a small research and development company located in the Seattle area, conducted research to develop shock wave-based compression systems to meet DOE goals of reducing the cost of CCS and enabling high-efficiency electricity generation.
The technology pioneered by Ramgen represents a dramatic breakthrough in turbomachinery and is funded in part by DOE’s National Energy Technology Laboratory.
With the massive scale of Ramgen’s computational fluid dynamics simulations run on Titan, the company is able to accelerate its technology development. By first testing prototypes computationally, Grosvenor is able to save Ramgen and DOE extensive time and money.
For an article giving more information on Ramgen’s shock wave compression project, visit https://www.olcf.ornl.gov/2012/08/14/ramgen-simulates-shock-waves-makes-shock-waves-across-energy-spectrum/.
Harnessing the power of the sun
The world’s oil resources are predicted to run out in fewer than 50 years if production is to meet the projected rise in energy consumption. In addition, concerns over the greenhouse effect are leading policy makers to emphasize carbon-free energy sources to avoid possible environmental disasters.
Enter fusion. Specifically, thermonuclear plasma in a donut-shaped magnetic confinement reactor known as a tokamak is a potentially important source of carbon-free energy. To test its feasibility for commercial power production, researchers are building a working fusion laboratory known as ITER in France.
In separate ALCC projects, C.S. Chang of Princeton Plasma Physics Laboratory and Zhihong Lin of the University of California’s Department of Physics and Astronomy used Titan’s HPC abilities to perform research for the ITER fusion reactor.
Using more than 35 million Titan core hours, Chang performed simulations to understand how tokamak plasma can generate intrinsic toroidal rotation—the donut-shaped rotation required to stabilize the plasma—without an external momentum source.
In experiments such a source, a neutral beam injection, for example, has been observed to cause this kind of rotation. However, in ITER, which has plasmas of much greater mass and pressure, an external momentum input by neutral beam becomes negligible. That means a new toroidal rotation method must be developed to stabilize plasmas.
Using Titan’s GPUs
While Chang is running simulations to understand the methods required to sustain ITER plasmas, Lin is running simulations to build the predictive capability of energetic particle turbulence and transport in the plasma.
Turbulence and transport models are some of the hardest to effectively simulate, requiring complex codes that can run on only HPC systems like Titan.
With almost 31 million Titan core hours to tackle the complexities of this system, Lin extended the capability of the original gyrokinetic toroidal code, a three-dimensional code used to study microturbulence in a tokamak. Initial tests found speeds three times faster than the original, allowing for more critical research to be performed in less time—a feat that could only be achieved using the combined GPU and CPU power of Titan. —by Austin Koenig
For the complete list of OLCF ALCC projects, visit https://www.olcf.ornl.gov/leadership-science/project-archives/2012-alcc-projects/.
For the complete list of the 2012 ALCC projects with a description for each, visit http://science.energy.gov/~/media/ascr/pdf/facilities/2012ALCCFactsheets.pdf.]]>
A simulation of the internal workings of cells has reached a sustained performance of 20,000 trillion calculations per second, or 20 petaflops, on the Titan supercomputer at Oak Ridge National Laboratory (ORNL).
The achievement makes the code—which promises to advance both biology and medicine—one of the fastest in the world. It also earned the development team from the Consiglio Nazionale delle Ricerche (CNR), or National Research Council, of Italy a finalist position for the coveted Gordon Bell Prize.
Scientists are using supercomputers to simulate the hundreds of thousands of proteins that move and interact within cells. The understanding they gain sheds light on the factors that drive the critical activities of cells, which are the most basic units of life.
“We are simulating the crowded protein solution that is representative of our cell compartments,” said Simone Melchionna from CNR’s Institute for Chemical and Physical Processes. “If we want to target a specific cell for treatment, we need to understand how that cell works. Proteins are the most crucial cellular agents, and their behavior affects all activities that take place within the human body.”
Proteins perform a vast number of functions in living organisms, from catalyzing reactions to replicating DNA to transporting molecules. In the human body, proteins are responsible for most physiological states, such as regulating heart rate and blood pressure and firing brain cells to allow movements such as blinking or swallowing.
Scientists have predicted the activity of isolated proteins, but nature isn’t an isolated system. The research team is interested in studying the cell interior, which is packed with proteins—an environment that can’t be studied in a laboratory using dilute protein solutions.
In living cells proteins interact with other proteins and with surrounding fluids by changing their shapes, movements, and behaviors, sometimes dramatically. These dynamic interactions complicate studies that examine biological systems.
The research team developed a code called MUPHY, for MUlti PHYsics simulator, to study the two-way interactions that exist between proteins and fluids within a cell. The application accounts for the physical forces that cause protein movement, as well as the forces involved in protein–protein and protein–fluid interactions.
Using this code, researchers can learn more about how proteins move in crowded conditions and travel between cellular compartments, or organelles. Knowing the general pattern for protein movement and interaction could help scientists understand specific diseases like Alzheimer’s, which results in a clumping of proteins in the brain.
To prove that these complex and realistic cellular simulations can be performed, the team used the Oak Ridge Leadership Computing Facility’s Titan, the United States’ fastest supercomputer for open scientific research. By combining CPUs and GPUs in a hybrid architecture, Titan has a peak performance of more than 27 petaflops.
“Our code runs on 18,000 of Titan’s 18,688 nodes,” said Massimo Bernaschi, the chief technology officer at CNR’s Institute for Applied Computing. “Our simulations run fastest on GPUs, and Titan currently has the highest number of available GPUs. We made the simulation run even faster by minimizing the communication between nodes, which allowed the GPUs to run at their full speed.”
The code achieved a peak performance of almost 27.5 petaflops by using a mixed-precision calculation.
The code is a computational platform for multiscale simulations of real-life biofluidic problems. It can also simulate other biological phenomena, such as blood flow through the coronary arteries. It is written in the Fortran 90 and CUDA C languages and employs MPI—or message passing interface—for communication.
In this run each node monitored the movement and behavior of a single protein, which minimized the memory requirement and helped provide a good load balance. The proteins simulated are representative of the typical size of normal-state (i.e., not diseased) proteins and are the most common ones found in yeast cells, which are less sophisticated than but similar to human cells.
The team includes Bernaschi, Melchionna, Mauro Bisson from CNR’s Institute for Applied Computing, and Massimiliano Fatica from the NVIDIA Corporation (United States). This code has been a finalist in two other Gordon Bell competitions, but this is the first time that the CNR team has competed in the peak performance section.
The Gordon Bell Prize, which is awarded annually by the Association for Computing Machinery (ACM), recognizes outstanding achievements in high-performance supercomputing. The prizes identify supercomputer codes that can perform on large systems like Titan with quick times to solution. Gordon Bell finalists will present their results in November at the SC13 Conference in Denver, where the winners will be named. —by Jennifer Brouner]]>
The Oak Ridge Leadership Computing Facility’s (OLCF’s) HPC Operations storage team recently relocated the center’s High-Performance Storage System (HPSS) archive tape library to a centralized location with a more controlled environment, resulting in better overall availability and uptime for OLCF system users and better resiliency of the media.
Developed in 1997 and a winner of an R&D 100 Award that same year, the HPSS archive uses tape and disk storage components, servers, and HPSS software to provide long-term storage for the massive amounts of data created by users on OLCF systems.
To ensure against data loss, the team updates the archive as often as possible with the latest software and storage technologies. This, however, can be a daunting task. Not only do storage needs increase every year, but the rate of increase is accelerating.
For instance, in 2006 the amount of data stored in HPSS surpassed 1 petabyte for the first time. Reaching this number took 8 1/2 years. To reach the second petabyte, however, took under 2 years, and getting to the third took only 6 months.
This year the team streamlined the day-to-day operations of the HPSS archive system by colocating six Oracle StorageTek SL8500 tape libraries and more than 40,000 media cartridges in a single centralized location.
Each tape library can hold 10,000 individual media cartridges, with each cartridge capable of storing from 1 to 8 terabytes of data. This sheer volume of information made the move from two locations to a central one extremely challenging because more than 30 petabytes are stored within the HPSS archive—roughly three times the size of the entire printed collection at the Library of Congress. Adding to this challenge, the team had to deal with the previous, very complex cabling plant and large array of fiber-channel and Ethernet switch gear.
After several months of preparation, though, the team was able to not only move the library itself, but also upgrade facilities and systems such as power, space, and cooling, as well as complete a new cabling plant and fiber/Ethernet network. The team also worked with Oak Ridge National Laboratory fire engineers and vendor representatives to design a fire-suppression system to meet fire code requirements and further protect the archive media and data from damage.
As a result of this work, the tape library infrastructure is better able to share the load of the HPSS archive’s requests for tape resources. The libraries are also in a more controlled environment that regulates temperature, humidity, and air quality, leading to better resiliency of the tape media.
Lastly, the upgraded cabling plant and infrastructure moved the HPSS archive off older, more-expensive-to-maintain hardware, which will save tens of thousands of dollars each year through reduced maintenance expenses.
“We are always trying to implement the newest storage technologies so that our researchers know their data is safe,” says HPC Operations’ Kevin Thach. “But sometimes the technology is just not there yet or is too expensive at its current stage, so we have to come up with our own ways to make the HPSS archive more reliable.” —by Austin Koenig]]>
Advancements to instruments in observatories and satellites can stretch the eye of the observer billions of light-years away to the fringes of the observable universe. Images from sky surveys of galaxies, quasars, and other astronomical objects offer scientists clues about how the distribution of mass is influenced by dark energy, the repelling force guiding the accelerated expansion of the universe.
But all the telescopes at scientists’ disposal cannot begin to canvas the distribution of mass across the entire universe through time—an analysis that would help physicists corroborate observational data with their understanding of the fundamental processes that govern how the structure of the universe is formed.
To create a comprehensive sky catalog of the development of the universe to which scientists can compare instrumental observations, researchers are using the Department of Energy’s (DOE’s) most powerful computing systems, including the nation’s top-ranked machine, Titan, managed by the Oak Ridge Leadership Computing Facility (OLCF), to simulate the evolution of the universe as it expands across billions of years.
“Basically what the code does is follow the formation of structure in the universe,” said Salman Habib, project leader and high-energy physicist and computational scientist at Argonne National Laboratory (ANL). “And the idea is to get very-high-accuracy simulations of the universe so you can compare them to observations of the sky.”
To simulate cosmic structure, the ANL research team and its collaborators developed a modular, high-performance computing (HPC) code called HACC (for Hardware/Hybrid Accelerated Cosmology Code) designed for diverse HPC architectures. HACC requires petascale computing to “evolve” trillions of interacting particles and is the first large-scale cosmology code that can run on a hybrid CPU/GPU supercomputer, as well as on multicore or many-core architectures.
The code’s exceptional performance on the OLCF’s 27-petaflop, hybrid CPU/GPU Cray XK7 supercomputer, Titan, located at Oak Ridge National Laboratory, as well as on the IBM Blue Gene/Q machines Sequoia (at Lawrence Livermore National Laboratory) and Mira (at ANL), earned the project a finalist nomination for ACM’s Gordon Bell Prize. The award recognizes outstanding achievement in high-performance supercomputing applications, and this year’s winner will be announced at the SC13 supercomputing conference in November. Four of the six finalists, including HACC, ran on Titan.
Pushing the boundaries of what is possible in HPC today, the most computationally demanding calculations in HACC run at performance levels that translate to more than 25 petaflops on Titan and at an estimated 10 petaflop sustained performance. Such extreme performance is needed to simultaneously capture the large scales and the low-level details that modern cosmology requires.
Tracing the cosmos billions of particles at a time
HACC simulations begin with an extremely dense and very uniform universe. As a run proceeds, the simulated universe expands in a series of thousands of time steps, while at the same time the initial uniform structure forms a complex cosmic web, developing a detailed clustering of mass across large distances.
“The code starts with an initial, smooth density field, and it tracks where the matter goes,” Habib said. “And within these clumps of matter, galaxies form and other things happen.”
But unlike many HPC simulations that are building virtual systems molecule by molecule or atom by atom, HACC’s trillions of points of mass are not exact representations of physical objects.
“We’re trying to track where the mass is in the universe, and of course, if you tried to do it by number of atoms, that would be hopeless because there are way too many of those,” Habib said.
The trillions of particles simulated are “tracer” particles, themselves representing conglomerations of mass like galaxies.
“Modern cosmology isn’t about looking at one object,” Habib said. “It’s about building statistics over billions of objects.”
The smallest clump of mass distinguishable in the code is 100 billion solar masses (roughly equivalent to the Milky Way galaxy), and the code resolves distances from kiloparsecs to gigaparsecs—from galaxy-scale distances to the entire swath of the observable universe.
“There’s very high resolution in this code,” Habib said. “And one of the reasons is that you may be comparing the results to a range of observational surveys with different parameters, so everywhere in the simulation you need a resolution on the order of a million to one.”
HACC has applications for a wide range of cosmological studies, especially in the production of predictive surveys for studies of dark matter, cosmic background radiation, and other indicators of dark energy at work.
“Many observations of the sky are ongoing or planned for the near future,” Habib said. “HACC is the modeling side of that.”
Habib’s team, for instance, aims to use HACC in conjunction with the Large Synoptic Survey Telescope (LSST) project. LSST will follow changes in the southern sky over a period of 10 years. By comparing HACC simulations with integrated time-lapse images from LSST, which will provide measurements of weak gravitational lensing (or the distortion of background galaxy images by foreground matter concentrations), researchers intend to probe the nature of dark energy.
Accelerating the universe on GPUs
One of the code’s most attractive features is its versatility. With limited and varied access to supercomputers, researchers benefit from a code that can be used on multiple architectures.
“We have modules that we can plug in and out for different architectures,” Habib said. “The beauty of HACC is we can run the exact same problem on different machines with different architectures using different algorithms and, by comparison, make sure the answer is accurate.”
And as the numbers prove, HACC is highly effective across architectures. In the test runs on Titan and Mira, the code evolved at least 1.1 trillion particles each time, and a larger run on Sequoia landed at 3.6 trillion particles.
“This is the largest cosmological benchmark ever performed,” said Katrin Heitmann, who is part of ANL’s HACC team.
To accommodate different architectures, HACC’s framework is divided into two levels: a more homogenous grid level and a detailed, computationally-intensive, particle-to-particle level.
The difference in running the code on Titan’s GPUs versus running it on the all-CPU, many-core Blue Gene/Q systems like Sequoia and Mira, lies in the short-range, particle-to-particle calculations. Hybrid architectures like Titan use an algorithm designed for accelerated GPU hardware, whereas the many-core Blue Gene machines use an algorithm developed for CPUs.
“The grid is responsible for four orders of magnitude of dynamic range, which are longer distances, while the particle interactions handle the critical two orders of magnitude at the shortest scales,” Habib said. “The bulk of the time-stepping computation takes place in the latter.”
As a user “zooms in” to smaller orders of magnitude, instances of mass in the field become denser, and increasingly accurate calculations are required. Calculating these particle-to-particle interactions is where Titan’s GPUs excelled, generating the largest cosmological simulation on GPUs to date.
“The advantage of the GPUs is they have this raw speed,” Habib said. And this raw speed is amplified by HACC’s mixed-precision code, meaning that some calculations require 8 figures after the decimal (single precision) and some 16 (double precision).
For single-precision calculations, both CPUs and GPUs can double in speed. However, GPUs are doubling or tripling an already higher speed, which further reduces computation time.
“We view the GPUs as a place where you get ‘computing for free,’” Habib said. “The particle interactions do well on the GPUs because they’re very computationally intensive.”
An additional benefit of HACC’s modular framework is its potential scalability as supercomputing architectures evolve. HACC is a highly parallel code, meaning that it can perform many calculations at once, and its capability is not limited to today’s petaflop machines. HACC is prepared to accelerate with petascale computing.
“We don’t see a problem at least into machines capable of hundreds of petaflops,” Habib said.
And that’s good news for researchers because resolving the history of the universe and the mystery of dark energy may take a lot of computational power and a little bit of time. —by Katie Elyce Jones]]>