Goal is to help speed up difficult jobs
Over the past months, Titan users have had the difficult task of moving enormous amounts of data from the old file system, Widow, to the new and improved Atlas.
Fortunately they have had help in the form of the powerful Distributed File Copy Tool (dcp). According to Oak Ridge Leadership Computing Facility’s (OLCF’s) Blake Caldwell, dcp has helped OLCF staff move 350 terabytes of data so far on behalf of some users, while other users have independently used dcp to copy much more.
OLCF helped to develop dcp as a collaboration with Lawrence Livermore National Laboratory (LLNL), Los Alamos National Laboratory, and the company Data Direct Networks to create a suite of parallel file system tools designed for scalability and performance. Such tools (dcp included) speed up difficult jobs by distributing the workload across multiple processors.
LLNL was the primary developer of dcp, but OLCF has played an important role. Developer Dr. Feiyi Wang, for instance, worked on several of dcp’s features, most notably testing the system and improving its stability.
Though dcp is not the first parallel copy tool, it is unique. According to Wang, traditional multithreading applications can’t scale beyond a single symmetric multiprocessing (SMP) node. In response, dcp uses MPI tasks instead of multithreading. Caldwell says, “MPI allows us to do the transfer over multiple nodes, which allows us to task more cores to each copy.”
However, Wang emphasizes that “dcp is just one of the tools in a suite of tools.” The collaboration that developed dcp has many other parallel file system appliances in the works. OLCF is leading in designing and implementing a dtar tool, which will use parallelism to efficiently collate many files into one, and a dfind tool, which will use parallelism to find specific files in the masses of data on the computer.
In fact, dcp itself is still evolving; Wang hopes to soon add a resume feature, allowing users to recover from a failed transfer, and provide progress information during the copying process. —Timothy Metcalf]]>
This visualization shows the turbulence front from the plasma edge being spread inward in multiscale interaction with the evolving background profile under the central heat source. Eventually, the whole volume becomes turbulent, with the spatial turbulence amplitude distribution being just enough to produce the outward heat transport to expel the centrally deposited heat to the edge. The edge turbulence source is continuously fed by the heat flux from the core. This is how the plasma profile, the heat source and the turbulence self-organize.
Credit: Dave Pugmire, ORNL.
Titan blazes trail for ITER reactor via DIII-D
Few problems have vexed physicists like fusion, the process by which stars fuel themselves and by which researchers on Earth hope to create the energy source of the future.
By heating the hydrogen isotopes tritium and deuterium to more than five times the temperature of the Sun’s surface, scientists create a reaction that could eventually produce electricity. Turns out, however, that confining the engine of a star to a manmade vessel and using it to produce energy is tricky business.
Big problems, such as this one, require big solutions. Luckily, few solutions are bigger than Titan, the Department of Energy’s flagship Cray XK7 supercomputer managed by the Oak Ridge Leadership Computing Facility.
Titan allows advanced scientific applications to reach unprecedented speeds, enabling scientific breakthroughs faster than ever with only a marginal increase in power consumption. This unique marriage of number-crunching hardware enables Titan, located at Oak Ridge National Laboratory (ORNL), to reach a peak performance of 27 petaflops to claim the title of the world’s fastest computer dedicated solely to scientific research.
And fusion is at the head of the research pack. In fact, a team led by Princeton Plasma Physics Laboratory’s (PPPL’s) C.S. Chang increased the performance of its fusion XGC1 code fourfold on Titan using its GPUs and CPUs, compared to its previous CPU-only incarnation after a 6-month performance engineering period during which the team tweaked its code to best take advantage of Titan’s revolutionary hybrid architecture.
“In nature, there are two types of physics,” said Chang. The first is equilibrium, in which changes happen in a “closed” world toward a static state, making the calculations comparatively simple. “This science has been established for a couple hundred years,” he said. Unfortunately, plasma physics falls in the second category, in which a system has inputs and outputs that constantly drive the system to a nonequilibrium state, which Chang refers to as an “open” world.
Most magnetic fusion research is centered on a tokamak, a donut-shaped vessel that shows the most promise for magnetically confining the extremely hot and fragile plasma. Because the plasma is constantly coming into contact with the vessel wall and losing mass and energy, which in turn introduces neutral particles back into the plasma, equilibrium physics generally don’t apply at the edge and simulating the environment is difficult using conventional computational fluid dynamics.
Another major reason the simulations are so complex is their multiscale nature. The distance scales involved range from millimeters (what’s going on among the gyrating particles and turbulence eddies inside the plasma itself) to meters (looking at the entire vessel that contains the plasma). The time scales introduce even more complexity, as researchers want to see how the edge plasma evolves from microseconds in particle motions and turbulence fluctuations to milliseconds and seconds in its full evolution. Furthermore, these two scales are coupled. “The simulation scale has to be very large, but still has to include the small-scale details,” said Chang.
And few machines are as capable of delivering in that regard as is Titan. “The bigger the computer, the higher the fidelity,” he said, simply because researchers can incorporate more physics, and few problems require more physics than simulating a fusion plasma.
On the hunt for blobs
Studying the plasma edge is critical to understanding the plasma as a whole. “What happens at the edge is what determines the steady fusion performance at the core,” said Chang. But when it comes to studying the edge, “the effort hasn’t been very successful because of its complexity,” he added.
Chang’s team is shedding light on a long-known and little-understood phenomenon known as “blobby” turbulence in which formations of strong plasma density fluctuations or clumps flow together and move around large amounts of edge plasma, greatly affecting edge and core performance in the DIII-D tokamak at General Atomics in San Diego, CA. DIII-D-based simulations are considered a critical stepping-stone for the full-scale, first principles simulation of the ITER plasma edge. ITER is a tokamak reactor to be built in France to test the science feasibility of fusion energy.
The phenomenon was discovered more than 10 years ago, and is one of the “most important things in understanding edge physics,” said Chang, adding that people have tried to model it using fluids (i.e., equilibrium physics quantities). However, because the plasma inhabits an open world, it requires first-principles, ab-initio simulations. Now, for the first time, researchers have verified the existence and modeled the behavior of these blobs using a gyrokinetic code (or one that uses the most fundamental plasma kinetic equations, with analytic treatment of the fast gyrating particle motions) and the DIII-D geometry.
This same first-principles approach also revealed the divertor heat load footprint. The divertor will extract heat and helium ash from the plasma, acting as a vacuum system and ensuring that the plasma remains stable and the reaction ongoing.
These discoveries were made possible because the team’s XGC1 code exhibited highly efficient weak and strong scalability on Titan’s hybrid architecture up to the full size of the machine. Collaborating with Ed D’Azevedo, supported by the OLCF and by the DOE Scientific Discovery through Advanced Computing (SciDAC) project Center for Edge Physics Simulation (EPSi), along with Pat Worley (ORNL), Jianying Liand (PPPL) and Seung-Hoe Ku (PPPL) also supported by EPSi, this team optimized its XGC1 code for Titan’s GPUs using the maximum number of nodes, boosting performance fourfold over the previous CPU-only code. This performance increase has enormous implications for predicting fusion energy efficiency in ITER.
“We can now use both the CPUs and GPUs efficiently in full-scale production simulations of the tokamak plasma,” said Chang.
Furthermore, added Chang, Titan is beginning to allow the researchers to model physics that were just a year ago out of reach altogether, such as electron-scale turbulence, that were out of reach altogether as little as a year ago. Jaguar—Titan’s CPU-only predecessor— was fine for ion-scale edge turbulence because ions are both slower and heavier than electrons (for which the computing requirement is 60 times greater), but fell seriously short when it came to calculating electron-scale turbulence. While Titan is still not quite powerful enough to model electrons as accurately as Chang would like, the team has developed a technique that allows them to simulate electron physics approximately 10 times faster than on Jaguar.
And they are just getting started. The researchers plan on eventually simulating the full volume plasma with electron-scale turbulence to understand how these newly modeled blobs affect the fusion core, because whatever happens at the edge determines conditions in the core. “We think this blob phenomenon will be a key to understanding the core,” said Chang, adding, “All of these are critical physics elements that must be understood to raise the confidence level of successful ITER operation. These phenomena have been observed experimentally for a long time, but have not been understood theoretically at a predictable confidence level.”
Given the team can currently use all of Titan’s more that 18,000 nodes, a better understanding of fusion is certainly in the works. A better understanding of blobby turbulence and its effects on plasma performance is a significant step toward that goal, proving yet again that few tools are more critical than simulation if mankind is to use the engines of stars to solve its most pressing dilemma: clean, abundant energy.]]>
Project goal is to help inform users
As high-performance computing (HPC) capability breaks petaflop boundaries and pushes toward the exascale to enable new scientific discoveries, the amount of data generated by large-scale simulations is becoming more difficult to manage. At the Oak Ridge Leadership Computing Facility (OLCF), users on the world’s most powerful supercomputer for open science, Titan, are routinely producing tens or hundreds of terabytes of data, and many predict their needs will multiply significantly in the next 5 years.
The OLCF currently provides users with 32 petabytes of scratch storage and hosts up to 34 petabytes of archival storage for them. However, individual users often need to move terabytes of data to other facilities for analysis and long-term use.
“We’re adding data at an amazing rate,” said Suzanne Parete-Koon of the OLCF User Assistance and Outreach Group.
To address this expanding load of outgoing data, three OLCF groups worked together to identify data transfer solutions and improve transfer rates for users by evaluating three parallel-streaming tools that can be optimized on upgraded OLCF hardware. Parete-Koon and colleagues Hai Ah Nam and Jason Hill shared their results in the paper “The practical obstacles of data transfer: Why researchers still love scp,” which was published in conjunction with the Third International Workshop on Network-Aware Data Management held at the SC13 supercomputing conference in November 2013.
Leadership computing facilities have long anticipated an increasing demand for data management resources. In a 2013 survey, OLCF staff asked users to rank hardware features in order of priority for their future computing needs. Archival storage ranked fourth and wide area network bandwidth for data transfers ranked seventh.
The OLCF revisited its data transfer capabilities and dedicated two nodes for outgoing batch transfers last year. At the beginning of 2014, that capability increased to 10 nodes.
“We saw the need for this service and have worked throughout the year to deploy these nodes to support the scientific data movement that our users demand,” Hill said.
These dedicated batch nodes provide a scheduled data transfer source determined by user requests.
“By scheduling transfers, users can maximize their transfer and get more predictable performance,” Hill said.
Data is moved from the OLCF via the Energy Sciences Network (ESNet)—the Department of Energy’s speedier lane of the Internet that is capable of transferring 10 times more bits per second than the average network. But in fall 2013, even with ESNet’s high bandwidth and OLCF’s new dedicated transfer nodes, users still complained of painfully slow transfer rates.
Parete-Koon, Nam, and Hill discovered researchers were often underutilizing the resources by using single-stream transfer methods such as scp rather than multiple-stream methods that break data into multiple, simultaneous streams, resulting in higher transfer rates.
“Using scp when multiple-stream methods are available is akin to drivers using a single lane of a 10-lane highway,” Nam said.
Staff tested three multiple-stream tools compatible with and optimized for OLCF systems—bbcp, GridFTP, and Globus Online—for ease of use and reliable availability. Results showed that bbcp and GridFTP transferred data up to 15 times faster than scp in performance tests.
“Although these multi-streaming tools were available to users, they were being grossly underutilized,” Parete-Koon said. “We wondered why researchers were not taking advantage of these tools.”
Through the survey and conversations with users, the team identified why researchers were stuck on scp.
Users were willing to sit on standby during long stretches of data transfer not only because they were typically more familiar with scp, but also because some multiple-stream methods require a lengthy setup process that can take several days, creating a high barrier to entry.
“Because of a high level of cybersecurity at the OLCF and other computing facilities, transferring files requires additional steps that seem inconvenient and overly complicated to users,” Parete-Koon said. “For example, to use GridFTP, OLCF users need an Open Science Grid Certificate, which is like a passport that users ‘own’ and are responsible for maintaining.”
Despite the improved transfer capabilities of GridFTP, only 37 Open Science Grid Certificates— out of 157 projects with about 500 users—had been activated on OLCF systems as of November 2013.
“We tested how fast these tools worked and asked ourselves if each tool would be convenient enough so people would go through the extra setup work,” Parete-Koon said.
“The results in our paper show that better transfer speed is worth it.”
Open grid certificates also enable users to launch an automated workflow from a batch script.
“The ability to seamlessly integrate data transfer into the scientific workflow with a batch script is where the benefit of the grid certificate authentication becomes even more apparent,” Nam said. “The certificates allow users to have uninterrupted productivity even when they’re not sitting in front of a computer screen.”
Based on the performance test results for bbcp, GridFTP, and Globus Online, the OLCF is recommending users with large data transfers apply for the certificate and take advantage of the specialized data transfer nodes.
In an effort to help users navigate the complicated setup process, Parete-Koon extended the research from the paper to establish documentation and user support.
“Suzanne [Parete-Koon] presented this information during a user conference call to a packed audience, which shows that users are interested in this topic and are looking for solutions,” Nam said.
Parete-Koon shared a sample, automated workflow that uses three scripts: the first to schedule a data transfer node, the second to launch an application, and the third to transfer files to longer-term data storage or a remote site for analysis. She is now working on a specialized data transfer user’s guide.
“After the conference call, one user thanked us because he was able to transfer 2 years of data—almost 19 terabytes—in five streams with an average rate of 1,290 megabits per second,” Nam said. “That is about five times faster than an optimized scp transfer.” —Katie Elyce Jones]]>
ORNL researchers learn more about biomass recalcitrance
Computer simulations are revealing the biological barriers that prevent the conversion of biomass into energy.
A team led by Oak Ridge National Laboratory’s Jeremy Smith, the director of ORNL’s Center for Molecular Biophysics and a Governor’s Chair at the University of Tennessee, has uncovered information that could help others harvest energy from plant mass. The team’s conclusion—that less ordered cellulose fibers bind less lignin—was published in the August edition of Biomacromolecules.
The team used simulations on the Oak Ridge Leadership Computing Facility’s Jaguar supercomputer—a 2.3-petaflop machine that in 2012 morphed into Titan, which is a more than 27-petaflop supercomputer—along with neutron scattering to seek ways to make ethanol as cheap as gasoline at the pump.
“We are trying to figure out how to effectively break down plant materials like grass or wood chips cheaply enough to make biofuels economically viable,” said Loukas Petridis, a researcher in the Biosciences Division. “We are investigating the two main features that make biomass recalcitrant, or resistant to breakdown—the presence of lignin and the tightly ordered structure of cellulose.”
All plants contain a sticky molecule called lignin that intertwines with cellulose and hemicellulose in their cell walls. This substance is one of the major roadblocks preventing the cost-effective production of cellulosic ethanol.
Lignin is a plant cell’s first defense against man and beast. It provides strength to the stalks of plants so that these organisms can stand. But a plant’s best friend is a bioenergy researcher’s worst nightmare. During biofuel production—a process that converts plant mass into alcohol—lignin blocks enzymes from breaking down cellulose into the sugars necessary for fermentation.
Petridis works with ORNL’s Biofuels Science Focus Area, a multidisciplinary research group with experts in math, computer science, physics, chemistry, and biology who are working on the lignin problem. By studying lignin–cellulose and lignin–lignin interactions, the scientists have learned more about the physical processes that occur during biomass pretreatment, which is an expensive process that opens plant cell walls and helps enzymes break down plant mass.
“The process is very effective,” Petridis said. “Without it you wouldn’t be able to produce biofuels, but we want to improve it further so that the production of biofuels becomes cheaper and more efficient. In order to do this, we have to understand what is taking place during pretreatment on a molecular level.”
Neutron imaging and supercomputer simulation allow scientists to resolve the structure of lignin aggregates down to 1 angstrom, about 1 million times smaller than what the naked eye can see.
A previous simulation on Jaguar showed how different pretreatment temperatures change lignin’s structure, causing it to either aggregate or expand. Petridis’s study builds on this finding.
Using neutron beams at ORNL’s High Flux Isotope Reactor, researchers have discovered that cellulose fibers that are less organized, or noncrystalline, are easier for enzymes to break down. Simulations run on Jaguar helped explain this phenomenon.
“Jaguar has shown us that not only is noncrystalline cellulose more easily broken down, but it also associates less with lignin,” Petridis said. “This was the first simulation that has looked at the interaction of lignin with specific types of cellulose.”
But researchers often want to learn why a process is occurring on a more fundamental level. Fortunately computer models also reveal the reason behind lignin’s preference for crystalline cellulose.
“It is harder for lignin to associate with the noncrystalline cellulose because this cellulose interacts very strongly with water,” Petridis said. “To see this we needed Jaguar’s enormous supercomputing power to simulate 3 million lignin, cellulose, and water molecules.”
Smaller supercomputers can simulate a cellulose fiber and one lignin molecule, but they often miss many of the molecular interactions that ORNL supercomputers like Jaguar or Titan can capture.
Smith’s team was awarded 23 million processor hours on Jaguar in 2012 through the Innovative and Novel Computational Impact on Theory and Experiment, or INCITE, program.
Under the INCITE allocation, the team ran a classic molecular dynamics simulation with a code called GROMACS (for Groningen Machine for Chemical Simulations). The code monitored 3 million atoms and used 30,000 of Jaguar’s cores, which means that each core was responsible for 100 atoms. The application provides information on the interaction of water with the cellulose, degree of lignin aggregation, shapes of lignin molecules, diffusion constants, etc.
The researchers received another 78 million INCITE hours this year to run an adapted version of GROMACS that can take advantage of Titan’s speedy GPUs, making the updated application 10 times bigger and much faster than the one run on Jaguar.
The Titan supercomputer is capable of more than 27,000 trillion calculations per second, and is the fastest open-science supercomputer in America. OLCF’s hybrid machine includes CPUs and NVIDIA GPUs, which are computational accelerators originally found in gaming systems. The GPUs provide the processing power needed to simulate larger and more realistic systems and accelerate scientific discovery.
Current simulations track around 30 million atoms, which include crystalline and noncrystalline cellulose, lignin, enzymes and water molecules. The models also account for the long-range interactions that lignin molecules experience with other surrounding molecules. These interactions make the code especially difficult to scale because the processors have to exchange information very frequently to monitor the position of every atom every fraction of a second.
“If you compare our 3-million-atom simulation to our 30-million-atom simulation, you can guess that we’ll eventually be able to simulate a billion atoms, which is the size of a living cell,” said Smith. “Our simulations could even include the microbes that eat the biomass.”
The team is now studying how lignin behaves in different types of biomass, which will help the researchers identify the plant characteristics best suited for biofuel production.
“The scientific insights we gain help improve the biofuel production process,” Petridis said. “We give engineers hints about how the production process works, which we hope will allow them to design new pretreatment methods and engineer different types of biomass and enzymes that can harvest more energy from plant materials.”
As supercomputing power increases, Petridis’s team will be able to simulate longer biological processes in greater detail. The simulations they run are already high resolution, but they can simulate only processes that happen within milliseconds.
“We can simulate lignin aggregation because it takes place in a millisecond, but there are other interesting biological processes that take much longer than 1 millisecond,” Petridis said. “The future of supercomputing is being able to access time and length scales that are currently accessible only through neutron scattering experiments.” —Jennifer Brouner]]>
The OLCF replaces Widow with the next generation Atlas file system
The Oak Ridge Leadership Computing Facility (OLCF) has introduced the next generation of data management with Atlas, a new center-wide file system that delivers increased capacity, greater performance, and a new directory structure for OLCF computational users.
Available to all OLCF users, Atlas provides a location to temporarily store large amounts of data produced on three OLCF systems: Titan, Eos, and Rhea. It replaces the OLCF’s Widow file system.
Atlas is organized by project, with work areas for users and projects located under a new project directory. Users on multiple projects now also have multiple work areas.
“Previously, a user’s project and their work areas were separate, making it extremely difficult to share data between the two,” said user support specialist Chris Fusion. “As a result of the new directory structure, users are no longer required to change permissions on their project directory to share data within the project, making it much easier to manage.”
The system itself is divided into two separate file systems, Atlas1 and Atlas2, together providing 32 petabytes of capacity and greater than 1 terabyte per second of aggregate performance—a huge step up from the previous file system.
The new file system also supports more than one metadata server, improving metadata operations, such as creating files and listing and giving twice the aggregate performance across the system.
“The existence of multiple file systems increases our ability to keep at least one file system available at all times,” said Dustin Leverman, deployment lead for Atlas. “This helps ensure that our users are still able to manage their project data at any given time. Furthermore, the increased capacity and greater performance will allow users even greater depth in their research, with less restrictions on data, both how fast it’s generated and the overall amount.” — by Austin Koenig]]>
Since the Oak Ridge Leadership Computing Facility’s (OLCF’s) Titan supercomputer began accepting its full suite of users on May 31st, science has been picking up steam.
With its hybrid architecture featuring traditional CPUs alongside GPUs, Titan represents a revolutionary paradigm in high-performance computing’s quest to reach the exascale with only marginal increases in power consumption for the world’s leading systems.
But while peak-performance numbers approaching 30 petaflops are indeed jaw-dropping, Titan is only as powerful as the applications that use its unique architecture to solve some of our greatest scientific challenges.
“The real measure of a system like Titan is how it handles working scientific applications and critical scientific problems,” said Buddy Bland, project director at the OLCF. “The purpose of Titan’s incredible power is to advance science, and the system has already shown its abilities on a range of important applications.”
In an effort to ensure that users could make the most of Titan when it came online and get the most science out of their allocations, the OLCF launched the Center for Accelerated Application Readiness (CAAR), a collaboration among application developers; Titan’s manufacturer, Cray; GPU manufacturer NVIDIA; and the OLCF’s scientific computing experts. CAAR identified five applications that might be able to harness the potential of the GPUs quickly and realize significant improvements in performance, fidelity, or both.
CAAR applications included the combustion code S3D; LSMS, which studies magnetic systems; LAMMPS, a bioenergy and molecular dynamics application; Denovo, which investigates nuclear reactors; and CAM-SE, a code that explores climate change.
These codes will greatly benefit as they scale up to take advantage of Titan’s unprecedented computing power. For instance, the S3D code will move beyond modeling simple fuels to tackle complex, larger-molecule hydrocarbon fuels such as isooctane (a surrogate for gasoline) and biofuels such as ethanol and butanol, helping America to achieve greater energy efficiency through improved internal combustion engines. And the climate change application CAM-SE will be able to increase the simulation speed to between 1 and 5 years per computing day, compared to just three months per computing day on Jaguar, Titan’s Cray XT5 predecessor. This speed increase is needed to make ultra-high-resolution, full-chemistry simulations feasible over decades and centuries and will allow researchers to quantify uncertainties by running multiple simulations.
One code in particular is already taking off. LAMMPS, a molecular dynamics code that simulates the movement of atoms through time and is a powerful tool for research in biology, materials science, and nanotechnology, has seen more than a sevenfold speedup on Titan compared to its performance on the comparable CPU-only Titan system (before the GPUs were available).
WL-LSMS, another CAAR application, has also proven to be an excellent match for Titan’s hybrid architecture. It calculates the magnetic properties of promising materials, including its Curie temperature, from basic laws of physics rather than from models that must incorporate approximations. WL-LSMS ran 3.8 times faster on the GPU-enabled Cray XK7 Titan than its XE6 CPU-only predecessor on a problem that consumed 18,600 of Titan’s compute nodes, 88 short of the full machine. Equally impressive is the fact that even with this dramatic increase in performance, the GPU version of Titan also consumed 7.3 times less energy than the CPU-only incarnation. This combination of accelerated performance and reduced energy consumption is precisely what the addition of the GPUs was intended to accomplish.
All of the CAAR applications achieved significant speedups using Titan’s GPUs, paving the way for the rest of Titan’s users, who are now beginning to ramp up their individual codes to scales never dreamed of just a few years ago.
ORNL had the option of building an equivalent Cray system that did not incorporate GPUs, using those 18,688 sockets instead to hold additional 16-core CPUs. Essentially, whereas each Titan node contains one NVIDIA Kepler GPU plus an AMD 16-core Opteron CPU, the other option would have simply replaced the GPU with another Opteron CPU. The resulting system would have contained nearly 600,000 processing cores, but it would nevertheless have paled in comparison to Titan.
The benefits of incorporating the GPUs are becoming more apparent with each application development, validating the OLCF’s philosophy and giving computational scientists unprecedented firepower with which to tackle some of nature’s largest, most complex questions.
But as with any sea change, Titan’s success is about more than the addition of the GPUs. Titan’s hybrid architecture presented obstacles on multiple fronts, obstacles the OLCF overcame in time for Titan’s official unveiling in June of 2013.
“We’ve made several improvements that don’t concern the GPUs but do affect the bottom line,” said OLCF Director of Science Jack Wells.
Take the individual codes, for instance. Many had to be restructured so that they could efficiently offload chunks of work to the GPUs, said Wells. In the case of S3D, the combustion code, that meant taking an MPI-based code and incorporating some OpenMP messaging. “When we did that, many codes saw a twofold speedup just by improving the code on the old machine,” said Wells.
Beyond the codes, the OLCF also reinvigorated large portions of the entire computing ecosystem. For instance, the center upgraded its Spider file system, now known as Atlas, to include more than double the number of routers from Titan to Atlas than was available from Jaguar to Spider. The OLCF also took a close look at where the routers were in the network, spreading them out so that I/O and communication traffic would conflict as little as possible.
Together with the addition of the GPUs, these ecosystem improvements represent a paradigm shift in high-performance computing. While the challenges that remain are great, they are dwarfed by the promise already evident after Titan’s first few months of use.]]>
Plants solved the solar energy challenge billions of years ago, with photosynthesis.
The sunlight-fueled process begins with two very plentiful molecules—carbon dioxide (CO2) and water (H2O). By rearranging electrons, the plants are able to combine hydrogen, carbon, and oxygen to produce carbohydrates, the sugars that store all the energy needed to fuel the plants and, by extension, our own bodies.
Photosynthesis is impressive not only because plants found a way to convert sunlight into chemical energy that they could use. It is also impressive because they found a way to economize.
While transferring an electron alone is relatively hard, moving an electron along with a proton is relatively easy. Plants take advantage of this principle by moving the particles simultaneously. As a result they are able to rearrange more electrons, and produce more energy-storing carbohydrates, using less of the solar energy that drives the process.
This trick is known as proton-coupled electron transfer, or PCET. It is present in a variety of processes, not just photosynthesis, but we don’t really know how or why it works. The researchers who figure it out will open the door to a range of new and vastly improved technologies, from more efficient photovoltaics to more effective catalysts.
“Many times in a catalyst or solar cell it costs a lot of energy to just move an electron or proton,” explained theoretical chemist Thomas Miller of Caltech. “Finding ways to reduce that cost involves moving electrons and protons in a coupled fashion. Understanding that at a detailed level is a central challenge in the basic science of energy.”
Miller and colleagues are exploring the PCET puzzle through simulation on the OLCF’s Titan supercomputer. Titan is allowing the group to simulate PCET at a level that was previously impossible. The group discusses its work in recent issues of The Journal of Physical Chemistry Letters and The Journal of Chemical Physics.
“PCET is interesting for two reasons,” Miller explained. “First, it’s used a lot in nature, so it is clearly a useful strategy. And from a quantum mechanical perspective, it’s a tricky problem to understand—we need new theoretical methods to tackle it.”
PCET simulations are very complex and computationally very expensive. They must incorporate the behavior of quantum mechanical particles—electrons and protons in the case of photosynthesis—with that of particles obeying the rules of classical physics—for example, the surrounding water molecules.
Quantum particles are breathtakingly fast and seemingly bizarre; for example, they regularly cross solid barriers in a process known as quantum tunneling. Classical particles, on the other hand, are much slower and commonsensical; tunneling and other quantum behaviors are not available to them.
“In PCET you have to consider both very quantum mechanical electrons and weakly quantum mechanical protons,” Miller explained, “and you have hundreds of thousands of classical solvent molecules. How do you take that array of different behaviors and put it onto a reliable footing so you can describe what’s going on?”
Miller and colleagues have addressed this problem with a technique known as ring polymer molecular dynamics, or RPMD. Put simply, it is able to take the quantum mechanical behavior of electrons and protons and map them onto a simulation of classical molecular dynamics. The technique builds on the path integral formulation of Nobel laureate Richard Feynman, who also taught at Caltech.
The team has implemented RPMD within existing codes such as GROMACS (for Groningen Machine for Chemical Simulations) and DL_POLY. A typical simulation can include more than 14,000 atoms, including the reacting molecules and solvent environment.
In addition, it is able to work around the time differences among very fast quantum mechanical processes, slower classical behaviors, and the much slower reactions involved in PCET.
“These reactions can occur once per second or so,” Miller explained. “They can be very slow. If we did a direct molecular dynamics simulation, we would never see anything happen.
“So what we use are rare-event sampling strategies to efficiently sample only the reactive events and to overcome that very long timescale between reactive events, which can last up to seconds or minutes.”
Not only is this effort opening the door to new and improved technologies, but it also helped Miller secure tenure more than a year ahead of schedule.
One obvious application for the new understanding coming out of this work is improved technologies to use energy coming from the sun. Miller is working with two Caltech colleagues, experimentalists Harry Gray and Jonas Peters, to create photosynthesis without having to rely on plants.
“One of the things that people are trying to understand is how to make a purely synthetic device that does the job of photosynthesis efficiently.”
In particular, he said, the team is working to develop the ability to efficiently split water molecules to extract hydrogen. Success in this area could have significant impact on efforts to produce clean energy, because while hydrogen itself is a carbon-neutral source of practically limitless energy, it is now produced primarily from fossil fuels, which are not.
“There are experimentalists here at Caltech that are very focused on achieving the aim of finding catalysts that will split water to produce hydrogen and oxygen. To guide that effort, we need to understand the basic reactions that are governing that process.”
A. R. Menzeleev, J. S. Kretchmer, T. F. Miller, III, H. B. Gray, and J. M. Mayer, “Long-range proton-coupled electron-transfer reactions of bis(imidazole) iron tetraphenylporphyrins linked to benzoates,” Journal of Physical Chemistry Letters 4 (2013): 519−523.
J. S. Kretchmer and T. F. Miller III, “Direct simulation of proton-coupled electron transfer across multiple regimes,” Journal of Chemical Physics 138 (2013): 134109.]]>
Researchers conduct unprecedented study on GPUs of damaging, high-frequency shaking. Visualization by Amit Choursia
Researchers conduct unprecedented study on GPUs of damaging, high-frequency shaking
When the last massive earthquake shook the San Andreas Fault in 1906—causing fires that burned down most of San Francisco and leaving half the city’s population homeless—no one would hear about “plate tectonics” for another 50 years, and the Richter scale was still a generation away. Needless to say, by today’s standards, only primitive data survive to help engineers prepare southern California for an earthquake of similar magnitude.
“We haven’t had a really big rupture since the city of Los Angeles existed,” said Thomas Jordan, Southern California Earthquake Center (SCEC) director.
Scientists predict this is just the quiet before the storm for cities like San Francisco and Los Angeles, among other regions lining the San Andreas.
“We think the San Andreas Fault is locked and loaded, and we could face an earthquake of 7.5-magnitude or bigger in the future,” Jordan said. “But the data accumulated from smaller earthquakes in southern California over the course of the last century is insufficient to predict the shaking associated with such large events.”
To prepare California for the next “big one,” SCEC joint researchers—including computational scientist Yifeng Cui of the University of California, San Diego and geophysicist Kim Olsen of San Diego State University—are simulating on Titan, the world’s most powerful supercomputer for open science research, earthquakes at high frequencies for more detailed predictions that are needed by structural engineers.
Titan, which is managed by the Oak Ridge Leadership Computing Facility (OLCF) located at Oak Ridge National Laboratory (ORNL), is a 27-petaflop Cray XK7 machine with a hybrid CPU/GPU architecture. GPUs, or graphics processing units, are accelerators that can rapidly perform calculation-intensive work while CPUs carry out more complex commands. The computational power of Titan enables users to produce simulations—comprising millions of interacting molecules, atoms, galaxies, or other systems difficult to manipulate in the lab—that are often the largest and most complex of their kind.
The SCEC’s high-frequency earthquakes are no exception.
“It’s a pioneering study,” Olsen said, “because nobody has really managed to get to these higher frequencies using fully physics-based models.”
Many earthquake studies hinge largely on historical and observational data, which assumes that future earthquakes will behave as they did in the past (even if the rupture site, the geological features, or the built environment is different).
“For example, there have been lots of earthquakes in Japan, so we have all this data from Japan, but analyzing this data is a difficult task because scientists and engineers preparing for earthquakes in California have to ask ‘Is Japan the same as California?’ The answer is in some ways yes, and in some ways no,” Jordan said.
The physics-based model calculates wave propagations and ground motions radiating from the San Andreas Fault through a 3-D model approximating the Earth’s crust. Essentially, the simulations unleash the laws of physics on the region’s specific geological features to improve predictive accuracy.
Seismic wave frequency, which is measured in Hertz (cycles per second), is important to engineers who are designing buildings, bridges, and other infrastructure to withstand earthquake damage. Low-frequency waves, which cycle less than once per second (1 Hertz), are easier to model, and engineers have largely been able to build in preparation for the damage caused by this kind of shaking.
“Building structures are sensitive to different frequencies,” Olsen said. “It’s mostly the big structures like highway overpasses and high-rises that are sensitive to low-frequency shaking, but smaller structures like single-family homes are sensitive to higher frequencies, even up to 10 Hertz.”
But high-frequency waves (in the 2–10 Hertz range) are more difficult to simulate than low-frequency waves, and there has been little information to give engineers on shaking up to 10 Hertz.
“The engineers have hit a wall as they try to reduce their uncertainty about how to prevent structural damage,” Jordan said. “There are more concerns than just building damage there, too. If you have a lot of high-frequency shaking it can rip apart the pipes, electrical systems, and other infrastructure in hospitals, for example. Also, very rigid structures like nuclear power plants can be sensitive to higher frequencies.”
A better understanding of the effects of high-frequency waves on critical facilities could inform disaster response in addition to structural engineering.
High-frequency waves are computationally more daunting because they move much faster through the ground. And in the case of the SCEC’s simulations on Titan, the ground is extremely detailed: representing a chunk of terrain one-fifth the size of California (including a depth of 41 kilometers) at a spatial resolution of 20 meters. The ground models include detailed 3-D structural variations—both larger features such as sedimentary basins as well as small-scale variations on the order of tens of meters—through which seismic waves must travel.
Along the San Andreas, the Earth’s surface is a mix of hard bedrock and pockets of clay and silt sands.
“The Los Angeles region, for example, sits on a big sedimentary basin that was formed over millions of years as rock eroded out of mountains and rivers, giving rise to a complex layered structure,” Jordan said.
Soft ground like Los Angeles’s sedimentary basin amplifies incoming waves, causing these areas to shake more over a longer period of time than rocky ground, which means some areas further away from the rupture site could actually experience more infrastructure damage.
The entire simulation totaled 443 billion grid points. At every point, 28 variables—including different wave velocities, stress, and anelastic wave attenuation (how waves lose energy to heat as they move through the crust)—were calculated.
“High-frequency ground motion modeling is a complex problem that requires a much larger scale of computation,” Jordan said. “With the capabilities that we have on Titan, we can approach those higher frequencies.”
Back in 2010, the SCEC team used the OLCF’s 1.75-petaflop Cray XT5 Jaguar supercomputer to simulate an 8-magnitude earthquake along the San Andreas Fault. Those simulations peaked at 2 Hertz. At the time the Jaguar simulations were conducted, doubling wave frequency would have required a 16-fold increase in computational power.
But on Titan in 2013, the team was able to run simulations of a 7.2-magnitude earthquake up to their goal of 10 Hertz, which can better inform performance-based building design. By modifying their code originally designed for CPUs for GPUs—the Anelastic Wave Propagation by Olsen, Steven Day, and Cui, known as the AWP-ODC—they significantly improved speed up. The simulations ran 5.2 times faster than they would have on a comparable CPU machine without GPU accelerators.
“We redesigned the code to exploit high performance and throughput,” Cui said. “We made some changes in the communications schema and reduced the communication required between the GPUs and CPUs, and that helped speed up the code.”
The SCEC team anticipates simulations on Titan will help improve its CyberShake platform, which is an ongoing sweep of millions of earthquake simulations that model many rupture sites across California.
“Our plan is to develop the GPU codes so the codes can be migrated to the CyberShake platform,” Jordan said. “Overcoming the computational barriers associated with high frequencies is one way Titan is preparing for this progression.”
Utilizing hybrid CPU/GPU machines in the future promises to substantially reduce the computational time required for each simulation, which would enable faster analyses and hazard assessments. And it is not only processor-hours that matter but real time as well. The 2010 San Andreas Fault simulations took 24 hours to run on Jaguar, but the higher frequency, higher resolution simulations took only five and a half hours on Titan.
And considering the “big one” could shake California anytime in the next few decades to the next few years, accelerating our understanding of the potential damage is crucial to SCEC researchers.
“We don’t really know what happens in California during these massive events, since we haven’t had one for more than 100 years,” Jordan said. “And simulation is the best technique we have for learning and preparing.”—Katie Elyce Jones]]>
The OLCF and industrial users Ford Motor Company and GE Global Research received 5 awards at SC13, the 25th meeting of the annual leading Supercomputing Conference.
Taking place from November 17–22 in Denver, SC13 brought the world’s leading supercomputing centers together with high-performance computing (HPC) users, developers, and sponsors from academia, industry, and government laboratories around the world.
The conference allowed participants to share knowledge and plans through a wide variety of venues and events, including tutorials, workshops, panel discussions, invited talks, research poster sessions, technical paper presentations, and birds-of-a-feather sessions.
For their collaboration and innovation addressing challenging industrial research problems, HPCwire, a leading publication for supercomputing news, recognized the OLCF, Ford, and GE with two Editor’s Choice Awards and one Readers’ Choice Award.
The Editor’s Choice Award for the best use of HPC in manufacturing went to GE and the OLCF for research to better understand ice formation at the atomic level. Using the OLCF’s Titan supercomputer, researchers for the first time simulated hundreds of millions of water molecules freezing in slow motion. New insights into how ice forms will help GE develop wind turbines that are better able to withstand debilitating ice accumulation in cold climates.
Ford and the OLCF were recognized with an Editor’s Choice Award for the best use of HPC in automotive research and a Readers’ Choice Award for the best HPC collaboration between government and industry. Using the OLCF’s Jaguar and Titan systems, Ford optimized for the first time the underhood airflow in automobiles to reduce cooling drag and increase fuel efficiency.
Ford and GE also received the International Data Corporation’s HPC Innovation Excellence Award for this research done on the OLCF’s supercomputing systems. This year’s award marks the second win for GE which was recognized last year for modeling on the OLCF’s Jaguar system the unsteady air flows in the blade rows of turbomachines.
Ford and GE gained access to the OLCF’s supercomputers through a program at ORNL called Accelerating Competitiveness through Computational Excellence, or ACCEL. This industrial partnership program aims to help companies boost competitiveness through easy access to the lab’s world-class computational resources and expertise.
OLCF staff member Jack Dongarra also won the 2013 Ken Kennedy Award.
Outside of the awards, the OLCF and ORNL once again continued their leadership role at the conference.
ORNL staff helped pave the way for this exciting event through leadership roles on the organizing committee. They were also immersed in the conference itself.
For example, birds-of-a-feather sessions, also known as BOFs, allowed participants to gather and discuss topics of common interest. ORNL chaired this portion of the SC agenda and OLCF members lead several of the BOF sessions: Director of Science Jack Wells moderated the session “High-Performance Communications for High-Performance Computing,” user assistance specialist Fernanda Foertter moderated “Women in HPC Around the World,” and INCITE manager Julia White moderated “INCITE and Leadership-Class Systems.”
ORNL staff also stayed busy in workshops and tutorials, both as moderators and presenters, and presented their research findings in papers and posters.
The OLCF also participated on the exhibit floor, staffing Exhibition Booth #1327 with other Department of Energy research laboratories.
Other areas of involvement include: participation in the SC Job Fair to help recruit new staff to the OLCF; a website so that those that could not attend SC could follow the events; content and staffing for the DOE booth, a collaboration among the 17 labs; live-blogging of events by OLCF staff; and the construction and hosting of the backend infrastructure for DOE’s SC website.
For more information https://www.olcf.ornl.gov/sc13/]]>
Roisin Langan, an intern at ORNL, spent last summer developing climate models that are better able to predict the variability and extremes of precipitation. Her project titled “Stochastic Representation of Unresolved Processes in Climate Models” won the best abstract award at the Research Alliance in Math and Science (RAMS) banquet, a student poster session held at ORNL on August 8.
Research in this field could result in more accurate warning systems for extreme events, such as flooding, droughts, and heat waves. In the face of extreme weather, it could help stakeholders plan economic and humanitarian relief efforts.
“This experience helped me gain invaluable networking channels, experience and instruction in effective scientific communication,” said Langan, a recent graduate of the University of California, Santa Barbara.
Richard Archibald and Kate Evans of ORNL’s Climate Change Science Institute mentored Langan, who analyzed data generated on the OLCF’s Titan supercomputer. Langan is now an intern through ORNL’s Nuclear Engineering Science Laboratory Synthesis program and hopes to enter a graduate program in computational science in fall 2014.—by Jennifer Brouner]]>