Supercomputers monitor proteins responsible for cellular behavior

The simulation represents a subset of proteins immersed in a fluid. Image courtesy of the National Research Council of Italy.

The simulation represents a subset of proteins immersed in a fluid. Image courtesy of the National Research Council of Italy.

A simulation of the internal workings of cells has reached a sustained performance of 20,000 trillion calculations per second, or 20 petaflops, on the Titan supercomputer at Oak Ridge National Laboratory (ORNL).

The achievement makes the code—which promises to advance both biology and medicine—one of the fastest in the world. It also earned the development team from the Consiglio Nazionale delle Ricerche (CNR), or National Research Council, of Italy a finalist position for the coveted Gordon Bell Prize.

Scientists are using supercomputers to simulate the hundreds of thousands of proteins that move and interact within cells. The understanding they gain sheds light on the factors that drive the critical activities of cells, which are the most basic units of life.

“We are simulating the crowded protein solution that is representative of our cell compartments,” said Simone Melchionna from CNR’s Institute for Chemical and Physical Processes. “If we want to target a specific cell for treatment, we need to understand how that cell works. Proteins are the most crucial cellular agents, and their behavior affects all activities that take place within the human body.”

Proteins perform a vast number of functions in living organisms, from catalyzing reactions to replicating DNA to transporting molecules. In the human body, proteins are responsible for most physiological states, such as regulating heart rate and blood pressure and firing brain cells to allow movements such as blinking or swallowing.

Scientists have predicted the activity of isolated proteins, but nature isn’t an isolated system. The research team is interested in studying the cell interior, which is packed with proteins—an environment that can’t be studied in a laboratory using dilute protein solutions.

In living cells proteins interact with other proteins and with surrounding fluids by changing their shapes, movements, and behaviors, sometimes dramatically. These dynamic interactions complicate studies that examine biological systems.

The research team developed a code called MUPHY, for MUlti PHYsics simulator, to study the two-way interactions that exist between proteins and fluids within a cell. The application accounts for the physical forces that cause protein movement, as well as the forces involved in protein–protein and protein–fluid interactions.

Using this code, researchers can learn more about how proteins move in crowded conditions and travel between cellular compartments, or organelles. Knowing the general pattern for protein movement and interaction could help scientists understand specific diseases like Alzheimer’s, which results in a clumping of proteins in the brain.

To prove that these complex and realistic cellular simulations can be performed, the team used the Oak Ridge Leadership Computing Facility’s Titan, the United States’ fastest supercomputer for open scientific research. By combining CPUs and GPUs in a hybrid architecture, Titan has a peak performance of more than 27 petaflops.

“Our code runs on 18,000 of Titan’s 18,688 nodes,” said Massimo Bernaschi, the chief technology officer at CNR’s Institute for Applied Computing. “Our simulations run fastest on GPUs, and Titan currently has the highest number of available GPUs. We made the simulation run even faster by minimizing the communication between nodes, which allowed the GPUs to run at their full speed.”

The code achieved a peak performance of almost 27.5 petaflops by using a mixed-precision calculation.

The code is a computational platform for multiscale simulations of real-life biofluidic problems. It can also simulate other biological phenomena, such as blood flow through the coronary arteries. It is written in the Fortran 90 and CUDA C languages and employs MPI—or message passing interface—for communication.

In this run each node monitored the movement and behavior of a single protein, which minimized the memory requirement and helped provide a good load balance. The proteins simulated are representative of the typical size of normal-state (i.e., not diseased) proteins and are the most common ones found in yeast cells, which are less sophisticated than but similar to human cells.

The team includes Bernaschi, Melchionna, Mauro Bisson from CNR’s Institute for Applied Computing, and Massimiliano Fatica from the NVIDIA Corporation (United States). This code has been a finalist in two other Gordon Bell competitions, but this is the first time that the CNR team has competed in the peak performance section.

The Gordon Bell Prize, which is awarded annually by the Association for Computing Machinery (ACM), recognizes outstanding achievements in high-performance supercomputing. The prizes identify supercomputer codes that can perform on large systems like Titan with quick times to solution. Gordon Bell finalists will present their results in November at the SC13 Conference in Denver, where the winners will be named. —by Jennifer Brouner