This summer, renovations began on the main campus of the US Department of Energy’s (DOE’s) Oak Ridge National Laboratory (ORNL) in preparation for Summit, the lab’s next supercomputer for open science.
Summit is expected to serve as the flagship machine for the Oak Ridge Leadership Computing Facility (OLCF), a DOE Office of Science User Facility, for at least 5 years beginning in late 2017. However, the facility to house the 100-plus petaflop supercomputer is being designed to support new systems for the next two decades or more. That means ORNL staff members must take into account not just the infrastructure requirements of Summit but also the likely needs of Summit’s successors.
“Of the three major factors to take into account—power, space, and cooling—cooling is probably the hardest part,” said Jim Rogers, Director of Computing and Facilities for ORNL’s National Center for Computational Sciences. “You’re trying to build a cooling system that can adapt and be capable of supporting the needs of the next four or five systems.”
During a talk at the sixth European Workshop on HPC Center Infrastructure earlier this year in Stockholm, Sweden, Rogers shared the blueprint for the facility that will take the OLCF into the exascale era—a thousandfold increase over current generation supercomputers.
The biggest planned change in infrastructure? A move from cold- to warm-water cooling, a shift that is projected to lower the OLCF’s cooling costs by more than half, saving nearly a million dollars per year in total operating costs.
Current OLCF machines, including the 27-petaflop Cray XK7 Titan, are cooled via chillers that lower the temperature of around 2,200 gallons of water per minute to 42 degrees Fahrenheit. Secondary loops exclusive to Titan’s cabinets carry heat away from the supercomputer to the chilled primary loop. The water then returns to the chillers to be cooled again, a cycle that has sustained OLCF systems since the early 2000s.
Although effective, cold-water cooling is expensive. About 23 percent of Titan’s $7.8 million annual operating budget goes toward cooling. An engineering team is currently evaluating the OLCF’s existing supercomputers to see if they could be cooled more efficiently. Any resulting changes, however, will be minor in comparison to the efficiency gains of the new Summit cooling system, which will have a target supply temperature nearly 30 degrees warmer than the existing system.
“In the new facility, we’re going to have a water supply temperature around 70 degrees Fahrenheit,” Rogers said. “The beauty of that is we’ll no longer need expensive chillers to cool the water. Instead of making up 23 percent of our operations costs, cooling Summit is going to drop to about 10 percent.”
Temperature-tolerant hardware and cooling technology based on heat’s natural tendency to flow from a hot medium to a cold medium are enabling the increased efficiency. For more than 70 percent of the year, high-efficiency cooling towers, where water is cooled through direct contact with air, will be able to handle the OLCF’s cooling needs. A single secondary loop will shift heat away from Summit through equipment known as plate and frame heat exchangers, which use metal plates to transfer heat between two fluids. Similar technology is currently employed at the National Renewable Energy Laboratory in Golden, Colorado, and the Wyoming Supercomputing Center in Cheyenne.
“We’re juxtaposing cold water against warm water, separated by a metal plate,” Rogers said. “The waste heat from Summit, perhaps 105 degrees Fahrenheit on average, transfers across the thin metal plate to the cooler water. Additionally, because Summit’s cold-water supply temperature is only about 70 degrees it’s a whole lot easier to maintain that temperature in our environmental conditions.”
On hot and humid days in East Tennessee and under heavy compute loads, Summit’s cooling system may need an extra chill. Rogers said the chilled water loop currently in place will trim off a few degrees in those instances. It’s a measure he estimates will be necessary only about 30 percent of the year, with much of that time requiring very little help from the existing chilled water plant.
OLCF staff will get a chance to test the planned cooling system in 2016 when IBM delivers a test cabinet similar to what Summit will look like.
“It will allow us to work out any issues we may encounter with the control systems,” Rogers said. “From an operations perspective, it is a critical step for mitigating risk with the larger system.”
Oak Ridge National Laboratory is supported by the US Department of Energy’s Office of Science. The single largest supporter of basic research in the physical sciences in the United States, the Office of Science is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.