Combination of speed and energy efficiency is real gain
When the Oak Ridge Leadership Computing Facility (OLCF) replaced its Jaguar supercomputer with Titan, not only did it expand its computing speed tenfold, it also saved on the electric bill.
Titan’s combination of speed and energy efficiency is a real gain, said Jim Rogers, the OLCF’s director of operations.
“It’s a totally different machine in the same physical cabinet,” Rogers explained. “It’s a powerful machine, and it’s a power-efficient machine.”
Jaguar’s run, in its configuration as a Cray XT5 supercomputer, came to an end at the close of 2011. In its last 2 years of production, Jaguar’s typical 5 million watts of instantaneous consumption translated into an average of 3.7 million kilowatt-hours used per month.
By comparison Titan, which entered full production in June 2013, has demonstrated a typical instantaneous consumption of just under 5 million watts, or an average of 3.6 million kilowatt-hours per month.
In short, Titan is using roughly the same amount of electricity to run its computations as did Jaguar, but at up to tenfold the performance. Where the new machine gathers its extra bang for the buck, Rogers said, is through the configuration of the compute nodes.
Each of Jaguar’s Cray XT5 compute nodes featured two 12-core AMD Opteron processors whereas each compute node in Titan pairs one 16-core AMD Opteron processor with one NVIDIA Kepler GPU. This hybrid architecture is not only computationally faster, but also much more energy efficient.
“The NVIDIA Kepler GPU has very sophisticated power management features,” Rogers explained. “Each GPU can identify new work, schedule it, and change its power-state accordingly. The GPU is ready to go in an instant, ramping both processor frequency and power budget. However, when it’s idle, it almost powers itself down—consuming less than 20 watts. The beautiful thing is it can switch between those states seamlessly based on workload. The net effect is that we actually use less energy to get a lot more work done.”
In fact, Rogers’s team is tracking statistics to show just how efficient the GPUs are at performing their work. “GPU-enabled applications can typically reduce their run time for the same problem by more than 50 percent. We’ve seen some applications where the speedup is as much as seven times the nonaccelerated case.
“We’re operating more compute cores, with up to 10 times the capability, in the same or smaller power profile and the same footprint, “ he said. “The hybrid architecture in Titan, with its very efficient GPUs, is the dominant driver for energy consumption and management.”