Event stresses accelerating scientific applications
The room was abuzz with purring laptops, clicking keyboards, fickle footsteps, and the low rumbling voices of dozens of computing experts pouring over millions of lines of code.
From October 27 to 31, scientific computing teams from around the world gathered in Knoxville to participate in the Oak Ridge Leadership Computing Facility’s (OLCF) inaugural Hackathon, an OpenACC event specifically aimed at scaling scientific applications to run on heterogeneous, high-performance computing (HPC) systems such as Titan at the US Department of Energy’s Oak Ridge National Laboratory (ORNL). The OLCF is a DOE Office of Science User Facility.
OLCF’s HPC User Assistance Specialist and training coordinator Fernanda Foertter planned the OLCF Hackathon based on user feedback she frequently received from code scaling training sessions that focused on the same three points: (1) users tried to port their code, but got stuck; (2) they lacked adequate “people power”; and (3) they ported the code but simply didn’t get the desired performance.
“Scientific applications are huge and very complicated. And usually they’re developed by lots of people in very far away locations,” Foertter said. “So I thought, ‘Why not get the scientific developers together with the GPU developers?’”
After a two week open call, Foertter chose six teams representing a diversity of scientific disciplines from climate to machine learning to participate in the event. In addition, for each team she recruited two expert mentors from Cray, NVIDIA, The Portland Group, The Swiss National Computing Centre, and the OLCF’s Scientific Computing and User Assistance and Outreach Groups.
By pairing the scientists with these experts, Foertter hoped the teams would be able to accomplish two things: (1) port their code to Titan’s GPUs and (2) do so without any significant performance penalties. That objective was, by all accounts, met with great success.
“It was awesome! Without our mentors it would have taken us several months to do what we were able to achieve in a week,” said research scientist Rangan Sukumar of the Computational Data Analytics Group at ORNL. “Having somebody that really knows the best practices showing us how the compilers come together, demonstrating how to prestage your dependencies and link libraries, and explaining aspects of how to write high-performance code with OpenACC was invaluable. It enabled a team of researchers—some very new to OpenACC—to leverage heterogeneous compute architectures and make the best use of hardware, software, and programming practices.”
Learning the lingo
Just because some applications run efficiently on one processor doesn’t guarantee they will run well on many. To parallelize a code (ensure it will scale to systems with heterogeneous CPU/GPU architectures like Titan) developers have access to multiple tools.
“In HPC most applications are written in Fortran, C, or C++. To minimize the changes to the overall application structure, compilers can be given hints as to what parts of the code can be offloaded to the GPU,” Foertter said. “OpenACC is one kind of compiler directives that can be used to program GPUs.”
OpenACC has been a vital tool for Titan and its users for two years. It has allowed researchers to attain a high level of parallelism/performance from their code, resulting in improved time to solutions and higher fidelity in simulations.
With ORNL’s recent announcement of its next HPC system, Summit—arriving in 2017—events such as the Hackathon provide tremendous opportunities for users to make the most out of today’s systems, as well as ensure seamless transitions to future generations of hybrid supercomputers.
“My impression thus far is that this event has been hugely successful,” said OpenACC President and NVIDIA’s Strategic Alliances Manager Duncan Poole during the closing ceremony. “Hackathons allow us to find key codes that are ready to take on the challenge of porting to accelerators through a very quick, focused effort. They also allow us to understand the gaps in our current implementations. Getting all that feedback in such a focused way is incredibly valuable.”
“Doing something new, like moving to a new platform, can be frustrating, and it’s nice to have someone there to help you keep moving forward, especially if your friends and colleagues are there with you,” Foertter said. “We definitely plan to have more of these here in the future, and perhaps even with our colleagues in Europe.” —Jeremy Rumsey
Oak Ridge National Laboratory is supported by the US Department of Energy’s Office of Science. The single largest supporter of basic research in the physical sciences in the United States, the Office of Science is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.