OLCF Staff Participate in Merged Conferences for New Perspectives
Staff from the Computer Science Research Group travel to colocated conferences to facilitate development goals
Last month staff at the Oak Ridge Leadership Computing Facility (OLCF) attended a group of events colocated in Barcelona, Spain, at the Barcelona Supercomputing Center (BSC) to advance the development of two of the most popular high-performance computing (HPC) programming models—message passing interface (MPI) and OpenMP—and bridge the two communities and their users.
The advent of new computing architectures brings the need for scalable and performance portable programming, and coordinated development of these programming models can maximize their efficiency and performance so they can exploit the parallelism available across and within compute nodes.
“These two need to be developed together, because their interactions are becoming essential to capitalizing on the complex and powerful nodes that we are seeing today,” said Oscar Hernandez, tools developer in the Computer Science Research (CSR) Group at the OLCF. The two platforms are complementary because OpenMP focuses on the parallelizing compute tasks inside a node—such as running programs on multiple cores at the same time, running the same task across multiple cores, and using GPU accelerators—and MPI typically focuses on communication across processes that run on the nodes in a system.
Both platforms, used to run scientific codes on the 27-petaflop Cray XK7 Titan and the 200-petaflop IBM AC922 Summit supercomputers at the OLCF, are key to successfully taking advantage of OLCF systems as well as other US Department of Energy (DOE) systems. The OLCF is a DOE Office of Science User Facility located at DOE’s Oak Ridge National Laboratory (ORNL).
This year vendors, developers, and application experts attended the colocated conferences to discuss successes, trends, and research related to these programming models. The events featured technical workshops and OpenMP and MPI standards committee meetings consisting of implementors, users, and vendors who are defining these community standards.
At the first event—the MPI Forum, held September 19–21 at BSC—R&D staff member Geoffroy Vallee represented ORNL and participated in standardization efforts, including potential extensions to MPI that would allow users to leverage modern hardware architectures. He also initiated the first interstandards discussion group by creating a working group that will directly interact with the OpenMP standardization committee.
At the second event—EuroMPI, held September 24–26 at BSC—Vallee presented a poster titled “Improving Support for MPI+OpenMP Applications,” which focused on coordinating MPI and OpenMP runtimes to optimize computation.
At the OpenMPCon sister conference, held September 24–25, Hernandez and Yun (Helen) He of the National Energy Research Scientific Computing Center gave a talk titled “Using MPI+OpenMP for Current and Future Architectures.” The team shared experiences and success stories from their respective computing centers, noting the scientific applications currently using MPI and OpenMP, the best practices for attaining code speedups, and the challenges the centers will need to solve to ensure portability across leadership-class systems.
Hernandez and He also helped present an “Introduction to ‘OpenMP Common Core’” tutorial that allowed them to get feedback on the most efficient ways to teach users about the new functionalities—such as tool portability—that will be available in OpenMP 5.0.
Because the conferences were held in the same location this year, both MPI and OpenMP experts had the opportunity to attend talks from each event and network with people outside of their own communities. This kind of collaboration will contribute to exascale computing efforts at ORNL, such as the Scaling OpenMP via Low Level Virtual Machine for Exascale Performance and Portability (SOLLVE) project and the Open MPI for Exascale, or OMPI-X, project led by CSR Group Leader David Bernholdt. The two projects aim to improve the platforms’ scalability and performance for future exascale architectures, such as the OLCF’s Frontier.
Another of the events at BSC was the International Workshop on OpenMP. University of Delaware (UDel) doctoral student Jose Manuel Monsalve Diaz presented a paper, “OpenMP 4.5 Validation and Verification Suite for Device Offload,” which he coauthored with OLCF CSR computer scientist Swaroop Pophale, Hernandez, Bernholdt, and others from UDel.
Because OpenMP 4.5 does not have a validation and verification (V&V) suite, a team at ORNL and a team led by UDel professor Sunita Chandrashekaran are working to create functionality tests that can be used to validate the correctness of a particular implementation of OpenMP as part of the SOLLVE project. As the team is developing these tests, they are running them on Summit for benchmarking.
“We have to figure out what’s not working with implementations on Summit and communicate with the vendors to resolve issues,” Pophale said. “Additionally, we have access to a number of platforms, so being able to collaborate with the OpenMP community through these events can help us understand how OpenMP is used and what cases the V&V needs to address.”
The current OpenMP functionality tests from the SOLLVE Project are available here: https://bitbucket.org/crpl_cisc/sollve_vv/
ORNL is managed by UT-Battelle for the Department of Energy’s Office of Science, the single largest supporter of basic research in the physical sciences in the United States. DOE’s Office of Science is working to address some of the most pressing challenges of our time. For more information, please visit https://science.energy.gov.