The “Pioneering Frontier” series features stories profiling the many talented ORNL employees behind the construction and operation of the OLCF’s incoming exascale supercomputer, Frontier. The HPE Cray system is scheduled for delivery in 2021, with full user operations in 2022.
Reuben Budiardja’s thoughts are often among the stars. As a computational scientist in the Advanced Computing for the Nuclear, Particles, and Astrophysics group at the US Department of Energy’s (DOE’s) Oak Ridge National Laboratory (ORNL), he gets to pursue the mysteries of the universe while at work. It’s a dream job that combines two of his passions: high-performance computing and astrophysics.
“It’s exciting that we can do simulations to understand the structure of the universe, how galaxies form, and how supernovae are the origins of all the chemical elements in the universe,” Budiardja said. “It’s the largest of things yet it informs how life happened. The grand scale of questions about our cosmic origins, and the fact that we can use simulations to gain understanding of nature, is very intellectually satisfying.”
However, to explore those monumental questions of the cosmos and our existence in it, a lot of computational grunt work must be conducted on terra firma first. And that’s where Budiardja excels, especially in preparing codes to run on the upcoming Frontier exascale supercomputer, which is currently being installed at the Oak Ridge Leadership Computing Facility (OLCF), a DOE Office of Science user facility located at ORNL.
With its next-generation hardware, Frontier will be able to solve calculations up to 50 times faster than today’s top supercomputers, exceeding a quintillion, or 1018, calculations per second. But to fully use that power, scientific software must be optimized for Frontier’s new AMD EPYC CPUs and AMD Instinct™ GPUs long before the components are actually available—which means Budiardja has been keeping busy.
Budiardja was recently named lead for the compilers working group for Frontier, taking over the position from Oscar Hernandez, a former tools developer in ORNL’s Computer Science Research group. As a member of Frontier’s compilers team from its inception, Budiardja has been toiling to ensure that the new system’s programming environments can be effectively used by computational scientists as soon as Frontier is operational. Compilers act as a bridge between the programming languages used to create software and the instructions that the machine can understand and execute.
“We no longer write programs in what we call the low-level languages, such as assembly, because higher level programming languages help us more in terms of productivity, being able to write more physics, more science, and express ourselves more easily into what the computer needs to do,” Budiardja said. “It’s the compiler’s job to make the code into something the machines can efficiently execute.”
Consequently, the new compilers will be key to Frontier’s early success. Budiardja’s working group must act as the interface between Frontier’s users and its vendors to balance programming needs vs. technical abilities in devising the compiler.
“It is a big task, but I lean on a lot of my colleagues—there are a lot of smart people working on this project. A single person can’t know everything,” Budiardja said. “From our side, we have the best knowledge of what our users do on our machines. And our vendor partners have the best knowledge of what the hardware is going to look like and what they need to do to emit efficient instructions. So it’s really a collaborative effort in building this tool chain.”
As part of the large-scale effort to prepare specific codes to run on Frontier on “day one” of being operational, Budiardja is also a scientific liaison in the Frontier Center for Accelerated Application Readiness program. Fittingly, he assists the team behind Cholla, a simulation code developed to help understand how galaxies form and evolve, which is led by Evan Schneider of the University of Pittsburgh.
Budiardja’s career has been leading him to exactly this point. Originally from Indonesia, he became a graduate research assistant in computational astrophysics in the University of Tennessee (UT)/ORNL Theoretical Astrophysics group in 2002. Eight years later, he joined the Department of Physics and Astronomy at UT Knoxville as a postdoc, then became a computational astrophysicist with UT Knoxville’s National Institute for Computational Sciences in 2012. Finally, in 2017, he set his sights on his dream job.
“When I was a graduate student, I was a user at the OLCF, so I knew all about it. It was always kind of my dream job: ‘Wouldn’t that be cool—to be on site at the center and actually have access to the cutting-edge hardware?’” Budiardja said. “And to develop for future machines and help users port their applications—that has always been my aspiration, so in some ways, I’m living the dream.”
Part of that dream currently includes planning the rigorous acceptance tests that Frontier will undergo to ensure it’s operating as designed before being released to users. “Quite a bit of homework has to be done prior to actually doing the testing on the machine itself,” Budiardja said. That homework includes documenting what will be tested, designing the tests themselves, and preparing the codes that will be used for the tests. Furthermore, the OLCF’s testing infrastructure must be updated to match the Frontier hardware so that the tests can be conducted as soon as the machine is installed.
“It’s very challenging, but it’s also very exciting just knowing the kind of science you can do on Frontier that you know cannot be done anywhere else,” Budiardja said. “That’s why we’re building this machine.”
UT-Battelle LLC manages Oak Ridge National Laboratory for DOE’s Office of Science, the single largest supporter of basic research in the physical sciences in the United States. DOE’s Office of Science is working to address some of the most pressing challenges of our time. For more information, visit https://energy.gov/science.