How Tom Papatheodore and the system acceptance team are preparing Frontier for prime time
The “Pioneering Frontier” series features stories profiling the many talented ORNL employees behind the construction and operation of the OLCF’s incoming exascale supercomputer, Frontier. The HPE Cray system is scheduled for delivery in 2021, with full user operations in 2022.
When Tom Papatheodore logs on, the discoveries begin.
Papatheodore, a high-performance computing engineer in the System Acceptance and User Environment Group at the Oak Ridge Leadership Computing Facility (OLCF), test-drives and delivers training for the hardware and software components that will come together as Frontier, the nation’s first exascale supercomputing system.
The HPE Cray system promises computational speeds that top 1 quintillion calculations per second when it opens to full user availability in 2022. That’s more than 1018, or a billion-billion, operations. Researchers around the world have reserved places in line to enter their codes and run simulations for every grand challenge—from nuclear fusion to modeling the Big Bang. Making those studies happen means Papatheodore and his group stay busy putting the processors through one digital dress rehearsal after another.
“We’re the first people—I or someone in our group—on Earth to use the system, and that’s by design,” he said. “We’re ‘User Zero.’ It’s our job to prepare this whole computing ecosystem so it’s available to use that first day. What we’re trying out will change science when it’s released, and that’s part of the excitement. Frontier’s going to make a revolutionary impact—not just in how fast calculations can be performed but in what’s possible to do at all.”
Papatheodore knows from experience what a difference that kind of power can make. He also helps lead training on Summit, the OLCF’s flagship 200-petaflop IBM AC922 supercomputer, and began his scientific career as a graduate student, and later a postdoc, simulating supernova explosions on Titan, the 27-petaflop predecessor to Summit.
“Frontier’s my third-generation OLCF system so far,” he said. “Whenever we release a new system, people ask why we need a new supercomputer. The answer is in what we gain. It’s like the difference for a photographer between working with a pixelated image and high-definition. As we put together these new machines, we keep getting closer to models that represent reality in all its detail, and we’re always able to make some new kind of scientific discovery that we couldn’t before.”
To help prepare Frontier and the user community for full operations, Papatheodore and the team work with users from the OLCF’s Center for Accelerated Application Readiness and the Exascale Computing Project for a sneak peek at Frontier’s capabilities via Spock, a demo version of the system built with the previous generation of the hardware and software that will be used in Frontier.
“It’s basically trying to get as close to Frontier as possible with existing hardware and software,” he said. “These are experienced users from across the various scientific domains working with community codes that can support research across a range of applications. If we can get these codes ported and running, now we have code that can run on Frontier on day one to offer the full advantage of what the system can do. We only have so many test apps we can run, so the more use cases they can stress the system with early, the better.”
Papatheodore’s early astrophysical studies on Titan trained him to think like a supercomputing user. He says his current role offers the best of all worlds—one foot in the scientific explorations of users and the other behind the scenes helping make the machine work.
“We get to help users prepare their codes to perform ground-breaking science runs and also get to try out hardware that no one else has had access to before. How can that not be exciting? But when a problem is found, we can’t just Google the answer. We need to investigate it ourselves and work together with the vendors to find a solution. It’s through this process—running our own tests, helping early users with their applications, and working with our vendor partners—that we shake out a majority of the issues that exist on brand new systems on our way to releasing a new world-class supercomputer.”
In between training and testing, Papatheodore enjoys hiking and exploring the outdoors with his wife and their two daughters. Sometimes he plays teacher to the girls. Sometimes they teach him.
“When you’re a kid, the world still seems magical because there’s so much stuff out there to discover that you still don’t know about,” he said. “That’s one of the joys of science, too, and they help me remember it.”
UT-Battelle LLC manages Oak Ridge National Laboratory for DOE’s Office of Science, the single largest supporter of basic research in the physical sciences in the United States. DOE’s Office of Science is working to address some of the most pressing challenges of our time. For more information, visit https://energy.gov/science.