PI: Andreas Kronfeld,
Distinguished Scientist, Fermilab
In 2016, the Department of Energy’s Exascale Computing Project (ECP) set out to develop advanced software for the arrival of exascale-class supercomputers capable of a quintillion (1018) or more calculations per second. That leap meant rethinking, reinventing, and optimizing dozens of scientific applications and software tools to leverage exascale’s thousandfold increase in computing power. That time has arrived as the first DOE exascale computer — the Oak Ridge Leadership Computing Facility’s Frontier — opened to users around the world. Exascale’s New Frontier explores the applications and software technology for driving scientific discoveries in the exascale era.
The Science Challenge
One of the most challenging goals for researchers in the fields of nuclear and particle physics is to better understand the interactions between quarks and gluons — the building blocks of protons and neutrons, which make up atomic nuclei. Deciphering these fundamental nuclear interactions is key to a variety of scientific enquiries, from designing experiments to test and refine the Standard Model of Particle Physics to achieving a better understanding of dark matter and its interaction with protons and neutrons.
The theory of the strong nuclear force that forms the bonds between these particles is called quantum chromodynamics, or QCD. Making predictions based on QCD requires high-performance computing to solve its complicated mathematical equations. Computational physicists use an approach called lattice QCD, which defines quarks and gluons on a 4D space-time grid, thereby allowing researchers to run calculations on computers.
Why Exascale?
By optimizing lattice QCD calculations to fully utilize the power of exascale supercomputers, the LatticeQCD project enables physicists to run their algorithms on a much larger scale — with some 10 billion degrees of freedom (positions, spins, momenta and other quantities) — and at much faster speeds. This capability will allow them to more realistically simulate the atomic nucleus and to obtain a deeper understanding of the fundamental organization of matter at the subatomic level.
“Whenever a new generation of computers emerges, like in the case of Frontier, they have more memory and more computing power, so we can increase the number of degrees of freedom in our calculation, and then we can run it longer,” said Andreas Kronfeld, principal investigator for the LatticeQCD project and a distinguished scientist at Fermilab. “And if you do all that, then you can make the result of the calculation more precise, and that is crucial to interpreting many experiments, particularly ones in particle physics.”
Frontier Success
The LatticeQCD team’s challenge problem set by the ECP was to make six computations representative of three of the common quark actions currently used by the worldwide lattice QCD community. Each fermion action — the highly improved staggered quark action, the domain wall fermion action, and the Wilson-clover fermion action — has specific advantages for different problems in nuclear and high-energy physics. (Quarks are examples of fermions, which are subatomic particles with half-odd-integer spin; electrons are another example.)
The team attained its figures of merit — integral measures of scalable science performance set by the ECP — for each action. The figures of merit were determined via two benchmark components: the generation of gauge configurations (i.e., snapshots of the gluon fields) and a typical suite of measurements performed with each gauge configuration.
Furthermore, the team exceeded its stretch goal of a 50× speedup to attain its figures of merit on Frontier by achieving an average of 70× speedup for all the computations.
What’s Next?
The team is currently optimizing LatticeQCD to run on the upcoming Aurora supercomputer at Argonne National Laboratory. Beyond that ECP goal, they will continue the work of improving the code’s algorithms.
“All of the improvements in the software that happened during the ECP are going to make it easier to get new algorithms into production without too much rewriting of the code,” Kronfeld said. “We’re very well set up to get new algorithms into the exascale machines so that we can go far beyond the ECP factor of a 50× speedup. The sped-up calculations will yield physics results of unprecedented impact on experiments in particle physics and nuclear physics.”
Support for this research came from the ECP, a collaborative effort of the DOE Office of Science and the National Nuclear Security Administration, and from the DOE Office of Science’s Advanced Scientific Computing Research program. The OLCF is a DOE Office of Science user facility located at ORNL.
UT-Battelle LLC manages ORNL for DOE’s Office of Science, the single largest supporter of basic research in the physical sciences in the United States. The Office of Science is working to address some of the most pressing challenges of our time. For more information, please visit https://energy.gov/science.