Flatiron Institute and Center for Astrophysics researchers published data from the largest known simulations of dark matter, showing its possible distribution from the big bang to the present

A team led by researchers at the Flatiron Institute and the Center for Astrophysics | Harvard & Smithsonian has published the largest amount of data for a publication to date on the Constellation service at the Oak Ridge Leadership Computing Facility (OLCF). The data includes models, collectively  dubbed the AbacusSummit suite, showing different distributions of matter in the universe at different resolutions. It also uses different sets of cosmological parameters. Run by the team on the OLCF’s 200-petaflop Summit supercomputer, the suite comprises a whopping 2 petabytes of data. This amount is the equivalent of more than 6 years of continuous 4K video. The models were, individually, among the largest ever run and, in total, the largest set ever produced.

“We simulated the gravitational evolution of dark matter from the big bang to the present day, which tells us where galaxies will end up,” Flatiron Research Fellow Lehman Garrison said. Dark matter is a hypothetical form of matter that is thought to account for most of the matter in the universe and more than a quarter of the universe’s energy and mass.

Garrison, along with graduate student Nina Maksimova and Professor Daniel Eisenstein, both of the Center for Astrophysics, ran more than 160 simulations containing nearly 60 trillion particles in total on Summit. They then whittled their data down from nearly 60 petabytes to 2 petabytes. They published only the crucial details using the OLCF’s Constellation service in association with their paper in Monthly Notices of the Royal Astronomical Society. A digital object identifier (DOI)–based science network, Constellation provides an open-access hub for the storage of valuable supercomputing data. This data can then be used by scientists around the world to look at how the models stack up against their own studies.

“This is larger than anything we have ever published,” said Ross Miller, systems integration programmer in the OLCF’s Technology Integration group. “It’s 20 times more data than the 100-terabyte DOIs that were previously the biggest datasets we had on Constellation.”

The AbacusSummit suite comprises hundreds of simulations of how gravity shaped the distribution of dark matter throughout the universe. Here, a snapshot of one of the simulations is shown at various zoom scales: 10 billion light-years across, 1.2 billion light-years across and 100 million light-years across. The simulation replicates the large-scale structures of our universe, such as the cosmic web and colossal clusters of galaxies.
Credit: The AbacusSummit Team; layout and design by Lucy Reading-Ikkanda

Miller worked with Mitch Griffith and the High Performance Storage System (HPSS) administrators to shuffle around data in HPSS so that Garrison’s data could be worked into the correct location.

“HPSS handles multiple petabytes without a problem,” Miller said. “And we want people to know that we can handle data—and a lot of it.”

Garrison’s team is currently analyzing the AbacusSummit suite, which they hope can be compared with data from the Dark Energy Spectroscopic Instrument (DESI). DESI aims to measure dark energy’s effect on the expansion of the universe. Garrison is thankful that the OLCF has a space for datasets as massive as the ones his team is producing because it can give other researchers the ability to replicate his studies.

“The fact that ORNL said they could host this and do it forever is what will make our analysis reproducible and persistent,” Garrison said.

Although individual scientists will use the team’s other portal at the National Energy Research Scientific Computing Center, large institutions and telescope collaborations will likely use Constellation to access the team’s data, Garrison said.

“For bulk transfers of the whole suite or large parts of it, Constellation really shines,” he said.

The compute time on Summit was supported by the Office of Advanced Scientific Computing Research (ASCR) as part of an ASCR Leadership Computing Challenge (ALCC) allocation. Lawrence Berkeley National Laboratory is the lead institution on the DESI project.

Related Publications: Garrison, Lehman, Nina A. Maksimova, Lehman H. Garrison, Daniel J. Eisenstein, Boryana Hadzhiyska, Sownak Bose, and Thomas P. Satterthwaite. “AbacusSummit: Cosmological N-body Halos, Light Cones, Particles, Merger Trees, Initial Conditions, and Power Spectra.” United States. https://doi.org/10.13139/OLCF/1811689. https://www.osti.gov/servlets/purl/1811689.

Maksimova, Nina A., Lehman H. Garrison, Daniel J. Eisenstein, Boryana Hadzhiyska, Sownak Bose, and Thomas P. Satterthwaite. “AbacusSummit: A Massive Set of High-Accuracy, High-Resolution N-Body Simulations.” Monthly Notices of the Royal Astronomical Society (2021): stab2484. https://doi.org/10.1093/mnras/stab2484.

The research was supported by DOE’s Office of Science. UT-Battelle LLC manages ORNL for DOE’s Office of Science, the single largest supporter of basic research in the physical sciences in the United States. DOE’s Office of Science is working to address some of the most pressing challenges of our time. For more information, visit https://energy.gov/science.