OLCF partners with Fermilab to integrate new data transfer technology

Researchers at US Department of Energy (DOE) national laboratories frequently transfer files between experimental and observational facilities, their home institutions, and computational facilities like the Oak Ridge Leadership Computing Facility (OLCF), a DOE Office of Science User Facility located at DOE’s Oak Ridge National Laboratory (ORNL).

Moving large files between two or more sites can be difficult and time-consuming, but the Adaptive Input/Output System (ADIOS), an open-source I/O framework built by OLCF staff, allows researchers to reduce time spent on the complex task of moving data on and off high-performance computing resources. Now the ADIOS developers have partnered with a team at DOE’s Fermi National Accelerator Laboratory (Fermilab) to integrate Fermilab’s Multicore-Aware Data Transfer Middleware (MDTM) into ADIOS. This integration will enable more efficient data movement to and from Titan, the OLCF’s Cray XK7 supercomputer.

MDTM is a high-performance data transfer tool that efficiently uses multicore hardware. Integrating MDTM into ADIOS will enable users with allocation hours on Titan to transfer data while it is in memory on computer nodes. Researchers will no longer have to go through the storage system and wait to read their entire dataset into memory before they can access the information they need. The integration will also allow for prioritization of data by level of importance and streaming of the most important pieces first.

Wenji Wu, principal investigator for the MDTM project, said the new middleware is a powerful tool because data movement is now an essential function for science discoveries, particularly within the big data environment. ADIOS and MDTM complement each other in that ADIOS supports efficient and intelligent data I/Os, whereas MDTM provides reliable and high-performance data transfer.

The project allowed for collaboration between ORNL and Fermilab, something that the MDTM software developer and networking researcher Liang Zhang said has long been sought. ADIOS developer and former ORNL staffer Gary Liu said ADIOS was the perfect candidate for the integration because of its widespread use. “ADIOS has been around for a long time and is known to be fast, especially on OLCF machines,” he said. “We have a number of applications using it already, and it provides fast, scalable performance. It works really well.”

Liu said preliminary testing showed a 15 percent gain in performance for Wide Area Network I/O, and the team expects even better performance gains in the future.

Scott Klasky, leader of the ADIOS framework and group leader for ORNL’s Scientific Data Group, believes the integration will transform the way data is transferred at the OLCF. “We don’t just want to move and process raw data; we want to reduce, move, and understand information,” he said. “Our job is to be able to go from rich data to the knowledge that’s then saved, processed, and archived. And this knowledge needs to be able to be retrieved efficiently and queried to help productivity.”

Oak Ridge National Laboratory is supported by the US Department of Energy’s Office of Science. The single largest supporter of basic research in the physical sciences in the United States, the Office of Science is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.