Data Transfer Service Upgrades to Improve Efficiency and Performance
Changes to include more nodes and partitioning, faster network connections
Data transfer can be tricky—especially when it involves moving large datasets at a rate of dozens of gigabits per second to and from one of the world’s most powerful supercomputers.
Staff members at the Oak Ridge Leadership Computing Facility (OLCF), a US Department of Energy (DOE) Office of Science User Facility located at DOE’s Oak Ridge National Laboratory (ORNL), are upgrading the center’s data transfer service to increase performance and speed while minimizing the risk of system outages.
Upgrades to the OLCF’s data transfer nodes (DTNs), a set of highly specialized servers that move data to, from, and within the OLCF network, include a major hardware refresh and the addition of 44 nodes. The changes will increase the DTNs’ Ethernet capabilities from 10-gigbit to 40-gigabit connectivity, enabling faster transfers without any extra steps for the user. Additionally, the new DTNs will be partitioned, allowing users greater flexibility when carrying out data transfer tasks.
Jason Anderson, project colead and administrator for the OLCF’s High-Performance Storage System, said of the upgrade, “The DTN upgrade will provide our users more transfer service capacity and better performance. In the long term it will also help solidify file transfer best practices within the OLCF as well as to other DOE laboratories, universities, and computing facilities.”
The upgrade is partially in response to increased use of DTNs for Globus-enabled GridFTP data transfers, which are highly tuned and parallelized transfers originally used for moving large datasets to other facilities. More recently users have adopted Globus services to make transfers within the OLCF as well, a practice that places more demand on DTN capacity and performance.
Anderson said the staff will monitor DTN activity to establish best practices for users. “We’ll introduce this, watch how it’s adopted, talk closely with the OLCF user council about the feedback they’re getting from our users, look at what people are trying to do on the system, and then phase toward what the user community needs and support that as the best practice for data transfer at the OLCF,” he said.
Anderson also said users will benefit from the intuitive Globus interface—especially users who access the center remotely. “The Globus interface is a web-based, point-and-click, drag-and-drop, move-your-data kind of interface,” he said. “Globus will be wonderful for users who want to drag, drop, and get an email when it’s finished.”
Daniel Pelfrey, high-performance computing systems network administrator, has been working on the networking portion of the upgrades and plans to implement a 100-gigabit path to ORNL’s network, which will allow for more parallel transfers to and from the OLCF. Pelfrey noted that in the coming months, the team hopes the network upgrades—in addition to the DTN work—will allow individual users to achieve a data transfer rate of a petabyte per week between ORNL and other DOE labs, while limiting the impact on other user data transfer needs.
OLCF staff members are working toward having the DTN upgrades rolled out in the fall. They hope that once the upgrades are in place, user feedback will help them define best practices as well as indicate that users welcome a more efficient way of transferring data. “At the end of the day, my job is to enable science,” Anderson said. “I want to make sure the tools we provide are doing just that.”
Oak Ridge National Laboratory is supported by the US Department of Energy’s Office of Science. The single largest supporter of basic research in the physical sciences in the United States, the Office of Science is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.