Unique challenges lead to internal collaboration and a brand-new piece of equipment
One of the challenges of cutting-edge equipment at a cutting-edge laboratory is that when a problem arises, the answers are never straightforward.
As the 2021 delivery date for Frontier — ORNL’s next supercomputer — looms closer and closer, the Oak Ridge Leadership Computing Facility is undergoing some major remodeling, including the removal of about 18,000 square feet of equipment.
One of the first systems to go was Titan, the OLCF’s groundbreaking Cray XK7 supercomputer, which was decommissioned in August after an impressive seven years in service. Although removing Titan freed up about 4,300 square feet of floor space in the data center, it was only the beginning of the long and arduous task of clearing out the future home of Frontier.
More than a dozen disk storage cabinets also needed to be removed from the data center that previously housed Titan, and with each cabinet weighing in at about 3,500 pounds, the size of the systems alone made this a daunting task. Complicating the project further were the 860 storage disks contained in each cabinet. Incredibly sensitive to movement, even the smallest shock — a cart’s wheels bumping over the tile floor, for example — could damage one of these disks and cause a loss of data.
With each delicate cabinet valued at around $500,000 to $1 million, not including the practically invaluable data stored within, it was imperative that ORNL staff find a way to move the equipment as carefully as possible.
One method would be to individually remove each storage disk from the cabinets by hand — almost 13,000 in total — before moving the empty cabinet.
“If we chose to depopulate every drawer in order to move the cabinets, each disk would have to be returned specifically to the slot that it came out of,” says Paul Abston, data center manager. “Each disk has a home, and it has to go exactly back into that home. That would mean manually depopulating each drawer and placing the disks into a foam array to keep the order correct, and then once the cabinet is moved, you have to take the correct piece of foam for that enclosure and repopulate those.”
Not only would removing each storage disk by hand be a painstaking and time-consuming process, it would also put the sensitive technology at a high risk of getting damaged.
“Each time you handle the disks, since they’re delicate to shock, you take a chance on losing some of them in the process,” Paul says. “So, we decided it was better to move everything all together.”
With the traditional method of moving individual nodes out of the question, Paul and his team were entering uncharted territory.
“We needed to be resourceful, so I engaged our rigging supervisor, J.P. Biondo,” Paul says. “We discussed using air pads to float the cabinets, but we decided that the air pad technology wouldn’t work based on how many seams and threshholds were in the floors. Every time you go across one of those, you’re going to lose air pressure and your cabinet will set down. Moreover, you would have to be able to continually move your air supply tank and compressor with the cabinets.”
Without the air pads, the cabinets would have to be pushed on a dolly, but the unique challenges posed by the lab environment and the shock-sensitive technology meant that Paul and his team had to take an unusual approach to the cart’s design.
“We talked about pneumatic wheels that would absorb the impact when you crossed thresholds and things like that,” J.P. says. “The problem was that pneumatic tires would lift the cabinets too high, making them even more top-heavy than they already were. So I kept searching for impact-resistant wheels, and I eventually ran across these wheels that were made specifically for medical equipment.”
The medical-grade wheels seemed to be the perfect solution, but the team ran into yet another problem. These new wheels were much smaller than the pneumatic alternative and, as a result, could only support a fraction of the weight. The solution: upgrading the cart from four wheels to six.
Finally, in mid-September, the team had a dolly custom-designed for the problem at hand, constructed almost entirely out of parts already at ORNL.
“There was just nothing out there available that we could use for this project,” J.P. says. “And that’s why we went with our own design and fabricated everything ourselves.”
In contrast to many ORNL construction projects, the equipment move was designed, tested and carried out entirely by Lab staff. Using internal Lab resources instead of outside contractors meant that the workers involved in the project were intimately familiar with the equipment, the surrounding area and the safety culture at ORNL.
“It’s very important as a Laboratory to use the resources we have internally,” Paul says. “Whether you want to know about isotopes, magnetic fields, neutron science, nuclear medicine or computer science, there’s likely someone at the Lab that can be a resource for that. That’s the same way we need to look at the craft organizations. They are a resource here to support research. So, we had a unique problem, and we knew we had unique expertise here at the lab to help us solve that problem.”
“It’s definitely preferable to try to solve problems internally with people who know what it’s like to work in a lab-type setting,” J.P. says. “When you start dealing with really sensitive equipment like supercomputers, it’s important to have a team that knows how to handle projects safely and carefully. Most people don’t deal with or don’t understand how valuable and unique the equipment here is, but I feel confident in my supervisors and our craft workers.”
After months of extensive testing, the Facilities and Operations team was left with a brand-new piece of equipment, custom designed for the task at hand.
“At the end of the day, we had a unique problem and we came together to find a unique solution,” Paul says. “We were able to draw on the expertise of a lot of different people to design, test, and finally implement this dolly, and that is something we did really well.”