The “Pioneering Frontier” series features stories profiling the many talented ORNL employees behind the construction and operation of the OLCF’s incoming exascale supercomputer, Frontier. The HPE Cray system is scheduled for delivery in 2021, with full user operations in 2022.
When the Oak Ridge Leadership Computing Facility’s (OLCF’s) next supercomputer, the HPE Cray EX Frontier, comes online, system users must be prepared to run their computational codes on day one. Capable of more than 1.5 exaflops, or 1.5 quintillion calculations per second, Frontier will be the first exascale system in the United States, with a level of computational power greater than any other US system before it.
For one unique group of users, whose work on such systems often leads to breakthrough advancements in manufacturing the products we depend on every day, understanding how to best utilize Frontier is as crucial as the computing allocation itself.
These are the OLCF’s industrial users.
From the OLCF’s Jaguar system in 2008 to the Summit machine today to the Frontier system tomorrow, the OLCF industrial partnerships program, called Accelerating Competitiveness through Computational ExceLlence (ACCEL), attracts companies that have complex, competitively important problems and require leadership computing to make progress in solving them. Since its inception in 2008, Suzy Tichenor has directed the program as part of the Computing and Computational Sciences Directorate (CCSD) at the US Department of Energy’s (DOE’s) Oak Ridge National Laboratory (ORNL).
Companies realize an array of benefits when they leverage high-performance computing (HPC) resources, such as the ability to scale their problems for greater accuracy, tackle new or more complex problems that cannot be addressed with their in-house systems or software, and even develop use cases to justify their own internal system upgrades.
“Often, results from the proof-of-principle problems that companies bring to the OLCF provide the return-on-investment justification they need to upgrade their internal systems,” Tichenor said. “Every industry user is gaining a ‘crystal ball’ look into leadership HPC systems and software as well as a head start in using them to accelerate their R&D, lower costs, and bring new products to market faster.”
Tichenor is highly experienced in helping companies understand and navigate the administrative and legal processes that all users must follow to gain access to OLCF supercomputers, but she hasn’t always worked in the DOE laboratory system. Prior to ORNL, she directed the HPC initiative at the Council on Competitiveness, a nonprofit public policy organization in Washington, DC, and before that she was in the Washington office of Cray Research (now HPE).
The initiative’s funding agencies—DOE, the Defense Advanced Research Projects Agency, and the National Science Foundation—were interested in understanding the relationships between HPC and economic competitiveness, as well as finding opportunities for public-private partnerships to expand HPC use across industry.
Under the guidance of an advisory committee composed of business executives and heads of computing from universities and national laboratories, the four-year initiative revealed that HPC was a “must-have” tool—at least for the companies that had invested in these systems. It also revealed that many firms had problems sitting on the shelf that they couldn’t address with their in-house systems. Companies felt they could benefit from time on DOE’s supercomputers just as universities did.
One of the initiative’s advisory committee members was Thomas Zacharia, who at the time was the Associate Laboratory Director of CCSD at ORNL. He took these industry comments to heart and saw a winning opportunity for the lab and for industry. In 2008, he invited Tichenor to join ORNL to help launch ACCEL.
Starting as an idea to make the lab’s supercomputing resources available to industry, ACCEL has matured over the last 12 years into a program that has helped many companies solve seemingly intractable problems. Since the ACCEL program began, 226 projects from 65 companies have run on OLCF’s systems, including on the OLCF’s current 200-petaflop IBM AC922 Summit machine. The center’s industry users include large companies, such as General Motors, GE Research, and Proctor & Gamble, as well as smaller firms, such as Cascade Technologies, KatRisk, and Microsurgeonbot.
“This program is not just for big companies,” Tichenor explained. “Small companies also have problems that need leadership-scale systems.”
But getting an allocation isn’t easy. Companies first must learn how to write solid proposals that justify time on these renowned computers. And these proposals must then survive peer review. Competition is intense and global, and not every proposal receives an award.
“The companies that are successful in winning allocations are very appreciative,” Tichenor explained. “They realize it’s a unique opportunity that will really accelerate their abilities to reach their goals.”
Industry users can apply for allocations of time on the OLCF’s systems through the Director’s Discretionary (DD) program, the Advanced Scientific Computing Research Leadership Computing Challenge (ALCC), and the Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program. From companies that are building the most powerful jet engines to ones that are developing new autonomous vehicles to those that are manufacturing market-leading consumer products, Tichenor works to build trust relationships to help them apply for computing time on the OLCF’s systems, including on Frontier.
But winning the award is just the first in a series of steps before log on. Tichenor explained that the lab’s policies and processes don’t always align with industry practices, and the DOE user facility business model differs from a corporate business model.
“There are user agreements and publication guidelines and intellectual property and cybersecurity concerns that must be addressed,” Tichenor said. “Sometimes it’s like putting together a jigsaw puzzle. So, we have to build trust relationships with our industry users to get our respective puzzle pieces to fit together so that companies can operate successfully at the OLCF.”
The benefits are worth the effort not only for industry but also for the OLCF. The OLCF has a mission to deliver high-impact science and engineering results, and the industrial projects are contributing to that. Tichenor also said that as industry and the OLCF work together, they are strengthening the nation’s innovation infrastructure. Companies that apply for and earn time on the Frontier system will be the biggest contributors to this effort yet.
“If we’re not including industry, we’re missing some potential winners,” Tichenor said. “ACCEL was launched to help us attract the most cutting-edge industry projects to help meet that mission.”
Today, Tichenor spends her days guiding companies through the different steps they must complete prior to receiving access to the OLCF’s resources and linking them to expertise at the OLCF as well as to other facilities at the lab—including the Spallation Neutron Source, the Center for Nanophase Materials Sciences, and the National Transportation Research Center—for broader collaborative opportunities. And she is always scouting new companies that may be candidates for allocations at the OLCF.
In addition to her role in ACCEL, Tichenor is co-executive director of DOE’s Exascale Computing Project (ECP) Industry and Agency Council along with David Martin, manager of Industry Partnerships and Outreach at DOE’s Argonne National Laboratory. ECP was established to develop the nation’s first exascale-capable ecosystem to fulfill DOE’s scientific and national security missions. Frontier will be the OLCF’s contribution to this new ecosystem.
The ECP Industry and Agency Council provides essential two-way communication and information exchange between ECP and selected representatives from US industry and federal agencies engaged in the use and development of HPC software. Several OLCF industrial users are members, and they bring that experience to council discussions along with their business acumen for managing large-scale R&D efforts.
Tichenor said the most satisfying part of her job is helping the OLCF’s industrial users quickly and seamlessly steer through the necessary steps to start computing at the OLCF and launch collaborations with ORNL’s researchers.
“Being a part of our awesome team at the OLCF and working with industry through ACCEL and the ECP Industry and Agency Council is a privilege,” Tichenor said. “These companies are passionate about their work, and their results touch our lives in so many ways that we don’t realize. It’s very gratifying to help them gain access to our leadership systems and the expertise of our world-class researchers so they can meet their goals and dreams.”
The research was supported by DOE’s Office of Science. UT-Battelle LLC manages Oak Ridge National Laboratory for DOE’s Office of Science, the single largest supporter of basic research in the physical sciences in the United States. DOE’s Office of Science is working to address some of the most pressing challenges of our time. For more information, visit https://energy.gov/science.