Skip to main content

Innovative model prepares students for the practical challenges of modern supercomputing.

A paper introducing a hands-on course  that puts students at the helm of high-performance computing clusters was presented at the annual User and System Administrator Research Software Event (USRSE) earlier this fall, highlighting a growing need for practical skills to meet rising supercomputing demands.

Fernando Posada, who leads the National Center for Computational Sciences’ System Acceptance and User Environment Group at the U.S. Department of Energy’s Oak Ridge National Laboratory, introduced the course called High-Performance Computing Technologies (HPCT), in a paper titled “A Hands-On Curriculum for Training in HPC Cluster Deployment and Management.” The paper was co-authored by Richard Berger, a high-performance computing specialist at Los Alamos National Laboratory.

The paper outlines the design, methodology and results of the HPCT course, a training program focused on HPC cluster deployment and administration. The course is part of a master’s program in high-performance computing, taught annually at the International Center of Theoretical Physics in Trieste, Italy. It provides students with practical experience in building, configuring and maintaining the computing clusters that support scientific research.

Person standing in front of a cabinet in a datacenter with their arms crossed.

Fernando Posada leads the System Acceptance and User Environment Group in the National Center for Computational Sciences at Oak Ridge National Laboratory. Credit: Alonda Hines, ORNL, U.S. Dept. of Energy

Posada originally taught the course in person while at Temple University, but he moved it online during the COVID-19 pandemic. That transition led to the development of free instructional materials and the adoption of a flipped-classroom model, where students review content independently and apply what they have learned through exercises during class. The curriculum covers essential system management tasks, including connecting computers, installing and managing software, running simulations and monitoring system performance.

Open-access HPC training expands workforce development opportunities

All course resources are freely available to the public, providing open access to HPC training resources. The paper offers a scalable and accessible model for HPC education, designed to support workforce development and expand the pipeline of skilled professionals in HPC and computational science.

“The HPCT course isn’t just about managing clusters,” Posada said. “It’s about building them from the ground up to drive scientific discovery.”

Posada adjusted the course for undergraduate students who participated in ORNL’s Pathways to Computing Internship Program over the summer. The program will be offered annually as part of the Oak Ridge Computing Academy and will task participants with building, configuring and operating an HPC cluster. These training opportunities complement the world-class computing resources of the Oak Ridge Leadership Computing Facility (OLCF), which is charged with helping researchers solve some of the world’s most challenging scientific problems with a combination of world-class HPC resources and world-class expertise in scientific computing.

“Anyone can learn this. There’s no secret or no trick. While the students I work with in the master’s program are pursuing advanced degrees, we’ve also successfully introduced these concepts to undergraduate students in the lab — and they’ve been equally capable,” Posada said.

The paper also outlines a roadmap for institutions wanting to develop similar training programs and underscores the role of accessible educational resources in boosting workforce development in HPC and computational science. By exposing students to real-world challenges, the program aims to prepare a new generation of experts to meet the evolving needs of research and industry.

Attendees at USRSE25 had the opportunity see how a well-designed curriculum can foster the skilled workforce needed to keep advanced computational infrastructure at the forefront of scientific progress.

“I’m confident that with the right support and resources, anyone can develop these skills,” Posada said. “This is not only a part of workforce development — it’s about empowering people and showing them that they can succeed if they choose to pursue it.”

The OLCF is a DOE Office of Science user facility located at ORNL.

UT-Battelle manages ORNL for DOE’s Office of Science, the single largest supporter of basic research in the physical sciences in the United States. DOE’s Office of Science is working to address some of the most pressing challenges of our time. For more information, visit energy.gov/science.

 

Angela Gosnell

Angela Gosnell is a science writer and communications specialist in the Oak Ridge Leadership Computing Facility. She specializes in digital communications and covers a wide range of science topics and research achievements in the lab's supercomputing facility.