Background

Scott joined ORNL in 2011. He is a Distinguished R&D Staff Member and he currently serves as the Chief Technology Officer for the National Center for Computational Science and the Oak Ridge Leadership Computing Facility. Scott leads the System Architecture team within the Technology Integration group. His team of system architects and system programmers focuses on compute, memory, and interconnect technology trends to understand how to optimize existing systems such as Frontier, how to prepare for next generation systems. Scott and his team are defining the requirements for OLCF-6, the system that will follow Frontier. His interests include resource heterogeneity within processors, within nodes, and across the system; processor architectures (e.g., CPU, GPU, FPGA, coarse-grained reconfigurable arrays); memory architectures and hierarchies; persistent memory; interconnects (e.g., PCIe, CXL, UCIe, InfiniBand, Ethernet); system scheduling; resilience; and system monitoring.

For Oak Ridge Leadership Computing Facility’s (OLCF) Frontier project, Scott is the Technical Project Officer (TPO). Scott reprises this role for the OLCF-6 project.

Scott was very involved in DOE’s Exascale Compute Project. He served as DOE’s Technical Representative for AMD’s FastForward-2 Node Architecture project and for AMD’s PathForward project. He also served as the lead for ECP’s HPCM and Slingshot Test and Evaluation effort.

Education

2002
University of Tennessee
Computer Science
Master of Science (M.S.)
1987
University of Tennessee
Business Administration
Bachelor of Science (B.S.)

R&D Activities Contributions

Exploiting Node-Local, Non-Volatile Memory (NVM) - Spectral is a transparently applied library for taking advantage of the Summit Burst Buffer architecture. Applications using per-process output simply write to the node-local burst…

Reliability and Resiliency - The project involves analysis of the reliability characteristics of Titan’s 299,008 CPUs and 18,688 GPUs to understand trends in machine failure, MTBF, single bit errors,…

HPC Systems Scheduling Improvements - Resource selection can have profound impacts on the performance and reliability of applications running on the supercomputer. On Titan, there are on going efforts to…

DDSD: Data-Driven HPC System Design - As supercomputers grow more expensive, a more data-driven procurement strategy is needed to optimize the overall system performance while avoiding speculative resource provisioning at the…

Highlights