Advanced Computing Ecosystem (ACE)
Advanced Computing Ecosystem (ACE) testbed is a unique OLCF capability providing a centralized sandboxed area in deploying heterogeneous computing and data resources and facilitating the evaluation of diverse workloads across a broad spectrum of system architectures. ACE is designed to fuel the cycle of productization of new HPC technologies, as applicable to the OLCF and DOE missions. ACE is an open access environment consisting of HPC production capable resources, and it allows researchers and HPC system architects to more freely assess existing and emerging technologies without the limitations of a production environment. Topics of interest include:
- IRI workflows and patterns (i.e., time-sensitive, data-intensive)
- Emerging compute architectures and techniques for HPC (e.g., ARM, AI appliances, reconfigurable)
- Emerging storage architectures and techniques for HPC (e.g., object storage)
- Emerging network architectures and techniques (e.g., DPUs)
- Cloudification of traditional HPC architectures (e.g., multi-tenancy, preemptible queues)
Supported Science Workflows
Gamma Ray Energy Tracking Array (GRETA)
SLAC
Compute Resources
Defiant
– 36 nodes: AMD Epyc CPU, 4 AMD MI100 GPUs, Slingshot 10 networking
– Former Frontier early access system
Wombat
– AArch64 testbed with multiple node configurations
– Fujitsu A64fx CPU, EDR IB (8 nodes)
– Ampere Computing Altra CPU, 2 NVidia Ampere GPUs, 2 BlueFied-2 DPUs (8 nodes)
Holly
– Single Supermicro server with 8 Nvidia H100 GPUs
GraphCore
– Dedicated AI appliance – BOWPod16 configuration
Storage Resources
Polis – Lustre
– ~1.6 PB
– Primarily spinning disk with some flash, connected to Defiant
VastData
– ~600 TB
– NFS-over-RDMA storage appliance
– Flash, connected to the IB fabric
DAOS
– 4 servers with ~30 TB flash each and 4 HDR IB connections