Project Description

Benchmarks of high performance computer systems are important for a

number of reasons. They allow us to compare the performance of

different systems against one another. They can help guide the design,

development, optimization, and evaluation of scientific applications

running on these systems. Finally, they provide important guidance to

policy-makers and funding agencies in deciding what kind of machines are

suitable for a given set of compute applications.

The High Performance Conjugate Gradient (HPCG) has recently been

proposed as a new benchmark for the evaluation of HPC systems. One of

the central goals of HPCG is to provide a more faithful measure of the

real world performance of an HPC system.

In its reference implementation, the HPCG benchmark is very challenging

for machines that derive a lot of their compute power from hardware

accelerators (GPUs and Intel Xeon Phi) such as Titan. This is because

one of the most expensive computational kernels in HPCG is a triangular

solve. The data dependencies in triangular solves strongly limit the

amount of parallelism available in this kernel and thus they cannot be

accelerated well with GPUs.

By using a new triangular solver library, GELUS, specifically designed

for GPUs, we have been able to significantly accelerate the HPCG

benchmark in our preliminary studies on Titan. Depending

on problem size, our GPU implementation is between three and eight times

faster than a CPU implementation (comparing one GPU against all CPU

cores in a single node on Titan). Importantly, this was achieved by

calling into an independent, off-the-shelf solver library without

tweaking the internals of the library to the specifics of the benchmark

or of the machine.

Allocation History

Source Hours Start Date End Date
OLCF DIRECTOR'S DISCRETIONARY PROGRAM3,000,0002014-06-202015-06-30
OLCF DIRECTOR'S DISCRETIONARY PROGRAM3,000,0002014-06-202015-06-30