Up since 11/8/17 02:45 pm


Up since 11/14/17 11:20 pm


Up since 10/17/17 05:40 pm


Up since 11/20/17 09:15 am


Up since 11/15/17 07:25 am


Up since 11/27/17 10:45 am
OLCF User Assistance Center

Can't find the information you need below? Need advice from a real person? We're here to help.

OLCF support consultants are available to respond to your emails and phone calls from 9:00 a.m. to 5:00 p.m. EST, Monday through Friday, exclusive of holidays. Emails received outside of regular support hours will be addressed the next business day.

GPU Languages/Frameworks

See this article in context within the following user guides: Titan

For complete control over the GPU, Titan supports CUDA C, CUDA Fortran, and OpenCL. These languages and language extensions, while allowing explicit control, are generally more cumbersome than directive-based approaches and must be maintained to stay up-to-date with the latest performance guidelines. Substantial code structure changes may be needed and an in-depth knowledge of the underlying hardware is often necessary for best performance.


NVIDIA’s CUDA C is largely responsible for launching GPU computing to the forefront of HPC. With a few minimal additions to the C programming language, NVIDIA has allowed low-level control of the GPU without having to deal directly with a driver-level API.


To setup the CUDA environment the cudatoolkit module must be loaded:

$ module load cudatoolkit

This module will provide access to NVIDIA supplied utilities such as the nvcc compiler, the CUDA visual profiler(computeprof), cuda-gdb, and cuda-memcheck. The environment variable CUDAROOT will also be set to provide easy access to NVIDIA GPU libraries such as cuBLAS and cuFFT.

To compile we use the NVIDIA CUDA compiler, nvcc.

$ nvcc source.cu

For a full usage walkthrough please see the supplied tutorials.


NVIDIA provides a comprehensive web portal for CUDA developer resources here. The developer documentation center contains the CUDA C programming guide which very thoroughly covers the CUDA architecture. The programming guide covers everything from the underlying hardware to performance tuning and is a must read for those interested in CUDA programming. Also available on the same downloads page are whitepapers covering topics such as Fermi Tuning and CUDA C best practices. The CUDA SDK is available for download as well and provides many samples to help illustrate C for CUDA programming technique. For personalized assistance NVIDIA has a very knowledgeable and active developer forum.


The OLCF provides both a Vector Addition and Game of Life example code tutorial demonstrating CUDA C usage.

PGI CUDA Fortran

PGI’s CUDA Fortran provides a well-integrated Fortran interface for low-level GPU programming, doing for Fortran what NVIDIA did for C. PGI worked closely with NVIDIA to ensure that the Fortran interface provides nearly all of the low-level capabilities of the CUDA C framework.


CUDA Fortran will be properly configured by loading the PGI programming environment:

$ module load PrgEnv-pgi

To compile a file with the cuf extension we use the PGI Fortran compiler as usual:

$ ftn source.cuf

For a full usage walkthrough please see the supplied tutorials.


PGI provides a comprehensive web portal for CUDA Fortran resources here. The portal links to the PGI Fortran & C Accelerator Programming Model which provides a comprehensive overview of the framework and is an excellent starting point. The web portal also features a set of articles covering introductory material, device kernels, and memory management. If you run into trouble PGI has a user forum where PGI staff regularly answer questions.


The OLCF provides both a Vector Addition and Game of Life example code tutorial demonstrating CUDA Fortran usage.


The Khronos group, a non-profit industry consortium, currently maintains the OpenCL (Open Compute Language) standard. The OpenCL standard provides a common low-level interface for heterogeneous computing. At its core, OpenCL is composed of a kernel language extension to C (similar to CUDA C) and a C API to control data management and code execution.


The cuda module must be loaded for the OpenCL header files to be found and a PGI or GNU programming environment enabled:

$ module load PrgEnv-pgi
$ module load cudatoolkit

To use OpenCL you must include the OpenCL library and library path:

gcc -lOpenCL source.c

Khronos provides a web portal for OpenCL. From here you can view the specification, browse the reference pages, and get individual level help from the OpenCL forums. A developers page is also of great use and includes tutorials and example code to get you started.

In addition to the general Khronos provided material users will want to check out the vendor-specific available information for capability and optimization details. Of main interest to OLCF users will be the AMD and NVIDIA OpenCL developer zones.


The OLCF provides both a Vector Addition and Game of Life example code tutorial demonstrating OpenCL usage.