titan

Up since 11/8/17 02:45 pm

eos

Up since 11/14/17 11:20 pm

rhea

Up since 10/17/17 05:40 pm

hpss

Up since 11/20/17 09:15 am

atlas1

Up since 11/15/17 07:25 am

atlas2

Up since 11/27/17 10:45 am
OLCF User Assistance Center

Can't find the information you need below? Need advice from a real person? We're here to help.

OLCF support consultants are available to respond to your emails and phone calls from 9:00 a.m. to 5:00 p.m. EST, Monday through Friday, exclusive of holidays. Emails received outside of regular support hours will be addressed the next business day.

Compiling and Node Types

See this article in context within the following user guides: Titan

Titan is comprised of different types of nodes:

  • Login nodes running traditional Linux
  • Service nodes running traditional Linux
  • Compute nodes running the Cray Node Linux (CNL) microkernel

The type of work you are performing will dictate the type of node for which you build your code.

Compiling for Compute Nodes (Cross Compilation)

Titan compute nodes are the nodes that carry out the vast majority of computation on the system. Compute nodes are running the CNL microkernel, which is markedly different than the OS running on the login and service nodes. Most code that runs on Titan will be built targeting the compute nodes.

All parallel codes should run on the compute nodes. Compute nodes are accessible only by invoking aprun within a batch job. To build codes for the compute nodes, you should use the Cray compiler wrappers:

titan-ext$ cc code.c
titan-ext$ CC code.cc
titan-ext$ ftn code.f90
Note: The OLCF highly recommends that the Cray-provided cc, CC, and ftn compiler wrappers be used when compiling and linking source code for use on Titan compute nodes.
Support for Shared Object Libraries

On Titan, and Cray machines in general, statically linked executables will perform better and are easier to launch. Depending on the module files you load, certain Cray-provided modules and libraries (such as mpich2 and cudart) may employ and configure dynamic linking automatically; the following warnings do not apply to them.

Warning: In general, the use of shared object libraries is strongly discouraged on Cray compute nodes. This excludes certain Cray-provided modules and libraries such as mpich2 and cudart. Any necessary dynamic linking of these libraries will be configured automatically by the Cray compiler wrappers.

If you must use shared object libraries, you will need to copy all necessary libraries to a Lustre scratch area ( $MEMBERWORK or $PROJWORK) or the NFS /ccs/proj area and then update your LD_LIBRARY_PATH environment variable to include this directory. Due to the meta-data overhead, /ccs/proj is the suggested area for shared objects and python modules. For example, the following command will append your project’s NFS home area to the LD_LIBRARY_PATH in bash:

$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/ccs/proj/[projid]

Compiling with shared libraries can be further complicated by the fact that Titan’s login nodes do not run the same operating system as the compute nodes, and thus many shared libraries are available on the login nodes which are not available on the compute nodes. This means that an executable may appear to compile correctly on a login node, but will fail to start on a compute node because it will be unable to locate the shared libraries it needs.

It may appear that this could be resolved by locating the shared libraries on the login node and copying them to Lustre or /ccs/proj for use on the compute nodes. This is inadvisable because these shared libraries were not compiled for the compute nodes, and may perform erratically. Also, referring directly to these libraries circumvents the module system, and may jeopardize your deployment environment in the event of system upgrades.

For performance considerations, it is important to bear in mind that each node in your job will need to search through $LD_LIBRARY_PATH for each missing dynamic library, which could cause a bottleneck with the Lustre Metadata Server. Lastly, calls to functions in dynamic libraries will not benefit from compiler optimizations that are available when using static libraries.

Compiling for Login or Service Nodes

When you log into Titan you are placed on a login node. When you submit a job for execution, your job script is initially launched on one of a small number of shared service nodes. All tasks not launched through aprun will run on the service node. Users should note that there are a small number of these login and service nodes, and they are shared by all users. Because of this, long-running or memory-intensive work should not be performed on login nodes nor service nodes.

Warning: Long-running or memory-intensive codes should not be compiled for use on login nodes nor service nodes.

When using cc, CC, or ftn your code will be built for the compute nodes by default. If you wish to build code for the Titan login nodes or service nodes, you must do one of the following:

  1. Add the -target-cpu=mc8 flag to your cc, CC, or ftn command
  2. Use the craype-mc8 module: module swap craype-interlagos craype-mc8
  3. Call the underlying compilers directly (e.g. pgf90, ifort, gcc)
  4. Use the craype-network-none module to remove the network and MPI libraries: module swap craype-network-gemini craype-network-none
XK7 Service/Compute Node Incompatibilities

On the Cray XK7 architecture, service nodes differ greatly from the compute nodes. The difference between XK7 compute and service nodes may cause cross compiling issues that did not exist on Cray XT5 systems and prior.

For XK7, login and service nodes use AMD’s Istanbul-based processor, while compute nodes use the newer Interlagos-based processor. Interlagos-based processors include instructions not found on Istanbul-based processors, so executables compiled for the compute nodes will not run on the login nodes nor service nodes; typically crashing with an illegal instruction error. Additionally, codes compiled specifically for the login or service nodes will not run optimally on the compute nodes.

Warning: Executables compiled for the XK7 compute nodes will not run on the XK7 login nodes nor XK7 service nodes.
Optimization Target Warning

Because of the difference between the login/service nodes (on which code is built) and the compute nodes (on which code is run), a software package’s build process may inject optimization flags incorrectly targeting the login/service nodes. Users are strongly urged to check makefiles for CPU-specific optimization flags (ex: -tp, -hcpu, -march). Users should not need to set such flags; the Cray compiler wrappers will automatically add CPU-specific flags to the build. Choosing the incorrect processor optimization target can negatively impact code performance.