Note: DDT does not support non-MPI applications on Titan. cuda-gdb and cuda-memcheck (module load cudatoolkit) may be run through aprun although MPI enabled applications are not supported. It is recommended to use Allinea DDT for MPI debugging.


Arm DDT is an advanced debugging tool used for scalar, multi-threaded, and large-scale parallel applications.
In addition to traditional debugging features (setting breakpoints, stepping through code, examining variables), DDT also supports attaching to already-running processes and memory debugging. In-depth debugging information is beyond the scope of this guide, and is best answered by the Arm Forge User Guide.

Additional DDT Articles

In addition to the information below, the following articles can help you perform specific tasks with DDT:

Launching DDT

The first step in using DDT is to launch the DDT GUI. This can either be launched on the remote machine using X11 forwarding, or by running a remote client on your local machine, and connecting it to the remote machine.

The remote client provides a native GUI (for Linux / OS X / Windows) that should be more far more responsive than X11, but requires a little extra setup. It is also useful if you don’t have a preconfigured X11 server.

To get started using the remote client, follow the Forge Remote Client setup guide.

To use X11 forwarding, in a terminal do the following:

$ ssh -X user@<host>
$ module load forge
$ ddt &

Running your job

Once you have launched a DDT GUI, we can initiate a debugging session from a batch job script using DDT’s “Reverse Connect” functionality. This will connect the debug session launched from the batch script to an already running GUI.

This is the most widely applicable method of launching, and allows re-use of any setup logic contained in existing batch scripts.

(This method can also be easily modified to launch DDT from an interactive batch session.)

  • Copy or modify an existing job script. (If you don’t have an existing job script, you may wish to read the section on letting DDT submit your job to the queue).
  • Include the following near the top of your jobs script:
    $ source $MODULESHOME/init/bash # If not already included, to make the module command available
    module load ddt
  • Finally, prefix the aprun/mpirun with ddt --connect, e.g.:
    $ aprun -n 1024 -N 8 ./myprogram


    $ ddt --connect aprun -n 1024 -N 8 ./myprogram

After submitting this script to the batch system (and waiting for it to be scheduled), a prompt will appear in the DDT GUI asking if you would like to debug this job.

DDT Reverse Connect Prompt

Once accepted, you can configure some final options before launching your program.

DDT Reverse Connect Run Dialog

Offline Debugging

In addition to debugging interactively, DDT also supports “Offline Debugging”. This can be particularly useful if your job takes a long time to schedule (and you’re not sure if you’ll be available when it runs).

DDT will execute you program under the debugger, and write a plain text or HTML report for you to inspect at your convenience.

To run your program with DDT’s Offline Debugging, modify your existing job script and modify your aprun command such that:

$ aprun -n 1024 -N 8 ./myprogram

Would become:

$ ddt --offline=output.html aprun -n 1024 -N 8 ./myprogram

Replacing printf / debug statements with tracepoints

Adding a quick debug statement is often a tempting next step when trying to debug an issue, but repeated compile/run cycles can quickly become time consuming.

Rather than adding logging statements, you can add tracepoints inside DDT. Tracepoints have the following advantages over debug statements:

  • No source code modification – this means there’s no need to recompile, and no need to track down and remove logging statements after debugging.
  • Scalability – variables can be collected and summarized over thousands of processes without worrying about where/how to store the output, or how to sift through the data afterwards.
  • Variables are automatically compared across processes. Variables with differing values across processes are highlighted and sparklines are included to give a quick graphical representation of the distribution on values.

For more information on tracepoints (including how to use them with interactive debugging), please see the Forge user guide. (Section 6.14 Tracepoints refers to tracepoints in general, while syntax can be found in section 15 DDT: Offline Debugging).

Attaching to a running job

You can also use DDT to connect to an already-running job. To do so, you must be connected to the system on which the job is running. You do not need to be logged into the job’s head node (the node from which aprun/mpirun was launched), but DDT needs to know the head node. The process is fairly simple:

  1. Find your job’s head node:
    • On Titan and Eos, run qstat -f <jobid> | grep login_node_id. The node listed is the head node.
    • On other systems, run qstat -f <jobid> | grep exec_host. The _first_ node listed is the head node.
  2. Start DDT by running module load forge and then ddt.
  3. When DDT starts, select the option to “Attach to an already running program“.
  4. In that dialog box, make sure the appropriate MPI implementation is selected. If not, click the “Change MPI” button and select the proper one.
  5. If the job’s head node is not listed after the word “Hosts”, click on “Choose Hosts”.
    • Click “Add”.
    • Type the host name in the resulting dialog box and click “OK”.
    • To make things faster, uncheck any other hosts listed in the dialog box.
    • Click “OK” to return.
  6. Once DDT has finished scanning, your job should appear in the “Automatically-detected jobs” tab, select it and click the “Attach” button.

Letting DDT submit your job to the queue

This method can be useful when using the Forge Remote Client or when your program doesn’t have a complex existing launch script.

  1. Run module load forge.
  2. Run ddt.
  3. When the GUI starts, click the “Run and debug a program” button.
  4. In the DDT “Run” dialog, ensure the “Submit to Queue” box is checked.
  5. Optionally select a queue template file (by clicking “Configure” by the “Submit to Queue” box).
    If your typical job scripts are basically only an aprun command, the default is fine.
    If your scripts are more complex, you’ll need to create your own template file. The default file can be a good start. If you need help creating one, contact or see the Forge User Guide.
  6. Click the “Parameters” button by the “Submit to Queue” box.
  7. In the resulting dialog box, select an appropriate walltime limit, account, and queue. Then click “OK“.
  8. Enter your executable in the “Application” box, enter any command line options your executable takes on the “Arguments” line, and select an appropriate number of processes and threads.
  9. Click “Submit“. Your job will be submitted to the queue and your debug session will start once the job begins to run. While it’s waiting to start, DDT will display a dialog box displaying showq output.

Starting DDT from an interactive-batch job

Note: To tunnel a GUI from a batch job, the -X PBS option should be used to enable X11 forwarding.

Starting DDT from within an interactive job gives you the advantage of running
repeated debug sessions with different configurations while only waiting in the queue once.

  1. Start your interactive-batch job with qsub -I -X ... (-X enables X11 forwarding).
  2. Run module load forge.
  3. Start DDT with the command ddt.
  4. When the GUI starts, click the “Run and debug a program” button.
  5. In the DDT “Run” dialog, ensure the “Submit to Queue” box is not checked.
  6. Enter your executable, number of processors, etc.
  7. Click “Run” to run the program.

Memory Debugging (on Cray systems)

In order to use the memory debugging functionality of DDT on Titan, you need to link against the
DDT memory debugging library. (On non-Cray systems DDT can preload the shared library automatically
if your program uses dynamic linking).

In order to link the memory debugging library:

  1. module load ddt (This determines the location of the library to link).
  2. module load ddt-memdebug (This tells the ftn/cc/CC compiler wrappers to link the library).
  3. Re-link your program (e.g. by deleting your binary and running make).

Once re-linked, run your program with DDT, ensuring you enable the “Memory Debugging” option in the run dialog.

Memory Debugging Caveats

  • The behavior of ddt-memdebug depends on the current programming environment. For this reason, you may encounter issues if you switch programming environments after ddt-memdebug has been loaded. To avoid this, please ensure that you unload ddt-memdebug before switching programming environments (you can then load it again).
  • The Fortran ALLOCATE function cannot currently be wrapped when using the PGI compiler, so allocations will not be tracked, or protected.
  • When using the Cray compiler, some libraries are compiled in such a way that DDT can not collect a backtrace when allocating memory. In this case, DDT can only show the location (rather than the full backtrace) for when memory is allocated.

For additional information on memory debugging, see the Forge User Guide and/or or how to fix memory leaks with DDT Leak Reports.

Debugging scalar/non-MPI programs (Cray systems)

Launching a debug session on the Cray systems requires the program be linked with the Cray PMI library. (This happens automatically when linking with MPI.) In addition, DDT must be told not to run your program to the MPI_Init function (as it won’t be called).

If you are using the Cray compiler wrappers, you can load the ddt-non-mpi module (before linking your program) to include the PMI library.

The same module should also be loaded prior to running ddt (to tell DDT not to attempt to run to MPI_Init during initialization).

Finally, enable the “MPI” option in the DDT run dialog. This will ensure DDT launches your program with aprun.

Using the ddt-non-mpi module with the DDT Remote Client

When using the Forge Remote Client, we can’t load the ddt-non-mpi module in to the client itself. Instead we have three options:

  1. If using “Reverse Connect”, load the module before launching ddt --connect ...
  2. Load the ddt-non-mpi module inside your “queue template file” (configured via the “Options” dialog).
  3. Load the module using the “remote script” mechanism while configuring your remote host in the DDT Remote Client. For convenience, you may specify the following file as the remote script: /sw/sources/ddt/