Fixing Memory Leaks with Arm DDT Leak Reports
Memory leaks occur when memory is allocated, and not correctly freed. This can be particularly problematic if the allocations are large or frequent. Over time, these leaks can degrade performance, or worse, cause the program to fail.
DDT’s memory debugging features allow analysis of allocated heap memory, both interactively, using the GUI, and non-interactively, using DDT’s “offline debugging” mode.
The information below will show you how to generate a leak report to pinpoint leaks, and eliminate them. Unlike conventional, interactive debugging, these reports can be created during a batch job, meaning you do not need to be present at the time your job is scheduled.
Source Code
The source code for this example can be downloaded here. The source code is contained within a git repository with tagged versions “initial
“, “fix-1
” and “fix-2
“.
In addition, the “leak-reports
” folder contains example leak reports for the different versions, and two queue submission files are included to launch the example program with and without DDT.
Linking with DDT’s Memory Debugging Library
The first step towards creating a memory leak report is to link your program with DDT’s memory debugging library. This will intercept calls to memory allocation and release functions (such as malloc
and free
) and record their location in your program.
Note: Manual linking is required only for Cray systems, or when using static linking.
Linking with DDT’s memory debugging library can be automated by loading the ddt-memdebug
module. After loading your usual compilation environment, load the following modules:
$ module load forge $ module load ddt-memdebug
Then re-link your program. How this is done will vary depending on your build system, but it’s often sufficient to delete the application binary, and have make
regenerate it. For our example:
$ rm mandel $ make
Launching with DDT
The next step is to launch the program with DDT. As of DDT 5.0, we can prefix our aprun
command with the appropriate DDT command.
In our example, we can edit submit.qsub
to first load the DDT module:
$ source $MODULESHOME/init/bash # Only required if used in a batch script $ module load forge
and then modify our aprun command so that:
$ aprun -n 16 ./mandel
becomes:
$ ddt --offline=leak-report.html --mem-debug=fast --check-bounds=off aprun -n 4 ./mandel
The DDT arguments used are as follows:
--offline=leak-report.html
: This tells DDT to run in non-interactive “offline” mode, and save the output to leak-report.html
--mem-debug=fast
: This enables the memory debugging options in DDT and uses the “fast” preset. (The “fast” present runs the fewest memory checks, in order to reduce overhead).
--check-bounds=off
: This disables bounds checking (or “guard pages”) in DDT. While this can be useful when tracking down invalid memory accesses, disabling this will reduce the runtime and memory overhead.
(The download bundle also contains a pre-modified version of submit.qsub
named submit-ddt.qsub
)
Now we submit our batch job:
$ qsub -A <projectID> ddt-submit.qsub
Interpretting the Output
Once the job has finished, copy the output file (leak-report.html
) to your local machine and open it with a web browser. (Alternatively, open leak-reports/initial.html
from the source code download.)
Scrolling down to the leak report section, we should see something like this:
For scalability reasons, DDT will limit the report to the 8 ranks with the greatest memory leakage (this can be controlled with the --leak-report-top-ranks
command line argument).
In the example shown we can see that rank 0 has leaked more memory than the others, and that most of the allocations were created by the Packet::allocate
function. Clicking the bar chart item for rank 0 will display a table below showing details of the allocations:
This table shows allocations grouped by the backtrace taken when the allocation was made, along with source code snippets. This information can be used to identify code paths leading to the largest leaks.
In our example, the first row of the table represents a single, 16MB allocation, whereas the second row represents 92 smaller allocations, totaling 14.72MB. All of these allocations share the same allocation site (as noted by “#0 Packet::allocate() (packet.cpp:91)
” at the top of each stack), but have taken different paths through the code to get there (as shown by different entries further down the stack).
Once we have the allocation site (the Packet::allocate
member function), the next step is figuring out why this allocation isn’t freed. From the source code snippet, we can see that the allocation is assigned to the iterations
variable.
Reading through the Packet
class, we can determine that the iterations
allocation is owned by the Packet class, and yet, the Packet::~Packet
destructor doesn’t contain code to free the allocation. The simple fix here is to add free(pointer);
to the destructor. (See”git show fix-1
” for more details).
After making the fix, running “make
” will recompile the code.
Another Leak!
After fixing our leak and recompiling, the next step is to verify the leak is gone, by resubmitting our job and generating a new leak report.
Opening the newly-generated leak-report.html
(or leak-reports/fix-1.html
from the source code download) shows the following:
While we’ve fixed one of our leak sites, rank 0 is still leaking around 16MB of memory. Clicking the bar chart item will again show us the allocation details.
Here we see that the remaining leak was created from a single allocation (again, made in Packet::allocate
).
As we’ve already fixed the leak in the Packet
class, we should check that the Packet
object itself is being correctly freed. We can do this by methodically working our way up the stack.
Following the backtrace, we see that Packet::allocate
is called by Packet::stitch
, which is in turn called by PacketFactory::stitch
. The source code snippet for PacketFactory::stitch
shows us that Packet::stitch
is being called on the packet
member object, so let’s verify that this is being freed.
With a little reading of the source code (e.g. packetfactory.h
), we can see that packet
is a plain object member of the PacketFactory
class, so when a PacketFactory
object is freed, packet
should be freed too.
Let’s jump up another level to find the origin of the PacketFactory
object itself. strategy1
(found in mandel.cpp
) shows the PacketFactory
is actually an instance of the derived class SimplePacketFactory
, named factory
.
Hopping up one final time to main
, we can see that factory
object is passed to stragegy1
as a dereferenced pointer (*factory
), and dynamically allocated a little further up the function.
The rest of the main
function is relatively simple, and we can see that factory
isn’t directly freed (or passed to any additional function calls where it could be freed). Now we’ve found the source of the leak, there are a few options to fix it:
We could rewrite the code to avoid the dynamic allocation entirely, or wrap the pointer in a C++ auto_ptr
(/C++11 unique_ptr
), but the simplest solution here is to add “delete factory
” once we have finished using the factory (i.e. after the switch
statement). See “git show fix-2
” for more details.
After making our final change, let’s recompile and generate one last report to verify our leak has been fixed.
Opening leak-report.html
(or leak-reports/fix-2.html
from the source code download), we see:
The chart may initially look busier than our other reports, but the total leaked memory is now only 16.75 kB, and the functions responsible are from various system libraries outside of our control.
We’ve now successfully rid our program of the two memory leaks, and improved the correctness of our code. We can also be more confident that our program (at least for the current configuration) is leak-free.