Categories: Debugging and Profiling, Software
Print this article
CrayPAT is a performance analysis tool for evaluating program execution on Cray systems. CrayPat consists of three major components:
pat_build– used to instrument the program to be analyzed
pat_report– a standalone text report generator that can be used to further explore the data generated by instrumented program execution
Apprentice2– a graphical analysis tool that can be used, in addition to pat_report, to further explore and visualize the data generated by instrumented program execution
These components are described in greater detail in the pat_build, pat_report, and app2 man pages, respectively. You must have the perftools module loaded first to get these man pages. In addition, more detail about CrayPat usage and environment variables is provided in the intro_craypat man page.
Follow these 10 STEPS to perform the basic analysis of your program using CrayPat and Apprentice2 tools. Since CrayPat is a performance analysis tool, not a debugging tool, start with a fully debugged and executable program. The program must be capable of running to a planned completion or an intentional termination before CrayPat can be used. Load the programming environment modules first. This ensures that the correct links and libraries are in place with your choice of compiler and target execution environment. For example, if you are working on a Cray XT series system using CNL on the compute nodes, enter the following command:
Step 1: Load CrayPat & Cray Apprentice2 module files
module load perftools
Then build your application. With the CrayPat module loaded remake your program using the compiler option to preserve all .o files (and .a files, if any) created during compilation. CrayPat requires access to the object files (and archive files, if any). For example, if you are working with a Fortran program, enter commands similar to the following:
Step 2: Build application
ftn -c my_program.f ftn -o my_program my_program.o
Or simply use your makefile
make clean make
Using Automatic Program Analysis(APA). To use automatic program analysis, follow these steps. Use the pat_build command to insert APA code into your program. The instrumented copy is saved under a new name with the extension +pat. Note that the original program remains unchanged.
Note: When building in your /tmp/work or /tmp/proj area, a copy of the build’s .o files will, by default, be placed in /ccs/home/$USER/.craypat. This may increase your home directory usage above quota. The PAT_LD_OBJECT_TMPDIR environment variable can be used to control the location of the .craypat directory. For example,
setenv PAT_LD_OBJECT_TMPDIR /tmp/work/$USER .
Step 3: Instrument the original program
pat_build -O apa my_program
This produces the instrumented executable
Execute the program. During execution, the specified performance analysis data is collected and written to one or more data files, depending on the experiment being conducted. On a Cray XT series CNL system, programs are executed using the aprun command.
Step 4: Run the instrumented executable
aprun -n <numproc> my_program+pat
Or simply submit a PBS script
This produces the data file
my_program+pat+PID-nodesdt.xf, which contains basic asynchronously derived program profiling data. After program execution completes or terminates, use the
pat_report command to create a .apa report.
Step 5: Use
pat_report to process the data file
pat_report -T -o report1.txt my_program+pat+PID-nodesdt.xf
This produces three results:
- a sampling-based text report to
- an .ap2 file (
my_program+pat+PID-nodesdt.ap2), which contains both the report data and the associated mapping from addresses to functions and source line numbers
- an .apa file (
my_program+pat+PID-nodesdt.apa), which contains the
pat_buildarguments recommended for further performance analysis
Once a .apa file is created, you can open the file in your preferred text editor and verify that you do not wish to have more or less profiled. Lines that are preceded with # will be ignored. Any option to
pat_build may be added to this file. For most users, the file created by
pat_report will be sufficient.
Step 6: Reinstrument the program
Reinstrument the program, this time using the .apa file.
Most common values are:
After you have verified this file, rebuild your executable as follows.
Step 7: Rebuild the program
pat_build -O my_program+pat+PID-nodesdt.apa
It is not necessary to specify the program name, as this is specified in the .apa file. Invoking this command produces the new executable,
my_program+apa, this time instrumented for enhanced tracing analysis.
Step 8: Run the new instrumented executable
aprun -n <numproc> my_program+apa
Or simply submit a PBS script
This produces the new data file
my_program+apa+PID2-nodesdt.xf, which contains expanded information tracing the most significant functions in the program.
You can use this file as input to pat_report, for text reports, or apprentice2, for graphical analysis. By default, your code will gather hardware counters from hwcp group 0. This can be overridden at runtime by setting the PAT_RT_PERFCTR environment variable (see man hwpc). To ignore hwpc data in your text reports, use the
-H option to pat_report.
Step 9: Use
pat_report to process the new data file
pat_report -T -o report2.txt my_program+apa+PID2-nodesdt.xf
This produces two results:
- a tracing report to
- an .ap2 file (
my_program+apa+PID2-nodesdt.ap2), which contains both the report data and the associated mapping from addresses to functions and source line numbers
Step 10: View results in text and/or with Apprentice2