Running ParaView on Titan
Categories: Basic Programming
Print this article
ParaView is an open-source, multi-platform data analysis and visualization application. ParaView users can quickly build visualizations to analyze their data using qualitative and quantitative techniques. The data exploration can be done interactively in 3D or programmatically using ParaView’s batch processing capabilities.
ParaView was developed to analyze extremely large datasets using distributed memory computing resources. The OLCF provides ParaView server installs on Titan to facilitate large scale distributed visualizations. The ParaView server running on Titan may be used in a headless batch processing mode or be used to drive a ParaView GUI client running on your local machine.
A ParaView client instance is not available on Titan. Interactive mode requires that your local machine have a version matched ParaView client installation and batch mode can benefit from a local installation as well to aid in script generation. Precompiled ParaView binaries for Windows, Macintosh, and Linux can be downloaded from Kitware.
As with other programs, OLCF has installed several different ParaView releases (4.1, 4.4 and 5.0 at the time of this writing). Also as with other programs, different configurations of each are available. The four versions of ParaView 4.4 for example include: statically linked with software rendering (aka “static”), dynamically linked with software rendering (aka “mesa”), dynamically linked with GPU accelerated rendering (aka “GL1”), and dynamically linked with VTK’s new OpenGL2 GPU accelerated rendering (aka “GL2”). All four of these were compiled with the GNU compiler toolchain. The instructions that follow describe how to use each one in interactive, batch and in-situ situations.
Batch execution mode is the most straight forward way to run ParaView data analysis jobs on Titan. Batch mode allows the user to run ParaView pipeline scripts written in Python through Titan’s batch queue that leverage ParaView’s built in distributed computing capabilities. Extensive knowledge of Python is not necessary in most cases as the ParaView GUI provides the ability to trace client GUI usage to generate python scripts. Once a ParaView python script has been generated it can be launched across Titan’s compute nodes using the
pvbatch command in conjunction with
The following provides an example of using a ParaView installation on your local machine to generate a ParaView pipeline script which is then launched in parallel on Titan.
Step 1: Representative Data
The easiest way to create a ParaView pipeline script for batch processing is to let ParaView do the work for you with a pipeline trace. This method works best if you have access to a representative dataset that can be manipulated on a local ParaView client. An ideal representative dataset will have the same attributes as the production dataset but at a much lower resolution. The dataset to be processed on Titan may be much larger than the representative dataset but the ParaView pipeline will remain largely the same. The number of changes to the captured pipeline will depend on how closely your representative dataset matches your larger production dataset.
Step 2: Start Trace
ParaView provides the ability to record user interaction with the GUI to generate a Python ParaView script. To start a trace open ParaView on your local machine and select
Step 3: Create Pipeline
Once the trace has been started import your representative data, apply ParaView filters, and position the view as you normally would. All actions will be recorded by the trace.
Step 4: Stop Trace
Once the representative data is in the form you are interested in selecting
Tools/Stop Trace will stop the recording and produce a python script of all actions recorded since the trace began. With a few modifications this script can be used in batch nodes on production datasets.
Step 5: Modify Trace
Once the trace script has been saved it can be modified as needed for production datasets. This will typically include changing the reader file path to your production dataset and adding a write command after the dataset has been rendered. In this example an image will be saved in the working directory.
Step 6: Run on Titan
With the trace modifications completed a PBS batch script can be created to launch the ParaView pipeline in parallel on Titan’s compute nodes. In this example
production.py is our modified trace script. The batch job will create the image
#!/bin/bash #PBS -A ABC123 #PBS -j oe #PBS -l walltime=0:20:00,nodes=2 source $MODULESHOME/init/bash module load GPU-render module load paraview/4.4.0_static # or # module load paraview/4.4.0_mesa # module load paraview/4.4.0_gl1 # module load paraview/4.4.0_gl2 cd $MEMBERWORK/ABC123 aprun -n 2 -N 1 runinX.sh pvbatch production.py
Although in a single machine setup both the ParaView client and server run on the same host this need not be the case. It is possible to run a local ParaView client to display and interact with your data while the ParaView server runs in a Titan batch job, allowing interactive analysis of very large data sets.
The following provides an example of launching the ParaView server on Titan and connecting to it from a locally running ParaView client. Although several methods may be used the one described should work in most cases.
Step 1: Launch ParaView on your Desktop and fetch a connection script for Titan
Start ParaView and then select
File/Connect to begin.
Next select the connection to TITAN for either windows or Mac/Linux and hit the “Import Selected” button.
You may now quit and restart ParaView in order to save connection setup in your preferences.
Step 2: Establish a connection to Titan
Once restarted, and henceforth, simply select Titan from the File->Connect dialog and click the “Connect” button.
A dialog box follows, in which you must enter in your username and project allocation, the number of nodes to reserve and a duration to reserve them for, and you may also choose one of the four server variations (
When you click OK, a windows command prompt or
xterm pops up. In this window enter your credentials at the OLCF login prompt.
When your job reaches the top of the queue, the
RenderView1 view window will return. At this point you are connected to Titan and can open files that reside there and visualize them interactively.
For extreme scale simulations, it is inefficient and often implausible to save ALL of the simulation’s output data to disk only to later load the data back from disk for post processing. If ParaView is linked into the simulation directly, the simulation can use ParaView to reduce the data into much smaller pictures, plots and other derived results before saving those to disk instead.
When packaged for this particular use case ParaView’s back end data processing and rendering library is called Catalyst. Since version 4.4, the ParaView installations on Titan include the header files and other compile time resources necessary to link it into simulations.
The best place to get started doing so is by cloning the Catalyst examples source code repository.
git clone https://github.com/Kitware/ParaViewCatalystExampleCode.git
With source code in hand, you must choose from among the installed versions ParaView to link your simulation code to. The required libraries for OpenGL enabled builds requires modules that are only available in an interactive job: Compilation instructions vary slightly for each version but should follow the boiler-plate example below:
$ qsub -I -A ABC123 -l nodes=0,walltime=02:00:00 ... wait for job to start ... $ module switch PrgEnv-pgi PrgEnv-gnu $ module load cmake3 $ module load cray-hdf5 # Not needed for static build $ module load dynamic-link # Not needed for MESA builds $ export CXXFLAGS=-L/sw/xk7/X11/lib $ module load paraview/4.4.0_gl2 # or # module load paraview/4.4.0_static # module load paraview/4.4.0_mesa # module load paraview/4.4.0_gl1 $ export CC=$(which cc) $ export CXX=$(which CC) $ export FC=$(which ftn) $ cd ParaViewCatalystExampleCode $ mkdir build $ cd build $ cmake .. $ make