GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.
It is primarily designed for biochemical molecules like proteins and lipids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.
UsageAccess to the binaries, libraries, and data files are provided through the gromacs module. This module sets up environmental variables which point to these locations and updates the required paths.
module load gromacs qsub -V [PBS Script]The gmx_mpi binary must run on the compute nodes via aprun.
aprun -n [number of cores] -N [cores per node] gmx_mpi [gmx options]
Running GROMACS on the GPUTo run GROMACS on the GPU and across multiple nodes, the
-gpu_idoption is needed. Titan nodes have a single GPU, so only ID
0should be used.
Using a single process per nodeTo run with a single MPI rank per node, a single GPU ID is needed. For example, on a 2 node job:
aprun -n 2 -N 1 gmx_mpi -gpu_id 0
Using multiple processes per nodeTo run a GROMACS simulation using multiple MPI ranks per node, first, the CUDA MPS mode has to be enabled by setting the CRAY_CUDA_MPS variable to 1:
export CRAY_CUDA_MPS=1 (bash syntax) setenv CRAY_CUDA_MPS 1 (csh/tcsh syntax)More details about CUDA MPS and managing GPU contexts can be found in the CUDA Proxy: Managing GPU context Tutorial. With that option enabled, the next step is to use the
-gpu_idoption to pass as many IDs as processes per node will share the GPU. For example, to run an job with 8 MPI ranks across 4 nodes and using 2 MPI ranks per node, the following command would be used:
aprun -n 8 -N 2 gmx_mpi -gpu_id 00In the example above, two zeros are used because the 2 processes started on a node need to share the a single GPU. In addition, it is also possible to launch GROMACS using OpenMP threads by including the
-ntompoption for gmx
-doption for aprun:
aprun -n 8 -N 2 -d 8 gmx_mpi -ntomp 8 -gpu_id 00The GROMACS documentation site has several recommendations for running on the GPU, but in general, it is recommended to run more processes and fewer threads per node. More details can be found at: http://www.gromacs.org/Documentation/Acceleration_and_parallelization#Multiple_MPI_ranks_per_GPU