GROMACS¶
GROMACS (GROningen MAchine for Chemical Simulations) is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.
Policy¶
GROMACS is available to users at HPC2N under the condition that published work include citation of the program. GROMACS is Free Software, available under the GNU General Public License.
Citations
Published work must include a citation of the program.
Principal Papers:
- Berendsen, et al. (1995) Comp. Phys. Comm. 91: 43-56. (DOI)
- Lindahl, et al. (2001) J. Mol. Model. 7: 306-317. (DOI)
- van der Spoel, et al. (2005) J. Comput. Chem. 26: 1701-1718. (DOI)
- Hess, et al. (2008) J. Chem. Theory Comput. 4: 435-447. (DOI)
- Pronk, et al. (2013) Bioinformatics 29 845-854. (DOI)
- Páll, et al. (2015) Proc. of EASC 2015 LNCS, 8759 3-27. (DOI, arxiv)
- Abraham, et al. (2015) SoftwareX 1-2 19-25 (DOI)
More information can be found in the GROMACS documentation.
Overview¶
GROMACS was first developed in Herman Berendsens group, department of Biophysical Chemistry of Groningen University. It is a team effort, with contributions from several current and former developers all over world.
GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.
It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.
GROMACS supports all the usual algorithms you expect from a modern molecular dynamics implementation.
Usage at HPC2N¶
On HPC2N we have GROMACS available as a module.
Loading¶
To use the GROMACS module, add it to your environment. You can find versions with
and you can then find how to load a specific version (including prerequisites), with
Loading the module should set all the needed environmental variables as well as the path.
Note that while the case does not matter when you use “ml spider”, it is necessary to match the case when loading the modules.
Note
Some versions are compiled only for CPUs and others are also compiled for GPUs. Normally, the CUDA-aware versions will have “CUDA” in the version-name.
Some versions are compiled with PLUMED, etc. This is also marked in the version-name.
Example, loading GROMACS version 2024 for GPUs.¶
Setup and running¶
There are some differences between how you run Gromacs versions 4.x and newer versions. The focus in this documentation is on newer versions. You can find manuals for older versions here: https://manual.gromacs.org/documentation/ and here: ftp://ftp.gromacs.org/pub/manual/
When you have loaded Gromacs and its prerequisites, you can find the executables etc. under the directory pointed to by the environment variable $EBROOTGROMACS
.
You can get some information about the particular version of Gromacs, do
In particular, it should not give “command not found” if the module was correctly loaded.
To run Gromacs, you first need to prepare various input files (.gro
and .pdb
for molecular structure, .top
for topology and the main parameters file, .mdp
).
Setup
This example is for GROMACS 2024, but there is little difference between newer versions.
The following steps are needed to do your setup (adapted from the Gromacs homepage’s Getting started guide, found under Documentation -> User guide for your specific version of GROMACS):
- The molecular topology file is generated by the program
gmx pdb2gmx
translates apdb
structure file of any peptide or protein to a molecular topology file. This topology file contains a complete description of all the interactions in your peptide or protein. - When
gmx pdb2gmx
is executed to generate a molecular topology, it also translates the structure file (pdb
file) to a GROMOS structure file (gro
file). The main difference between apdb
file and a gromos file is their format and that agro
file can also hold velocities. However, if you do not need the velocities, you can also use apdb
file in all programs. - To generate a box of solvent molecules around the peptide, the program
should be used to define a box of appropriate size around the molecule.
gmx solvate
solvates a solute molecule (the peptide) into any solvent. The output ofgmx solvate
is a gromos structure file of the peptide solvated.gmx solvate
also changes the molecular topology file (generated bygmx pdb2gmx
) to add solvent to the topology. - The Molecular Dynamics Parameter (
mdp
) file contains all information about the Molecular Dynamics simulation itself e.g. time-step, number of steps, temperature, pressure etc. The easiest way of handling such a file is by adapting a samplemdp
file. A sample mdp file is available from the Gromacs homepage. - The next step is to combine the molecular structure (
gro
file), topology (top
file) MD-parameters (mdp
file) and (optionally) the index file (ndx
) to generate a run input file (tpr
extension). This file contains all information needed to start a simulation with GROMACS. The program processes all input files and generates the run inputtpr
file. - Once the run input file is available, we can start the simulation. The program which starts the simulation is called
or
The only input file of
gmx mdrun
that you usually need in order to start a run is the run input file (tpr
file). The typical output files ofgmx mdrun
are the trajectory file (trr
file), a logfile (log
file), and perhaps a checkpoint file (cpt
file).
Submit file examples¶
Below is some examples on how to run Gromacs, gmx mdrun
or gmx_mpi mdrun
, jobs on Kebnekaise.
Always note the number of cores per node (depends on the node type - see The different parts of the batch system for more information about this), and the amount of memory per node (see same page as above).
Note
You must run from the parallel file system.
Also:
- When using
gmx mdrun
(andgmx_mpi mdrun
) it is important to specify the-ntomp
option. - If not
gmx(_mpi) mdrun
will try to use all the cores on the node by adding multiple OpenMP threads to each (MPI) task. - If the batch job does not have the whole node allocated (using
--exclusive
,--ntasks-per-node=<all-cores-on-that-node-type>
, or other means) this will result in overallocation of the cores resulting in severely reduced performance.
So, all examples below explicitly use -ntomp
to avoid that situation.
Generic submit file for single or multi node job without GPUs
#!/bin/bash
# Change to your actual project id number (of the form: hpc2nXXXX-YYY, SNICXXX-YY-ZZ, or NAISSXXXX-YY-ZZ)
#SBATCH -A hpc2nXXXX-YYY
# Asking for 30 hours walltime
#SBATCH -t 30:00:00
# Use 14 tasks
#SBATCH -n 14
# Use 2 threads per task
#SBATCH -c 2
# It is always best to do a ml purge before loading modules in a submit file
ml purge > /dev/null 2>&1
# Load the module for GROMACS and its prerequisites.
# This is for GROMACS/2024.1 with no GPU support
ml GCC/13.2.0 OpenMPI/4.1.6 GROMACS/2024.1
# Automatic selection of single or multi node based GROMACS
if [ $SLURM_JOB_NUM_NODES -gt 1 ]; then
GMX="gmx_mpi"
MPIRUN="mpirun"
ntmpi=""
else
GMX="gmx"
MPIRUN=""
ntmpi="-ntmpi $SLURM_NTASKS"
fi
# Automatic selection of ntomp argument based on "-c" argument to sbatch
if [ -n "$SLURM_CPUS_PER_TASK" ]; then
ntomp="$SLURM_CPUS_PER_TASK"
else
ntomp="1"
fi
# Make sure to set OMP_NUM_THREADS equal to the value used for ntomp
# to avoid complaints from GROMACS
export OMP_NUM_THREADS=$ntomp
$MPIRUN $GMX mdrun $ntmpi -ntomp $ntomp -deffnm md_0
Generic submit file for single or multi node job using GPUs
#!/bin/bash
# Change to your actual project id number (of the form: hpc2nXXXX-YYY, SNICXXX-YY-ZZ, or NAISSXXXX-YY-ZZ)
#SBATCH -A hpc2nXXXX-YYY
# Name of the job
#SBATCH -J Gromacs-gpu-job
# Asking for one hour of walltime
#SBATCH -t 01:00:00
# Use for 4 tasks
#SBATCH -n 4
# Usefor 7 threads per task
#SBATCH -c 7
# Choose the total number of cores so that
# n x c = <cores on the node you picked> (on a single node, or multiples of <cores on the node you picked> for multi node runs)
# The examples above is for a Skylake node with a v100 GPU. See the
# section [Different parts of the batch system](../../documentation/batchsystem/resources/) for types of nodes and cores on them
# Asking for 2 V100 GPU cards per node
#BATCH -C v100
#SBATCH --gpus-per-node=2
# It is always best to do a ml purge before loading modules in a submit file
ml purge > /dev/null 2>&1
# Load the module for GROMACS and its prerequisites.
# This is for GROMACS/2024.3 with GPU support
ml GCC/13.2.0 OpenMPI/4.1.6
ml GROMACS/2024.3-CUDA-12.4.0
# Automatic selection of single or multi node based GROMACS
if [ $SLURM_JOB_NUM_NODES -gt 1 ]; then
GMX="gmx_mpi"
MPIRUN="mpirun"
ntmpi=""
else
GMX="gmx"
MPIRUN=""
ntmpi="-ntmpi $SLURM_NTASKS"
fi
# Automatic selection of ntomp argument based on "-c" argument to sbatch
if [ -n "$SLURM_CPUS_PER_TASK" ]; then
ntomp="$SLURM_CPUS_PER_TASK"
else
ntomp="1"
fi
# Make sure to set OMP_NUM_THREADS equal to the value used for ntomp
# to avoid complaints from GROMACS
export OMP_NUM_THREADS=$ntomp
$MPIRUN $GMX mdrun $ntmpi -ntomp $ntomp -deffnm md_0
Generic submit file for single or multi node job using GPUs with offloads
Starting from version 2018, other tasks besides PME long-range part can be offloaded to GPUs
#!/bin/bash
# Change to your actual local/SNIC/NAISS id project number (of the form: hpc2nXXXX-YYY, SNICXXX-YY-ZZ, or NAISSXXXX-YY-ZZ)
#SBATCH -A hpc2nXXXX-YYY
#SBATCH -t 00:05:00
#SBATCH -N 1
# Use for 4 tasks
#SBATCH -n 4
# Usefor 7 threads per task
#SBATCH -c 7
# Choose the total number of cores so that
# n x c = <cores on the node you picked> (on a single node, or
# multiples of <cores on the node you picked> for multi node runs)
# The examples above is for a Skylake node with a v100 GPU. See the
# section "Different parts of the batch system" for types of nodes
# and cores on them
# Asking for 2 V100 GPU cards per node
#SBATCH -C v100
#SBATCH --gpus-per-node=2
# It is always best to do a ml purge before loading modules in a submit file
ml purge > /dev/null 2>&1
ml GCC/13.2.0 OpenMPI/4.1.6
ml GROMACS/2024.3-CUDA-12.4.0
# Automatic selection of single or multi node based GROMACS
if [ $SLURM_JOB_NUM_NODES -gt 1 ]; then
GMX="gmx_mpi"
MPIRUN="mpirun"
ntmpi=""
else
GMX="gmx"
MPIRUN=""
ntmpi="-ntmpi $SLURM_NTASKS"
fi
# Automatic selection of ntomp argument based on "-c" argument to sbatch
if [ -n "$SLURM_CPUS_PER_TASK" ]; then
ntomp="$SLURM_CPUS_PER_TASK"
else
ntomp="1"
fi
# Make sure to set OMP_NUM_THREADS equal to the value used for ntomp
# to avoid complaints from GROMACS
export OMP_NUM_THREADS=$ntomp
$MPIRUN $GMX mdrun -gputasks 0123 -nb gpu -pme gpu -npme 1 $ntmpi -ntomp $ntomp -dlb yes -v -deffnm script_name
In the above example, four tasks (3 related to PP interactions and 1 related to the long-range PME part) will be distributed on the 4 GPU engines.
Notice that the PME long-range part should be handled by only 1 rank (npme 1). More information about the tasks that can be offloaded can be found in the GROMACS Documentation.
Note
We have more information about batchscripts in the section about batch systems.
Comparisons and benchmarks¶
A comparison of runs on the various types of GPU nodes on Kebnekaise is displayed below.
- The figure below shows the best performance of GROMACS on a single node obtained by changing the values of input parameters (MPI number of processes and OpenMP number of threads).
- The benchmark case consisted of 158944 particles, using 1 fs. for time step and a cutoff of 1.2 nm. for real space electrostatics calculations.
- Particle mesh Ewald was used to solve long-range electrostatic interactions.
We performed a benchmark of Gromacs on the different Nvidia GPUs that are available on Kebnekaise using the batch script job-gpu-gromacs.sh. The results can be seen in the following plot. The labels 1,2, and 3 refer to the three different and common options to run Gromacs written in this batch job. A dashed red line at 25 ns/day is added for better visualization. The full example can be found here.
Additional info¶
Documentation is available on the Gromacs documentation page and the Gromacs Online Reference page.