mpi4py¶

MPI for Python (mpi4py) provides bindings of the Message Passing Interface (MPI) standard for the Python programming language, allowing any Python program to exploit multiple processors.

Policy¶

mpi4py is freely available to users at HPC2N.

Citations

If MPI for Python has been significant to a project that leads to an academic publication, please acknowledge that fact by citing the project.

M. Rogowski, S. Aseeri, D. Keyes, and L. Dalcin, mpi4py.futures: MPI-Based Asynchronous Task Execution for Python, IEEE Transactions on Parallel and Distributed Systems, 34(2):611-622, 2023. https://doi.org/10.1109/TPDS.2022.3225481
L. Dalcin and Y.-L. L. Fang, mpi4py: Status Update After 12 Years of Development, Computing in Science & Engineering, 23(4):47-54, 2021. https://doi.org/10.1109/MCSE.2021.3083216
L. Dalcin, P. Kler, R. Paz, and A. Cosimo, Parallel Distributed Computing using Python, Advances in Water Resources, 34(9):1124-1139, 2011. https://doi.org/10.1016/j.advwatres.2011.04.013
L. Dalcin, R. Paz, M. Storti, and J. D’Elia, MPI for Python: performance improvements and MPI-2 extensions, Journal of Parallel and Distributed Computing, 68(5):655-662, 2008. https://doi.org/10.1016/j.jpdc.2007.09.005
L. Dalcin, R. Paz, and M. Storti, MPI for Python, Journal of Parallel and Distributed Computing, 65(9):1108-1115, 2005. https://doi.org/10.1016/j.jpdc.2005.03.010

Overview¶

The MPI for Python package.

The Message Passing Interface (MPI) is a standardized and portable message-passing system designed to function on a wide variety of parallel computers. The MPI standard defines the syntax and semantics of library routines and allows users to write portable programs in the main scientific programming languages (Fortran, C, or C++). Since its release, the MPI specification has become the leading standard for message-passing libraries for parallel computers.

MPI for Python provides MPI bindings for the Python programming language, allowing any Python program to exploit multiple processors. This package build on the MPI specification and provides an object oriented interface which closely follows MPI-2 C++ bindings.

Usage at HPC2N¶

On HPC2N we have mpi4py available as a module for newer versions (3.1.4 and above) of mpi4py.

Note

Version 3.1.3 and older is part of the SciPy-bundles and are loaded with them.

This page is about loading the separate mpi4py module.

Loading¶

To use the mpi4py module, add it to your environment. You can find versions with

module spider mpi4py

and you can then find how to load a specific version (including prerequisites), with

module spider mpi4py/<VERSION>

Running¶

You should run MPI jobs through the batch system.

Example

Here follows an example of how to run an example MPI Python code with a batch job. In this example we load mpi4py 3.1.5 and prerequisites.

You can find an example Python script using mpi4py below as well.

batch scriptintegration2d_mpi.py

#!/bin/bash
#SBATCH -A hpc2nXXXX-YYY 
#SBATCH -t 00:05:00       # wall time - change as you need 
#SBATCH -n 4              # Change number of tasks as you want 
#SBATCH -o output_%j.out  # output file
#SBATCH -e error_%j.err   # error messages

ml purge > /dev/null 2>&1 # Clean the module environment  
module load GCC/13.2.0 OpenMPI/4.1.6 
module load mpi4py/3.1.5 

mpirun -np python integration2d_mpi.py

from mpi4py import MPI
import math
import sys
from time import perf_counter

# MPI communicator
comm = MPI.COMM_WORLD
# MPI size of communicator
numprocs = comm.Get_size()
# MPI rank of each process
myrank = comm.Get_rank()

# grid size
n = 10000

def integration2d_mpi(n,numprocs,myrank):
    # interval size (same for X and Y)
    h = math.pi / float(n)
    # cummulative variable
    mysum = 0.0
    # workload for each process
    workload = n/numprocs

    begin = int(workload*myrank)
    end = int(workload*(myrank+1))
    # regular integration in the X axis
    for i in range(begin,end):
        x = h * (i + 0.5)
        # regular integration in the Y axis
        for j in range(n):
            y = h * (j + 0.5)
            mysum += math.sin(x + y)

    partial_integrals = h**2 * mysum
    return partial_integrals


if __name__ == "__main__":

    starttime = perf_counter()

    p = integration2d_mpi(n,numprocs,myrank)

    # MPI reduction
    integral = comm.reduce(p, op=MPI.SUM, root=0)

    endtime = perf_counter()

    if myrank == 0:
        print("Integral value is %e, Error is %e" % (integral, abs(integral - 0.0)))
        print("Time spent: %.2f sec" % (endtime-starttime))

Additional info¶

More information can be found on