MVAPICH: A High Performance MPI Implementation

The number of downloads has crossed 2.0 million!! The number of organizations using MVAPICH libraries has crossed 3,500 in 94 countries!! The MVAPICH team would like to express thanks to all these organizations and their users!!

MVAPICH Delivers Sub-minute (22 sec) Job Startup for 229,376 processes!! (Details)

MVAPICH@IPDPS 2026

MVAPICH@GTC 2026

MVAPICH@PPoPP/HPCA 2026

Welcome to the home page of the MVAPICH project, led by Network-Based Computing Laboratory (NBCL) of The Ohio State University. The MVAPICH software, based on MPI 5.0 standard, delivers the best performance, scalability and fault tolerance for high-end computing systems and servers using InfiniBand, Omni-Path, Ethernet/iWARP, RoCE(v1/v2), Cray Slingshot, and Rockport Networks networking technologies. This software is being used by more than 3,500 organizations in 94 countries worldwide to extract the potential of these emerging networking technologies for modern systems. As of May '26, more than 2,043,000 downloads have taken place from this project's site. This software is also being distributed by many vendors as part of their software distributions.

The MVAPICH software family is ABI compatible with the version of MPICH it is based on. Please refer to our download page for more details.

The MVAPICH software is powering several supercomputers in the TOP500 list. Examples (from the November '25 ranking) include:

24th, 10,649,600-core (Sunway TaihuLight) at National Supercomputing Center in Wuxi, China
74th, 448, 448 cores (Frontera) at TACC
99th, 288,288 cores (Lassen) at LLNL
274th, 42,336 cores (Cosmos) at SDSC

The MVAPICH group provides several software libraries as listed below.

High-Performance Parallel Programming Libraries
MVAPICH	Support for InfiniBand, Slingshot, Omni-Path, Ethernet/iWARP, RoCE, and Rockport Networks
MVAPICH-Plus	Advanced MPI for GPU clusters with converged software stack for unified HPC, DL, ML, Big Data and Data Science applications. This new series replaces the older MVAPICH2-GDR and MVAPICH2-X stacks.
MVAPICH2-Azure	Optimized Support for Microsoft Azure Platform with InfiniBand
MVAPICH2-X	Advanced MPI features/support (UMR, ODP, DC, Core-Direct, SHArP, XPMEM), OSU INAM (InifniBand Network Monitoring and Analysis), PGAS (OpenSHMEM, UPC, UPC++, and CAF), and MPI+PGAS programming models with unified communication runtime
MVAPICH2-X-AWS	Advanced MPI features (SRD and XPMEM) with support for Amazon Elastic Fabric Adapter (EFA)
MVAPICH2-GDR	Optimized MPI for clusters with NVIDIA GPUs, AMD GPUs, and for GPU-enabled Deep Learning Applications
MVAPICH2-J	Java bindings for the MVAPICH2 family of MPI libraries
MVAPICH2-Virt	Hypervisor and container based (Docker and Singularity) HPC cloud with MPI & IB (SR-IOV)
MVAPICH2-EA	Energy aware and High-performance MPI

Microbenchmarks
OMB	Microbenchmarks suite to evaluate MPI and PGAS (OpenSHMEM, UPC, and UPC++) libraries for CPUs and GPUs

Tools
OSU INAM	Network monitoring, profiling, and analysis for clusters with MPI and scheduler integration
OEMT	Utility to measure the energy consumption of MPI applications

This project is supported by funding from U.S. National Science Foundation, U.S. DOE Office of Science, U.S. Department of Defense, Ohio Board of Regents, Ohio Department of Development, arm, Cisco Systems, Cray, Intel, Linux Networx, Mellanox, Microsoft, NVIDIA, Pattern Computer, QLogic, ROCKPORT, and Sun Microsystems; and equipment donations from Advanced Clustering, AMD, Appro, arm, Broadcom, Chelsio, Dell, Fulcrum, Fujitsu, Intel, Mellanox, Microway, NetEffect, Pattern Computer, QLogic, ROCKPORT and Sun. Other technology partner includes: TotalView.

Announcements

Tweets by mvapich

OMB 8.0b with support for fully rebuilt benchmarks on common template files and unified OMB launcher for batch testing of pt2pt, collective, and RMA tests, and several bug fixes are available. [more]

Spack and Docker versions of MVAPICH 4.1GA are available. [more]

MVAPICH 4.1 GA with support for UCX and Libfabrics; support for major CPUs (x86-Intel, x86-AMD, and ARM), major Interconnects (IB, Slingshot, OPX, Omni-Path, ROCE, and Ethernet/ iWARP), and major GPUs (from NVIDIA, AMD, and Intel); improved rndv protocol performance in point-to-point operations; improved MPIT PVAR support; and updated embedded OMB v7.5.1 is available. [more]

MVAPICH-Plus 4.1GA with support for unified MVAPICH2-GDR and MVAPICH2-X features is available. Features of this release include: support for various HPC fabrics (InfiniBand, Slingshot, Omni-Path, OPX, RoCE, and Ethernet/iWARP), various CPUs (x86, ARM, and OpenPOWER), and various GPUs (NVIDIA, AMD, and Intel), adaptive dynamic collective tuning, Optimized HIP kernel-based collective performance for AMD GPUs, Optimized algorithms for GPU collectives, dynamic GPU initialization after MPI_Init, unified GPU memory models for AMD MI300A APUs, Improved rndv protocol performance in point-to-point operations, Improved MPIT PVAR support, and multiple bug fixes. This new release series is targeted to provide optimized support for modern platforms (CPU, GPU, and interconnects) for HPC, Deep Learning, Machine Learning, Big Data, and Data Science applications. [more]

MPI4Spark 0.3 (based on Apache Spark 3.3.0) with support for MPI-based communication runtime on high-performance networks (InfiniBand, OPA, ROCE, and Slingshot) to accelerate SPARK-based applications and support for the YARN cluster manager is available. [more]

ParaInfer-X v1.0 with MPI and NCCL-based support for fast parallel inference of various large language models (GPT-J and LlaMA), persistent model inference stream, temporal fusion/in-flight batching of multiple requests, multiple GPU tensor parallelism, asynchronous memory reordering for evicting finished requests, and support for float32, float16, bfloat16 for model inference is available. [more]

MPI4DL 0.6 with support for distributed and accelerated training framework for very high-resolution images that integrates Spatial Parallelism, Layer Parallelism, and Pipeline Parallelism is available. [more]

MPI4cuML 0.5 (based on cuML 22.02.00) with support for RAFT 22.02.00, C++ and Python APIs, built on top of mpi4py over the MVAPICH2-GDR library, handles to use MVAPICH2-GDR backend for Python cuML applications (KMeans, PCA, tSVD, RF, and LinearModels) is available. [more]

OSU INAM 1.0 OSU InfiniBand Network Analysis and Monitoring (INAM) Tool 1.0 with support for data logging progress bars on the UI for all charts, asynchronous calls for data loading, detailed debugging levels for the INAM daemon, and features in conjunction with MVAPICH2-X 2.3 is available. Click here for more details!

Partnership and contribution to the NSF-Awarded $20M AI-Institute on Intelligent CyberInfrastructure (ICICLE). Details.

MPI4Dask 0.2 (based on Dask Distributed 2021.01.0) with support for MPI-based communication in Dask for a cluster of CPUs and GPUs, built on top of mpi4py over the MVAPICH2, MVAPICH2-X, and MVAPICH2-GDR library, starting execution of Dask programs using Dask-MPI, compliant with user-level Dask APIs and packages is available. [more]

CUDA

ROCM

MVAPICH: MPI over InfiniBand, Omni-Path, Ethernet/iWARP, RoCE, and Slingshot

High-Performance Parallel Programming Libraries

Microbenchmarks

Tools

Announcements