MVAPICH :: Home

The number of downloads has crossed 1.91 million!! The number of organizations using MVAPICH libraries has crossed 3,450 in 92 countries!! The MVAPICH team would like to express thanks to all these organizations and their users!!

MVAPICH Delivers Sub-minute (22 sec) Job Startup for 229,376 processes!! (Details)

MVAPICH@ISC-HPC

Welcome to the home page of the MVAPICH project, led by Network-Based Computing Laboratory (NBCL) of The Ohio State University. The MVAPICH software, based on MPI 4.1 standard, delivers the best performance, scalability and fault tolerance for high-end computing systems and servers using InfiniBand, Omni-Path, Ethernet/iWARP, RoCE(v1/v2), Cray Slingshot, and Rockport Networks networking technologies. This software is being used by more than 3,450 organizations in 92 countries worldwide to extract the potential of these emerging networking technologies for modern systems. As of Jul '25, more than 1,921,000 downloads have taken place from this project's site. This software is also being distributed by many vendors as part of their software distributions.

The MVAPICH software family is ABI compatible with the version of MPICH it is based on. Please refer to our download page for more details.

The MVAPICH software is powering several supercomputers in the TOP500 list. Examples (from the June '25 ranking) include:

21st, 10,649,600-core (Sunway TaihuLight) at National Supercomputing Center in Wuxi, China
67th, 448, 448 cores (Frontera) at TACC
88th, 288,288 cores (Lassen) at LLNL

The MVAPICH group provides several software libraries as listed below.

High-Performance Parallel Programming Libraries
MVAPICH	Support for InfiniBand, Slingshot, Omni-Path, Ethernet/iWARP, RoCE, and Rockport Networks
MVAPICH-Plus	Advanced MPI for GPU clusters with converged software stack for unified HPC, DL, ML, Big Data and Data Science applications. This new series replaces the older MVAPICH2-GDR and MVAPICH2-X stacks.
MVAPICH2-Azure	Optimized Support for Microsoft Azure Platform with InfiniBand
MVAPICH2-X	Advanced MPI features/support (UMR, ODP, DC, Core-Direct, SHArP, XPMEM), OSU INAM (InifniBand Network Monitoring and Analysis), PGAS (OpenSHMEM, UPC, UPC++, and CAF), and MPI+PGAS programming models with unified communication runtime
MVAPICH2-X-AWS	Advanced MPI features (SRD and XPMEM) with support for Amazon Elastic Fabric Adapter (EFA)
MVAPICH2-GDR	Optimized MPI for clusters with NVIDIA GPUs, AMD GPUs, and for GPU-enabled Deep Learning Applications
MVAPICH2-J	Java bindings for the MVAPICH2 family of MPI libraries
MVAPICH2-Virt	Hypervisor and container based (Docker and Singularity) HPC cloud with MPI & IB (SR-IOV)
MVAPICH2-EA	Energy aware and High-performance MPI

Microbenchmarks
OMB	Microbenchmarks suite to evaluate MPI and PGAS (OpenSHMEM, UPC, and UPC++) libraries for CPUs and GPUs

Tools
OSU INAM	Network monitoring, profiling, and analysis for clusters with MPI and scheduler integration
OEMT	Utility to measure the energy consumption of MPI applications

This project is supported by funding from U.S. National Science Foundation, U.S. DOE Office of Science, U.S. Department of Defense, Ohio Board of Regents, Ohio Department of Development, arm, Cisco Systems, Cray, Intel, Linux Networx, Mellanox, Microsoft, NVIDIA, Pattern Computer, QLogic, ROCKPORT, and Sun Microsystems; and equipment donations from Advanced Clustering, AMD, Appro, arm, Broadcom, Chelsio, Dell, Fulcrum, Fujitsu, Intel, Mellanox, Microway, NetEffect, Pattern Computer, QLogic, ROCKPORT and Sun. Other technology partner includes: TotalView.

Announcements

Tweets by mvapich

(NEW) MUG '25 Preliminary Program is now available!! Click here for more details.

(NEW) OMB 7.5.1 with support for fine grained iteration control, automatic power-of-2 calculations in message sizes, and several bug fixes are available. [more]

(NEW) HiDL 2.0 (a vendor neutral stack) with support for PyTorch 2.0 and later versions, built on top of the MVAPICH-Plus MPI back-end, providing large-scale Distributed Data Parallel (DDP) training for clusters with NVIDIA and AMD GPUs and without the need of any vendor supported collective communication library is available. [more]

MVAPICH-Plus 4.1rc with support for unified MVAPICH2-GDR and MVAPICH2-X features is available. Features of this release include: support for various HPC fabrics (InfiniBand, Slingshot, Omni-Path, OPX, RoCE, and Ethernet/iWARP), various CPUs (x86, ARM, and OpenPOWER), and various GPUs (NVIDIA, AMD, and Intel), adaptive dynamic collective tuning, Optimized HIP kernel-based collective performance for AMD GPUs, Optimized algorithms for GPU collectives, dynamic GPU initialization after MPI_Init, unified GPU memory models for AMD MI300A APUs, Improved rndv protocol performance in point-to-point operations, Improved MPIT PVAR support, and multiple bug fixes. This new release series is targeted to provide optimized support for modern platforms (CPU, GPU, and interconnects) for HPC, Deep Learning, Machine Learning, Big Data, and Data Science applications. [more]

The 13th annual MVAPICH User Group (MUG) Conference will be held during August 18-20, 2025 in Columbus, Ohio, USA. Click here for details.

Spack and Docker versions of MVAPICH 4.0GA are available. [more]

MVAPICH 4.0 GA with support for UCX, Libfabrics, and an enhanced OFI provider for IB systems, "mverbs;ofi_ucr"; support for major CPUs (x86-Intel, x86-AMD, and ARM), major Interconnects (IB, Slingshot, OPX, Omni-Path, ROCE, and Ethernet/ iWARP), and major GPUs (from NVIDIA, AMD, and Intel); optimized support for pt2pt inter-node and intra-node communication; CMA support for intra-node pt2pt CPU communication; and optimized algorithms for CPU-based collectives is available. [more]

MPI4Spark 0.3 (based on Apache Spark 3.3.0) with support for MPI-based communication runtime on high-performance networks (InfiniBand, OPA, ROCE, and Slingshot) to accelerate SPARK-based applications and support for the YARN cluster manager is available. [more]

The 12th Annual MVAPICH User Group (MUG) Conference was held successfully in a hybrid manner on August 19-21, 2024 with more than 220 attendees. Slides and videos of the presentations are available from here.

ParaInfer-X v1.0 with MPI and NCCL-based support for fast parallel inference of various large language models (GPT-J and LlaMA), persistent model inference stream, temporal fusion/in-flight batching of multiple requests, multiple GPU tensor parallelism, asynchronous memory reordering for evicting finished requests, and support for float32, float16, bfloat16 for model inference is available. [more]

MPI4DL 0.6 with support for distributed and accelerated training framework for very high-resolution images that integrates Spatial Parallelism, Layer Parallelism, and Pipeline Parallelism is available. [more]

MPI4cuML 0.5 (based on cuML 22.02.00) with support for RAFT 22.02.00, C++ and Python APIs, built on top of mpi4py over the MVAPICH2-GDR library, handles to use MVAPICH2-GDR backend for Python cuML applications (KMeans, PCA, tSVD, RF, and LinearModels) is available. [more]

OSU INAM 1.0 OSU InfiniBand Network Analysis and Monitoring (INAM) Tool 1.0 with support for data logging progress bars on the UI for all charts, asynchronous calls for data loading, detailed debugging levels for the INAM daemon, and features in conjunction with MVAPICH2-X 2.3 is available. Click here for more details!

MVAPICH2-X-AWS 2.3.7 based on MVAPICH2-X, direct support for Amazon EFA adapter, improved inter-node latency and bandwidth performance, initial support for AWS hpc6a/c6a instances, support and performance optimization for AWS c6g/c7g with Amazon Graviton 2/3 ARM processors, support for rdma_read feature for p4d instance types, support for currently available basic OS types on AWS EC2 including: Amazon Linux 1/2, CentOS 6/7, Ubuntu 18.04/20.04 is available. [more]

MVAPICH2-J 2.3.7 GA (based on MVAPICH2 2.3.7 GA) with support for Java bindings to the MVAPICH2 family of libraries, support for communicating data from basic Java data types as well as direct ByteBuffers from the Java New I/O (NIO) package is available. [more]

MVAPICH2-GDR 2.3.7 GA (based on MVAPICH2 2.3.7 GA) with support for on-the-fly compression of point-to-point GPU-GPU communication for NVIDIA GPUs; hybrid communication protocols using NCCL-based, CUDA-based, and IB verbs-based primitives for blocking and non-blocking collective operations; full support for NVIDIA DGX, NVIDIA DGX V-100, NVIDIA DGX A-100 systems, and AMD systems with Mi100 GPUs; support for Slingshot-10 interconnect; optimized support for HPC, deep learning, machine learning, and data science workloads and multiple bug fixes is available. [more]

MVAPICH2 2.3.7 GA with support for Cray Slingshot 10, Rockport's switchless networks, enhanced support for blocking, non-blocking collective offload using Mellanox SHARP, and multiple bug-fixes. [more]

Partnership and contribution to the NSF-Awarded $20M AI-Institute on Intelligent CyberInfrastructure (ICICLE). Details.

MPI4Dask 0.2 (based on Dask Distributed 2021.01.0) with support for MPI-based communication in Dask for a cluster of CPUs and GPUs, built on top of mpi4py over the MVAPICH2, MVAPICH2-X, and MVAPICH2-GDR library, starting execution of Dask programs using Dask-MPI, compliant with user-level Dask APIs and packages is available. [more]

MVAPICH2-X 2.3 GA with optimized support for large message MPI_Allreduce and MPI_Reduce, improved communication performance using DC transport, optimized point-to-point and collective communication support for AWS EFA adapter and SRD transport protocol, availability of multiple MPI_T PVARs and CVARs, support for hybrid MPI+OpenSHMEM; optimized communication performance for AMD (EPYC), ARM, Intel and OpenPOWER platforms, and support for INAM 0.9.6 is available. [more]

CUDA

ROCM

MVAPICH: MPI over InfiniBand, Omni-Path, Ethernet/iWARP, RoCE, and Slingshot

High-Performance Parallel Programming Libraries

Microbenchmarks

Tools

Announcements