MVAPICH/MVAPICH2 Project
Ohio State University



Features | MVAPICH2 | Overview | Network-Based Computing Laboratory

MVAPICH2-GDR 2.0b Features

Features for supporting GPU-GPU communication on clusters with NVIDIA GPUs.

MVAPICH2-GDR 2.0b derives from MVAPICH2 2.0b, which is an MPI-3 implementation based on MPICH ADI3 layer. All the features available with OFA-IB-CH3 channel of MVAPICH2 2.0b are available with this release. MVAPICH2-GDR 2.0b offers additional features that take advantage of the GPUDirect RDMA technology for inter-node communication between NVIDIA GPUs on clusters with Mellanox InfiniBand adapters. The list of features for supporting MPI communication from NVIDIA GPU device memory is provide below.
  • High performance RDMA-based inter-node point-to-point communication (GPU-GPU, GPU-Host and Host-GPU) using GPUDirect RDMA and pipelining
  • Multi-rail support for inter-node point-to-point GPU communication
  • High performance intra-node point-to-point communication for multi-GPU adapters/node (GPU-GPU, GPU-Host and Host-GPU) using CUDA IPC and pipelining
  • Automatic communication channel selection for different GPU communication modes (DD, HH and HD) in different configurations (intra-IOH a and inter-IOH)
  • Optimized and tuned support for collective communication from GPU buffers
  • Enhanced designs for Alltoall and Allgather collective communication from GPU device buffers
  • Efficient vector, hindexed datatype processing on GPU buffers
  • Dynamic CUDA initialization. Support GPU device selection after MPI_Init
  • Support for non-blocking streams in asynchronous CUDA transfers for better overlap
  • Efficient synchronization using CUDA Events for pipelined device data transfers
  • Support for running on heterogeneous clusters with GPU and non-GPU nodes
A complete list of features supported in MVAPICH2-GDR 2.0b (derived from MVAPICH2 2.0b) can be found here.