MVAPICH 3.0 Quick Start Guide
MVAPICH Team
Network-Based Computing Laboratory
Department of Computer Science and Engineering
The Ohio State University
http://mvapich.cse.ohio-state.edu
Copyright (c) 2001-2023
Network-Based Computing Laboratory,
headed by Dr. D. K. Panda.
All rights reserved.
Last revised: February 16, 2024
This Quick Start contains the necessary information for MVAPICH users to download, install, and use MVAPICH 3.0. Please refer to our User Guide for the comprehensive list of all features and instructions about how to use them.
MVAPICH (pronounced as “em-vah-pich”) is an open-source MPI software to exploit the novel features and mechanisms of high-performance networking technologies (InfiniBand, iWARP, RDMA over Converged Enhanced Ethernet (RoCE v1 and v2), Slingshot 10, and Rockport Networks) and deliver best performance and scalability to MPI applications. This Release Candidate of MVAPICH 3.0 adds support for the Cray Slingshot 11, Cornelis OPX, and Intel PSM3 interconnects through the OFI libfabric library, and for the UCX communication library.
Please note that as this is a pre-release, performance may not be optimal. For best performance on Mellanox InfiniBand, RoCE, iWARP, Slingshot 10 or lower, Rockport Networks, and Intel TrueScale or Omni-Path adapters with PSM2, please use MVAPICH 2.3.7.
This software is developed in the Network-Based Computing Laboratory (NBCL), headed by Prof. Dhabaleswar K. (DK) Panda since 2001.
More details on MVAPICH software, users list, mailing lists, sample performance numbers on a wide range of platforms and interconnects, a set of OSU benchmarks and related publications can be obtained from our website.
The MVAPICH 3.0 source code package includes MPICH 3.4.3. All the required files are present in a single tarball.
Download the most recent distribution tarball from http://mvapich.cse.ohio-state.edu/download/mvapich/mv2/mvapich2-3.0.tar.gz
If you have either UCX of OFI (libfabric) installed on your system on a default PATH you can use the default configuration to automatically detect the installed communication library…
If you’re using a Mellanox InfiniBand, RoCE, iWARP, Slingshot 10, or Rockport Networks network adapter you can use the UCX library configuration…
If a UCX installation is available on a system PATH that version will be used, otherwise the included version will be built. To configure with a particular version of UCX please use the following configuration…
To force the included version to be built, you may use the following…
If you’re using an Intel TrueScale, Intel Omni-Path, or Intel Columbiaville adapter you should use the OFI library configuration…
If an OFI installation is available on a system PATH that version will be used, otherwise the included version will be built. The included version of OFI is v1.15.1 and will support the PSM, PSM2, PSM3, and OPX libraries. If you require a different libfabric version please use the following…
To force the included version to be built, you may use the following…
If you’re using a Cray Slingshot 11 Adapater you must use the OFI library configuration…
Where [PATH] points to the directory with the custom Cray libfabric library installed. This path should include both a lib and include directory. Using a non-Cray version of libfabric, or the embedded version, is not supported and will lead to poor performance. To use the Cray Slingshot 11 interconnect, please set the CVAR ‘MPIR_CVAR_OFI_USE_PROVIDER=cxi‘ to ensure that the Cray CXI provider is used by libfabrics. Other providers are typically detected at runtime, but can be also be explicitly set in a similar manner. From the libfabrics side, a provider may also be forced by setting ‘FI_PROVIDER=provname‘.
MVAPICH supports many other configure and run time options which may be useful for advanced users. Please refer to our User Guide for more complete details.
In this section we will demonstrate how to build and run a hello world program which uses mpi.
Hostfile Format The mpiexec hostfile format allows for users to specify hostnames, one per line.
The following demonstrates the distribution of MPI ranks when using different hostfiles:
[IMPORTANT]
================================================================================
The -ppn option will create a block of N processes on each node in the hostfile. This is analogous to using the ‘:#‘
syntax in the hostfile. Using both of these capabilities to create a block ordering will be multiplicitive. Ie:
setting node1:2 in the hostfile and -ppn 2 on the command line will result in 4 processes being allocated to
node1.
If you are using the SLURM resource manager, ommitting a hostfile will result in mpiexec using the SLURM_JOB_HOSTLIST environment variable to determine the hosts. It will distribute processes accross all active nodes in the job according the value set by -ppn.
Pass an environment variable named FOO with the value BAR
By default MVAPICH is built with mpirun_rsh and the MPICH Hyrda process manager. Hydra can be invoked using the mpiexec binary in a standard install.
To configure with SLURM’s srun launcher as your launcher, please use the following configuration:
Or, if you are on a Cray cluster:
If your SLURM or Cray libpmi.so, pmi.h, libpmi2.so, and pmi2.h files are on non-standard paths you may need to add the appropriate lib and include directories to LD_LIBRARY_PATH, LIBRARY_PATH and CPATH respectively.
Please look at our User Guide for more complete details.
Please see the following for more information.