1. Overview

MVAPICH2-X-AWS release is based on MVAPICH2-X and incorporates designs that take advantage of Scalable Reliable Datagram (SRD) of AWS Elastic Fabric Adapter(EFA) technology. It also provides support for XPMEM to achieve efficient intra-node communication performance. The latest version is MVAPICH2-X-AWS 2.3.7

2. Features

  • Based on MVAPICH2-X

  • Design based on Amazon Elastic Fabric Adapter’s (EFA) Scalable Reliable Datagram (SRD) transport protocol

  • Delivers efficient inter-node latency and bandwidth performance

  • Support for XPMEM based intra-node communication

  • Optimized and Tuned collectives (inter-node and intra-node)

  • Support for dynamic run-time detection of XPMEM module

  • Add initial support for AWS hpc6a/c6a instances with 3rd generation AMD EPYC processors (new)

  • Add support & performance optimization for AWS c6g/c7g with Amazon Graviton 2/3 processors aarch64 (new)

  • Targeted for AWS EC2 instances with EFA support

  • Support available (currently) for basic OS types on AWS EC2 including: Amazon Linux 1/2, CentOS 7, Ubuntu 20.04/18.04

3. Launch AWS EFA Instance

Follow the step 1-3 in this webpage:

Launch a AWS EC2 instance with Elastic Fabric Adapter enabled. We recommend to use Amazon Linux 2 AMI.

4. Install MVAPICH2-X-AWS

Install MVAPICH2-X-AWS from rpm: (make sure you have sudo access)

Download MVAPICH2-X-AWS rpm with following command (for alinux1/2, Centos7 with x86 Architecture):
$ wget http://mvapich.cse.ohio-state.edu/download/mvapich/mv2x/2.3/mvapich2-x-aws-mofed-gnu7.3.1-2.3x-1.amzn2.x86_64.rpm
(for Ubuntu)
$ wget http://mvapich.cse.ohio-state.edu/download/mvapich/mv2x/mvapich2-x-aws-mofed-gnu9.4.0_2.3.7x-2_amd64.deb
(for Arm Architecture)
$ wget http://mvapich.cse.ohio-state.edu/download/mvapich/mv2x/2.3/mvapich2-x-aws-mofed-gnu7.3.1-2.3x-1.amzn2.aarch64.rpm
you can install mvapich2-x to default path /opt/mvapich2-x
$ rpm -Uvh --nodeps mvapich2-x-aws-mofed-gnu7.3.1-2.3x-1.amzn2.x86_64.rpm
or you can Install library using a prefix to specify install path.
$ rpm --prefix=/custom/install/prefix -Uvh --nodeps mvapich2-x-aws-mofed-gnu7.3.1-2.3x-1.amzn2.x86_64.rpm
If you do not have root permission or are on a system that does not use RPMs you can use rpm2cpio to extract the library.
$ rpm2cpio mvapich2-x-aws-mofed-gnu7.3.1-2.3x-1.amzn2.x86_64.rpm | cpio -id

5. Install XPMEM

To run MVAPICH2-X-AWS with better intra-node performance, you may want to install and load XPMEM as well.

Download the XPMEM module from the following Gitlab link

$ git clone https://github.com/hpc/xpmem.git
Download and build xpmem
$ cd xpmem
$ ./autogen.sh
$ ./configure --prefix=/opt/xpmem
$ sudo make -j8 install
A Common Build Issue

A common build issue is likely to happen with latest kernel version of Amazon Linux 2 OS. Please find details & solutions in this link: https://github.com/hpc/xpmem/issues/40

Load xpmem
$ sudo insmod /opt/xpmem/lib/modules/4.14.123-111.109.amzn2.x86_64/xpmem.ko
$ sudo chmod 666 /dev/xpmem
you can check if xpmem is loaded by following command and output:
$ lsmod | grep xpmem
xpmem                  32569  0

6. Create More Instances

Now you can install HPC applications. You can either use AWS ParallelCluster to create a cluster with head node and compute nodes, or you can create more instances with image of the created instance. To create more instances, make AMI from our created instance, launch new instances with the AMI so that you don’t need to re-install everything.

Note that you need to repeat the above step to load xpmem everytime when you launch a new instance or reboot an existed instance.

7. Example: How to Run OSU Micro-benchmarks?

OMB is installed as default in mvapich2-x install path, you can find OMB in ./libexec directory

go to mvapich2-x install path such as /opt/mvapich2-x/gnu7.3.1/aws-ofed/intermediate/mpirun

you may need to prepend mvapich2-x library to LD_LIBRARY_PATH like this:

$ export LD_LIBRARY_PATH=/opt/mvapich2-x/gnu7.3.1/aws-ofed/intermediate/mpirun/lib64/:$LD_LIBRARY_PATH

run OMB with this command:

$ ./bin/mpirun_rsh -np 2 -hostfile ~/hostfile ./libexec/osu-micro-benchmarks/mpi/pt2pt/osu_latency