MVAPICH2-X-AWS release is based on MVAPICH2-X and incorporates designs that take advantage of Scalable Reliable Datagram (SRD) of AWS Elastic Fabric Adapter(EFA) technology. It also provides support for XPMEM to achieve efficient intra-node communication performance. The latest version is MVAPICH2-X-AWS 2.3.
Based on MVAPICH2-X 2.3 GA
Design based on Amazon Elastic Fabric Adapter’s (EFA) Scalable Reliable Datagram (SRD) transport protocol
Delivers efficient inter-node latency and bandwidth performance
Support for XPMEM based intra-node communication
Optimized and Tuned collectives (inter-node and intra-node)
Support for dynamic run-time detection of XPMEM module
Targeted for AWS EC2 instances with EFA support
Support available (currently) for basic OS types on AWS EC2 including: Amazon Linux 1/2, CentOS 6/7, Ubuntu 16.04/18.04
3. Launch AWS EFA Instance
Follow the step 1-3 in this webpage:
Launch a c5n.18xlarge instance with Amazon Linux 2 AMI.
4. Install MVAPICH2-X
Install MVAPICH2-X from rpm: (make sure you have sudo access)
$ wget https://mvapich.cse.ohio-state.edu/download/mvapich/mv2x/2.3/mofed4.6/mvapich2-x-aws-xpmem-mofed4.6-gnu7.3.1-2.3rc3XAWS-2.amzn2.x86_64.rpm
$ wget https://mvapich.cse.ohio-state.edu/download/mvapich/mv2x/2.3/mofed4.6/mvapich2-x-aws-xpmem-mofed4.6-gnu7.3.1_2.3rc3XAWS-2.amzn2_amd64.deb
$ rpm -Uvh --nodeps mvapich2-x-intermediate-aws-ofed-gnu7.3.1-2.3-2.amzn2.x86_64.rpm
$ rpm --prefix=/custom/install/prefix -Uvh --nodeps mvapich2-x-intermediate-aws-ofed-gnu7.3.1-2.3-2.amzn2.x86_64.rpm
$ rpm2cpio mvapich2-x-intermediate-aws-ofed-gnu7.3.1-2.3-2.amzn2.x86_64.rpm | cpio -id
5. Install XPMEM
To run MVAPICH2-X-AWS with better intra-node performance, you may want to install and load XPMEM as well.
Download the XPMEM module from the following Gitlab link
$ git clone https://gitlab.com/hjelmn/xpmem.git
$ cd xpmem $ ./autogen.sh $ ./configure --prefix=/opt/xpmem --with-default-prefix=/opt/xpmem --with-module=/opt/xpmem/share/modules/xpmem $ sudo make -j8 install
$ sudo insmod /opt/xpmem/lib/modules/4.14.123-111.109.amzn2.x86_64/xpmem.ko $ sudo chmod 666 /dev/xpmem
$ lsmod | grep xpmem xpmem 32569 0
6. Create More Instances
Now you can install HPC applications. You can either use AWS ParallelCluster to create a cluster with head node and compute nodes, or you can create more instances with image of the created instance. To create more instances, make AMI from our created instance, launch new instances with the AMI so that you don’t need to re-install everything.
Note that you need to repeat the above step to load xpmem everytime when you launch a new instance or reboot an existed instance.
7. Example: How to Run OSU Micro-benchmarks?
OMB is installed as default in mvapich2-x install path, you can find OMB in ./libexec directory
go to mvapich2-x install path such as /opt/mvapich2-x/gnu7.3.1/aws-ofed/intermediate/mpirun
you may need to prepend mvapich2-x library to LD_LIBRARY_PATH like this:
$ export LD_LIBRARY_PATH=/opt/mvapich2-x/gnu7.3.1/aws-ofed/intermediate/mpirun/lib64/:$LD_LIBRARY_PATH
run OMB with this command:
$ ./bin/mpirun_rsh -np 2 -hostfile ~/hostfile ./libexec/osu-micro-benchmarks/mpi/pt2pt/osu_latency