1. Overview

The MVAPICH2-EA 2.1 (Energy-Aware) release is based on MVAPICH2 2.1 and incorporates designs and algorithms that optimize the performance-energy trade-off. In other words, MVAPICH2-EA, is a white-box approach that reduces energy consumption while providing the maximum performance. Further, MVAPICH2-EA provides the flexibility to achieve energy savings while ensuring performance does not degrade beyond what the user permitted. For more details on the degradation-tolerance concept, please refer to MV2_EAM_TOLERANCE within the Basic Usage section.

The OSU Energy Monitoring Tool (OEMT) is also provided as part of the MVAPICH2-EA package. OEMT allows users to measure the energy consumption of MPI applications. For more details on the OEMT tool, please refer to section OSU Energy Monitoring Tool (OEMT-0.8).

2. Installing MVAPICH2-EA package

To install the MVAPICH2-EA package you simply need to download the package and install the RPM using your favorite RPM tool.

$ curl -O http://mvapich.cse.ohio-state.edu/download/mvapich/ea/2.1/mvapich2-ea-2.1.tar.gz
$ tar xf mvapich2-ea-2.1.tar.gz
$ cd mvapich2-ea-2.1/
$ rpm -Uvh --nodeps mvapich2-ea-gnu-2.1-1.el6.x86_64.rpm

The RPMs contained in our packages are relocatable and can be installed using a prefix other than the default of /opt/mvapich2/ea/2.1/gnu used by the package in the previous example.

Install package specifying custom prefix
$ rpm --prefix /custom/install/prefix -Uvh --nodeps mvapich2-ea-gnu-2.1-1.el6.x86_64.rpm

If you do not have root permission, you can use rpm2cpio to extract the package.

Use rpm2cpio to extract the package
$ rpm2cpio mvapich2-ea-gnu-2.1-1.el6.x86_64.rpm | cpio -id

When using the rpm2cpio method, you will need to update the mpi compiler scripts, such as mpicc, in order to point to the correct path of where you place the package.

Tip
If you are using a Debian based system such as Ubuntu you can convert the rpm to a deb using a tool such as alien or follow the rpm2cpio instructions above.

3. Tuning and Usage Parameters

Note that MVAPICH2-EA selects the optimal value for each of the following parameters.

3.1. Basic Usage

The usage of MVAPICH2-EA is equivalent to the usage of the default MVAPICH2 library. In addition to the tunables available in MVAPICH2, MVAPICH2-EA responds to the following environmental variables.

  • MV2_ENABLE_EAM

    • Default: 1 (Enabled)

    • Toggles support for energy-aware communication protocols

    • To disable: set to 0

  • MV2_EAM_TOLERANCE

    • Default: 5

    • Positive integer value

    • Tunes the amount of performance degradation permitted in terms of percentage for any given MPI data transfer call

    • For instance if MV2_EAM_TOLERANCE=5, MVAPICH2-EA will try to save the maximum energy while allowing for at most 5% degradation in performance compared to native execution

4. OSU Energy Monitoring Tool (OEMT-0.8)

The OSU Energy Monitoring Tool (OEMT) is comprised of an utility library and kernel module that allows users to measure the energy consumed by an MPI application. The library can either be directly linked in with the application when compiled or used with prebuilt binaries via LD_PRELOAD. Note that this tool can be used with all MPI run-times on Intel CPU architectures from SandyBridge onward.

4.1. System Requirements

OEMT uses RAPL MSRs which are currently available only on Intel CPUs from the SandyBridge architecture onward. Thus OEMT requires:

  • Intel SandyBridge CPUs and newer.

This the only requirement for both the library and the kernel module of the OEMT package.

4.2. Workaround for ROOT permission

OEMT uses the koemt kernel module to avoid the requirement for users to run applications with elevated permissions. Indeed, reading the MSR counters requires ROOT permission in all kernels. To alleviate this restriction, we provide a safe kernel module that allows read only access to the MSR energy counters.

4.3. Download and Install

Please download and install the koemt package available on the MVAPICH2 website http://mvapich.cse.ohio-state.edu/tools/oemt/. The tarball includes the sources, Makefile and the install scripts.

4.4. Usage Parameters for OEMT-0.8

  • OEMT_ENABLE

    • Default: 1

    • Set to 0 to disable usage of the tool even if it has been linked/pre-loaded

  • OEMT_MSR_DOMAIN

    • Default: 1 (PKG energy consumption)

    • Accepted values - 0 : CPU (PP0); 1 : PKG; 2 : DRAM; 3 : ALL

  • OEMT_MSR_VERBOSITY

    • Default: 1

    • MSR are currently available only at socket level granularity. This parameter select how many values are reported per node: 1 socket or 2 sockets.

4.5. Running applications

Using the LD_PRELOAD functionality is the simplest way to use the OEMT tool utility. This is a generic solution and works even for a pre-compiled applications where the source code is not available.

Due to the ABI incompatibility between the different MPI run-times, we provide two versions of the tool. Pre-load the appropriate library based on the MPI in use.

liboemt_1.so

Compatible with MPICH derivative run-times with MPICH-3.1+ ABI compatibility. This includes:

  • MVAPICH2 2.0+

  • MPICH-3.1+

  • IntelMPI-5.0+

  • CrayMPI-7

  • IBM MPI-2.1

liboemt_2.so

Compatible with OpenMPI derivative run-times, which includes:

  • OpenMPI

  • HPC-X

Example running OMB with LD_PRELOAD (using MVAPICH2-EA)
$ export MV2_PATH=/opt/mvapich2/ea/2.1/gnu
$ $MV2_PATH/bin/mpirun_rsh -n 2 hostA  hostB OEMT_MSR_DOMAIN=3 \
        OEMT_MSR_VERBOSITY=1 LD_PRELOAD=$OEMT_PATH/lib/liboemt_1.so ./osu_latency
# OSU MPI Latency Test
# Size            Latency (us)
0                         1.77
1                         1.88
2                         1.88
4                         1.88
8                         1.90
16                        1.92
32                        1.93
64                        1.98
128                       2.17
256                       2.59
512                       2.81
1024                      3.12
2048                      3.87
4096                      4.64
8192                      6.38
16384                     9.51
32768                    12.38
65536                    17.56
131072                   28.13
262144                   49.24
524288                  117.81
1048576                 205.94
2097152                 391.44
4194304                 746.55
<0> (PP0) = 19.394769 Joules
<0> (PKG) = 36.992257 Joules
<0> (DRAM) = 2.630593 Joules