Machine Specifications

CPU Model CPU Core Info Memory IB Card IB Switch OS CUDA GPU
Intel Xeon CPU E5-2687W v3 2x10 @ 3.10GHz 64GB Mellanox ConnectX-4 (100Gbps) Mellanox EDR IB Switch RHEL 7.0 CUDA 9.2 NVIDIA Tesla V100-PCIE-16GB

Inter-Node Performance numbers of MVAPICH2 on Intel Haswell Architecture with Mellanox ConenctX-4 (11/09/18)

One Way Latency Unidirectional Bandwidth Bidirectional Bandwidth Notes
1.85 us 9890.47 MBps 17962.21 MBps Inter-Node GPU to GPU (Device-to-Device) using single HCA

Machine Specifications

CPU Model CPU Core Info Memory IB Card IB Switch OS CUDA GPU
Intel Xeon CPU E5-2650 v4 2x12 @ 2.2GHz 64GB Mellanox ConnectX-4 (100Gbps) Mellanox EDR IB Switch RHEL 7.0 CUDA 9.0 4xNVIDIA Tesla P100-PCIE2-16GB

Intra-Node Performance numbers of MVAPICH2 on Intel Broadwell Architecture with Mellanox ConenctX-4 (11/09/18)

One Way Latency Unidirectional Bandwidth Bidirectional Bandwidth Notes
1.55 us 13049.32 MBps 21261.77 MBps Intra-Node (Device-to-Device)

Machine Specifications

CPU Model CPU Core Info Memory IB Card IB Switch OS CUDA GPU
POWER8NVL 2x8x10 @ 2 GHz 64GB Dual Mellanox Connect-X4 (2x100Gbps) Mellanox EDR IB Switch RHEL 7.0 CUDA 9.2 NVIDIA Tesla P100-SXM2-16GB

Inter-Node Performance numbers of MVAPICH2 on OpenPOWER8 Architecture with Mellanox Conenct-X4 EDR (11/09/18)

One Way Latency Unidirectional Bandwidth Bidirectional Bandwidth Notes
23.12 us 11978.03 MBps 21398.28 MBps Inter-Node GPU to GPU (Device-to-Device) using a single HCA

Intra-Node Performance numbers of MVAPICH2 on OpenPOWER8 Architecture with Mellanox Conenct-X4 EDR (11/09/18)

One Way Latency Unidirectional Bandwidth Bidirectional Bandwidth Notes
16.06 us 32964.01 MBps 64372.87 MBps Intra-Node (Device-to-Device)