Ohio State University

Hybrid MPI+OpenSHMEM Graph500 Benchmark | Performance | Network-Based Computing Laboratory

Hybrid MPI+OpenSHMEM Graph500 Benchmark (05/06/13)

  • Experimental Testbed: The testbed (TACC Stampede) is equipped with compute nodes with Intel Sandybridge series of processors using Xeon dual eight-core sockets, operating at 2.70GHz with 32GB RAM. Each node is equipped with MT4099 FDR ConnectX HCAs (54 Gbps data rate) with PCI-Ex Gen3 interfaces. The operating system used is CentOS release 6.3, with kernel version 2.6.32-279.el6 and OpenFabrics version
  • The Graph500 Benchmark represents the subclass of data intensive and irregular applications that use graph algorithm-based processing methods. The Concurrent Search kernel of Graph500 benchmark does Breadth First Traversal (BFS) time of a given graph. The Graph500 problem size is represented using Scale and Edge Factor. Scale is logarithm base two of the number of vertices; and, edge-factor is the ratio of the graph's edge count to its vertex count. Thus Scale = N and and Edge factor = M indicates a graph with 2N vertices and 2N * M edges.
  • Graph500 Benchmark Suite contains MPI based reference implementations: MPI Simple, MPI CSR (Replicated Compressed Sparse Row), MPI CSC (Replicated Compressed Sparse Column), and MPI OneSided. The hybrid design (MPI+OpenSHMEM) of Graph500 Concurrent Search Benchmark using MVAPICH2-X provides significant performance improvement compared to pure MPI based implementations.
  • The bar graph depicts the performance improvements that can be obtained for the Graph500 Concurrent Search Benchmark by using hybrid MPI+OpenSHMEM design using MVAPICH2-X, as compared to pure MPI based versions. The problem size for Graph500 is Scale = 29 and EdgeFactor = 16.