Intra-node performance numbers of MVAPICH2 on Magny Cours Architecture (05/06/13)

  • Experimental Testbed: Each node of our testbed has 24 AMD Opteron 6174 processors running at 2.2 GHz with 512 KB L2 cache. Each node also has 32 Gigabyte memory, x8 PCI Express Gen2 interfaces and Mellanox ConnectX-2 QDR HCAs with PCI Express interfaces in multi-rail configuration. The nodes are connected using a 36 port Mellanox QDR InfiniBand switch with QSFP ports. The operating system used was Red Hat Enterprise Linux Server release 5.5 (Tikanga).
  • MVAPICH2 currently delivers one-way latency of .37 microseconds within the socket for 4 bytes and .45 microseconds between sockets for 4 bytes; unidirectional bandwidth upto 6092.02 Million Bytes/sec within the socket and 5762.80 Million Bytes/sec between sockets; bidirectional bandwidth upto 11448.79 Million Bytes/sec within the socket, 11541.51 Million Bytes/sec between sockets on the above testbed. (1 Mega Byte = 1,048,576 Bytes; 1 Million Byte = 1,000,000 Bytes)
  • Processes were mapped onto cores 1 and 2 to take the intra socket numbers and onto 1 and 12 to take the inter socket numbers.