MPI Intra-node Communication Performance on Opteron Platform (03/12/10)
- Experimental Testbed: Each node of our testbed has 16 AMD Opteron 1.95 Ghz processors with 512 KB L2 cache. Each node also has 16 Gigabyte memory and PCI-Express bus. They are equipped with MT25418 HCAs with PCI-Ex interfaces. A 24-port Mellanox switch is used to connect all the nodes. The operating system used was RedHat Enterprise Linux Server 5.
- MVAPICH2 currently delivers one-way latency of 0.72 microseconds within the socket and 0.89 microseconds between sockets, with 1 hop and 1.11 microseconds between sockets, with 2 hops for 4 bytes; unidirectional bandwidth upto 4037 Million Bytes/sec within the socket and 3705 Million Bytes/sec between sockets, with 1 hop and 3810 Million Bytes/sec between sockets, with 2 hops; bidirectional bandwidth upto 8234 Million Bytes/sec within the socket, 8569 Million Bytes/sec between sockets, with 1 hop and 8165 Million Bytes/sec between sockets with 2 hops on the above testbed. (1 Mega Byte = 1,048,576 Bytes; 1 Million Byte = 1,000,000 Bytes)

