Memory Usage for All-to-all Connectivity
- Experimental Testbed: 1100-node InfiniBand Linux cluster at Lawrence Livermore National Laboratory. Each compute node has four 2.5 GHz Opteron 8216 dual-core processors for a total of 8 cores. Total memory per node is 16GB. Each node has a Mellanox MT25208 DDR HCA. InfiniBand software support is provided through the OpenFabrics/Gen2 stack. Values up to 4K are measured, 8K and 16K are simulated.
- MVAPICH-UD requires only around 40 MB per process even with all 16K processes connected to each other.

