Designing Efficient Small Message Transfer Mechanism for Inter-node MPI Communication on InfiniBand GPU Clusters
R. Shi, S. Potluri, K. Hamidouche, M. Li, J. Perkins, D. Rossetti, D. Panda
IEEE International Conference on High Performance Computing (HiPC ’14),
Dec 2014.