Designing Efficient Small Message Transfer Mechanism for Inter-node MPI Communication on InfiniBand GPU Clusters R. Shi, S. Potluri, K. Hamidouche, M. Li, J. Perkins, D. Rossetti, D. Panda IEEE International Conference on High Performance Computing (HiPC ’14), Dec 2014.