GPU-Aware Design, Implementation, and Evaluation of Non-blocking Collective Benchmarks A. Awan, K. Hamidouche, A. Venkatesh, J. Perkins, H. Subramoni, D. Panda EuroMPI 2015, Sep 2015.