MVAPICH/MVAPICH2 Project
Ohio State University



Publications | Network-Based Computing Laboratory

Upcoming Publications

The following papers have been presented at the recent conferences or accepted be presented at upcoming conferences.

For all publications related to the MVAPICH project and other related projects, please refer to the Publications link.

Int'l Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE13)

Int'l Conference on Supercomputing (SC13)

  • S. Potluri, D. Bureddy, K. Hamidouche, A. Venkatesh, K. Kandalla, H. Subramoni and D. K. Panda, MVAPICH-PRISM: A Proxy-based Communication Framework using InfiniBand and SCIF for Intel MIC Clusters Int'l Conference on Supercomputing (SC '13), November 2013.

Int'l Workshop on OpenSHMEM (OpenSHMEM13)

  • J. Jose, J. Zhang, A. Venkatesh, S. Potluri and D. K. Panda, A Comprehensive Performance Evaluation of OpenSHMEM Libraries on InfiniBand Clusters, OpenSHMEM Workshop, Oct 2013.

Int'l Conference on Partitioned Global Address Space Programming (PGAS13)

  • M. Luo, M. Li, A. Venkatesh, X. Lu and D. K. Panda, UPC on MIC: Early Experiences with Native and Symmetric Modes, Int'l Conference on Partitioned Global Address Space Programming Models (PGAS '13), October 2013.
  • J. Jose, K. Kandalla, S. Potluri, J. Zhang and D. K. Panda, Optimizing Collective Communication in OpenSHMEM, Int'l Conference on Partitioned Global Address Space Programming Models (PGAS '13), October 2013.

Int'l Conference on Parallel Processing (ICPP 13)

  • K. C. Kandalla, H. Subramoni, K. Tomko, D. Pekurovsky and D. K. Panda, A Novel Functional Partitioning Approach to Design High-Performance MPI-3 Non-Blocking Alltoallv Collective on Multi-core Systems, Int'l Conference on Parallel Processing (ICPP '13), October 2013.
  • S. Potluri, K. Hamidouche, A. Venkatesh, D. Bureddy and D. K. Panda, Efficient Inter-node MPI Communication using GPUDirect RDMA for InfiniBand Clusters with NVIDIA GPUs, Int'l Conference on Parallel Processing (ICPP '13), October 2013.

IEEE Cluster (Cluster13)

  • R. Shi, S. Potluri, K. Hamidouche, X. Lu, K. Tomko and D. K. Panda, A Scalable and Portable Approach to Accelerate Hybrid HPL on Heterogeneous CPU-GPU Clusters, IEEE Cluster (Cluster '13), September 2013, Best Student Paper Award.
  • H. Subramoni, D. Bureddy, K. Kandalla, K. Schulz, B. Barth, J. Perkins, M. Arnold and D. K. Panda, Design of Network Topology Aware Scheduling Services for Large InfiniBand Clusters, IEEE Cluster (Cluster '13), September 2013.

EuroMPI (EuroMPI13)

  • M. Li, S. Potluri, K. Hamidouche, J. Jose and D. K. Panda, Efficient and Truly Passive MPI-3 RMA Using InfiniBand Atomics, EuroMPI 2013, September 2013.

Hot Interconnect (HOTI13)

  • K. Kandalla, A. Venkatesh, K. Hamidouche, S. Potluri and D. K. Panda, Designing Optimized MPI Broadcast and Allreduce for Many Integrated Core (MIC) InfiniBand Clusters, Int'l Symposium on High-Performance Interconnects (HotI '13), August 2013.

Extreme Scaling Workshop (XSCALE13)

  • S. Potluri, K. Hamidouc, D. Bureddy and D. K. Panda, MVAPICH2-MIC: A High-Performance MPI Library for Xeon Phi Clusters with InfiniBand, Extreme Scaling Workshop, August 2013.
  • A. Venkatesh, K. Kandalla and D. K. Panda, Optimized MPI Gather collective for Many Integrated Core (MIC) InfiniBand Clusters, Extreme Scaling Workshop, August 2013.

Int'l Conference on High Performance and Distributed Commputing (HPDC13)

  • R. Rajachandrasekar, A. Moody, K. Mohror and D. K. Panda, A 1PB/s File System to Checkpoint Three Million MPI Tasks, Int'l Conference on High Performnce Distributed Computing (HPDC '13), June 2013.

Int'l Supercomputing Conference (ISC13)

  • J. Jose, S. Potluri, K. Tomko and D. K. Panda, Designing Scalable Graph500 Benchmark with Hybrid MPI+OpenSHMEM Programming Models, Int'l Supercomputing Conference (ISC '13), June 2013.

Int'l Parallel and Distributed Processing Symposium (IPDPS13)

  • S. Potluri, D. Bureddy, H. Wang, H. Subramoni and D. K. Panda, Extending OpenSHMEM for GPU Computing, Int'l Parallel and Distributed Processing Symposium (IPDPS '13), May 2013.

Int'l Workshop on High Performance Data Intensive Computing (HPDIC13)

  • M.-W Rahman, N. S. Islam, X. Lu, J. Jose, H. Subramon, H. Wang and D. K. Panda, High-Performance RDMA-based Design of Hadoop MapReduce over InfiniBand, Int'l Workshop on High Performance Data Intensive Computing (HPDIC), held in conjunction with Int'l Parallel and Distributed Processing Symposium (IPDPS '13), May 2013.

Int'l Symposium on Cluster, Cloud, and Grid Computing (CCGGRID13)

  • S. Potluri, A. Venkatesh, D. Bureddy, K. Kandalla and D. K. Panda, Efficient Intra-node Communication on Intel-MIC Clusters, Int'l Symposium on Cluster, Cloud, and Grid Computing (CCGrid 2013), May 2013.
  • J. Jose, M. Li, X. Lu, K. Kandalla, M. Arnold and D. K. Panda, SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience, Int'l Symposium on Cluster, Cloud, and Grid Computing (CCGrid 2013), May 2013.

Int'l Conference on Supercomputing (SC12)

  • H. Subramoni, S. Potluri, K. Kandalla, B. Barth, J. Vienne, J. Keasler, K. Tomko, K. Schulz, A. Moody and D. K. Panda, Design of a Scalable InfiniBand Topology Service to Enable Network-Topology-Aware Placement of Processes, Supercomputing (SC12), November 2012. Best Paper and Best Student Paper Finalist.

Int'l Conference on Partitioned Global Address Space Programming Model (PGAS12)

  • M. Luo, H. Wang and D. K. Panda, Multi-Threaded UPC Runtime for GPU to GPU communication over InfiniBand, Int'l Conference on Partitioned Global Address Space Programming Models (PGAS '12), October 2012.
  • S. Potluri, K. Kandalla, D. Bureddy, M. Li and D. K. Panda, Efficient Intranode Desgins for OpenSHMEM on Multicore Clusters, Int'l Conference on Partitioned Global Address Space Programming Models (PGAS '12), October 2012.

EuroMPI 2012

  • D. Bureddy, H. Wang, A. Venkatesh, S. Potluri and D. K. Panda, OMB-GPU: A Micro-benchmark suite for Evaluating MPI Libraries on GPU Clusters, EuroMPI 2012, September 2012.

IEEE Cluster (Cluster) 2012

  • R. Rajachandrasekar, J. Jaswani, H. Subramoni and D. K. Panda, Minimizing Network Contention in InfiniBand Clusters with a QoS-Aware Data-Staging Framework, IEEE Cluster (Cluster '12), September 2012.

Int'l Workshop on Parallel Algorithm and Parallel Software (IWPAPS) 2012

  • K. Kandalla, A. Buluc¸ H. Subramoni, K. Tomko, J. Vienne, L. Oliker and D. K. Panda, Can Network-Offload based Non-Blocking Neighborhood MPI Collectives Improve Communication Overheads of Irregular Graph Algorithms? Int'l Workshop on Parallel Algorithm and Parallel Software (IWPAPS12), held in conjunction with IEEE Cluster (Cluster '12), September 2012.

Int'l Conference on Parallel Processing (ICPP) 2012

  • J. Jose, K. Kandalla, M. Luo and D. K. Panda, Supporting Hybrid MPI and OpenSHMEM over InfiniBand: Design and Performance Evaluation, Int'l Conference on Parallel Processing (ICPP '12), Sept. 2012.

Int'l Workshop on Productivity and Performance (PROPER) 2012

  • H. Subramoni, J. Vienne and D. K. Panda, A Scalable InfiniBand Network-Topology-Aware Performance Analysis Tool for MPI, 5th Int'l Workshop on Productivity and Performance (PROPER 2012), in conjunction with EuroPar, Aug. 2012.

Int'l Symposium on High-Performance Interconnects (HotI), 2012

  • J. Vienne, J. Chen, M.-W. Rahman, N. Islam, H. Subramoni and D. K. Panda, Performance Analysis and Evaluation of InfiniBand FDR and 40GigE RoCE on HPC and Cloud Computing System, Int'l Symposium on High-Performance Interconnects (HotI 2012), August 2012.

Int'l Conference on Supercomputing (ICS), 2012

  • M. Luo, D. K. Panda, C. Iancu and K. Z. Ibrahim, Congestion Avoidance on Manycore High Performance Computing Systems, Int'l Conference on Supercomputing (ICS '12), June 2012.

Int'l Supercomputing Conference (ISC) 2012

  • M. Luo, H. Wang, J. Vienne and D. K. Panda, Redesigning MPI Shared Memory Communication for Large Multi-Core Architecture, Int'l Supercomputing Conference (ISC '12), June 2012.

Int'l Parallel and Distributed Processing Symposium (IPDPS) 2012

  • J. Huang, X. Ouyang, J. Jose, M. Wasi-ur-Rahman, H. Wang, M. Luo, H. Subramoni, C. Murthy and D. K. Panda, High-Performance Design of HBase with RDMA over InfiniBand, Int'l Parallel and Distributed Processing Symposium (IPDPS '12), May 2012.

  • K. Kandalla, U. Yang, J. Keasler, T. Kolev, A. Moody, H. Subramoni, K. Tomko, J. Vienne and D. K. Panda, Designing Non-blocking Allreduce with Collective Offload on InfiniBand Clusters: A Case Study with Conjugate Gradient Solvers Int'l Parallel and Distributed Processing Symposium (IPDPS '12), May 2012.

Int'l Workshop on System Management Techniques, Processes, and Services (SMTPS) 2012

  • S. P. Raikar, H. Subramoni, K. Kandalla, J. Vienne and D. K. Panda, Designing Network Failover and Recovery in MPI for Multi-Rail InfiniBand Clusters, Int'l Workshop on System Management Techniques, Processes, and Services (SMTPS), in conjunction with Int'l Parallel and Distributed Processing Symposium (IPDPS '12), May 2012.

  • R. Rajachandrasekar, X. Besseron and D. K. Panda, Monitoring and Predicting Hardware Failures in HPC Clusters with FTB-IPMI, Int'l Workshop on System Management Techniques, Processes, and Services (SMTPS), in conjunction with Int'l Parallel and Distributed Processing Symposium (IPDPS '12), May 2012.

Int'l Workshop on Accelerators and Hybrid Exascale Systems (AsHES) 2012

  • S. Potluri, H. Wang, D. Bureddy, A. K. Singh, C. Rosales and D. K. Panda, Optimizing MPI Communication on Multi-GPU Systems using CUDA Inter-Process Communication, Int'l Workshop on Accelerators and Hybrid Exascale Systems (AsHES), in conjunction with Int'l Parallel and Distributed Processing Symposium (IPDPS '12), May 2012.

Int'l Conference on Cluster, Cloud and Grid Computing (CCGrid) 2012

  • J. Jose, H. Subramoni, K. Kandalla, M. Wasi-ur-Rahman, H. Wang, S. Narravula and D. K. Panda, Scalable Memcached design for InfiniBand Clusters using Hybrid Transports, Int'l Symposium on Cluster, Cloud, and Grid Computing (CCGrid 2012), May 2012.

TACC-Intel Highly Parallel Computing Symposium (TI-HPCS) 2012

  • S. Potluri, K. Tomko, D. Bureddy and D. K. Panda, Intra-MIC MPI Communication using MVAPICH2: Early Experience, TACC-Intel Highly-Parallel Computing Symposium, April 2012.

Int'l Symposium on Performance Analysis of Systems and Software (ISPASS) 2012

  • M. Wasi-ur-Rahman, J. Huang, J. Jose, X. Ouyang, H. Wang, N. S. Islam, H. Subramoni, C. Murthy and D. K. Panda, Understanding the Communication Characteristics in HBase: What are the Fundamental Bottlenecks? Int'l Symposium on Performnce Analysis of Systems and Software (ISPASS '12).

Int'l Conference on High Performance Computing (HiPC) 2011

  • M. Luo, J. Jose, S. Sur and D. K. Panda, Multi-threaded UPC Runtime with Network Endpoints: Design Alternatives and Evaluation on Multi-core Architectures, Int'l Conference on High Performance Computing (HiPC '11), Dec. 2011.

Int'l Conference on Partitioned Global Address Space Programming Model (PGAS11)

  • J. Jose, S. Potluri, M. Luo, S. Sur and D. K. Panda, UPC Queues for Scalable Graph Traversals: Design and Evaluation on InfiniBand Clusters, Fifth Conference on Partitioned Global Address Space Programming Model (PGAS11), Oct. 2011.

Int'l Workshop on Interfaces and Architectures for Scientific Data Storage (IASDS) 2011

  • Can a Decentralized Metadata Service Layer benefit Parallel Filesystems? Workshop on Interfaces and Architectures for Scientific Data Storage (IASDS '11), held in conjunction with Cluster '11, Sept. 2011.

Int'l Workshop Parallel Programming on Accelerator Clusters (PPAC) 2011

  • A. Singh, S. Potluri, H. Wang, K. Kandalla, S. Sur and D. K. Panda, MPI Alltoall Personalized Exchange on GPGPU Clusters: Design Alternatives and Benefits, Workshop on Parallel Programming on Accelerator Clusters (PPAC '11), held in conjunction with Cluster '11, Sept. 2011.

Int'l Conference on Cluster Computing (Cluster) 2011

  • H. Subramoni, K. Kandalla, J. Vienne, S. Sur, B. Barth, K. Tomko, R. McLay, K. Schulz and D. K. Panda, Design and Evaluation of Network Topology-/Speed-Aware Broadcast Algorithms for InfiniBand Clusters, IEEE Cluster '11, Sept. 2011.
  • H. Wang, S. Potluri, M. Luo, A. Singh, X. Ouyang, S. Sur and D. K. Panda, Optimized Non-contiguous MPI Datatype Communication for GPU Clusters: Design, Implementation and Evaluation with MVAPICH2, IEEE Cluster '11, Sept. 2011.

EuroMPI 2011

  • S. Potluri, H. Wang, V. Dhanraj, S. Sur and D. K. Panda, Optimizing MPI One Sided Communication on Multi-core InfiniBand Clusters using Shared Memory Backed Windows, EuroMPI '11, Sept. 2011.
  • S. Potluri, S. Sur, D. Bureddy and D. K. Panda, Design and Implementation of Key Proposed MPI-3 One-Sided Communication Semantics on InfiniBand, Poster/Short Paper, EuroMPI '11, Sept. 2011.

Int'l Conference on Parallel Processing (ICPP) 2011

  • J. Jose, H. Subramoni, M. Luo, M. Zhang, J. Huang, M. W. Rahman, N. S. Islam, X. Ouyang, S. Sur and D. K. Panda, Memcached Design on High Performance RDMA Capable Interconnects, Int'l Conference on Parallel Processing (ICPP '11), Sept. 2011.
  • X. Ouyang, R. Rajachandrasekar, X. Besseron, H. Wang, J. Huang and D. K. Panda, CRFS: A Lightweight User-Level Filesystem for Generic Checkpoint/Restart, Int'l Conference on Parallel Processing (ICPP '11), Sept. 2011.

Int'l Workshop on Productivity and Performance (PROPER) 2011

  • N. Dandapanthula, H. Subramoni, J. Vienne, K. Kandalla, S. Sur, D. K. Panda, and R. Brightwell, INAM - A Scalable InfiniBand Network Analysis and Monitoring Tool, 4th Int'l Workshop on Productivity and Performance (PROPER 2011), in conjunction with EuroPar, Aug. 2011.

Int'l Workshop on Resiliency in High Performance Computing in Clusters, Clouds, and Grids (Resilience) 2011

  • R. Rajachandrasekar, X. Ouyang, X. Besseron, V. Meshram and D. K. Panda, Can Checkpoint/Restart Mechanisms Benefit from Hierarchical Data Staging? Workshop on Resiliency in High Performance Computing in Clusters, Clouds, and Grids (Resilience '11), held in conjunction with EuroPar, Aug. 2011.

Hot Interconnects (HotI) 2011

  • K. Kandalla, H. Subramoni, J. Vienne, K. Tomko, S. Sur and D. K. Panda, Designing Non-blocking Broadcast with Collective Offload on InfiniBand Clusters: A Case Study with HPL, Hot Interconnect '11, Aug. 2011.

Int'l Supercomputing Conference (ISC) 2011

  • K. Kandalla, H. Subramoni, K. Tomko, D. Pekurovsky, S. Sur and D. K. Panda, High-Performance and Scalable Non-Blocking All-to-All with Collective Offload on InfiniBand Clusters: A Study with Parallel 3D FFT, Int'l Supercomputing Conference (ISC), June 2011.
  • H. Wang, S. Potluri, M. Luo, A. Singh, S. Sur and D. K. Panda, MVAPICH2-GPU: Optimized GPU to GPU Communication for InfiniBand Clusters, Int'l Supercomputing Conference (ISC), June 2011.

Cluster Computing and Grid (CCGrid) 2011

  • X. Ouyang, R. Rajachandrasekar, X. Besseron, D. K. Panda, High Performance Pipelined Process Migration with RDMA, Int'l Symposium on Cluster Computing and the Grid (CCGrid), May 2011.

Int'l Conference on Supercomputing (SC10)

  • Y. Cui, K.B. Olsen, T. H. Jordan, K. Lee, J. Zhou, P. Small, D. Roten, G. Ely, D. K. Panda, A. Chourasia, J. Levesque, S. M. Day, P. Maechling, Scalable Earthquake Simulation on Petascale Supercomputers, Supercomputing (SC10), November 2010. Gordon Bell Prize Finalist.

Int'l Conference on Partitioned Global Address Space Programming Model (PGAS10)

  • J. Jose, M. Luo, S. Sur and D. K. Panda, Unifying UPC and MPI Runtimes: Experience with MVAPICH, Fourth Conference on Partitioned Global Address Space Programming Model (PGAS10), Oct. 2010.