This page lists publications from the group related to designing High Performance MPI on InfiniBand. In addition, the group is also actively engaged in other research directions (PVFS and MPI-IO, Micro-Benchmark suite, Distributed Shared Memory, ARMCI, and Datacenter) related to modern interconnects. Publications related to these research directions are also included in the corresponding links.

Journals (21)

1 S. Sur, S. Potluri, K. Kandalla, H. Subramoni, K. Tomko, and D. K. Panda, Co-Designing MPI Library and Applications for InfiniBand Clusters, IEEE Computer, November 2011
2 J. Liu, J. Wu, and D. K. Panda, High Performance RDMA-Based MPI Implementation over InfiniBand, Int'l Journal of Parallel Programming: Volume 32, June 2004
3 J. Liu, B. Chandrasekaran, W. Yu, J. Wu, D. Buntinas, S. Kini, P. Wyckoff, and D. K. Panda, Micro-Benchmark Performance Comparison of High-Speed Cluster Interconnects, IEEE Micro, January 2004
4 R. Sivaram, C. Stunkel, and D. K. Panda, HIPIQS: A High-Performance Switch Architecture using Input Queuing, IEEE Transactions on Parallel and Distributed Systems. Vol. 13, March 2002
5 M. Banikazemi, B. Abali, L. Herger, and D. K. Panda, Design Alternatives for Virtual Interface Architecture (VIA) and an Implementation on IBM Netfinity NT Cluster, Journal of Parallel and Distributed Computing, January 2001
6 H. Jin, P. Balaji, C. Yoo, J. -Y. Choi, and D. K. Panda, Exploiting NIC Architectural Support for Enhancing IP based Protocols on High Performance Networks, Journal of Parallel and Distributed Computing, January 2001
7 B. Abali, C. B. Stunkel, J. Herring, M. Banikazemi, D. K. Panda, C. Aykanat, and Y. Aydogan, Adaptive Routing on the New Switch Chip for IBM SP Systems, Journal of Parallel and Distributed Computing, January 2001
8 D. Dai, and D. K. Panda, Exploiting the Benefits of Multiple-Path Network in DSM Systems: Architectural Alternatives and Performance Evaluation IEEE Transactions on Computers, Special Issue on Cache Memory, February 1999
9 R. Prakash, and D. K. Panda, Designing Communication Strategies for Heterogeneous Parallel Systems, Parallel Computing, December 1997
10 R. Kesavan, and D. K. Panda, Efficient Multicast on Irregular Switch-based Cut-Through Networks with Up-Down Routing, IEEE Transactions on Parallel and Distributed Systems, September 1996
11 R. Sivaram, R. Kesavan, D. K. Panda, and C. Stunkel Architectural Support for Efficient Multicasting in Irregular Networks, Architectural Support for Efficient Multicasting in Irregular Networks, IEEE Transactions on Parallel and Distributed Systems, September 1996
12 R. Sivaram, C. Stunkel, and D. K. Panda, Implementing Multidestination Worms in Switch-Based Parallel Systems: Architectural Alternatives and their Impact, IEEE Transactions on Parallel and Distributed Systems, September 1996
13 R. Kesavan, and D. K. Panda, Multiple Multicast with Minimized Node Contention on Wormhole k-ary n-cube Networks, IEEE Transactions on Parallel and Distributed Systems, September 1996
14 M. Banikazemi, R. K. Govindaraju, R. Blackmore, and D. K. Panda, MPI-LAPI: An Efficient Implementation of MPI for IBM RS/6000 SP Systems, IEEE Transactions on Parallel and Distributed Systems, September 1996
15 R. Sivaram, D. K. Panda, and C. B. Stunkel, Efficient Broadcast and Multicast on Multistage Interconnection Networks using Multiport Encoding, IEEE Transactions on Parallel and Distributed Systems, September 1996
16 D. Basak, and D. K. Panda, Designing Clustered Multiprocessor Systems under Packaging and Technological Advancements, IEEE Transactions on Parallel and Distributed Systems, September 1996
17 GPU-Aware MPI on RDMA-Enabled Cluster: Design, Implementation and Evaluation,
18 P. Lai, P. Balaji, R. Thakur, and D. K. Panda, ProOnE: A General-Purpose Protocol Onload Engine for Multi- and Many-Core Architectures, Computer Science: Research and Development,
19 A. Vishnu, M. Koop, A. Moody, A. Mamidala, S. Narravula, and D. K. Panda, Topology Agnostic Hot-Spot Avoidance with InfiniBand, Concurrency and Computation: Practice and Experience,
20 J. Liu, A. Mamidala, A. Vishnu, and D. K. Panda, Performance Evaluation of InfiniBand with PCI Express . IEEE Micro, 2005. ,
21 A. Wagner, D. Buntinas, R. Brightwell, and D. K. Panda, Application-Bypass Reduction for Large-Scale Clusters. Int'l Journal of High Performance Computing and Networking, Cluster 2003 Special Issue. In Press. ,

Conferences & Workshops (299)

1 J. Zhang, X. Lu, J. Jose, M. Li, R. Shi, and D. K. Panda, High Performance MPI Library over SR-IOV Enabled InfiniBand Clusters, Int'l Conference on High Performance Computing (HiPC'14), December 2014
2 A. Venkatesh, H. Subramoni, K. Hamidouche, and D. K. Panda, A High Performance Broadcast Design with Hardware Multicast and GPUDirect RDMA for Streaming Applications on Infiniband Clusters, IEEE International Conference on High Performance Computing (HiPC’2014), December 2014
3 R. Shi, S. Potluri, K. Hamidouche M. Li, J. Perkins D. Rossetti, and D. K. Panda, Designing Efficient Small Message Transfer Mechanism for Inter-node MPI Communication on InfiniBand GPU Clusters, IEEE International Conference on High Performance Computing (HiPC’2014), December 2014
4 J. Jose, S. Potluri, H. Subramoni, X. Lu, K. Hamidouche, K. Schulz, H. Sundar, and D. K. Panda, Designing Scalable Out-of-core Sorting with Hybrid MPI+PGAS Programming Models, Int'l Conference on Partitioned Global Address Space Programming Models (PGAS '14), October 2014
5 S. Chakraborty, H. Subramoni, J. Perkins, A. Moody, M. Arnold, and D. K. Panda, PMI Extensions for Scalable MPI Startup, EuroMPI/ASIA 2014, September 2014
6 R. Rajachandrasekar, J. Perkins, K. Hamidouche, M. Arnold, and D. K. Panda, Understanding the Memory-Utilization of MPI Libraries: Challenges and Designs in Implementing the MPI_T Interface, EuroMPI/ASIA 2014, September 2014
7 H. Subramoni, K. Kandalla, J. Jose, K. Tomko, K. Schulz, D. Pekurovsky, and D. K. Panda, Designing Topology-Aware Communication Schedules for Alltoall Operations in Large InfiniBand Clusters, International Conference on Parallel Processing (ICPP’14), September 2014
8 R. Shi, X. Lu, S. Potluri, K. Hamidouche, J. Zhang, and D. K. Panda, HAND: A Hybrid Approach to Accelerate Non-contiguous Data Movement using MPI Datatypes on GPU Clusters, International Conference on Parallel Processing (ICPP’14), September 2014
9 M. Li, X. Lu, S. Potluri, K. Hamidouche, J. Jose, K. Tomko, and D. K. Panda, Scalable Graph500 Design with MPI-3 RMA, IEEE CLUSTER’14, September 2014
10 J. Jose, K. Hamidouche, X. Lu, S. Potluri, J. Zhang, K. Tomko, and D. K. Panda, High Performance OpenSHMEM for MIC Clusters: Extensions, Runtime Designs, and Application Co-Design, IEEE CLUSTER’14, September 2014
11 J. Zhang, X. Lu, J. Jose, R. Shi, and D. K. Panda, Can Inter-VM Shmem Benefit MPI Applications on SR-IOV based Virtualized InfiniBand Clusters?, Euro-Par 2014 Parallel Processing, August 2014
12 R. Rajachandrasekar, S. Potluri, A. Venkatesh, K. Hamidouche, M. Wasi-ur-Rahman, and D. K. Panda, MIC-Check: A Distributed Checkpointing Framework for the Intel Many Integrated Cores Architecture, Int'l Symposium on High Performance and Distributed Computing (HPDC), June 2014
13 H. Subramoni, K. Hamidouche, A. Venkatesh, S. Chakraborty, and D. K. Panda, Designing MPI Library with Dynamic Connected Transport (DCT) of InfiniBand : Early Experiences, IEEE International Supercomputing Conference (ISC ’14), June 2014
14 A. Venkatesh, S. Potluri, R. Rajachandrasekar, M. Luo, K. Hamidouche, and D. K. Panda, High Performance Alltoall and Allgather designs for InfiniBand MIC Clusters, International Parallel and Distributed Processing Symposium (IPDPS’14), May 2014
15 J. Jose, K. Hamidouche, J. Zhang, A. Venkatesh, and D. K. Panda, Optimizing Collective Communication in UPC, Int'l Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS '14), May 2014 [Slides]
16 J. Jose, J. Zhang, A. Venkatesh, S. Potluri, and D. K. Panda, A Comprehensive Performance Evaluation of OpenSHMEM Libraries on InfiniBand Clusters, OpenSHMEM Workshop, March 2014
17 M. Luo, X. Lu, K. Hamidouche, K. Kandalla, and D. K. Panda, Initial Study of Multi-Endpoint Runtime for MPI+OpenMP Hybrid Programming Model on Multi-Core Systems, International Symposium on Principles and Practice of Parallel Programming (PPoPP '14), February 2014
18 D. K. Panda, K. Tomko, K. Schulz, and A. Majumdar, The MVAPICH Project: Evolution and Sustainability of an Open Source Production Quality MPI Library for HPC, nt'l Workshop on Sustainable Software for Science: Practice and Experiences, November 2013
19 S. Potluri, D. Bureddy, K. Hamidouche, A. Venkatesh, K. Kandalla, H. Subramoni, and D. K. Panda, MVAPICH-PRISM: A Proxy-based Communication Framework using InfiniBand and SCIF for Intel MIC Clusters Int'l Conference on Supercomputing (SC '13), November 2013. , November 2013
20 K. Kandalla, H. Subramoni, K. Tomko, D. Pekurovsky, and D. K. Panda, A Novel Functional Partitioning Approach to Design High-Performance MPI-3 Non-Blocking Alltoallv Collective on Multi-core Systems, Int'l Conference on Parallel Processing (ICPP '13), October 2013
21 M. Luo, M. Li, A. Venkatesh, X. Lu, and D. K. Panda, UPC on MIC: Early Experiences with Native and Symmetric Modes, Int'l Conference on Partitioned Global Address Space Programming Models (PGAS '13), October 2013
22 J. Jose, K. Kandalla, S. Potluri, J. Zhang, and D. K. Panda, Optimizing Collective Communication in OpenSHMEM, Int'l Conference on Partitioned Global Address Space Programming Models (PGAS '13), October 2013
23 S. Potluri, K. Hamidouche, A. Venkatesh, D. Bureddy, and D. K. Panda, Efficient Inter-node MPI Communication using GPUDirect RDMA for InfiniBand Clusters with NVIDIA GPUs Int'l Conference on Parallel Processing (ICPP '13), October 2013. , October 2013
24 H. Subramoni, D. Bureddy, K. Kandalla, K. Schulz, B. Barth, J. Perkins, M. Arnold, and D. K. Panda, Design of Network Topology Aware Scheduling Services for Large InfiniBand Clusters, IEEE Cluster (Cluster '13), September 2013
25 R. Shi, S. Potluri, K. Hamidouche, X. Lu, K. Tomko, and D. K. Panda, A Scalable and Portable Approach to Accelerate Hybrid HPL on Heterogeneous CPU-GPU Clusters, IEEE Cluster (Cluster '13), September 2013
26 M. Li, S. Potluri, K. Hamidouche, J. Jose, and D. K. Panda, Efficient and Truly Passive MPI-3 RMA Using InfiniBand Atomics, EuroMPI 2013, September 2013 [Slides]
27 K. Kandalla, A. Venkatesh, K. Hamidouche, S. Potluri, and D. K. Panda, Designing Optimized MPI Broadcast and Allreduce for Many Integrated Core (MIC) InfiniBand Clusters, Int'l Symposium on High-Performance Interconnects (HotI '13), August 2013
28 S. Potluri, K. Hamidouche, D. Bureddy, and D. K. Panda, MVAPICH2-MIC: A High-Performance MPI Library for Xeon Phi Clusters with InfiniBand, Extreme Scaling Workshop, August 2013
29 A. Venkatesh, K. Kandalla, and D. K. Panda, Optimized MPI Gather collective for Many Integrated Core (MIC) InfiniBand Clusters, Extreme Scaling Workshop, August 2013
30 X. Lu, W. Rahman, N. Islam, and D. K. Panda, A Micro-Benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks, Int'l Workshop on Big Data Benchmarking (WBDB '13), July 2013
31 R. Rajachandrasekar, A. Moody, K. Mohror, and D. K. Panda, A 1PB/s File System to Checkpoint Three Million MPI Tasks, Int'l Conference on High Performance Distributed Computing (HPDC '13), June 2013 [Slides]
32 J. Jose, S. Potluri, K. Tomko, and D. K. Panda, Designing Scalable Graph500 Benchmark with Hybrid MPI+OpenSHMEM Programming Models, Int'l Supercomputing Conference (ISC '13), June 2013
33 K. Hamidouche, S. Potluri, H. Subramoni, K. Kandalla, and D. K. Panda, MIC-RO: Enabling Efficient Remote Offload on Heterogeneous Many Integrated Core (MIC) Clusters with InfiniBand, Int'l Conference on Supercomputing (ICS '13), June 2013
34 W. Rahman, N. Islam, X. Lu, J. Jose, H. Subramoni, H. Wang, and D. K. Panda, High-Performance RDMA-based Design of Hadoop MapReduce over InfiniBand, Int'l Workshop on High Performance Data Intensive Computing (HPDIC), May 2013
35 S. Potluri, D. Bureddy, H. Wang, H. Subramoni, and D. K. Panda, Extending OpenSHMEM for GPU Computing, Int'l Parallel and Distributed Processing Symposium (IPDPS '13), May 2013 [Slides]
36 A. Venkatesh, K. Kandalla, and D. K. Panda, Evaluation of Energy Characteristics of MPI Communication Primitives with RAPL, Int'l Workshop on High Performance, May 2013
37 H. Subramoni, S. Potluri, K. Kandalla, B. Barth, J. Vienne, J. Keasler, K. Tomko, K. Schulz, A. Moody, and D. K. Panda, Design of a Scalable InfiniBand Topology Service to Enable Network-Topology-Aware Placement of Processes, Int'l Conference on Supercomputing (SC '12), November 2012
38 N. Islam, W. Rahman, J. Jose, R. Rajachandrasekar, H. Wang, H. Subramoni, C. Murthy, and D. K. Panda, High Performance RDMA-Based Design of HDFS over InfiniBand, Int'l Conference on Supercomputing (SC '12), November 2012 [Slides]
39 M. Luo, H. Wang, and D. K. Panda, Multi-Threaded UPC Runtime for GPU to GPU communication over InfiniBand, Int'l Conference on Partitioned Global Address Space Programming Models (PGAS '12), October 2012 [Slides]
40 J. Jose, K. Kandalla, M. Luo, and D. K. Panda, Supporting Hybrid MPI and OpenSHMEM over InfiniBand: Design and Performance Evaluation, Int'l Conference on Parallel Processing (ICPP '12), September 2012
41 D. Bureddy, H. Wang, A. Venkatesh, S. Potluri, and D. K. Panda, OMB-GPU: A Micro-benchmark suite for Evaluating MPI Libraries on GPU Clusters, EuroMPI 2012, September 2012
42 R. Rajachandrasekar, J. Jaswani, H. Subramoni, and D. K. Panda, Minimizing Network Contention in InfiniBand Clusters with a QoS-Aware Data-Staging Framework, IEEE Cluster (Cluster '12), September 2012
43 K. Kandalla, H. Subramoni, K. Tomko, J. Vienne, L. Oliker, and D. K. Panda, Can Network-Offload based Non-Blocking Neighborhood MPI Collectives Improve Communication Overheads of Irregular Graph Algorithms? Int'l Workshop on Parallel Algorithm and Parallel Software (IWPAPS12), held in conjunction with IEEE Cluster (Cluster '12), September 2012
44 H. Subramoni, J. Vienne, and D. K. Panda, A Scalable InfiniBand Network-Topology-Aware Performance Analysis Tool for MPI, Int'l Workshop on Productivity and Performance (Proper '12), August 2012
45 J. Vienne, J. Chen, W. Rahman, N. Islam, H. Subramoni, and D. K. Panda, Performance Analysis and Evaluation of InfiniBand FDR and 40GigE RoCE on HPC and Cloud Computing System, Int'l Symposium on High-Performance Interconnects (HotI 2012), August 2012
46 M. Luo, D. K. Panda, C. Iancu, and K. Z. Ibrahim, Congestion Avoidance on Manycore High Performance Computing Systems, Int'l Conference on Supercomputing (ICS '12), June 2012
47 M. Luo, H. Wang, J. Vienne, and D. K. Panda, Redesigning MPI Shared Memory Communication for Large Multi-Core Architecture Int'l Supercomputing Conference (ISC '12), June 2012. , June 2012
48 K. Kandalla, U. Yang, J. Keasler, T. Kolev, A. Moody, H. Subramoni, K. Tomko, J. Vienne, and D. K. Panda, Designing Non-blocking Allreduce with Collective Offload on InfiniBand Clusters: A Case Study with Conjugate Gradient Solvers Int'l Parallel and Distributed Processing Symposium (IPDPS '12), May 2012. , May 2012
49 S. P. Raikar, H. Subramoni, K. Kandalla, J. Vienne, and D. K. Panda, Designing Network Failover and Recovery in MPI for Multi-Rail InfiniBand Clusters, Int'l Workshop on System Management Techniques, May 2012
50 R. Rajachandrasekar, X. Besseron, and D. K. Panda, Monitoring and Predicting Hardware Failures in HPC Clusters with FTB-IPMI, Int'l Workshop on System Management Techniques, May 2012
51 S. Potluri, H. Wang, D. Bureddy, A. Singh, C. Rosales, and D. K. Panda, Optimizing MPI Communication on Multi-GPU Systems using CUDA Inter-Process Communication, Int'l Workshop on Accelerators and Hybrid Exascale Systems (AsHES), May 2012
52 S. Potluri, K. Tomko, D. Bureddy, and D. K. Panda, Intra-MIC MPI Communication using MVAPICH2: Early Experience, TACC-Intel Highly-Parallel Computing Symposium, April 2012 [Slides]
53 M. Luo, J. Jose, S. Sur, and D. K. Panda, Multi-threaded UPC Runtime with Network Endpoints: Design Alternatives and Evaluation on Multi-core Architectures, Int'l Conference on High Performance Computing (HiPC '11), December 2011
54 J. Jose, S. Potluri, M. Luo, S. Sur, and D. K. Panda, UPC Queues for Scalable Graph Traversals: Design and Evaluation on InfiniBand Clusters, Fifth Conference on Partitioned Global Address Space Programming Model (PGAS '11), October 2011 [Slides]
55 V. Meshram, X. Besseron, X. Ouyang, R. Rajachandrasekar, and D. K. Panda, Can a Decentralized Metadata Service Layer benefit Parallel Filesystems? Workshop on Interfaces and Architectures for Scientific Data Storage (IASDS '11), held in conjunction with Cluster '11, September 2011
56 A. Singh, S. Potluri, H. Wang, K. Kandalla, S. Sur, and D. K. Panda, MPI Alltoall Personalized Exchange on GPGPU Clusters: Design Alternatives and Benefits, Int'l Workshop on Parallel Programming on Accelerator Clusters (PPAC '11), September 2011
57 H. Subramoni, K. Kandalla, J. Vienne, S. Sur, B. Barth, K. Tomko, R. McLay, K. Schulz, and D. K. Panda, Design and Evaluation of Network Topology-/Speed- Aware Broadcast Algorithms for InfiniBand Clusters, IEEE Cluster '11, September 2011
58 H. Wang, S. Potluri, M. Luo, A. Singh, X. Ouyang, S. Sur, and D. K. Panda, Optimized Non-contiguous MPI Datatype Communication for GPU Clusters: Design Implementation and Evaluation with MVAPICH2, IEEE Cluster '11, September 2011
59 S. Potluri, H. Wang, V. Dhanraj, S. Sur, and D. K. Panda, Optimizing MPI One Sided Communication on Multi-core InfiniBand Clusters using Shared Memory Backed Windows, EuroMPI '11, September 2011
60 S. Potluri, S. Sur, D. Bureddy, and D. K. Panda, Design and Implementation of Key Proposed MPI-3 One-Sided Communication Semantics on InfiniBand, Poster/Short Paper, September 2011 [Slides]
61 X. Ouyang, R. Rajachandrasekar, X. Besseron, H. Wang, J. Huang, and D. K. Panda, CRFS: A Lightweight User-Level Filesystem for Generic Checkpoint/Restart, Int'l Conference on Parallel Processing (ICPP '11), September 2011 [Slides]
62 R. Rajachandrasekar, X. Ouyang, X. Besseron, V. Meshram, and D. K. Panda, Can Checkpoint/Restart Mechanisms Benefit from Hierarchical Data Staging? Workshop on Resiliency in High Performance Computing in Clusters, Clouds, August 2011
63 N. Dandapanthula, H. Subramoni, J. Vienne, K. Kandalla, S. Sur, D. K. Panda, and R. Brightwell, INAM - A Scalable InfiniBand Network Analysis and Monitoring Tool, 4th Int'l Workshop on Productivity and Performance (PROPER 2011), August 2011 [Slides]
64 K. Kandalla, H. Subramoni, J. Vienne, K. Tomko, S. Sur, and D. K. Panda, Designing Non-blocking Broadcast with Collective Offload on InfiniBand Clusters: A Case Study with HPL, Hot Interconnect '11, August 2011
65 H. Wang, S. Potluri, M. Luo, A. Singh, S. Sur, and D. K. Panda, MVAPICH2-GPU: Optimized GPU to GPU Communication for InfiniBand Clusters, Int'l Supercomputing Conference '11 (ISC'11), June 2011
66 X. Ouyang, R. Rajachandrasekar, X. Besseron, and D. K. Panda, High Performance Pipelined Process Migration with RDMA, Int'l Symposium on Cluster, May 2011 [Slides]
67 S. Potluri, A. Venkatesh, D. Bureddy, K. Kandalla, and D. K. Panda, Efficient Intra-node Communication on Intel-MIC Clusters, Int'l Symposium on Cluster, May 2011 [Slides]
68 J. Jose, M. Li, X. Lu, K. Kandalla, M. Arnold, and D. K. Panda, SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience, Int'l Symposium on Cluster, May 2011 [Slides]
69 X. Ouyang, D. Nellans, R. Wipfel, D. Flynn, and D. K. Panda, Beyond Block I/O: Rethinking Traditional Storage Primitives, 17th IEEE International Symposium on High Performance Computer Architecture (HPCA-17), February 2011 [Slides]
70 J. Jose, M. Luo, S. Sur, and D. K. Panda, Unifying UPC and MPI Runtimes: Experience with MVAPICH, Int'l Workshop on Partitioned Global Address Space (PGAS '10), October 2010 [Slides]
71 X. Ouyang, S. Marcarelli, R. Rajachandrasekar, and D. K. Panda, RDMA-Based Job Migration Framework for MPI over InfiniBand Int'l Conference on Cluster Computing (Cluster '10), Sept. 2010. , October 2010
72 H. Subramoni, P. Lai, S. Sur, and D. K. Panda, Improving Application Performance and Predictability using Multiple Virtual Lanes in Modern Multi-Core InfiniBand Clusters, Int'l Conference on Parallel Processing (ICPP '10), September 2010 [Slides]
73 K. Kandalla, E. Mancini, S. Sur, and D. K. Panda, Designing Power-Aware Collective Communication Algorithms for InfiniBand Clusters, Int'l Conference on Parallel Processing (ICPP '10), September 2010 [Slides]
74 M. Luo, S. Potluri, P. Lai, E. Mancini, H. Subramoni, K. Kandalla, S. Sur, and D. K. Panda, High Performance Design and Implementation of Nemesis Communication Layer for Two-sided and One-Sided MPI Semantics in MVAPICH2, Int'l Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2 '10), September 2010
75 S. Potluri, P. Lai, K. Tomko, S. Sur, Y. Cui, M. Tatineni, K. Schulz, W. Barth, A. Majumdar, and D. K. Panda, Quantifying Performance Benefits of Overlap using MPI-2 in a Seismic Modeling Application, 24th International Conference on Supercomputing (ICS), June 2010
76 P. Lai, S. Sur, and D. K. Panda, Designing Truly One-Sided MPI-2 RMA Intra-node Communication on Multi-core Systems, International Supercomputing Conference (ISC'10), June 2010 [Slides]
77 X. Ouyang, S. Marcarelli, and D. K. Panda, Enhancing Checkpoint Performance with Staging IO and SSD, IEEE International Workshop on Storage Network Architecture and Parallel I/Os (SNAPI), May 2010 [Slides]
78 K. Kandalla, H. Subramoni, A. Vishnu, and D. K. Panda, Designing Topology-Aware Collective Communication Algorithms for Large Scale InfiniBand Clusters: Case Studies with Scatter and Gather, Int'l Workshop on Communication Architecture for Clusters (CAC 10), April 2010
79 M. Koop, P. Shamis, I. Rabinovitz, and D. K. Panda, Designing High-Performance and Resilient Message Passing on InfiniBand, Int'l Workshop on Communication Architecture for Clusters (CAC 10), April 2010
80 X. Ouyang, K. Gopalakrishnan, D. K. Panda, Fast Checkpointing by Write Aggregation with Dynamic Buffer, and Interleaving on Multicore Architecture, Int'l Conference on High Performance Computing (HiPC '09), Dec. 2009. , April 2010
81 R. Gupta, P. Beckman, H. Park, E. Lusk, P. Hargrove, A. Geist, D. K. Panda, A. Lumsdaine, and J. Dongarra, CIFTS: A Coordinated Infrastructure for Fault-Tolerant Systems, Int'l Conference on Parallel Processing (ICPP '09), September 2009
82 P. Lai, H. Subramoni, S. Narravula, A. Mamidala, and D. K. Panda, Designing Efficient FTP Mechanisms for High Performance Data-Transfer over InfiniBand, Int'l Conference on Parallel Processing (ICPP '09), September 2009 [Slides]
83 X. Ouyang, K. Gopalakrishnan, and D. K. Panda, Accelerating Checkpoint Operation by Node-Level Write Aggregation on Multicore Systems, Int'l Conference on Parallel Processing (ICPP '09), September 2009 [Slides]
84 T. Gangadharappa, M. Koop, and D. K. Panda, Designing and Evaluating MPI-2 Dynamic Process Management Support for InfiniBand, Int'l Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2 '09), September 2009
85 J. Sridhar, and D. K. Panda, Impact of Node Level Caching in MPI Job Launch Mechanisms, EuroPVM/MPI '09, September 2009 [Slides]
86 A. Vishnu, M. Krishnan, and D. K. Panda, An Efficient Hardware-Software Approach to Network Fault Tolerance with InfiniBand, Int'l Conference on Cluster Computing (Cluster '09), September 2009 [Slides]
87 M. Koop, M. Luo, and D. K. Panda, Reducing Network Contention with Mixed Workloads on Modern Multicore Clusters, Int'l Conference on Cluster Computing (Cluster '09), September 2009 [Slides]
88 G. Santhanaraman, T. Gangadharappa, S. Narravula, A. Mamidala, and D. K. Panda, Design Alternatives for Implementing Fence Synchronization in MPI-2 One-sided Communication on InfiniBand Clusters, Int'l Conference on Cluster Computing (Cluster '09), September 2009 [Slides]
89 H. Subramoni, P. Lai, M. Luo, and D. K. Panda, RDMA over Ethernet - A Preliminary Study, Int'l Workshop on High Performance Distributed Computing (HPI-DC '09), September 2009 [Slides]
90 P. Lai, P. Balaji, R. Thakur, and D. K. Panda, ProOnE: A General Purpose Protocol Onload Engine for Multi- and Many-Core Architectures, Int'l Supercomputing Conference (ISC), June 2009
91 K. Kandalla, H. Subramoni, K. Tomko, D. Pekurovsky, S. Sur, and D. K. Panda, High-Performance and Scalable Non-Blocking All-to-All with Collective Offload on InfiniBand Clusters: A Study with Parallel 3D FFT, Int'l Supercomputing Conference (ISC), June 2009
92 J. Sridhar, M. Koop, J. Perkins, and D. K. Panda, ScELA: Scalable and Extensible Launching Architecture for Clusters, Int'l Symposium on High Performance Computing (HiPC), December 2008 [Slides]
93 R. Noronha, X. Ouyang, and D. K. Panda, Designing High Performance pNFS With RDMA on InfiniBand, Int'l Symposium on High Performance Computing (HiPC), December 2008
94 P. Balaji, S. Bhagvat, R. Thakur, and D. K. Panda, Sockets Direct Protocol for Hybrid Network Stacks: A Case Study with iWARP over 10G Ethernet, Int'l Symposium on High Performance Computing (HiPC), December 2008 [Slides]
95 H. Subramoni, G. Marsh, S. Narravula, P. Lai, and D. K. Panda, Design and Evaluation of Benchmarks for Financial Applications using Advanced Message Queuing Protocol (AMQP) over InfiniBand, Workshop on High Performance Computational Finance (In conjunction with SC '08), November 2008
96 W. Huang, M. Koop, and D. K. Panda, Efficient One-Copy MPI Shared Memory Communication in Virtual Machines, IEEE Cluster 2008, September 2008 [Slides]
97 M. Koop, J. Sridhar, and D. K. Panda, Scalable MPI Design over InfiniBand using eXtended Reliable Connection, IEEE Cluster 2008, September 2008 [Slides]
98 R. Noronha, and D. K. Panda, IMCa: A High Performance Caching Frontend for GlusterFS on InfiniBand, Int'l Conference on Parallel Processing (ICPP '08), September 2008 [Slides]
99 S. Narravul, H. Subramoni, P. Lai, R. Noronha, and D. K. Panda, Performance of HPC middleware over InfiniBand WAN, Int'l Conference on Parallel Processing (ICPP '08), September 2008
100 L. Chai, P. Lai, H. Jin, and D. K. Panda, Designing An Efficient Kernel-level and User-level Hybrid Approach for MPI Intra-node Communication on Multi-core Systems, Int'l Conference on Parallel Processing (ICPP '08), September 2008 [Slides]
101 R. Kumar, A. Mamidala, M. Koop, G. Santhanaraman, and D. K. Panda, Lock-free Asynchronous Rendezvous Design for MPI Point-to-point Communication, EuroPVM/MPI '08, September 2008
102 M. Koop, R. Kumar, and D. K. Panda, Can Software Reliability Outperform Hardware Reliability on High Performance Interconnects? A Case Study with MPI over InfiniBand, 22nd ACM International Conference on Supercomputing (ICS '08), June 2008
103 A. Mamidala, R. Kumar, D. De, and D. K. Panda, MPI Collectives on modern Multicore clusters: Performance Optimizations and Communication Characteristics. Int'l Symposium on Cluster Computing and the Grid (CCGrid), May 2008 , May 2008
104 R. Kumar, A. Mamidala, and D. K. Panda, Scaling Alltoall Collective on Multi-core Systems, International Workshop on Communication Architecture for Clusters, April 2008 [Slides]
105 L. Chai, X. Ouyang, R. Noronha, and D. K. Panda, pNFS/PVFS2 over InfiniBand: Early Experiences, Petascale Data Storage Workshop, November 2007 [Slides]
106 R. Noronha, L. Chai, S. Shepler, and D. K. Panda, Enhancing the Performance of NFSv4 with RDMA, Int'l Workshop on Storage Network Architecture and Parallel I/Os (SNAPI'07), September 2007
107 G. Santhanaraman, S. Narravula, A. Mamidala, and D. K. Panda, MPI-2 One Sided Usage and Implementation for Read Modify Write operations: A case study with HPCC, EuroPVM/MPI 2007, September 2007
108 M. Koop, S. Sur, and D. K. Panda, Zero-Copy Protocol for MPI using InfiniBand Unreliable Datagram, IEEE Cluster 2007, September 2007
109 W. Huang, Q. Gao, J. Liu, and D. K. Panda, High Performance Virtual Machine Migration with RDMA over Modern Interconnects, IEEE Cluster 2007, September 2007
110 K. Vaidyanathan, L. Chai, W. Huang, and D. K. Panda, Efficient Asynchronous Memory Copy Operations on Multi-Core Systems and I/OAT, IEEE Cluster 2007, September 2007
111 Q. Gao, W. Huang, M. Koop, and D. K. Panda, Group-based Coordinated Checkpointing for MPI: A Case Study on InfiniBand, Int'l Conference on Parallel Processing (ICPP'07), September 2007 [Slides]
112 S. Narravul, A. Mamidala, A. Vishnu, and G. Santhanaraman, and D. K. Panda, High Performance MPI over iWARP: Early Experiences, September 2007
113 R. Noronha, L. Chai, T. Talpey, and D. K. Panda, Designing NFS With RDMA For Security, Performance and Scalability, September 2007
114 S. Sur, M. Koop, L. Chai, and D. K. Panda, Performance Analysis and Evaluation of Mellanox ConnectX InfiniBand Architecture with Multi-Core Platforms, Int'l Symposium on Hot Interconnects (HotI), August 2007 [Slides]
115 M. Koop, W. Huang, K. Gopalakrishnan, and D. K. Panda, Performance Analysis and Evaluation of PCIe 2.0 and Quad-Data Rate InfiniBand, Int'l Symposium on Hot Interconnects (HotI), August 2007
116 H. Subramoni, K. Kandalla, S. Sur, and D. K. Panda, Design and Evaluation of Generalized Collective Communication Primitives with Overlap using ConnectX-2 Offload Engine, Int'l Symposium on Hot Interconnects (HotI), August 2007
117 H. Subramoni, M. Koop, and D. K. Panda, Designing Next Generation Clusters: Evaluation of InfiniBand DDR/QDR on Intel Computing Platforms, Int'l Symposium on Hot Interconnects (HotI), August 2007 [Slides]
118 M. Koop, S. Sur, Q. Gao, and D. K. Panda, High Performance MPI Design using Unreliable Datagram for Ultra-Scale InfiniBand Clusters, 21st Int'l ACM Conference on Supercomputing (ICS '07), June 2007
119 W. Huang, J. Liu, M. Koop, B. Abali, and D. K. Panda, Nomad: Migrating OS-bypass Networks in Virtual Machines, Third Int'l SIGPLAN/SIGOPS Conference on Virtual Execution Environments (VEE), June 2007
120 K. Vaidyanathan, and D. K. Panda, Benefits of I/O Acceleration Technology (I/OAT) in Clusters, International Symposium on Performance Analysis of Systems and Software (ISPASS), April 2007
121 R. Noronha, and D. K. Panda, Improving Scalability of OpenMP Applications on MultiCore Systems Using Large Page Support, International Workshop on Multithreaded Architectures and Applications (MTAAP), March 2007
122 A. Vishnu, B. Benton, and D. K. Panda, High Performance MPI on IBM 12x InfiniBand Architecture, International Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS), March 2007
123 A. Vishnu, A. Mamidala, S. Narravul, and D. K. Panda, Automatic Path Migration over InfiniBand: Early Experience, Third International Workshop on System Management Techniques, March 2007
124 K. Vaidyanathan, W. Huang, L. Chai, and D. K. Panda, Designing Efficient Asynchronous Memory Operations Using Hardware Copy Engine: A Case Study with I/OAT, International Workshop on Communication Architecture for Clusters (CAC), March 2007
125 A. Mamidala, S. Narravula, A. Vishnu, G. Santhanaraman, and D. K. Panda, Using Connection-Oriented and Connection-Less Transport on Performance and Scalability of Collective and One-sided operations: Trade-offs and Impact, International Symposium on Principles and Practice of Parallel Programming (PPoPP 2007), March 2007
126 K. Vaidyanathan, S. Narravul, and D. K. Panda, DDSS: A Low-Overhead Distributed Data Sharing Substrate for Cluster-Based Data-Centers over Modern Interconnects, Int'l Conference on High Performance Computing (HiPC), December 2006 [Slides]
127 A. Vishnu, P. Gupta, A. Mamidala, and D. K. Panda, A Software Based Approach for Providing Network Fault Tolerance in Clusters Using the uDAPL Interface: MPI Level Design and Performance Evaluation, SuperComputing (SC), November 2006
128 P. Balaji, W. Feng, S. Bhagvat, D. K. Panda, R. Thakur, and W. Gropp, Analyzing the Impact of Supporting Out-of-Order Communication on In-order Performance with iWARP, SuperComputing (SC), November 2006
129 W. Huang, M. Koop, Q. Gao, and D. K. Panda, Virtual Machine Aware Communication Libraries for High Performance Computing, SuperComputing (SC), November 2006
130 S. Sur, M. Koop, and D. K. Panda, High-Performance and Scalable MPI over InfiniBand with Reduced Memory Usage: An In-Depth Performance Analysis, SuperComputing (SC), November 2006
131 Q. Gao, F. Qin, and D. K. Panda, Finding Bugs in Large-Scale Parallel Programs by Detecting Anomaly in Data Movements, SuperComputing (SC), November 2006
132 Y. Cui, K. B. Olsen, T. H. Jordan, K. Lee, J. Zhou, P. Small, D. Roten, G. Ely, D. K. Panda, A. Chourasia, J. Levesque, S. M. Day, and P. Maechling, Scalable Earthquake Simulation on Petascale Supercomputers, SuperComputing (SC), November 2006
133 H. Jin, S. Narravul, K. Vaidyanathan, and D. K. Panda, NemC: A Network Emulator for Cluster-of-Clusters, Int'l Conf. on Computer Commn. and Networks, October 2006
134 L. Chai, A. Hartono, and D. K. Panda, Designing Efficient MPI Intra-node Communication Support for Modern Computer Architectures, Int'l Conference on Cluster Computing, September 2006
135 A. Mamidala, A. Vishnu, and D. K. Panda, Efficient Shared Memory and RDMA based design for MPI\_Allgather over InfiniBand, EuroPVM/MPI, September 2006
136 M. Koop, W. Huang, A. Vishnu, and D. K. Panda, Memory Scalability Evaluation of the Next-Generation Intel Bensley Platform with InfiniBand, Int'l Symposium on Hot Interconnect (HotI), August 2006 [Slides]
137 Q. Gao, W. Yu, W. Huang, and D. K. Panda, Application-Transparent Checkpoint/Restart for MPI Programs over InfiniBand, Int'l Conference on Parallel Processing (ICPP), August 2006 [Slides]
138 S. Liang, W. Yu, and D. K. Panda, High Performance Block I/O for Global File System (GFS) with InfiniBand RDMA, Int'l Conference on Parallel Processing (ICPP), August 2006
139 W. Huang, J. Liu, B. Abali, and D. K. Panda, A Case for High Performance Computing with Virtual Machines, Int'l Conference on Supercomputing (ICS), June 2006
140 J. Liu, W. Huang, B. Abali, and D. K. Panda, High Performance VMM-Bypass I/O in Virtual Machines, USENIX Annual Technical Conference, June 2006
141 H. Subramoni, P. Lai, R. Kettimuthu, and D. K. Panda, High Performance Data Transfer in Grid Environment Using GridFTP over InfiniBand, Int'l Symposium on Cluster Computing and the Grid (CCGrid), May 2006 [Slides]
142 K. Vaidyanathan, and S. Narravula, Optimized Distributed Data Sharing Substrate in Multi-Core Commodity Clusters: A Comprehensive Study with Applications, Int'l Symposium on Cluster Computing and the Grid (CCGrid), May 2006 [Slides]
143 E. Mancini, G. Marsh, and D. K. Panda, An MPI-Stream Hybrid Programming Model for Computational Clusters, Int'l Symposium on Cluster Computing and the Grid (CCGrid), May 2006 [Slides]
144 G. Santhanaraman, P. Balaji, K. Gopalakrishnan, R. Thakur, W. Gropp, and D. K. Panda, Natively Supporting True One-sided Communication in MPI on Multi-core Systems with InfiniBand, Int'l Symposium on Cluster Computing and the Grid (CCGrid), May 2006
145 S. Narravul, H. Jin, K. Vaidyanathan, and D. K. Panda, Designing Efficient Cooperative Caching Schemes for Multi-Tier Data-Centers over RDMA-enabled Networks, Int'l Symposium on Cluster Computing and the Grid (CCGrid), May 2006
146 M. Koop, T. Jones, and D. K. Panda, Reducing Connection Memory Requirements of MPI for InfiniBand Clusters: A Message Coalescing Approach, Int'l Symposium on Cluster Computing and the Grid (CCGrid), May 2006
147 S. Narravul, A. Mamidala, A. Vishnu, K. Vaidyanathan, and D. K. Panda, High Performance Distributed Lock Management Services using Network-based Remote Atomic Operations, Int'l Symposium on Cluster Computing and the Grid (CCGrid), May 2006
148 L. Chai, Q. Gao, and D. K. Panda, Understanding the Impact of Multi-Core Architecture in Cluster Computing: A Case Study with Intel Dual-Core System, Int'l Symposium on Cluster Computing and the Grid (CCGrid), May 2006
149 P. Lai, S. Narravula, K. Vaidyanathan, and D. K. Panda, Advanced RDMA-based Admission Control for Modern Data-Centers, Int'l Symposium on Cluster Computing and the Grid (CCGrid), May 2006 [Slides]
150 A. Vishnu, M. Koop, A. Moody, A. Mamidala, S. Narravul, and D. K. Panda, Hot-Spot Avoidance With Multi-Pathing Over InfiniBand: An MPI Perspective, Int'l Symposium on Cluster Computing and the Grid (CCGrid), May 2006
151 W. Huang, G. Santhanaraman, H. Jin, Q. Gao, and D. K. Panda, Design and Implementation of High Performance MVAPICH2: MPI2 over InfiniBand, Int'l Sympsoium on Cluster Computing and the Grid (CCGrid), May 2006
152 L. Chai, R. Noronha, and D. K. Panda, MPI over uDAPL: Can High Performance and Portability Exist Across Architectures? Int'l Sympsoium on Cluster Computing and the Grid (CCGrid), Singapore, May 2006
153 S. Sur, L. Chai, H. Jin, and D. K. Panda, Shared Receive Queue based Scalable MPI Design for InfiniBand Clusters, Int'l Parallel and Distributed Processing Symposium (IPDPS '06), April 2006
154 W. Yu, Qi Gao, and D. K. Panda, Adaptive Connection Management for Scalable MPI over InfiniBand, Int'l Parallel and Distributed Processing Symposium (IPDPS '06), April 2006 [Slides]
155 P. Balaji, S. Bhagvat, H. Jin, and D. K. Panda, Asynchronous Zero-Copy Communication for Synchronous Sockets Direct Protocol (SDP) over InfiniBand, Communication Architecture for Clusters (CAC) Workshop, April 2006
156 W. Yu, R. Noronha, S. Liang, and D. K. Panda, Benefits of High Speed Interconnects to Cluster File Systems: A Case Study with Lustre, Communication Architecture for Clusters (CAC) Workshop, April 2006
157 A. Mamidala, L. Chai, H. Jin, and D. K. Panda, Efficient SMP-Aware MPI-Level Broadcast over InfiniBand's Hardware Multicast, Communication Architecture for Clusters (CAC) Workshop, April 2006
158 S. Sur, L. Chai, H. Jin, and D. K. Panda, RDMA Read Based Rendezvous Protocol for MPI over InfiniBand: Design Alternatives and Benefits, International Symposium on Principles and Practice of Parallel Programming (PPoPP 2006), March 2006 [Slides]
159 V. Vishwanathz, P. Balaji, W. Feng, J. Leigh, and D. K. Panda, A Case for UDP Offload Engines in LambdaGrids, International Workshop on Protocols for Fast Long-Distance Networks (PFLDnet 2006), February 2006
160 S. Sur, U. Bondhugula, A. Mamidala, H. Jin, and D. K. Panda, High Performance RDMA Based All-to-all Broadcast for InfiniBand Clusters, International Conference on High Performance Computing (HiPC 2005), December 2005
161 A. Vishnu, G. Santhanaraman, W. Huang, H. Jin, and D. K. Panda, Supporting MPI-2 One Sided Communication on Multi-Rail InfiniBand Clusters: Design Challenges and Performance Benefits, International Conference on High Performance Computing (HiPC 2005), December 2005
162 K. Vaidyanathan, H. Jin, and D. K. Panda, Exploiting RDMA operations for Providing Efficient Fine-Grained Resource Monitoring in Cluster-based Servers, Workshop on Remote Direct Memory Access (RDMA): Applications, September 2005
163 P. Balaji, H. Jin, K. Vaidyanathan, and D. K. Panda, Supporting iWARP Compatibility and Features for Regular Network Adapters, Workshop on Remote Direct Memory Access (RDMA): Applications, September 2005 [Slides]
164 S. Liang, R. Noronha, and D. K. Panda, Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device, IEEE Cluster Computing 2005, September 2005 [Slides]
165 P. Balaji, W. Feng, Q. Gao, R. Noronha, W. Yu, and D. K. Panda, Head-to-TOE Evaluation of High-Performance Sockets over Protocol Offload Engines, IEEE Cluster Computing 2005, September 2005 [Slides]
166 W. Yu, and D. K. Panda, Benefits of Quadrics Scatter/Gather to PVFS2 Noncontiguous I/O, International Workshop on Storage Network Architecture and Parallel I/Os (SNAPI) 2005. Sept. 2005. , September 2005 [Slides]
167 S. Sur, A. Vishnu, H. Jin, W. Huang, and D. K. Panda, Can Memory-Less Network Adapters Benefit Next-Generation InfiniBand Systems?, Hot Interconnect 13 (HOTI 05), August 2005 [Slides]
168 W. Feng, P. Balaji, C. Baron, L. N. Bhuyan, and D. K. Panda, Performance Characterization of a 10-Gigabit Ethernet TOE, Hot Interconnect 13 (HOTI 05), August 2005 [Slides]
169 R. Noronha, and D. K. Panda, Performance Evaluation of MM5 on Clusters With Modern Interconnects: Scalability and Impact, Euro-Par, August 2005
170 H. Jin, S. Narravul, K. Vaidyanathan, P. Balaji, and D. K. Panda, Performance Evaluation of RDMA over IP: A Case Study with Ammasso Gigabit Ethernet NIC, HPI-DC Workshop, July 2005
171 W. Yu, S. Liang, and D. K. Panda, High Performance Support of Parallel Virtual File System (PVFS2) over Quadrics, Int'l Conference on Supercomputing (ICS '05), June 2005
172 H. Jin, S. Sur, L. Chai, and D. K. Panda, LiMIC: Support for High-Performance MPI Intra-Node Communication on Linux Cluster, International Conference on Parallel Processing (ICPP-05), June 2005 [Slides]
173 S. Narravula, P. Balaji, K. Vaidyanathan, H. Jin, and D. K. Panda, Architecture for Caching Responses with Multiple Dynamic Dependencies in Multi-Tier Data-Centers over InfiniBand, IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid 05), May 2005
174 R. Noronha, and D. K. Panda, Can High Performance Software DSM Systems Designed With InfiniBand Features Benefit from PCI-Express?, DSM Workshop, May 2005
175 and D. K. Panda, Designing Multi-Level, Multi-Tier Data Center Architecture for Securing Distributed Infrastructure and Assets, April 2005
176 L. Chai, S. Sur, H. Jin, and D. K. Panda, Analysis of Design Considerations for Optimizing Multi-Channel MPI over InfiniBand, Workshop on Communication Architecture on Clusters (CAC '05), April 2005
177 W. Huang, G. Santhanaraman, H. Jin, and D. K. Panda, Scheduling of MPI-2 One Sided Operations over InfiniBand, Workshop on Communication Architecture on Clusters (CAC '05), April 2005 [Slides]
178 A. Vishnu, A. Mamidala, and H.- W, Performance Modeling of Subnet Management on Fat Tree InfiniBand Networks using OpenSM, Workshop on System Management Tools on Large Scale Parallel Systems, April 2005
179 W. Yu, T. S. Woodall, R. L. Graham, and D. K. Panda, Design and Implementation of Open MPI over Quadrics/Elan4, International Parallel and Distributed Processing Symposium (IPDPS 2005). April 2005. , April 2005 [Slides]
180 P. Balaji, S. Narravula, K. Vaidyanathan, H. Jin, and D. K. Panda, On the Provision of Prioritization and Soft QoS in Dynamically Reconfigurable Shared Data-Centers over InfiniBand, IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 05), March 2005 [Slides]
181 K. Vaidyanathan, P. Balaji, H. Jin, and D. K. Panda, Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers over InfiniBand Computer Architecture Evaluation using Commercial Workloads (CAECW-8 ), (in conjunction with HPCA), February 2005
182 W. Yu, J. Wu, and D. K. Panda, Scalable Startup of Parallel Programs over InfiniBand, Int'l Conference on High Performance Computing (HiPC '04), December 2004 [Slides]
183 J. Liu, A. Vishnu, and D. K. Panda, Building Multirail InfiniBand Clusters: MPI-Level Design and Performance Evaluation, SuperComputing 2004 Conference (SC 04), November 2004 [Slides]
184 W. Yu, D. Buntinas, and D. K. Panda, Scalable and High Performance NIC-Based Allgather over Myrinet/GM, Int'l Conference on Cluster Computing 2004, September 2004 [Slides]
185 A. Mamidala, J. Liu, and D. K. Panda, Efficient Barrier and Allreduce on IBA Clusters using Hardware Multicast and Adaptive Algorithms, Int'l Conference on Cluster Computing 2004, September 2004
186 A. Wagner, H. Jin, R. Riesen, and D. K. Panda, NIC-Based Offload of Dynamic User-Defined Modules for Myrinet Clusters. Int'l Conference on Cluster Computing 2004, September, September 2004
187 R. Noronha, and D. K. Panda, Reducing Diff Overhead in Software DSM Systems using RDMA Operations in InfiniBand, Int'l Workshop on Remote Direct Memory Access (RDMA): Applications, September 2004 [Slides]
188 P. Balaji, K. Vaidyanathan, S. Narravula, K. Savitha, H. Jin, and D. K. Panda, Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand, Int'l Workshop on Remote Direct Memory Access (RDMA): Applications, September 2004 [Slides]
189 P. Balaji, H. V. Shah, and D. K. Panda, Sockets vs RDMA Interface over 10-Gigabit Networks: An In-depth analysis of the Memory Traffic Bottleneck, Int'l Workshop on Remote Direct Memory Access (RDMA): Applications, September 2004 [Slides]
190 G. Santhanaraman, J. Wu, and D. K. Panda, Zero-Copy MPI Derived Datatype Communication over InfiniBand, EuroPVM/MPI 2004, September 2004 [Slides]
191 W. Jiang, J. Liu, H. Jin, D. K. Panda, D. Buntinas, R. Thakur, and W. Gropp, Efficient Implementation of MPI-2 Passive One-Sided Communication on InfiniBand Clusters, EuroPVM/MPI 2004, September 2004 [Slides]
192 J. Liu, A. Mamidala, A. Vishnu, and D. K. Panda, Performance Evaluation of InfiniBand with PCI Express, Hot Interconnect 12 (HOTI 04), August 2004
193 S. Sur, H. Jin, and D. K. Panda, Efficient and Scalable All-to-All Personalized Exchange for InfiniBand-based Clusters, Int'l Conference on Parallel Processing (ICPP '04), August 2004
194 J. Liu, W. Jiang, P. Wyckoff, D. K. Panda, D. Ashton, D. Buntinas, W. Gropp, and B. Toonen, Design and Implementation of MPICH2 over InfiniBand with RDMA Support, Int'l Parallel and Distributed Processing Symposium (IPDPS 04), April 2004 [Slides]
195 J. Liu, A. Mamidala, and D. K. Panda, Fast and Scalable MPI-Level Broadcast using InfiniBand's Hardware Multicast Support, Int'l Parallel and Distributed Processing Symposium (IPDPS 04), April 2004 [Slides]
196 J. Wu, P. Wyckoff, and D. K. Panda, High Performance Implementation of MPI Datatype Communication over InfiniBand, Int'l Parallel and Distributed Processing Symposium (IPDPS 04), April 2004
197 V. Tipparaju, G. Santhanaraman, J. Nieplocha, and D. K. Panda, Host-Assisted Zero-Copy Remote Memory Access Communication on InfiniBand, Int'l Parallel and Distributed Processing Symposium (IPDPS 04), April 2004
198 J. Liu, and D. K. Panda, Implementing Efficient and Scalable Flow Control Schemes in MPI over InfiniBand, Int'l Workshop on Communication Architecture for Clusters (CAC 04), April 2004 [Slides]
199 W. Yu, and D. K. Panda, Efficient and Scalable Barrier over Quadrics and Myrinet with a New NIC-Based Collective Message Passing Protocol, Int'l Workshop on Communication Architecture for Clusters (CAC 04), April 2004 [Slides]
200 W. Jiang, J. Liu, H. Jin, D. K. Panda, W. Gropp, and R. Thakur, High Performance MPI-2 One-Sided Communication over InfiniBand, Int'l Symposium on Cluster Computing and the Grid (CCGrid 04), April 2004 [Slides]
201 J. Wu, P. Wyckoff, D. K. Panda, and R. Ross, Unifier: Unifying Cache Management and Communication Buffer Management for PVFS over InfiniBand, Int'l Symposium on Cluster Computing and the Grid (CCGrid 04), April 2004
202 R. Noronha, and D. K. Panda, Designing High Performance DSM Systems using InfiniBand Features, Int'l Workshop on Distributed Shared Memory Systems, April 2004 [Slides]
203 P. Balaji, S. Narravul, K. Vaidyanathan, S. Krishnamoorthy, J. Wu, and D. K. Panda, Sockets Direct Protocol over InfiniBand in Clusters: Is it Beneficial? Int'l Symposium on Performance Analysis of Systems and Software (ISPASS 04). March, 2004 , April 2004
204 A. Wagner, D. Buntinas, R. Brightwell, and D. K. Panda, Application-Bypass Reduction for Large-Scale Clusters, Cluster 2003 Conference, December 2003
205 J. Wu, P. Wyckoff, and D. K. Panda, Supporting Efficient Noncontiguous Access in PVFS over InfiniBand, Cluster 2003 Conference, December 2003
206 V. Tipparaju, M. Krishnan, J. Nieplocha, G. Santhanaraman, and D. K. Panda, Optimizing Mechanisms for Latency Tolerance in Remote Memory Access Communication, Cluster 2003 Conference, December 2003
207 J. Liu, B. Chandrasekaran, J. Wu, W. Jiang, S. Kini, W. Yu, D. Buntinas, P. Wyckoff, and D. K. Panda, Performance Comparison of MPI Implementations over InfiniBand, Myrinet and Quadrics, November 2003
208 A. Moody, J. Fernandez, F. Petrini, and D. K. Panda, Scalable NIC-based Reduction on Large-scale Clusters, SuperComputing (SC) Conference, November 2003
209 W. Yu, S. Sur, D. K. Panda, R. T. Aulwes, and R. Graham, High Performance Broadcast Support in LA-MPI over Quadrics, Los Alamos Computer Science Institute (LACSI) Symposium, October 2003 [Slides]
210 W. Yu, D. Buntinas, and D. K. Panda, High Performance and Reliable NIC-Based Multicast over Myrinet/GM-2, International Conference on Parallel Processing (ICPP 03). Oct. 6-9, October 2003 [Slides]
211 J. Wu, P. Wyckoff, and D. K. Panda, PVFS over InfiniBand: Design and Performance Evaluation, International Conference on Parallel Processing (ICPP 03). Oct. 6-9, October 2003
212 L. Chai, R. Noronha, P. Gupta, G. Brown, and D. K. Panda, Designing a Portable MPI-2 over Modern Interconnects using uDAPL Interface, Euro PVM/MPI Conference, September 2003
213 A. Mamidala, H. Jin, and D. K. Panda, Efficient Hardware Multicast Group Management for Multiple MPI Communicators over InfiniBand, Euro PVM/MPI Conference, September 2003 [Slides]
214 W. Huang, G. Santhanaraman, H. Jin, and D. K. Panda, Design Alternatives and Performance Trade-offs for Implementing MPI-2 over InfiniBand, Euro PVM/MPI Conference, September 2003 [Slides]
215 S. Kini, J. Liu, J. Wu, P. Wyckoff, and D. K. Panda, Fast and Scalable Barrier using RDMA and Multicast Mechanisms for InfiniBand-Based Clusters, Euro PVM/MPI Conference, September 2003
216 J. Wu, P. Wyckoff, and D. K. Panda, Demotion-Based Exclusive Caching through Demote Buffering: Design and Evaluations over Different Networks, Workshop on Storage Network Architecture and Parallel I/O (SNAPI), September 2003
217 B. Chandrasekaran, P. Wyckoff, and D. K. Panda, MIBA: A Micro-benchmark Suite for Evaluating InfiniBand Architecture Implementations, Performance TOOLS 2003, September 2003
218 J. Liu, B. Chandrasekaran, W. Yu, J. Wu, D. Buntinas, S. P. Kinis, P. Wyckoff, and D. K. Panda, Micro-Benchmark Level Performance Comparison of High-Speed Cluster Interconnects, Hot Interconnects 10, August 2003
219 J. Liu, J. Wu, S. Kini, P. Wyckoff, and D. K. Panda, High Performance RDMA-Based MPI Implementation over InfiniBand, Int'l Conference on Supercomputing (ICS '03), June 2003
220 S. Senapathi, B. Chandrasekharan, D. Stredney, H.-W. Shen, and D. K. Panda, QoS-aware Middleware for Cluster-based Servers to Support Interactive and Resource-Adaptive Applications, High Performance Distributed Computing (HPDC-12), June 2003
221 P. Balaji, J. Wu, T. Kurc, U. Catalyurek, D. K. Panda, and J. Saltz, Impact of High Performance Sockets on Data Intensive Applications, High Performance Distributed Computing (HPDC-12), June 2003
222 D. Buntinas, D. K. Panda, and R. Brightwell, Application-Bypass Broadcast in MPICH over GM, Cluster Computing and Grid (CCGrid '03), May 2003
223 D. Buntinas, A. Saify, D. K. Panda, and Jarek Nieplocha, Optimizing Barrier and Lock Operations in ARMCI, Int'l Workshop on Communication Architecture for Clusters (CAC '03), April 2003
224 R. Gupta, P. Balaji, D. K. Panda, and J. Nieplocha, Efficient Collective Operations using Remote Memory Operations on VIA-Based Clusters, Int'l Parallel and Distributed Processing Symposium (IPDPS '03), April 2003
225 V. Tipparaju, J. Nieplocha, D. K. Panda, Fast Collective Operations Using Shared, and Remote Memory Access Protocols on Clusters, Int'l Parallel and Distributed Processing Symposium (IPDPS '03), April 2003. BEST paper in the software track. , April 2003
226 D. Buntinas, and D. K. Panda, NIC-Based Reduction in Myrinet Clusters: Is It Beneficial? SAN-02 Workshop (in conjunction with HPCA), Feb. 2003. , April 2003
227 J. Liu, M. Banikazemi, B. Abali, and D. K. Panda, A Portable Client/Server Communication Middleware over SANs: Design and Performance Evaluation with InfiniBand SAN-02 Workshop (in conjunction with HPCA), Feb. 2003. , April 2003
228 S. Narravul, P. Balaji, K. Vaidyanathan, S. Krishnamoorthy, J. Wu, and D. K. Panda, Supporting Strong Coherency for Active Caches in Multi-Tier Data-Centers over InfiniBand, In SAN-03 Workshop (in conjunction with HPCA), February 2003 [Slides]
229 J. Liu, D. K. Panda, and M. Banikazemi, Evaluating the Impact of RDMA on Storage I/O over InfiniBand, In SAN-03 Workshop (in conjunction with HPCA), February 2003 [Slides]
230 J. Wu, J. Liu, P. Wyckoff, and D. K. Panda, Impact of On-Demand Connection Management in MPI over VIA, Cluster '02, September 2002
231 R. Gupta, V. Tipparaju, J. Nieplocha, and D. K. Panda, Efficient Barrier using Remote Memory Operations on VIA-Based Clusters, Cluster '02, September 2002
232 P. Balaji, P. Shivam, P. Wyckoff, and D. K. Panda, High Performance User-Level Sockets over Gigabit Ethernet, Cluster '02, September 2002
233 S. Senapathi, D. K. Panda, D. Stredney, and H.-W. Shen, A QoS Framework for Clusters to support Applications with Resource Adaptivity and Predictable Performance, Int'l Workshop on Quality of Service (IWQoS), May 2002
234 P. Shivam, P. Wyckoff, and D. K. Panda, Can User Level Protocols Take Advantage of Multi-CPU NICs?, Int'l Parallel and Distributed Processing Symposium (IPDPS '02), April 2002
235 J. Wu, and D. K. Panda, MPI/IO on DAFS Over VIA: Implementation and Performance Evaluation, Communication Architecture for Clusters (CAC'02) Workshop, April 2002
236 J. Nielplocha, V. Tipparaju, A. Saify, and D. K. Panda, Protocols and Strategies for Optimizing Remote Memory Operations on Clusters (CAC'02) Workshop, held in conjunction with IPDPS '02, April 2002
237 D. Buntinas, D. K. Panda, and W. Gropp, NIC-Based Atomic Operations on Myrinet/GM, SAN-1 Workshop, February 2002
238 P. Shivam, P. Wyckoff, and D. K. Panda, EMP: Zero-copy OS-bypass NIC-driven Gigabit Ethernet Message Passing, Supercomputing '01. , February 2002
239 M. Banikazemi, J. Liu, D. K. Panda, and P. Sadayappan, Implementing TreadMarks over VIA on Myrinet and Gigabit Ethernet: Challenges, Design Experience, September 2001
240 R. Noronha, and D. K. Panda, Implementing TreadMarks over GM on Myrinet: Challenges, Design Experience, September 2001 [Slides]
241 A. Gulati, D. K. Panda, P. Sadayappan, and P. Wyckoff, NIC-based Rate Control for Proportional Bandwidth Allocation in Myrinet Clusters, Int'l Conference on Parallel Processing, September 2001
242 D. Buntinas, D. K. Panda, and P. Sadayappan, Performance Benefits of NIC-Based Barrier on Myrinet/GM, Workshop on Communication Architecture for Clusters (CAC '01), April 2001
243 M. Banikazemi, and D. K. Panda, Can Scatter Communication Take Advantage of Multidestination Message Passing? Int'l Symposium on High Performance Computing (HiPC '00), December 2000. , April 2001
244 Praveen Holenarsipur, V. Yarmolenko, J. Duato, D. K. Panda, and P. Sadayappan, Characterization and Enhancement of Static Mapping Heuristics for Heterogeneous Systems, Int'l Symposium on High Performance Computing (HiPC '00), December 2000
245 V. Yarmolenko, J. Duato, D. K. Panda, and P. Sadayappan, Dynamic Mapping Heuristics in Heterogeneous Systems, Workshop on Network-Based Computing, August 2000
246 A. Paul, W.-C. Feng, D. K. Panda, and P. Sadayappan, Balancing Web Server Load for Adaptive Video Distribution, Workshop on Multimedia Computing, August 2000
247 M. Banikazemi, D. K. Panda, and P. Sadayappan, Implementing TreadMarks on Virtual Interface Architecture (VIA): Design Issues and Alternatives, Ninth Workshop on Scalable Shared Memory Multiprocessors, June 2000
248 M. Koop, J. Sridhar, and D. K. Panda, TupleQ: Fully-Asynchronous and Zero-Copy MPI over InfiniBand, Int'l Parallel and Distributed Processing Symposium (IPDPS), May 2000 [Slides]
249 M. Koop, T. Jones, and D. K. Panda, MVAPICH-Aptus: Scalable High-Performance Multi-Transport MPI over InfiniBand, Int'l Parallel and Distributed Processing Symposium (IPDPS), May 2000 [Slides]
250 G. Santhanaraman, S. Narravul, and D. K. Panda, Designing Passive Synchronization for MPI-2 One-Sided Communication to Maximize Overlap, Int'l Parallel and Distributed Processing Symposium (IPDPS), May 2000
251 M. Banikazemi, J. Liu, S. Kutlug, A. Ramakrishna, P. Sadayappan, H. Sah, and D. K. Panda, VIBe: A Micro-benchmark Suite for Evaluating Virtual Interface Architecture (VIA) Implementations, Int'l Parallel and Distributed Processing Symposium (IPDPS), May 2000
252 D. Buntinas, D. K. Panda, and P. Sadayappan, Fast NIC-Based Barrier over Myrinet/GM, Int'l Parallel and Distributed Processing Symposium (IPDPS), May 2000
253 M. Banikazemi, V. Moorthy, L. Herger, D. K. Panda, and B. Abali, Efficient Virtual Interface Architecture Support for the IBM SP Switch-Connected NT Clusters, Int'l Parallel and Distributed Processing Symposium (IPDPS), May 2000
254 A. Singhal, M. Banikazemi, P. Sadayappan, and D. K. Panda, Efficient Multicast Algorithms for Heterogeneous Switch-based Irregular Networks of Workstations, Int'l Parallel and Distributed Processing Symposium (IPDPS), May 2000
255 M. Banikazemi, C. B. Stunkel, D. K. Panda, and B. Abali, Adaptive Routing in RS/6000 SP-like Bidirectional Multistage Interconnection Networks, Int'l Parallel and Distributed Processing Symposium (IPDPS), May 2000
256 M. Banikazemi, B. Abali, and D. K. Panda, Comparison and Evaluation of Design Choices for Implementing the Virtual Interface Architecture (VIA), Fourth Int'l Workshop on Communication, January 2000
257 D. Buntinas, D. K. Panda, J. Duato, and P. Sadayappan, Broadcast/Multicast over Myrinet Using NIC-Assisted Multidestination Messages, Fourth Int'l Workshop on Communication and Architectural Support for Network-Based Parallel Computing (CANPC'00), January 2000
258 V. Moorthy, D. K. Panda, and P. Sadayappan, Fast Collective Communication Algorithms for Reflective Memory Network Clusters, Fourth Int'l Workshop on Communication and Architectural Support for Network-Based Parallel Computing (CANPC'00), January 2000
259 M. Banikazemi, R. Govindaraju, R. Blackmore, and D. K. Panda, Implementing Efficient MPI on LAPI for the IBM-SP: Experiences and Performance Evaluation, International Parallel Processing Symposium (IPPS'99), January 2000
260 V. Moorthy, M. Jacunski, M. Pillai, P. Ware, D. K. Panda, T. Page, P. Sadayappan, V. Nagarajan, and J. Daniel, Low Latency Message Passing on Workstation Clusters using SCRAMNet, International Parallel Processing Symposium (IPPS'99), January 2000
261 M. Jacunski, P. Sadayappan, and D. K. Panda, All-to-All Broadcast on Switch-Based Clusters of Workstations International Parallel Processing Symposium (IPPS'99), April 99, January 2000
262 M. Banikazemi, S. Prabhu, J. Sampathkumar, D. K. Panda, and P. Sadayappan, Communication Modeling of Heterogeneous Networks of Workstations for Performance Characterization of Collective Operations, International Workshop on Heterogeneous Computing (HCW'99), January 2000
263 M. Jacunski, V. Moorthy, P. Ware, M. Pillai, D. K. Panda, and P. Sadayappan, Low Latency Message-Passing for Reflective Memory Networks, International Workshop on Communication, January 1999
264 R. Sivaram, R. Kesavan, D. K. Panda, and Craig B. Stunkel, Where to Provide Support for Efficient Multicasting in Irregular Networks: Network Interface or Switch? International Conference on Parallel Processing, Aug. 1998, August 1998
265 A. Bala, D. Shah, W.-C. Feng, and D. K. Panda, Experiences with Software MPEG-2 Video Decompression on an SMP PC, ICPP Workshop, August 1998
266 R. Sivaram, C. Stunkel, and D. K. Panda, HIPIQS: A High-Performance Switch Architecture using Input Queuing, International Parallel Processing Symposium (IPPS '98), August 1998
267 A-H. Smai, D. K. Panda, and L-E. Thorelli, Prioritized Demand Multiplexing (PDM): A Low-Latency Virtual Channel Flow Control Framework for Prioritized Traffic, International Conference on High Performance Computing, December 1997
268 D. Dai, and D. K. Panda, How Much Does Network Contention Affect Distributed Shared Memory Performance? International Conference on Parallel Processing (ICPP'97), pp. 454-461. , December 1997
269 R. Kesavan, and D. K. Panda, Optimal Multicast with Packetization and Network Interface Support, International Conference on Parallel Processing (ICPP'97), December 1997
270 R. Kesavan, and D. K. Panda, Multicasting on Switch-based Irregular Networks using Multi-drop Path-based Multidestination Worms, Parallel Computing, December 1997
271 R. Sivaram, D. K. Panda, and C. B. Stunkel, Multicasting in Irregular Networks with Cut-Through Switches using Tree-Based Multidestination Worms, Parallel Computing, December 1997
272 D. Dai, and D. K. Panda, How Can We Design Better Networks for DSM Systems? Parallel Computing, Routing, December 1997
273 C. B. Stunkel, R. Sivaram, and D. K. Panda, Implementing Multidestination Worms in Switch-Based Parallel Systems: Architectural Alternatives and their Impact, International Symposium on Computer Architecture (ISCA'97), June 1997
274 R. Sivaram, C. B. Stunkel, and D. K. Panda, A Reliable Hardware Barrier Synchronization Scheme, International Parallel Processing Symposium (IPPS'97), April 1997
275 R. Kesavan, and D. K. Panda, Minimizing Node Contention in Multiple Multicast on Wormhole k-ary n-cube Networks, International Conference on Parallel Processing, August 1996
276 M. Banikazemi, V. Moorthy, and D. K. Panda, Efficient Collective Communication on Heterogeneous Networks of Workstations, International Conference on Parallel Processing, August 1996
277 F. Silla, M. P. Malumbres, J. Duato, D. Dai, and D. K. Panda, Impact of Adaptivity on the Behavior of Networks of Workstations under Bursty Traffic, International Conference on Parallel Processing, August 1996
278 D. Basak, and D. K. Panda, Designing Processor-cluster Based Systems: Interplay Between Cluster Organizations and Collective Communication Algorithms, International Conference on Parallel Processing, August 1996
279 D. Dai, and D. K. Panda, Reducing Cache Invalidation Overheads in Wormhole DSMs using Multidestination Message Passing, International Conference on Parallel Processing, August 1996
280 N. S. Sundar, D. N. Jayasimha, D. K. Panda, and P. Sadayappan, Hybrid Algorithms for Complete Exchange in 2D Meshes, Proceedings of the International Conference on Supercomputing, May 1996
281 R. Kesavan, K. Bondalapati, and D. K. Panda, Multicast on Irregular Switch-based Networks with Wormhole Routing, Proceedings of the Third International Symposium on High Performance Computer Architecture (HPCA-3), February 1996
282 and D. K. Panda, Fast Barrier Synchronization in Wormhole k-ary n-cube Networks with Multidestination Worms, Proc. of the International Symposium on High Performance Computer Architecture, January 1995
283 D. K. Panda, and D. Basak, Issues in Designing Scalable Systems with k-ary n-cube cluster-c organization, Proc. of the International Workshop on Parallel Processing, December 1994
284 R. Prakash, and D. K. Panda, Architectural Issues in Designing Heterogeneous Parallel Systems with Passive Star-Coupled Optical Interconnection, Proc. of the International Symposium on Parallel Architectures, December 1994
285 D. Basak, and D. K. Panda, Designing Large Hierarchical Multiprocessor Systems under Processor, Interconnection, August 1994
286 D. K. Panda, and V. Dixit-Radiya, Message-Ordering for Wormhole-Routed Multiport Systems with Link Contention and Routing Adaptivity, Proc. of the Scalable High Performance Computing Conference, May 1994
287 N. S. Sundar, D. N. Jayasimha, D. K. Panda, and P. Sadayappan, Complete Exchange in 2D Meshes, Proc. of the Scalable High Performance Computing Conference, May 1994
288 D. K. Panda, S. Singal, and P. Prabhakaran, Multidestination Message Passing Mechanism Conforming to Base Wormhole Routing Scheme, Proc. of the Parallel Routing and Communication Workshop, May 1994
289 D. Basak, and D. K. Panda, Scalable Architecture with k-ary n-cube cluster-c Organizations, Proc. of the Symposium on Parallel and Distributed Processing, December 1993
290 V. Dixit-Radiya, and D. K. Panda, Task Assignment in Distributed-Memory Systems with Adaptive Wormhole Routing, Proc. of the Symposium on Parallel and Distributed Processing, December 1993
291 and D. K. Panda, Optimal Phase Barrier Synchronization in k-ary n-cube Wormhole-routed Systems using Multirendezvous Primitives, Workshop on Fine-Grain Massively Parallel Coordination, May 1993
292 T. Mzaik, S. Chandra, J. M. Jagadeesh, and D. K. Panda, Analysis of Routing in Pyramid Architectures, Proc. of the IEEE National Aerospace and Electronics Conference (NAECON), May 1993
293 S. Balakrishnan, and D. K. Panda, Impact of Multiple Consumption Channels on Wormhole Routed k-ary n-cube Networks, Proc. of the International Parallel Processing Symposium, April 1993
294 D. Basak, D. K. Panda, and M. Banikazemi, Benefits of Processor Clustering in Designing Large Parallel Systems: When and How?, Proc. of the International Parallel Processing Symposium, April 1993
295 S. K. S. Gupta, and D. K. Panda, Barrier Synchronization in Distributed-Memory Multiprocessors using Rendezvous Primitives, Proc. of the International Parallel Processing Symposium, April 1993
296 and D. K. Panda, Global Reduction in Wormhole k-ary n-cube Networks with Multidestination Exchange Worms, Proc. of the International Parallel Processing Symposium, April 1993
297 Y. C. Tseng, and D. K. Panda, A Trip-based Multicasting Model for Wormhole-routed Networks with Virtual Channels, Proc. of the International Parallel Processing Symposium, April 1993
298 Y.-C. Tseng, S. K. S. Gupta, and D. K. Panda, An Efficient Scheme for Complete Exchange in 2D Tori, Proc. of the International Parallel Processing Symposium, April 1993
299 V. Dixit-Radiya, and D. K. Panda, Clustering and Intra-Processor Scheduling for Explicitly-Parallel Programs on Distributed-Memory Systems, Proc. of the International Parallel Processing Symposium, April 1993

Ph.D. Disserations (25)

1 M. Luo, Designing Efficient MPI and UPC Runtime for Multicore Clusters with InfiniBand and Heterogeneous System, Jul 2013
2 K. Kandalla, High Performance Non-Blocking Collective Communication for Next Generation InfiniBand Clusters, Jul 2013
3 H. Subramoni, Topology-Aware MPI communication and Scheduling for High Performance Computing Systems, Jul 2013
4 X. Ouyang, Efficient Storage Middleware Design in InfiniBand Clusters for High-End Computing, Mar 2012
5 G. Santhanaraman, Designing Scalable And High Performance One Sided Communication Middleware For Modern Interconnects, Jun 2009
6 M. Koop, High-Performance Multi-Transport MPI Design For Ultra-Scale Infiniband Clusters, Jun 2009
7 L. Chai, High Performance And Scalable MPI Intra-Node Communication Middleware For Multi-Core Clusters, Mar 2009
8 W. Huang, High Performance Network I/O In Virtual Machines Over Modern Interconnects, Aug 2008
9 R. Noronha, Designing High-Performance and Scalable Clustered Network Attached Storage With InfiniBand, Aug 2008
10 S. Narravula, Designing High-Performance and Scalable Distributed Datacenter Services over Modern Interconnects, Aug 2008
11 A. Mamidala, Scalable and High Performance Collective Communication For Next Generation Multicore InfiniBand Clusters, May 2008
12 K. Vaidyanathan, High Performance and Scalable Soft Shared State for Next-Generation Datacenters, May 2008
13 A. Vishnu, High Performance and Network Fault Tolerant MPI with Multi-Pathing Over InfiniBand, Dec 2007
14 S. Sur, Scalable and High Performance MPI Design for Very Large InfiniBand Clusters, Aug 2007
15 P. Balaji, High Performance Communication Support for Sockets Based Applications over High-Speed Networks, Jun 2006
16 W. Yu, Enhancing MPI with Modern Networking Mechanisms in Cluster Interconncts, Jun 2006
17 J. Wu, Communication and Memory Management in Networked Storage Systems, Sep 2004
18 J. Liu, Designing High Performance and Scalable MPI over InfiniBand, Sep 2004
19 D. Buntinas, Improving Cluster Performance through the Use of Programmable Network Interfaces, Jun 2003
20 M. Banikazemi, Design and Implementation of High Performance Communication Subsystems for Clusters, Dec 2000
21 D. Dai, Designing Efficient Communication Subsystems for Distributed Shared Memory (DSM) Systems, Mar 1999
22 R. Kesavan, Communication Mechanisms and Algorithms for Supporting Scalable Collective Communication on Parallel Systems, Oct 1998
23 R. Sivaram, Architectural Support for Efficient Communication in Scalable Parallel Systems, Aug 1998
24 D. Basak, Designing High Performance Parallel Systems: A Processor-Cluster Based Approach, Jul 1996
25 V. Dixit-Radiya, Mapping on Wormhole-routed Distributed-Memory Systems: A Temporal Communication Graph-based Approach, Mar 1995

M.S. Thesis (23)

1 Siddesh Pai Raikar, Network Fault-Resilient MPI for Multi-Rail InfiniBand Clusters, Dec 2011
2 N. Dandapanthula, InfiniBand Network Analysis and Monitoring using OpenSM, Aug 2011
3 V. Meshram, Distributed Metadata Management for Parallel Systems, Aug 2011
4 G. Marsh, Evaluation of High Performance Financial Messaging on Modern Multi-core Systems, Mar 2010
5 K. Gopalakrishnan, Enhancing Fault Tolerance in MPI for Modern InfiniBand Clusters, Aug 2009
6 T. Gangadharappa, Designing Support For MPI-2 Programming Interfaces On Modern Interconnects,Jun 2009
7 J. Sridhar, Scalable Job Startup And Inter-Node Communication In Multi-Core Infiniband Clusters,Jun 2009
8 R. Kumar, Enhancing MPI Point-to-Point and Collectives for Clusters with Onloaded/Offloaded InfiniBand Adapters,Aug 2008
9 S. Bhagvat, Designing and Enhancing the Sockets Direct Protocol (SDP) over iWARP and InfiniBand, Aug 2006
10 S. Krishnamoorthy, Dynamic Re-Configurability Support to Provide Soft QoS Guarantees in Cluster-Based Multi-Tier Data-Centers over InfiniBand,Jun 2004
11 W. Jiang, High Performance MPICH2 One-Sided Communication Implementation over InfiniBand,Jun 2004
12 A. Wagner, Static and Dynamic Processing Offload on Myrinet Clusters with Programmable NIC Support,Jun 2004
13 A. Moody, NIC-based Reduction on Large-Scale Quadrics Clusters,Dec 2003
14 B. Chandrasekharan, Micro-benchmark Level Performance Evaluation and Comparison of High Speed Cluster Interconnects, Sep 2003
15 S. Kini, Efficient Collective Communication using Multicast and RDMA Operations for InfiniBand-based Clusters, Jun 2003
16 S. Senapathi, QoS-Aware Middleware to Support Interactive and Resource Adaptive Applications on Myrinet Clusters, Sep 2002
17 P. Shivam, High Performance User Level Protocol on Gigabit Ethernet,Aug 2002
18 R. Gupta, Efficient Collective Communication using Remote Memory Operations on VIA-Based Clusters, Aug 2002
19 A. Saify, Optimizing Collective Communication Operations in ARMCI,Jul 2002
20 V. Tipparaju, Optimizing ARMCI Get/Put Operations on Myrinet/GM,Sep 2001
21 V. Kota, Designing Efficient Inter-Cluster Communication Layer for Distributed Computing, Jun 2001
22 A. Gulati, A Proportional Bandwidth Allocation Scheme for Myrinet Clusters,Jun 2001
23 S. Kutlug, Performance Evaluation and Analysis of User Level Networking Protocols in Clusters, Jun 2000