Upcoming Talks

The International Conference for High Performance Computing, Networking, Storage, and Analysis 2025 - St. Louis, Missouri
(Nov 16 - 21, 2025)

Time Location Event Speaker(s)

Sunday, November 16

8:30AM - 5:00PM B131

Principles and Practice of High-Performance Deep/Machine Learning Training and Inference

[Talk]

DK Panda
H. Subramoni
N. Alnaasan
J. Yao
C. Chen
L. Xu

Monday, November 17

8:30AM - 5:00PM B130

High-Performance and Smart Networking Technologies for HPC and AI

[Talk]

DK Panda
H. Subramoni
B. Michalowicz
T. Tran
S. Xu
7:15PM - 8:00PM Booth #414

MPI-driven Solutions towards High-Performance Deep Learning Training and Inference on Modern Clusters

[Talk]

N. Alnaasan
J. Yao
8:15PM - 9:00PM Booth #414

Intelligent Cyberinfrastructure for Next Generation AI - Activities at the NSF-AI Institute ICICLE

[Talk]

DK Panda

Tuesday, November 18

11:30AM - 12:00PM Booth #414

Designing and Benchmarking MPI Communication over CXL

[Talk]

T. Tran
12:15PM - 1:15PM B276

Unified Communication X (UCX) Community

[BoF]

DK Panda
12:30PM - 1:00PM Booth #414

X-ScaleAI: Scaling Your Challenging AI Applications with Efficiency and Ease

[Talk]

Soham Ghosh, X-ScaleSolutions
12:30PM - 1:00PM Booth #414

Experiences Running AI and HPC Applications using MVAPICH-Plus on the SDSC Cosmos System

[Talk]

Mahidhar Tatineni, SDSC
1:30PM - 2:00PM Booth #414

Using BlueField-3 DPUs to Offload Vector Operations in Krylov Subspace Methods

[Talk]

B. Michalowicz
2:30PM - 3:00PM Booth #414

ParaTools Pro for E4S(TM) - an HPC-AI ecosystem for science.

[Tutorial]

Sameer Shende, ParaTools, Inc.
3:30PM - 4:00PM Booth #414

Understanding and Characterizing Communication Characteristics for Distributed Transformer Models

[Talk]

L. Xu
4:30PM - 5:00PM Booth #414

Performance Evaluation and Optimization of MVAPICH-Plus on JCAHPC Miyabi

[Talk]

Toshihiro Hanawa, Uni. of Tokyo, Japan
5:15PM - 7:00PM Second Floor Atrium

Designing GPU-Aware Collective Communication for Heterogeneous Clusters with Diverse GPUs and Interconnects

[Doctoral Showcase]

C. Chen
5:15PM - 6:45PM B263-264

Agriculture Empowered by Supercomputing

[BoF]

DK Panda
5:15PM - 6:45PM B130

OpenSHMEM: Version 1.7, Heterogeneous Devices, and Emerging Frontiers

[BoF]

B. Michalowicz
5:30PM - 6:00PM Booth #414

Training Ultra Long Context LLM with Fully Pipelined Distributed Transformer

[Talk]

J. Yao

Wednesday, November 19

10:30AM - 11:00AM Booth #414

Accelerating the 3D FFT using heFFTe and MVAPICH

[Talk]

Ahmad Abdelfattah, Univ. of Tennesse, Knoxville
11:30AM - 12:00PM Booth #414

High Performance Communication Frameworks for FPGAs

[Talk]

N. Contini
12:30PM - 1:00PM Booth #414

Thor Ultra 800G AI Ethernet NIC

[Talk]

Hemal Shah, Broadcom
1:15PM - 1:45PM Booth #414

Performance and Scalability of MVAPICH-Plus on Large-Scale Systems

[Talk]

B. Michalowicz
2:30PM - 3:00PM Booth #414

Accelerating AWP-ODC Seismic Simulations through MPI Optimizations

[Talk]

Scott Callaghan, USC
3:30PM - 4:00PM Booth #414

Design and Optimization of GPU-Aware MPI Allreduce Using Direct Sendrecv Communication

[Talk]

C. Chen
4:00PM - 4:30PM Booth #935

High-Performance and Scalable Middleware for HPC and AI

[Talk]

DK Panda
4:30PM - 5:00PM Booth #414

Accelerating HPC and AI Applications using Novel Products from X-ScaleSolutions

[Talk]

Soham Ghosh, X-ScaleSolutions
5:15PM - 6:45PM B125

MPICH: A High Performance Open-Source MPI Implementation

[BoF]

N. Shineman
5:30PM - 6:00PM Booth #414

MPI Communication Performance on AMD MI300A: Microbenchmarks and Applications

[Talk]

G. Kuncham

Thursday, November 20

10:30AM - 11:00AM Booth #414

Powering Performance: Inside TACC Vista and Its Communication Libraries

[Talk]

John Cazes and Amit Ruhela, TACC
11:30AM - 12:00PM Booth #414

HyperSack: Distributed Hyperparameter Optimization for Deep Learning using Resource-Aware Scheduling

[Talk]

N. Alnaasan
12:30PM - 1:00PM Booth #414

HPC and AI Workloads on C-DAC’s Trinetra Network Using MVAPICH4

[Talk]

Parikshit Godbole, CDAC, India
1:30PM - 2:00PM Booth #414

Enhanced MPI Intra-node Communication Framework with Cooperative DMA-based Data Transfer

[Talk]

L. Xu
2:15PM - 2:45PM B275

A Streaming Collectives Interface Targeting Dataflow Acceleration and HPC Workloads

[Talk]

N. Contini
J. Queiser
B. Ramesh
H. Subramoni
DK Panda

Friday, November 21

10:30AM - 11:00AM 261-262-265-266

MPI Communication Performance on AMD MI300A: Microbenchmarks and Applications, IPDRM workshop

[Workshop]

G. Kuncham
S. Zhang
S. Mohammad
C. Chen
DK Panda

Past Talks

54th International Conference on Parallel Processing - San Diego, California
(Sep 08 - 11, 2025)

Time Location Event Speaker(s)

Wednesday, September 10

11:00AM - 11:30AM Macaw

Design and Optimization of GPU-Aware MPI Allreduce Using Direct Sendrecv Communication

[Talk]

C. Chen
J. Yao
H. Subramoni
DK Panda

IEEE International Conference on Cluster Computing 2025 - Edinburgh, Scotland
(Sep 02 - 05, 2025)

Time Location Event Speaker(s)

Tuesday, September 02

9:30AM - 5:30PM Pentland East

High-Performance and Smart Networking Technologies for HPC and AI

[Tutorial]

DK Panda
H. Subramoni
B. Michalowicz
12:00PM - 1:00PM LLMxHPC workshop

Distributed LLM Training and Inference on Modern HPC Clusters with Performance and Scalability

[Talk]

DK Panda

IEEE Hot Interconnects Symposium 2025 - Online
(Aug 20 - 22, 2025)

Time Location Event Speaker(s)

Thursday, August 21

11:00AM - 11:30AM Virtual Over Zoom

Characterizing Communication Patterns in Distributed Large Language Model Inference

[Technical Paper]

L. Xu

Friday, August 22

8:30AM - 12:00PM Virtual Over Zoom

Principles and Practice of Scalable and Distributed Deep Neural Networks Training and Inference

[Tutorial]

DK Panda
N. Alnaasan
1:00PM - 4:30PM Virtual Over Zoom

High-Performance and Smart Networking Technologies for HPC and AI

[Tutorial]

DK Panda
B. Michalowicz

ISC HIGH PERFORMANCE 2025 - Hamburg, Germany
(Jun 10 - 13, 2025)

Time Location Event Speaker(s)

Tuesday, June 10

All times are in CEST
3:00PM - 4:00PM Foyer D-G - 2nd floor

Design and Implementation of a GPU-Aware MPI Collective Library for Intel GPUs

[Poster Presentation]

C. Chen
G. Kuncham
N. Alnaasan
H. Subramoni
DK Panda
3:00PM - 4:00PM Foyer D-G - 2nd floor

Design and Implementation of MPI Collective Operations for Large Message Communication on AMD GPUs

[Poster Presentation]

C. Chen
L. Xu
O. Pearce
D. Boehme
H. Subramoni
DK Panda
3:00PM - 4:00PM Foyer D-G - 2nd floor

Use of BlueField-SmartNICs in Offloading One-Sided Communication Primitives

[Poster Presentation]

B. Michalowicz
K. Suresh
H. Subramoni
S. Poole
DK Panda

Wednesday, June 11

All times are in CEST
9:00AM - 10:00AM Hall Z - 3rd floor

Commodity Interconnect in Next Generation: Perspective on Hardware and Software

[Panel]

DK Panda
11:30AM - 12:30PM Hall E - 2nd floor

Agriculture Empowered by Supercomputing

[BoF]

DK Panda
1:00PM - 2:00PM Hall Z - 3rd floor

Enabling AI-Driven Digital Agriculture: Solutions from the NSF-AI ICICLE Institute

[Talk]

DK Panda

Friday, June 13

All times are in CEST
9:00AM - 1:00PM Hall Y4 - 2nd floor

Principles and Practice of Scalable and Distributed Deep Neural Networks Training and Inference

[Tutorial]

DK Panda
H. Subramoni
N. Alnaasan