

C++ Standard Parallelism for HPC Performance Portability
Monday, June 22, 2026 2:00 PM to 6:00 PM · 4 hr. (Europe/Berlin)
Hall X10 - 1st Floor
Tutorial
Compiler and Tools for Parallel ProgrammingParallel Programming LanguagesParallel Numerical Algorithms
Information
In this tutorial, you’ll discover the portable parallelism and concurrency features of the new C++26 standard and learn to accelerate HPC applications on modern, heterogeneous GPU-based systems from all three main vendors (AMD, Intel, NVIDIA), without any non-standard extensions. We’ll show you how to parallelize classic HPC patterns like multi-dimensional loops and reductions, and how to solve common problems like overlapping MPI communication with GPU computation. The material is supplemented with numerous hands-on exercises and illustrative HPC mini-applications. All exercises will be done on cloud GPU-instances directly in your web-browser; no setup required. The tutorial synthesizes practical techniques acquired from our professional experience to show how the C++26 standard programming model applies to real-world HPC workloads, and which thoughts went into implementing and designing the programming model itself. You'll also receive links to additional resources and a preview of upcoming C++ features.
Format
on-site
Targeted Audience
You should attend if you are a researcher, student, developer, or practitioner interested in developing portable HPC applications with C++ to run them on heterogeneous systems.
Beginner Level
40%
Intermediate Level
60%
Prerequesites
The prerequisites for following the tutorial and hands-on exercises are beginner-level MPI and intermediate-level C++ knowledge. Experience with C++11 lambdas and C++98 standard library algorithms is helpful, but we cover both topics as part of the tutorial.
This tutorial is a very hands-on tutorial with multiple exercises following HPC motifs that allow attendees to transfer the techniques and tools demonstrated to an actual HPC mini-application. In all tutorials, we provide attendees with one-click access to heterogeneous systems via the web-browser. The attendees then edit, compile, run, debug and profile their solutions to the exercises online within their web browsers.
Attendees who prefer to run the notebooks on their hardware or any HPC system from the commandline or JupyterHub can do so by following the steps in the GitHub repository that will contain all tutorial materials.
Attendees need a laptop and a web browser that supports javascript web sockets (Firefox, Chrome, Safari, or Edge). A stable internet connection is needed, but for most portions of the tutorial, high bandwidth and low latency are not required, so WiFi should be fine. The exercises that use profiling tools use a web-based remote desktop that needs somewhat higher bandwidth and lower latency for the best experience.



