In this tutorial, you’ll discover the portable parallelism and concurrency features of the ISO C++26 standard and learn to accelerate HPC applications on modern, heterogeneous GPU-based systems from all three main vendors (AMD, Intel, NVIDIA), without any non-standard extensions. We’ll show you how to parallelize classic HPC patterns like multi-dimensional loops and reductions, and how to solve common problems like overlapping MPI communication with GPU computation. The material is supplemented with numerous hands-on exercises and illustrative HPC mini-applications. All exercises will be done on cloud GPU-instances directly in your web-browser; no setup required. The tutorial synthesizes practical techniques acquired from our professional experience to show how the C++26 standard programming model applies to real-world HPC workloads, and which thoughts went into implementing and designing the programming model itself. You'll also receive links to additional resources and a preview of upcoming C++ features.
Targeted Audience
The target audience is researchers, students, developers, and practitioners interested in developing portable HPC applications with C++ that make use of all CPUs and GPUs available in heterogeneous systems without duplicating code or using vendor-specific programming models.