

Reproducible Benchmarking Techniques for High-Performance Computing
Monday, June 22, 2026 2:00 PM to 6:00 PM · 4 hr. (Europe/Berlin)
Hall X8 - 1st Floor
Tutorial
Community EngagementDevelopment of HPC SkillsEducation and TrainingPerformance and Resource ModelingSystem and Performance Monitoring
Information
Benchmarking is integral to overall practices of HPC, AI and ML. A hier-
archy of benchmarks allows comparison of machines to each other, vendors to
respond to new procurements, and compute facilities to evaluate offers when
selecting new platforms. Within a developer community, benchmarks allow an
application’s performance to be evaluated and improved over time. Despite the
importance of benchmarking, the process of performing a benchmark is chal-
lenging and requires many manual steps, such as researching the appropriate
workloads, determining how to compile the software, acquiring execution steps,
and extracting the correct metric for comparison. Collectively, these steps pose
a high barrier for new users to perform a benchmark and and even higher barrier
to executing these benchmarks across a wide set of compute facilities.
This tutorial will provide a detailed introduction to two tools for addressing
the current challenges in HPC system benchmarking. Ramble [1, 2] is a Python
experimentation framework for encoding reproducible run instructions of large
sets of experiments. Benchpark [3, 4] is an infrastructure-as-code project com-
bining a variety of open-source tools into a fully specified system for tracking
benchmark performance across a variety of systems, across multiple HPC cen-
ters, and across arbitrary choices of benchmarks. One of the open-source tools
leveraged by Benchpark is Spack, used for defining reproducible build instruc-
tions of applications. Ramble builds upon Spack for defining its run instructions.
Benchpark and Ramble are gaining popularity in the HPC community for en-
abling collaborative continuous benchmarking. Attendees of this tutorial will
leave with foundational skills in benchmarking systems, using these tools for
building and testing.
archy of benchmarks allows comparison of machines to each other, vendors to
respond to new procurements, and compute facilities to evaluate offers when
selecting new platforms. Within a developer community, benchmarks allow an
application’s performance to be evaluated and improved over time. Despite the
importance of benchmarking, the process of performing a benchmark is chal-
lenging and requires many manual steps, such as researching the appropriate
workloads, determining how to compile the software, acquiring execution steps,
and extracting the correct metric for comparison. Collectively, these steps pose
a high barrier for new users to perform a benchmark and and even higher barrier
to executing these benchmarks across a wide set of compute facilities.
This tutorial will provide a detailed introduction to two tools for addressing
the current challenges in HPC system benchmarking. Ramble [1, 2] is a Python
experimentation framework for encoding reproducible run instructions of large
sets of experiments. Benchpark [3, 4] is an infrastructure-as-code project com-
bining a variety of open-source tools into a fully specified system for tracking
benchmark performance across a variety of systems, across multiple HPC cen-
ters, and across arbitrary choices of benchmarks. One of the open-source tools
leveraged by Benchpark is Spack, used for defining reproducible build instruc-
tions of applications. Ramble builds upon Spack for defining its run instructions.
Benchpark and Ramble are gaining popularity in the HPC community for en-
abling collaborative continuous benchmarking. Attendees of this tutorial will
leave with foundational skills in benchmarking systems, using these tools for
building and testing.
Format
on-site
Targeted Audience
This tutorial targets a broad audience, including HPC application develop-
ers, performance engineers, performance tool developers, system administrators,
and anyone interested in learning recent techniques for building and understand-
ing reproducible benchmarking capabilities.
Beginner Level
60%
Intermediate Level
40%
Prerequesites
Attendees will need a laptop with internet connection. We will provide the attendees with an AWS system on which to do the hands-on portion of the tutorial.
Speakers

Gregory Becker
Computer ScientistLawrence Livermore National Laboratory
Olga Pearce
Computer ScientistLawrence Livermore National LaboratoryLG
Lin Guo
HPC Software EngineerGoogle
Jens Domke
Computer ScientistRIKENSB
Stephanie Brink
Computer ScientistLawrence Livermore National LaboratoryRB
Robert Bird
SWEGoogle