High Performance Data Center Digital Twins: ExaDigiT Community BoF

High Performance Data Center Digital Twins: ExaDigiT Community BoF

Tuesday, June 10, 2025 1:00 PM to 2:00 PM · 1 hr. (Europe/Berlin)
Hall E - 2nd floor
Birds of a Feather
Community EngagementData Center Infrastructure and CoolingDigital Twins and MLVisualization and Virtual Reality

Information

Digital twins are revolutionizing the supercomputing industry. We have spent the past couple of years developing a digital twin framework that can readily be used to create a digital representation of any liquid-cooled supercomputer. The approach can model the power and cooling of the entire system, replay or reschedule workloads previously run on the system, and has an augmented and virtual reality component for interacting with the virtual supercomputer, as well as a web-based dashboard for launching “what-if” experiments. ExaDigiT was presented in the main track at SC24 (doi:10.1109/SC41406.2024.00029) and currently has an open-source community consisting of about 120 researchers from top supercomputing centers and HPC industry partners from around the world. The main landing page of the project (with links to source code and a 2-minute video demo) and links to detailed papers published on the various components is available at https://exadigit.github.io.
In this BoF, we will first give an overview of the digital twin framework. We will hear from several leading experts who are using digital twins for studying HPC system optimization, energy efficiency, failure prediction, etc. The goal of the BoF is the introduce ExaDigiT to a broad community of attendees that are interested in developing HPC digital twins of their own systems. We aim to gather interested parties together and start a community discussion on the utility of ExaDiGiT-like twins for HPC systems, gather novel use cases, share experiences and network in an informal environment.
Organizers:
Format
On Site
Targeted Audience
Target audience: Researchers and practitioners from HPC centers interested in system operations, telemetry, optimization, and digital twins for system modeling, energy efficiency, and workload optimization. The session is also relevant for those involved in planning, procuring, or designing future HPC systems.