

Energy-Aware Handling of HPC workloads: A Co-Scheduling Approach
Wednesday, June 24, 2026 1:00 PM to 1:20 PM · 20 min. (Europe/Berlin)
Hall Z - 3rd Floor
Invited Talk
Energy Efficiency and SustainabilityResource Management and SchedulingSystem and Performance Monitoring
Information
High-performance computing (HPC) is increasingly adopted across diverse application domains, enabling the efficient execution of large-scale, resource-intensive simulations. As computational density continues to grow, energy efficiency has become a critical concern for modern HPC systems. In addition to approaches such as CPU frequency scaling, scheduler-aware minimum resource allocation, and algorithmic optimizations, co-scheduling has emerged as a suitable strategy for improving energy efficiency in HPC systems.
Co-scheduling refers to the concurrent allocation of multiple tasks on a single node while considering their resource usage patterns and interactions, rather than treating each job independently. This strategy involves leveraging the complementary resource demands of different workload types, particularly compute-bound and memory-bound applications. Compute-bound workloads primarily stress CPU cores consuming more energy on computations, while memory-bound workloads are constrained by memory bandwidth spending more energy on data retrieval. By executing such workloads concurrently on the same node, system resources can be utilized more effectively while reducing idle capacity leading to energy conservation. However, shared resources such as L3 cache and memory bandwidth introduce bottlenecks that may impact individual application performance.
This talk presents the concept of co-scheduling as a means of reducing the energy consumption for HPC workloads and proves the applicability based on experimental results performed on the HSUper cluster at HSU, Hamburg. The performance and energy consumption of standalone executions of compute-bound and memory-bound workloads are compared with their simultaneous execution on a single node. The study includes standard memory-bound and compute-bound benchmarks such as STREAM and LINPACK, as well as extension into real-world applications from HSUper users. The findings demonstrate that co-scheduling complementary workloads can achieve approximately 20% energy savings alongside a reduction in overall runtime of around 10 minutes.
Building on these findings, ongoing research focuses on improving L3 cache and memory utilization behaviour of the workloads to further enhance performance and broaden the applicability of co-scheduling strategies for HPC users.
Co-scheduling refers to the concurrent allocation of multiple tasks on a single node while considering their resource usage patterns and interactions, rather than treating each job independently. This strategy involves leveraging the complementary resource demands of different workload types, particularly compute-bound and memory-bound applications. Compute-bound workloads primarily stress CPU cores consuming more energy on computations, while memory-bound workloads are constrained by memory bandwidth spending more energy on data retrieval. By executing such workloads concurrently on the same node, system resources can be utilized more effectively while reducing idle capacity leading to energy conservation. However, shared resources such as L3 cache and memory bandwidth introduce bottlenecks that may impact individual application performance.
This talk presents the concept of co-scheduling as a means of reducing the energy consumption for HPC workloads and proves the applicability based on experimental results performed on the HSUper cluster at HSU, Hamburg. The performance and energy consumption of standalone executions of compute-bound and memory-bound workloads are compared with their simultaneous execution on a single node. The study includes standard memory-bound and compute-bound benchmarks such as STREAM and LINPACK, as well as extension into real-world applications from HSUper users. The findings demonstrate that co-scheduling complementary workloads can achieve approximately 20% energy savings alongside a reduction in overall runtime of around 10 minutes.
Building on these findings, ongoing research focuses on improving L3 cache and memory utilization behaviour of the workloads to further enhance performance and broaden the applicability of co-scheduling strategies for HPC users.
Format
on-demandon-site
Intermediate Level
70%
Advanced Level
30%
