

Optimizing Workload in Heterogeneous HPC Workflows with Constraints
Tuesday, June 10, 2025 3:00 PM to Thursday, June 12, 2025 4:00 PM · 2 days 1 hr. (Europe/Berlin)
Foyer D-G - 2nd floor
Research Poster
Industrial Use Cases of HPC, ML and QCPerformance and Resource Modeling
Information
Poster is on display and will be presented at the poster pitch session.
This study tackles the critical challenges of workload mapping and scheduling in heterogeneous high-performance computing (HPC) workflows, with a specific focus on multi-objective optimization considering Data Transfer Time (DTT). In diverse and resource-intensive HPC environments, efficient allocation and scheduling of tasks are essential for maximizing system performance and minimizing overall completion time (makespan).
We present advanced HPC systems and workload models that seamlessly integrate DTT as a core constraint in the optimization process. The research focuses on improving resource utilization and reducing makespan through innovative tools and techniques. A key contribution is the development of a digital twin framework that predicts optimal resource usage, enabling precise and efficient planning and execution of HPC workflows. This framework is adaptable, accommodating evolving cost considerations and ensuring robust performance across a variety of scenarios.
The poster highlights empirical evaluations using task workflows with serial, parallel, and mixed dependencies, as well as standardized task graph sets both with and without DTT constraints. Results demonstrate that Integer Linear Programming (ILP) methods consistently achieve the most optimized solutions, respecting all constraints and achieving the minimum makespan when feasible. These findings are compared against heuristic and meta-heuristic techniques, showcasing the superior accuracy of ILP approaches in constrained environments.
Furthermore, the study explores practical solutions for real-world applications, advancing the state-of-the-art in workload mapping and scheduling within heterogeneous HPC landscapes. Future research aims to integrate machine learning techniques for adaptive and online workload management, enhancing responsiveness to dynamic system conditions.
This research is funded by the DECICE initiative and the EU Horizon Project (Grant No. 101092582), in collaboration with GWDG, NHR, and the University of Göttingen.
Contributors:
This study tackles the critical challenges of workload mapping and scheduling in heterogeneous high-performance computing (HPC) workflows, with a specific focus on multi-objective optimization considering Data Transfer Time (DTT). In diverse and resource-intensive HPC environments, efficient allocation and scheduling of tasks are essential for maximizing system performance and minimizing overall completion time (makespan).
We present advanced HPC systems and workload models that seamlessly integrate DTT as a core constraint in the optimization process. The research focuses on improving resource utilization and reducing makespan through innovative tools and techniques. A key contribution is the development of a digital twin framework that predicts optimal resource usage, enabling precise and efficient planning and execution of HPC workflows. This framework is adaptable, accommodating evolving cost considerations and ensuring robust performance across a variety of scenarios.
The poster highlights empirical evaluations using task workflows with serial, parallel, and mixed dependencies, as well as standardized task graph sets both with and without DTT constraints. Results demonstrate that Integer Linear Programming (ILP) methods consistently achieve the most optimized solutions, respecting all constraints and achieving the minimum makespan when feasible. These findings are compared against heuristic and meta-heuristic techniques, showcasing the superior accuracy of ILP approaches in constrained environments.
Furthermore, the study explores practical solutions for real-world applications, advancing the state-of-the-art in workload mapping and scheduling within heterogeneous HPC landscapes. Future research aims to integrate machine learning techniques for adaptive and online workload management, enhancing responsiveness to dynamic system conditions.
This research is funded by the DECICE initiative and the EU Horizon Project (Grant No. 101092582), in collaboration with GWDG, NHR, and the University of Göttingen.
Contributors:
Format
On DemandOn Site


