

Online Deep Learning Training and Inference in HPC Programs with TorchFort Library
Monday, June 22, 2026 2:00 PM to 6:00 PM · 4 hr. (Europe/Berlin)
Hall X9 - 1st Floor
Tutorial
AI Applications powered by HPC TechnologiesEngineeringHPC Simulations enhanced by Machine LearningML Systems and FrameworksPhysics
Information
Researchers are using numerical simulation data to train deep learning (DL) models for a wide variety of tasks. These models include surrogate models for efficient parameter space exploration applications, regression models for approximating numerics, generative models for super resolution applications and reinforcement learning (RL) models for control applications. However, as researchers undertake simulations at increasingly high resolutions, it can lead to an explosion of data which is difficult to harness for deep learning purposes. For example, a high-resolution direct numerical simulation (DNS) computational fluid dynamics (CFD) data can be hundreds of GB per a single time snapshot. To circumvent this, we can adopt the online training approach where the DL training process is run concurrently to the simulation and the training data is read directly from the memory without the need for storing it to disk. Online training is also a natural framework for reinforcement learning applications as they require interaction between the agent and simulation environment.
Fortran and C/C++ HPC codes underpin the majority scientific computing applications, whereas deep learning is dominated by Python. In this tutorial we will show how to use the TorchFort library to perform online DL training and inference with Fortran and C++ -based numerical simulation programs. The tutorial is structured as follows. First, we start with a lecture to introduce and motivate the online training approach, including most commonly used model architectures, applications and training techniques. Second, we run two live demos that detail the TorchFort supervised and reinforcement learning functions for both Fortran and C++ APIs. Finally, we will assist participants to implement an online training/inference pipeline to one of the prepared example applications, CaNS (Fortran, structured grid CFD) or PyFR (Python/C, unstructured grid CFD).
Fortran and C/C++ HPC codes underpin the majority scientific computing applications, whereas deep learning is dominated by Python. In this tutorial we will show how to use the TorchFort library to perform online DL training and inference with Fortran and C++ -based numerical simulation programs. The tutorial is structured as follows. First, we start with a lecture to introduce and motivate the online training approach, including most commonly used model architectures, applications and training techniques. Second, we run two live demos that detail the TorchFort supervised and reinforcement learning functions for both Fortran and C++ APIs. Finally, we will assist participants to implement an online training/inference pipeline to one of the prepared example applications, CaNS (Fortran, structured grid CFD) or PyFR (Python/C, unstructured grid CFD).
Format
on-site
Targeted Audience
Numerical simulation researchers and scientific AI researchers, in particular those who are interested in
combining Fortran and C++ -based HPC codes with AI capabilities.
Beginner Level
50%
Intermediate Level
50%
Prerequesites
The participants should bring their laptop. We will arrange a compute platform for the duration of the tutorial together with a containerised environment, including pre-built TorchFort-enabled applications that participants can modify.