

An Experimental Study on Network Offloading Using DPUs in Virtual Machines
Wednesday, June 24, 2026 3:45 PM to 5:15 PM · 1 hr. 30 min. (Europe/Berlin)
Foyer D-G - 2nd Floor
Research Poster
HPC in the Cloud and HPC ContainersHW and SW Design for Scalable Machine LearningNetworking and Interconnects
Information
Poster is on display and will be presented at the poster pitch session.
In recent HPC and cloud environments, virtualization has been widely adopted to achieve flexibility and scalability of computing resources; however, performance degradation caused by network virtualization remains a significant limitation. In particular, with the increasing demand for large-scale data processing and high-speed communication driven by the growth of HPC and AI workloads, network performance has become a critical factor in determining the overall performance of cloud infrastructures.
Recently, network offloading architectures based on Data Processing Units (DPUs) have gained attention as an effective approach to mitigating network performance bottlenecks. A DPU offloads key networking functions, including packet processing, policy enforcement, and virtual switching, from the host CPU to dedicated ARM cores and hardware acceleration engines, thereby significantly reducing CPU overhead and enabling more stable and predictable network performance.
In this paper, we build an OpenStack environment integrated with the NVIDIA BlueField-3 DPU and evaluate the effectiveness of virtual machine network traffic offloading. Using the NTTTCP benchmark, we compare network throughput and host CPU utilization between DPU mode and the conventional NIC mode, experimentally validating the performance benefits of DPU-based network offloading in HPC-oriented cloud infrastructures.
Contributors:
In recent HPC and cloud environments, virtualization has been widely adopted to achieve flexibility and scalability of computing resources; however, performance degradation caused by network virtualization remains a significant limitation. In particular, with the increasing demand for large-scale data processing and high-speed communication driven by the growth of HPC and AI workloads, network performance has become a critical factor in determining the overall performance of cloud infrastructures.
Recently, network offloading architectures based on Data Processing Units (DPUs) have gained attention as an effective approach to mitigating network performance bottlenecks. A DPU offloads key networking functions, including packet processing, policy enforcement, and virtual switching, from the host CPU to dedicated ARM cores and hardware acceleration engines, thereby significantly reducing CPU overhead and enabling more stable and predictable network performance.
In this paper, we build an OpenStack environment integrated with the NVIDIA BlueField-3 DPU and evaluate the effectiveness of virtual machine network traffic offloading. Using the NTTTCP benchmark, we compare network throughput and host CPU utilization between DPU mode and the conventional NIC mode, experimentally validating the performance benefits of DPU-based network offloading in HPC-oriented cloud infrastructures.
Contributors:
Format
on-demandon-site