HPC-Enabled Identification of Host Proteins Associated with Strong SARS-CoV-2 Neutralising Antibody Responses

HPC-Enabled Identification of Host Proteins Associated with Strong SARS-CoV-2 Neutralising Antibody Responses

Wednesday, June 24, 2026 3:45 PM to 5:15 PM · 1 hr. 30 min. (Europe/Berlin)
Foyer D-G - 2nd Floor
Research Poster
Application Workflows for DiscoveryBioinformatics and Life SciencesHigh-Performance Data Analytics

Information

Poster is on display and will be presented at the poster pitch session.
High-performance computing (HPC) is increasingly becoming an essential resource to health research, enabling the analysis and integration of large-scale, high-dimensional biological data. Such data are important for identifying biomarkers, immune correlates of protection, and the molecular mechanisms underlying disease. In African research settings focused on highly diverse cohorts with a substantial burden of endemic infectious diseases, scalable HPC-enabled bioinformatics workflows provide a critical pathway for overcoming analytical constraints and accelerating discovery.

Here, we investigated host proteins associated with SARS-CoV-2 neutralising antibody responses in a South African cohort during first infection. Seventy-one participants were enrolled, including individuals with moderate disease severity, 17 of whom required supplemental oxygen, but none were critically ill. Participants were stratified into high and low neutralisers based on convalescent plasma neutralisation capacity, with anti-spike antibody levels measured in parallel. Using large-scale SomaScan® plasma proteomics generated early after diagnosis, we applied HPC-enabled bioinformatics workflows for processing, statistical modelling, and feature selection. The analyses identified host proteins associated with neutralisation capacity, spike-binding antibodies, and disease severity. Proteins linked to neutralisation showed strong concordance with spike-binding antibody responses (87%), but only partial overlap with disease severity (36%), highlighting the importance of separating protective immunity from clinical severity signals. Predictive modelling demonstrated that neutralisation status could be inferred from individual protein markers, with HSPA8 emerging as the strongest signal. HSPA8 is a molecular chaperone involved in viral protein interactions and antigen cross-presentation, highlighting a biologically plausible pathway shaping humoral immunity. The analysis required HPC resources to support parallelised statistical modelling and validation across thousands of proteomic features, including bootstrapping, cross-validation, permutation testing, and protein-wise analyses across multiple biological time points and strata.

HPC infrastructure enabled robust resampling while accommodating clinically heterogeneous subgroups, with workloads distributed across compute nodes using a batch scheduler for concurrent execution; typical jobs utilised 4-16 CPU cores with memory requirements ranging from approximately 8 to 32 GB. This study revealed a distinct host proteomic program underlying neutralising antibody capacity that is strongly aligned with spike-binding responses and largely independent of disease severity. It also demonstrated how HPC-driven bioinformatics enables and accelerates the extraction of biologically meaningful immune signatures from high-dimensional omics data. These findings provide new insight into host pathways influencing protective antibody responses and illustrate the importance of scalable computational infrastructure for advancing health research.

Keywords: High-performance computing; Bioinformatics; COVID-19; Plasma proteomics; Neutralising antibodies; Africa; Health data science.
Format
on-demandon-site