Oxford open immunology

How different analysis methods affect patient groups in Long COVID based on 6,031 patients’ reports of 162 symptoms

Updated

Abstract

Essence

symptom phenotypes depended strongly on the clustering algorithm, suggesting a continuous symptom landscape rather than stable subgroups.

Evidence

This patient-led international survey analysis of 6,031 adults with at least 90 days of illness applied three unsupervised machine-learning methods to 162 self-reported symptoms and tested concordance, subsampling robustness, and links to symptom burden, , age, and gender.

Caveat

The data were self-reported, and clustering a largely continuous symptom space may impose artificial boundaries rather than validated biological subtypes.

Simplified

Key numbers

6031
Cohort Size
Total number of participants in the study.
78%
Women Proportion
Percentage of participants identifying as women.
198 days
Median Illness Duration
Median duration of symptoms reported by participants.

Full Text

What this is

  • affects millions with diverse symptoms persisting after COVID-19 infection.
  • This research analyzes data from 6,031 patients to explore symptom clustering.
  • Three unsupervised machine learning methods were applied to assess symptom phenotypes.
  • Findings reveal low concordance between methods, challenging the stability of clusters.

Essence

  • Patient clustering in is highly dependent on the algorithm used, leading to low reproducibility across different methods. Each method identified symptom clusters, but the overall symptom landscape appeared continuous rather than distinctly separable.

Key takeaways

  • Each clustering method produced clinically plausible symptom clusters, but cross-method agreement was low. For example, a high-symptom-burden group was identified across methods, yet specific patient memberships varied significantly.
  • The analysis showed that symptom burden correlated with higher rates of () severity and a greater proportion of women. This aligns with existing literature suggesting women experience higher symptom frequencies.
  • The continuous nature of the symptom space indicates that clustering methods may impose artificial boundaries, complicating clinical interpretations and highlighting the need for robust phenotyping that considers the full range of symptoms.

Caveats

  • The study's cohort is self-selected and may not represent the broader population, potentially biasing symptom prevalence and severity.
  • Limitations include the lack of systematic assessment of symptom severity and duration, which may obscure important distinctions in patient experiences.
  • The findings emphasize the need for caution in interpreting symptom clusters as distinct diagnostic entities due to their sensitivity to algorithm choice.

Definitions

  • Long COVID: A condition characterized by persistent symptoms following acute COVID-19 infection, affecting multiple organ systems.
  • Post-exertional malaise (PEM): A worsening of symptoms following physical or cognitive exertion, commonly reported in Long COVID and related conditions.

Simplified

what lands in your inbox each week:

  • 📚7 fresh studies
  • 📝plain-language summaries
  • direct links to original studies
  • 🏅top journal indicators
  • 📅weekly delivery
  • 🧘‍♂️always free