We can’t show the full text here under this license.
Identifying Long COVID Symptoms from Medical Notes Using Combined Language Processing Methods
Updated
Abstract
Essence
A hybrid pipeline showed moderate-to-strong performance for extracting symptoms and assertion status from clinical notes.
Evidence
This multi-site model development and validation study used 160 intake progress notes from 11 RECOVER health systems for development and evaluation plus 47,654 progress notes for a prevalence study, achieving F1 scores of 0.82 internally and 0.76 externally for assertion detection.
Caveat
The abstract reports note-level NLP validation and prevalence processing, not direct evidence that the pipeline improves PASC diagnosis or patient outcomes.
Simplified
Key numbers
0.82
Average F1 Score (Internal Validation)
Measured across all symptoms in the internal validation dataset.
2.448 ± 0.812 seconds
Average Processing Time per Note
Calculated across 11 health systems in the RECOVER initiative.
ρ > 0.83
Spearman Correlation for Positive Mentions
Based on symptom-mentioning patterns in the population-level prevalence study.