What this is
- This research investigates the shared molecular biomarkers and mechanisms between gastroesophageal reflux disease (GERD) and ischemic stroke (IS).
- Using integrated machine learning and bioinformatics, the study identifies common () and enriched pathways.
- Key findings include the identification of 9 hub genes that may serve as diagnostic biomarkers and therapeutic targets.
Essence
- Nine hub genes were identified as common biomarkers linking GERD and ischemic stroke, showing high diagnostic accuracy. These genes are involved in inflammatory and vascular pathways, suggesting potential therapeutic targets.
Key takeaways
- Fifty-two upregulated and 57 downregulated were identified as common between GERD and IS. These genes are implicated in pathways such as IL-17 signaling and PI3K-Akt signaling.
- The study identified 9 hub genes (FAM46C, FUT4, ODC1, UQCRB, ID2, TSC22D1, IL17RB, AHR, and MGAT4B) with consistent dysregulation across both diseases, indicating shared pathogenic mechanisms.
- The combined () values for the hub genes ranged from 0.9 to 1.0, demonstrating their potential as reliable diagnostic biomarkers for both GERD and IS.
Caveats
- The study relies on publicly available datasets, which may introduce sample heterogeneity and limit the generalizability of findings across different populations.
- The cross-sectional design does not allow for causal inferences between hub gene dysregulation and disease progression.
- High values may indicate model overfitting, particularly due to small sample sizes and stringent feature selection processes.
Definitions
- differentially expressed genes (DEGs): Genes that show statistically significant differences in expression levels between different conditions or groups.
- area under the curve (AUC): A measure of the diagnostic performance of a test, with higher values indicating better discrimination between conditions.
AI simplified
1. Introduction
Gastroesophageal reflux disease (GERD) and stroke represent 2 prevalent yet pathophysiologically distinct conditions, with emerging evidence suggesting potential shared biological pathways. Observational studies have indicated that GERD may increase stroke risk (odds ratio: 1.22 for all-stroke, 1.19 for ischemic stroke),while bidirectional Mendelian randomization analyses reveal a causal interplay, with stroke subtypes (e.g., large-artery stroke, odds ratio: 1.49) also exacerbating GERD susceptibility.Despite these associations, the specific biomarkers and mechanistic links underlying this relationship remain poorly understood. [,] 1 2 [,] 1 2
The pathogenesis of GERD involves multifactorial processes, including lower esophageal sphincter dysfunction, prolonged acid exposure, and inflammatory responses mediated by granulocytes or T-lymphocytes.Notably, these mechanisms may intersect with stroke pathways through shared risk factors such as hypertension (mediated effect reported in GERD-stroke association),obesity,and systemic inflammation.For instance, GERD-related esophageal mucosal damage triggers endoplasmic reticulum (ER) stress, which is implicated in both local inflammation and systemic vascular endothelial dysfunction.Additionally, humoral markers of asymptomatic lung injury in GERD patientssuggest potential circulating biomarkers that could also reflect cerebrovascular injury. [–] 3 5 [] 1 [] 6 [] 7 [] 7 [] 8
Current research gaps include: limited identification of molecular biomarkers (e.g., calcitonin gene-related peptide upregulation in reflux hypersensitivity) that may concurrently predict stroke risk; unclear mediation effects of cardiovascular risk factors (e.g., major depressive disorder shown to mediate GERD-stroke links); and heterogeneity in GERD phenotypes (erosive vs nonerosive) and their differential associations with stroke subtypes.Machine learning approaches offer a promising solution to integrate multi-omics data (genetic, proteomic, and clinical) from existing genome-wide association studiesand mechanistic studies,enabling the identification of high-dimensional patterns that traditional statistical methods may overlook. [] 9 [] 10 [,] 1 11 [,] 2 12 [,] 5 7
This study systematically integrated gene expression datasets from the Gene Expression Omnibus (GEO) database for GERD and ischemic stroke. Common differentially expressed genes (DEGs) were identified through comprehensive differential expression analysis. To prioritize clinically relevant biomarkers, we employed least absolute shrinkage and selection operator (LASSO) regression, support vector machine-recursive feature elimination, and random forest algorithms in a complementary analytical framework. This multi-method approach not only circumvents the limitations inherent to single-algorithm strategies but also enhances result reliability through rigorous cross-validation procedures. Figureillustrates a schematic representation of the integrated workflow. 1

Flowchart of shared gene and pathway identification between ischemic stroke (IS) and gastroesophageal reflux disease (GERD).
2. Materials and methods
2.1. Data download and preprocessing
Transcriptome profiling datasets were retrieved from the GEO database () using the R package GEOquery (v2.76.0; Bioconductor Core Team, Seattle). The study incorporated 4 independent cohorts:andfor ischemic stroke (IS) analysis, andandfor GERD investigation. Detailed clinical characteristics and sample information of the datasets are summarized in Table. For each dataset, the corresponding platform files were also retrieved to obtain the probe-to-gene mapping information. The raw CEL files were extracted, and background correction, normalization, and summarization were performed using the rma function from the affy package (Bioconductor Core Team, Seattle). All datasets were derived from microarray platforms and were uniformly processed using the same normalization pipeline to ensure comparability across datasets. Probe intensities were summarized to gene level by taking the median value of multiple probes corresponding to the same gene. Probes mapping to multiple genes were excluded from the analysis. After data cleaning, low-abundance genes were filtered out based on expression levels across samples. The final expression matrices were log2-transformed and normalized between arrays using the normalizeBetweenArrays function from the limma package (Walter and Eliza Hall Institute of Medical Research [WEHI], Melbourne, Victoria, Australia). To address batch effects between datasets, the ComBat function from the sva package was applied. For IS datasets (and), and GERD datasets (and), the expression matrices were merged, and batch correction was performed based on the dataset origin. Principal component analysis was conducted before and after batch correction to assess the effectiveness of the ComBat method. https://www.ncbi.nlm.nih.gov/geo/↗ GSE58294 GSE22255 GSE26886 GSE39491 GSE58294 GSE22255 GSE26886 GSE39491 1
| GEO accession | Platforms | Samples | Tissue |
|---|---|---|---|
| GSE58294 | GPL570 | 69 cardioembolic stroke samples and 23 controls | Blood |
| GSE22255 | GPL570 | 20 IS patients and 20 controls | Blood |
| GSE26886 | GPL570 | 20 GRED and 19 controls | Barrett esophagus/esophageal squamous epithelium |
| GSE39491 | GPL571 | 40 GRED and 40 controls | Barrett esophagus/normal mucosa from squamous esophagus |
2.2. Differential expression analysis
Differential expression analysis was performed using the limma package. For each dataset, a linear model was fitted to the normalized expression data, and empirical Bayes moderation of the standard errors was applied. Contrasts were defined to compare case and control groups. The topTable function was used to extract DEGs with a-value < .05 and |log2(fold change)| > 0.2. Volcano plots were generated to visualize significant DEGs, and heatmaps were created for the top DEGs using the pheatmap package. P
2.3. Enrichment analysis
Gene ontology and Kyoto encyclopedia of genes and genomes (KEGG) pathway enrichment analyses were performed using the clusterProfiler package (Southern Medical University, Guangzhou, Guangdong, China). DEGs common to both IS and GERD were converted from gene symbols to Entrez IDs using the bitr function. For gene ontology enrichment, the entire human genome was used as the background. For KEGG enrichment, pathways with adjusted-values < .05 were considered significant. The results were visualized using dot plots and bubble plots to highlight the most significantly enriched terms and pathways. P
2.4. Machine learning for hub gene identification
Machine learning approaches were employed to identify hub genes from the DEGs. LASSO regression was implemented using the glmnet package with 10-fold cross-validation to determine the optimal regularization parameter (lambda.min), selecting features with nonzero coefficients at this threshold. Support vector machine with recursive feature elimination was implemented using the mRFE package with the e1071 backend, utilizing 10-fold cross-validation to identify the feature subset corresponding to minimum classification error. Random forest analysis was conducted using the randomForest package (Fortran Original by Leo Breiman & Adele Cutler; R port by Andy Liaw & Matthew Wiener) with 10 repetitions of 10-fold cross-validation to determine the optimal feature number based on minimal cross-validation error, assessing variable importance through mean decrease in accuracy. The final hub genes were defined as the intersection of genes selected by all 3 algorithms, ensuring robust and consensus feature selection across complementary machine learning paradigms.
2.5. Validation and receiver operating characteristic (ROC) analysis
The expression patterns of the identified hub genes were validated across all datasets. Boxplots were generated to compare gene expression levels between case and control groups, with statistical significance assessed using Wilcoxon rank-sum tests. The Wilcoxon rank-sum test was employed as the primary method for hypothesis testing because the normality assumption for parametric tests could not be guaranteed for all gene expression distributions across datasets. Subsequently, we evaluated the diagnostic performance of each individual hub gene and their combined predictive power using ROC curve analysis. For individual genes, ROC curves were constructed and the area under the curve (AUC) was calculated to quantify their discriminatory capacity. To harness the collective predictive power of all 9 hub genes, we developed a combined diagnostic model using multivariate logistic regression. The probability scores derived from this model were used to generate a composite ROC curve. The optimal cutoff threshold for the combined model was determined by maximizing the Youden index, balancing sensitivity and specificity.
3. Results
3.1. Identification of IS-associated genes
Batch effects between the 2 IS cohorts (and) were systematically corrected using ComBat (Fig.A and B). To identify IS-specific molecular signatures, differential expression analysis was performed on thedataset, which included transcriptomic profiles of peripheral blood mononuclear cells from 20 IS patients and 20 age- and sex-matched healthy controls. Limma R package-based analysis revealed 343 significantly upregulated genes and 335 downregulated genes in IS patients (Table S1, Supplemental Digital Content,), with pronounced dysregulation observed for key genes including JUN, TNF, COX2, and RPL7 (Fig.C and D). GSE22255 GSE58294 GSE22255 https://links.lww.com/MD/Q706↗ 2 2

Identification of differentially expressed genes in IS. PCA results of theandcohorts before (A) and after (B) batch effect correction. (C) Volcano plot of differential expression analysis between IS and control groups in thecohort. (D) Top 20 upregulated and downregulated genes with the largest expression fold-changes in IS. IS = ischemic stroke. GSE22255 GSE58294 GSE22255
3.2. Identification of DEGs in GERD
To further identify genes associated with GERD, we first performed batch effect correction on 2 GERD cohorts (and) (Fig.A and B). Subsequently, differential expression analysis was conducted on thecohort, which included 20 GERD samples and 19 control samples. The results revealed that 2537 genes were significantly upregulated and 2796 genes were significantly downregulated in GERD patients compared with the control group (Fig.C, Table S2, Supplemental Digital Content,). Among these, genes such as GOLM1, SULT1C2, and TOX2 exhibited the most pronounced upregulation, whereas SRPINB3, CRCT1, and CLCA4 displayed the most significant downregulation (Fig.D). GSE26886 GSE39491 GSE26886 https://links.lww.com/MD/Q706↗ 3 3 3

Identification of GERD-associated genes. PCA results of theandcohorts before (A) and after (B) batch effect correction. (C) Volcano plot of differential expression analysis between GERD and control groups in thecohort. (D) Top 20 upregulated and downregulated genes with the largest expression fold-changes in GERD. GERD = gastroesophageal reflux disease. GSE26886 GSE39491 GSE26886
3.3. Shared DEGs and pathways between GERD and IS
The Venn diagram revealed 52 upregulated genes shared between IS and GERD (Fig.A). KEGG pathway enrichment analysis demonstrated that these overlapping genes were predominantly enriched in pathways such as the IL-17 signaling pathway, cancer-associated pathways, viral and bacterial infection-related pathways, and parathyroid hormone synthesis, secretion, and action (Fig.B). Additionally, 57 downregulated genes co-associated with both conditions were identified (Fig.C), which were significantly enriched in pathways including glycosphingolipid biosynthesis, steroid biosynthesis, ribosomes, glycosylphosphatidylinositol-anchor biosynthesis, homologous recombination, apelin signaling pathway, neutrophil extracellular trap (NET) formation, and PI3K-Akt signaling pathway (Fig.D). 4 4 4 4

Shared upregulated/downregulated genes and pathways between IS and GERD. (A) Venn diagram showing 52 overlapping upregulated genes and (B) KEGG pathway enrichment results. (C) Venn diagram showing 57 overlapping downregulated genes and (D) KEGG pathway enrichment results. GERD = gastroesophageal reflux disease. IS = ischemic stroke, KEGG = Kyoto encyclopedia of genes and genomes.
3.4. Identification of 9 hub genes via machine learning
To further identify hub genes from these shared DEGs, we applied 3 machine learning algorithms. For IS, the LASSO, support vector machine, and random forest algorithms yielded 15, 13, and 52 hub genes, respectively (Fig.A–C). In GERD, the same algorithms identified 2, 4, and 7 hub genes, respectively (Fig.D–F). Taking the union of hub genes across algorithms, we obtained 52 hub genes for IS (Fig.G) and 9 hub genes for GERD (Fig.H). Finally, the intersection of these gene sets across both diseases revealed 9 shared hub genes: FAM46C, FUT4, ODC1, UQCRB, ID2, TSC22D1, IL17RB, AHR, and MGAT4B (Fig.I). 5 5 5 5 5

Identification of hub genes in ischemic stroke (IS) and gastroesophageal reflux disease (GERD) using multiple machine learning algorithms. (A–C) Feature selection and model optimization for identifying hub genes in IS using 3 machine learning approaches: (A) LASSO regression with cross-validation; the optimal λ value was selected based on minimum misclassification error. (B) Support Vector Machine with Recursive Feature Elimination (SVM-RFE); the number of features corresponding to the lowest 10 × cross-validation error (13 features, error = 0.18) was selected. (C) Random forest algorithm; the cross-validation error stabilizes after selecting approximately 20 genes, indicating convergence. (D–F) Feature selection and model optimization for identifying hub genes in GERD: (D) LASSO regression; the optimal λ value was determined by minimal misclassification error. (E) SVM-RFE; the optimal feature set was identified at 4 features (cross-validation error = 0.0242). (F) Random forest; cross-validation error plateaus after ~10 genes, suggesting sufficient feature selection. (G) Venn diagram of hub genes across IS algorithms. (H) Venn diagram of hub genes across GERD algorithms. (I) Venn diagram of 9 shared hub genes between IS and GERD. LASSO = least absolute shrinkage and selection operator, SVM-RFE = support vector machine with recursive feature elimination.
3.5. Validation of expression levels of the 9 hub genes
To further validate the expression patterns of the 9 shared hub genes in IS and GERD, we compared their differential expression between disease groups and control groups across all cohorts. The results revealed distinct expression profiles across different datasets: In thecohort, the expression levels of UQCRB, TSC22D1, FUT4, IL17RB, and MGAT4B were significantly upregulated in IS patients compared with controls (Fig.A). In thecohort, AHR, FUT4, and MGAT4B showed significant upregulation in IS, whereas ID2 and TSC22D1 were significantly downregulated relative to controls (Fig.B). In thecohort, all 9 hub genes exhibited significantly higher expression in GERD patients than in controls (Fig.C). In thecohort, all hub genes except UQCRB displayed significant upregulation in GERD patients compared with controls (Fig.D). GSE22255 GSE58294 GSE26886 GSE39491 6 6 6 6

Expression validation of 9 hub genes. (A–B) Expression differences of hub genes between IS patients and controls. (A)dataset; (B)dataset. (C–D) Expression differences of hub genes between GERD patients and controls. (C)dataset; (D)dataset. Group comparisons were performed using Wilcoxon rank-sum tests. ns, not significant; *< .05; **< .01; ***< .001; ****< .0001. GERD = gastroesophageal reflux disease, IS = ischemic stroke. GSE22255 GSE58294 GSE26886 GSE39491 P P P P
3.6. Diagnostic performance of hub genes in IS and GERD
Finally, we evaluated the diagnostic performance of the 9 hub genes in IS and GERD. In thecohort, all genes exhibited AUC values > 0.6 for IS diagnosis, with the combined model achieving a diagnostic accuracy of 0.9 (Fig.A). In thecohort, ID2 (AUC = 0.823) and MGAT4B (AUC = 0.856) demonstrated strong individual performance, while the combined model showed superior diagnostic capacity (AUC = 0.96) (Fig.B). For the GERD cohort (), all genes displayed AUC values > 0.9, indicating excellent diagnostic performance, and the combined model reached an AUC of 1 (Fig.C). In another GERD cohort (), except for UQCRB (AUC = 0.503), all other genes exhibited AUC values > 0.8, showing favorable performance, and the combined model further improved to an AUC of 0.92 (Fig.D). GSE22255 GSE58294 GSE26886 GSE39491 7 7 7 7

Diagnostic performance evaluation of the 9 hub genes in ischemic stroke and GERD cohorts. Receiver operating characteristic (ROC) curves demonstrating the discriminatory power of individual hub genes and their combined model for disease diagnosis. (A) ROC analysis in theischemic stroke cohort. (B) ROC analysis in theischemic stroke cohort. (C) ROC analysis in theGERD cohort. (D) ROC analysis in theGERD cohort. In each panel, ROC curves for individual genes are displayed in distinct colors in left, with the combined multivariate model represented by a red line in right. The dashed diagonal line indicates the reference line of random classification (AUC = 0.5). Corresponding AUC values for each gene and the combined model are provided in the legend. AUC = area under the curve, GERD = gastroesophageal reflux disease, ROC = receiver operating characteristic. GSE22255 GSE58294 GSE26886 GSE39491
4. Discussion
GERD and IS impose significant global health burdens due to their high prevalence and complex pathogenesis. Epidemiological evidence suggests a potential bidirectional association between these disorders, yet molecular mechanisms underlying their co-occurrence remain poorly understood. This study identifies 9 hub genes (FAM46C, FUT4, ODC1, UQCRB, ID2, TSC22D1, IL17RB, AHR, and MGAT4B) as shared molecular signatures between GERD and stroke through integrative bioinformatics analysis. These genes are implicated in critical pathways such as IL-17 signaling, glycosphingolipid biosynthesis, and PI3K-Akt signaling, suggesting potential convergence of inflammatory and immune dysregulation in both diseases.
Previous studies have independently reported dysregulated pathways in GERD and stroke. For instance, oxidative stress-related genes (e.g., HO-1, GSH) are consistently implicated in GERD pathogenesis,while stroke studies highlight neuroinflammatory pathways (e.g., TNF-α, IL-6).However, our study is the first to systematically identify overlapping molecular mechanisms using machine learning-based integration of multi-omics data. Our study identified the IL-17 signaling pathway as a potential common pathogenic regulatory hub in both GERD and IS. Despite differences in their specific modes of action, IL-17-mediated processes in both diseases share a core proinflammatory response as a key characteristic. In GERD, IL-17 exacerbates clinical symptoms by activating acid-sensing receptors in esophageal squamous epithelial cells, thereby intensifying heartburn.Additionally, IL-17 upregulates the expression of proinflammatory genes (e.g., IL-17 receptors) to propagate the inflammatory cascadeNotably, therapeutic interventions targeting the IL-17 pathway (such as STW5) alleviate inflammation by downregulating receptor expression, while omeprazole, though not directly affecting IL-17 levels, inhibits receptor activity.These findings underscore IL-17 signaling as a critical therapeutic target in GERD. In IS, IL-17 primarily drives neuroinflammation and blood–brain barrier (BBB) dysfunction. Studies demonstrate that IL-17A disrupts tight junction proteins (ZO-1, claudin-5, occludin), increasing BBB permeability. It further synergizes with proinflammatory cytokines (IL-6, TNF-α) to activate endothelial cell contraction and oxidative stress, promoting the infiltration of peripheral immune cells (neutrophils, Th17 cells) into the brain parenchyma and exacerbating neuronal damage.Moreover, IL-17-driven Th17/Treg imbalance establishes a positive feedback loop with microglial M1 polarization, sustaining inflammatory amplification. This process is closely linked to poststroke cognitive impairments (e.g., vascular dementia).Although IL-17-associated mechanisms exhibit tissue specificity (GERD primarily involves gastrointestinal inflammation, whereas IS focuses on cerebral microenvironmental dysregulation) their core pathology converges on IL-17-mediated chronic inflammation driven by Th17 cells. This shared feature suggests that IL-17 signaling may promote pathological damage across organs by enhancing systemic or local inflammatory microenvironments. For instance, IL-17-induced esophageal inflammation in GERD might indirectly disrupt cerebrovascular homeostasis via systemic inflammatory factors, while BBB breakdown in IS could facilitate inflammatory spread to the esophageal mucosal barrier. Furthermore, IL-17’s broad regulatory role in barrier function (e.g., intestinal and blood–brain barriers) may serve as a potential bridge for the comorbidity of these 2 diseases. [,] 13 14 [] 15 [] 16 [] 17 [] 17 [] 18 [] 18
Previous investigations have highlighted that GERD significantly alters multiple metabolic pathways, including glycosphingolipid metabolism.Notably, another study reported that pathways associated with extracellular matrix–receptor interaction, xenobiotic metabolism, and glycosphingolipid metabolism were markedly activated as early as 3 hours poststroke,suggesting that glycosphingolipid metabolism may undergo dynamic changes during the acute phase of stroke. Dysregulation of glycosphingolipid metabolism has been linked to increased BBB permeability, inflammatory responses, and neuronal cell damage/repair processes. For instance, accumulation of specific glycosphingolipids can induce vascular wall inflammation, thereby promoting atherosclerosis (a key risk factor for stroke).Our study identified the PI3K-Akt signaling pathway as a critical regulator in both GERD and ischemic stroke. In a reflux esophagitis rat model, activation of the PI3K/Akt pathway was associated with oxidative stress and inflammatory injury; inhibition of this pathway alleviated inflammation and oxidative damage, thereby improving symptoms of reflux esophagitis.In the early stages of IS, PI3K/Akt activation promotes neuronal survival and regeneration, mitigates brain tissue injury, and exerts neuroprotective effects. Specifically, Akt activation suppresses the expression of apoptosis-related proteins, reduces apoptotic cell death, and alleviates cerebral ischemia–reperfusion injury. Additionally, this pathway regulates poststroke angiogenesis and neural plasticity, which are critical for brain tissue repair and functional recovery. However, aberrant PI3K/Akt activation may also drive pathological outcomes, such as glial cell proliferation and excessive inflammation, exacerbating stroke-induced tissue damage.Collectively, these findings suggest that precision-targeted strategies modulating the PI3K-Akt pathway may provide novel avenues for investigating the shared pathogenic mechanisms of GERD and stroke, as well as developing dual-purpose therapeutic interventions. [] 19 [] 20 [] 21 [] 22 [,] 23 24
Our study revealed that NETs may play a critical role in both GERD and IS through shared inflammatory-thrombotic mechanisms. The excessive release of NETs could underlie the association between these 2 diseases: in GERD, chronic gastric acid reflux induces neutrophil activation and NETosis, with NET components (e.g., citrullinated histone H3, myeloperoxidase) directly damaging the esophageal mucosa and promoting local fibrosis.In IS, NETs exacerbate cerebral microthrombosis by activating platelets and promoting fibrin deposition, while simultaneously releasing proinflammatory cytokines (e.g., IL-6, TNF-α) that disrupt the BBB and amplify neurological injury.Notably, NETs may act as a bridge linking the 2 conditions: NET components in the esophageal microenvironment of GERD patients (e.g., circulating free DNA) could accelerate atherosclerosis via systemic inflammation,whereas poststroke neuroinflammation might worsen GERD symptoms through vagal nerve reflex, forming a vicious cycle.Therapeutic strategies targeting NETs (such as the PAD4 inhibitor Cl-amidine) have shown reduced mortality in animal models of ischemic stroke, highlighting the potential to explore multi-target interventions that simultaneously regulate NET-mediated inflammatory and thrombotic pathways. [,] 25 26 [] 27 [] 28 [] 29
Our integrated analysis positions ID2 as a pleiotropic regulator linking GERD and IS through shared pathways of cellular stress and immunomodulation. In IS, ID2 demonstrates a context-dependent role: while its expression is upregulated by hypoxia/ischemia, its silencing attenuates neuronal apoptosis and improves neurological outcomes, suggesting a involvement in stress-induced cell death pathways.Concurrently, ID2 is a critical determinant of T-cell fate, where its expression level dictates the balance between effector and exhausted CD8+ tissue-resident memory T cells during chronic CNS inflammation.In GERD, ID2 is essential for maintaining intestinal epithelial identity by repressing foregut transcription factors, and its deficiency leads to gastric metaplasia in the small intestine, a process akin to the mucosal changes in Barrett esophagus.This confluence of roles (regulating neuronal survival, T-cell exhaustion, and epithelial cell identity) suggests that ID2 dysregulation may simultaneously exacerbate cerebrovascular injury in IS and impair mucosal integrity in GERD, potentially creating a vicious cycle of inflammation and tissue damage via the brain–gut axis. [,] 30 31 [,] 32 33 [] 34
UQCRB, a subunit of mitochondrial complex III, may contribute to both diseases through dysregulated reactive oxygen species (ROS) accumulation and impaired energy metabolism. In GERD, UQCRB dysfunction could exacerbate esophageal mucosal injury by promoting oxidative stress. UQCRB is a key regulator of mitochondrial reactive oxygen species (mROS) production,and mROS-mediated oxidative stress is a central mechanism in GERD pathogenesis. In IS, UQCRB’s role is more complex and context-dependent. On one hand, UQCRB enhances angiogenesis through mROS-mediated HIF-1α signal transduction and VEGF expression,a process with dual effects post-ischemia (potentially beneficial for revascularization but detrimental if excessive, contributing to cerebral edema). UQCRB also positively regulates VEGFR2 signaling in endothelial cells,influencing cerebrovascular function. Notably, UQCRB is significantly downregulated in atherosclerosis,a major risk factor for IS. UQCRB upregulates COX5A, enhancing mitochondrial membrane potential, boosting ATP production, reducing ROS levels, decreasing secretion of inflammatory cytokines (TNF-α, IL-1β, IL-6), and reducing apoptosis rates,all mechanisms relevant to cerebrovascular protection. AHR, a ligand-activated transcription factor and a key regulator of inflammatory responses, may bridge GERD and IS via modulation of proinflammatory cytokines and immune cell function. In GERD, AHR may be involved in the response to environmental factors or endogenous ligands. AHR is highly expressed in gastric cancer tissues,and its signaling can be activated by various environmental pollutants, suggesting a potential role in esophageal inflammation and mucosal damage related to GERD. In IS, AHR’s mechanisms are more delineated. AHR plays a critical role in regulating neuroinflammation by influencing microglial polarization. AHR activation promotes a proinflammatory M1 phenotype, while its inhibition favors an anti-inflammatory M2 phenotype.Furthermore, AHR interacts with the TLR4 signaling pathway, modulating inflammation in cerebral ischemia/reperfusion injury.Notably, AHR inhibitors like Isorhapontigenin can alleviate CIRI by targeting this pathway.Drugs like Edaravone dexborneol ameliorate cognitive impairment by regulating the NF-κB pathway through AHR and promoting microglial M2 polarization.AHR also influences platelet activation and thrombosis,impacting IS pathology, and modulates ferroptosis in neuronal damage via the AHR-CYP1B1 axis. [] 35 [,] 36 37 [] 37 [] 38 [] 38 [] 39 [] 40 [] 41 [] 41 [] 40 [] 42 [] 43
FUT4, a key enzyme synthesizing the CD15/sialyl Lewis X glycan epitope, emerges as a significant molecular bridge between GERD and IS, primarily through its roles in cell adhesion, inflammation, and barrier dysfunction. In GERD, FUT4-mediated fucosylation critically regulates the function of adhesion molecules, such as CD44, by modifying their glycostructures, which can alter cell–cell and cell–matrix interactions.The observed upregulation of FUT4 may lead to aberrant fucosylation of membrane proteins on esophageal epithelial cells, potentially disrupting the assembly and function of tight junctions, thereby impairing mucosal cohesion and repair, and rendering the epithelium more vulnerable to acid and pepsin injury. Furthermore, FUT4 expression is regulated by and can activate the Wnt/β-catenin signaling pathway,a pathway implicated in epithelial proliferation and inflammation. Dysregulated FUT4 might thus contribute to the aberrant inflammatory and proliferative responses characteristic of chronic GERD. In IS, FUT4’s role is multifaceted and predominantly pro-pathogenic. Firstly, its involvement in endothelial activation is critical. FUT4-mediated fucosylation of selectins and other adhesion molecules on endothelial cells and leukocytes is a well-established mechanism for promoting the firm adhesion and subsequent transmigration of inflammatory cells across the BBB. Secondly, beyond inflammation, FUT4 drives malignant phenotypes in various cells through the activation of the RAF-MEK-ERK and Wnt/β-catenin signaling pathways.In neurons and glial cells, such constitutive signaling activation could promote excitotoxicity, suppress pro-survival pathways, and ultimately exacerbate ischemic cell death. [] 44 [,] 44 45 [,] 46 47
As the rate-limiting enzyme in polyamine biosynthesis, ODC1 is upregulated in macrophages in response to inflammatory stimuli and functions to attenuate the production of proinflammatory cytokines and inhibit ROS-induced apoptosis, suggesting a protective feedback mechanism.Conversely, the loss of ODC1 promotes macrophage pyroptosis, a highly inflammatory form of cell death, exacerbating organ injury in septic models.In GERD, dysregulated ODC1 could disrupt the delicate balance of mucosal repair and inflammatory responses in the esophageal epithelium. In IS, altered ODC1 activity may similarly influence microglial activation and neuronal survival post-ischemia, positioning it as a key modulator of the shared inflammatory microenvironment in both conditions. TSC22D1 emerges as a pivotal transcriptional regulator with context-dependent roles in cell fate, potentially contributing to both GERD and IS through pathways involving cellular senescence and endothelial dysfunction. This gene encodes protein isoforms that can exert opposing effects on cell survival and proliferation, with one isoform inducing apoptosis and another suppressing it.Furthermore, TSC22D1 has been identified as a critical effector in oncogene-induced senescence, a key tumor-suppressor mechanism.More recently, TSC22D1 was shown to promote liver sinusoidal endothelial cell dysfunction and drive proinflammatory M1 macrophage polarization, thereby exacerbating tissue fibrosis.In GERD, TSC22D1 dysregulation could therefore influence esophageal epithelial cell turnover, apoptosis, and the development of fibrotic strictures. In IS, its role in promoting endothelial dysfunction and a proinflammatory macrophage phenotype aligns with mechanisms of blood-brain barrier disruption and chronic neuroinflammation, highlighting its involvement in fundamental stress–response pathways common to both diseases. IL17RB, a receptor for IL-17E (IL-25), is identified as a central node in type 2 inflammatory signaling, providing a direct mechanistic link between allergic/inflammatory pathways and the pathophysiology of GERD and IS. Its expression is strongly induced by the Th2 cytokine IL-4, creating a positive autocrine loop that sustains its own expression and amplifies inflammatory signaling, often through the NF-κB pathway.IL17RB is significantly upregulated during natural allergen exposure in conditions like seasonal allergic rhinitis, underscoring its role in environmental antigen-driven inflammation.In cancer contexts, IL17RB signaling promotes stemness and confers resistance to therapy.Within the GERD–IS axis, IL17RB likely mediates the activation of immune-esophageal epithelial interactions, exacerbating reflux-related inflammation. Simultaneously, in IS, its engagement could potentiate IL-25-driven neuroinflammation and impair brain repair mechanisms, positioning IL17RB as a key amplifier of pathogenic immune responses across organs. [] 48 [] 49 [] 50 [] 51 [] 52 [] 53 [] 54 [] 55
FAM46C, identified as a noncanonical poly(A) polymerase that regulates mRNA stability and translation,may serve as a critical link between GERD and IS through its roles in maintaining cellular homeostasis. In GERD, the integrity of the esophageal mucosal barrier is paramount. FAM46C, by stabilizing mRNAs of specific target genes, could be essential for the continuous renewal and repair of the esophageal epithelium. Its downregulation, as suggested by our data, might lead to impaired stability of transcripts encoding for epithelial junction proteins or cytoprotective factors, thereby rendering the mucosa more susceptible to acid-peptic injury.Furthermore, FAM46C has been demonstrated to inhibit autophagy and promote the accumulation of protein aggregates, exacerbating ER stress.Given that ER stress is a known mechanism in GERD pathogenesis,loss of FAM46C function could amplify this stress response, leading to enhanced epithelial cell damage and impaired healing. This is supported by its established tumor-suppressor role in gastrointestinal cancers, where its loss promotes disease progression.In IS, the neuroprotective potential of FAM46C may be 2-fold. First, its role in inhibiting apoptosis is highly relevant. Studies have shown that FAM46C overexpression can suppress apoptosis induced by various stressors.In cerebral ischemia, the downregulation of FAM46C could therefore disrupt this protective function, permitting the widespread activation of apoptotic pathways in neuronal cells. Second, FAM46C’s function extends to regulating critical cellular processes like inflammation and organelle homeostasis. It has been identified as an interferon-stimulated gene that modulates inflammatory pathways,and its expression promotes mitochondrial and lysosomal components essential for cellular health.In IS, where neuroinflammation and mitochondrial dysfunction are central to secondary injury, a deficiency in FAM46C could exacerbate these damaging processes. Additionally, its interaction with and inhibition of Plk4 kinase,a regulator of centrosome duplication and the actin cytoskeleton, suggests a potential role in maintaining cytoskeletal integrity of cerebrovascular endothelial cells or neurons under ischemic stress. The convergent dysregulation of FAM46C in both GERD and IS suggests it may act as a pleiotropic regulator at the interface of epithelial integrity, neuronal survival, and inflammatory control. Its potential role in modulating the stability of mRNAs central to both diseases positions FAM46C as a compelling candidate for further mechanistic investigation into the gut–brain axis connecting GERD and stroke. [,] 56 57 [] 58 [,] 59 60 [] 7 [,] 58 61 [,] 57 62 [] 59 [] 56 [,] 63 64
Several limitations should be acknowledged. First, the reliance on publicly available datasets introduces heterogeneity in sample collection (e.g., peripheral blood vs. tissue biopsies) and population demographics. Crucially, we were unable to perform sex-stratified analyses to explore potential gender-specific molecular mechanisms, which may be important given the known epidemiological differences in both GERD and IS between men and women. Furthermore, the lack of granular clinical subtyping represents another constraint. For ischemic stroke, we could not differentiate between etiological subtypes (e.g., large-artery atherosclerosis, cardioembolism, small-vessel occlusion), nor could we distinguish between erosive and nonerosive reflux disease phenotypes within the GERD cohorts. This limits our understanding of whether the identified biomarkers and pathways are universally applicable or specific to certain disease subtypes. Second, the cross-sectional design precludes causal inference between hub gene dysregulation and disease progression. Third, the exceptional diagnostic performance (AUC approaching 1.0) of our hub genes, particularly within the discovery cohorts, must be interpreted with caution. While this indicates strong separability between case and control groups in the analyzed datasets, it raises the possibility of model overfitting, especially given the current absence of a large, completely independent validation cohort. The high AUC values may be influenced by the relatively small sample sizes and the stringent feature selection process that optimized performance on the available data. Fourth, functional validation of candidate genes in cellular/animal models was beyond the scope of this study. Future directions include prospective cohort studies with larger sample sizes to validate the temporal dynamics and generalizability of hub gene expression, alongside intervention trials targeting IL-17/PI3K-Akt axes. Crucially, validating these biomarkers in an independent, external cohort will be essential to confirm their true diagnostic utility and mitigate concerns regarding overfitting. Multi-omics integration (e.g., proteomics, metabolomics) and single-cell sequencing could elucidate cell-type-specific contributions to shared pathophysiology.
5. Conclusion
In conclusion, this study elucidates shared molecular mechanisms underlying GERD and IS through integrative machine learning and systems biology approaches. The identified hub genes and pathways provide a foundation for developing novel diagnostic biomarkers and therapeutic targets. Further experimental validation and mechanistic dissection are warranted to translate these findings into clinical practice, ultimately improving outcomes for patients with cardioesophageal multimorbidity.
Author contributions
Fang Huang, Jie Zhang. Conceptualization:
Fang Huang, Jie Zhang. Data curation:
Fang Huang, Jie Zhang. Formal analysis:
Fang Huang, Jie Zhang. Funding acquisition:
Fang Huang, Jie Zhang. Investigation:
Fang Huang, Jie Zhang. Methodology:
Fang Huang, Jie Zhang. Project administration:
Fang Huang, Jie Zhang. Resources:
Fang Huang, Jie Zhang. Software:
Fang Huang, Jie Zhang. Supervision:
Fang Huang, Jie Zhang. Validation:
Fang Huang, Jie Zhang. Visualization:
Fang Huang, Jie Zhang. Writing – original draft:
Fang Huang, Jie Zhang. Writing – review & editing: