What this is
- This study investigates the and metabolome in preterm infants at high versus low risk for ().
- It involves a longitudinal analysis of 60 preterm infants, matched for clinical factors, to identify distinct microbial and metabolic signatures.
- Findings reveal early microbial differences that correlate with neurodevelopmental outcomes, suggesting potential biomarkers for risk prediction.
Essence
- Preterm infants at high risk for exhibit distinct and metabolome profiles from birth, which correlate with later neurobehavioral outcomes. These signatures may serve as early predictive biomarkers.
Key takeaways
- High-risk infants show a dysbiotic characterized by reduced beneficial microbes and increased pathobionts. This contrasts with low-risk infants, who maintain a healthier microbial composition.
- Functional analysis reveals that high-risk infants have enriched pathways associated with inflammation and neurodegeneration, while low-risk infants exhibit pathways supporting metabolic health and development.
- Metabolomic profiling indicates that high-risk infants have impaired amino acid metabolism and disrupted neuroactive signaling, linking gut health to neurodevelopmental outcomes.
Caveats
- The study's observational design limits causal inferences, as it cannot definitively prove that differences lead to neurodevelopmental outcomes.
- While the sample size is adequate for the matched design, validation in larger cohorts is necessary to confirm findings.
Definitions
- neurodevelopmental impairment (NDI): A range of disorders affecting brain development, leading to issues like cognitive deficits and learning disabilities.
- gut microbiome: The community of microorganisms residing in the gastrointestinal tract, influencing health and disease through metabolic and immune interactions.
Simplified
Introduction
Preterm birth, defined as delivery before 37 weeks of gestation, remains a leading global cause of childhood mortality and long-term neurodevelopmental impairment (NDI), encompassing cerebral palsy, cognitive deficits, learning disabilities, and behavioral disorders (Blencowe et al., 2012). The early postnatal period is now recognized as a critical window of parallel and interconnected development for both the brain and the gut microbiome (Hill et al., 2017; Thompson et al., 2015). Microbes colonize the human body during the first moments of life and coexist with the host, with the intestinal microbiota and their metabolites playing a crucial role in programming important bodily systems such as the immune and central nervous systems during these formative windows, with lifelong structural and functional implications (Cowan et al., 2020). Consequently, perturbations of the developing gut microbiota in early life can have a lasting impact on neurodevelopment and potentially lead to NDI later in life (Borre et al., 2014).
The gut microbiome of preterm infants is distinct and vulnerable. Shaped by clinical necessities such as antibiotic exposure, parenteral nutrition, and delayed enteral feeding, it is often characterized by low diversity, dominance of facultative anaerobes, and a paucity of beneficial symbionts (Gasparrini et al., 2019). A growing body of evidence links specific gut microbial features in preterm infants to later neurobehavioral outcomes (Masi and Stewart, 2019) This is grounded in the well-established concept of the microbiota-gut-brain axis, a bidirectional communication network highlighted by extensive preclinical and clinical research (Morais et al., 2021; Cryan et al., 2019). The early postnatal period represents a critical window where the parallel development of the brain and gut microbiota occurs, leading to the hypothesis of “nested sensitive periods” wherein their interaction shapes future neurological and behavioral outcomes (Carlson et al., 2018). The gut microbiome influences the central nervous system through multiple pathways, including the production of neuroactive metabolites (e.g., short-chain fatty acids, serotonin precursors), regulation of immune system maturation, and modulation of blood-brain barrier integrity (Silva et al., 2020).
Compared to term infants, the developing microbiome of preterm infants is often characterized by reduced diversity, delayed colonization by beneficial symbionts such as Bifidobacterium, and a predominance of facultative anaerobes and potential pathobionts like Enterobacteriaceae and Klebsiella (Gibson et al., 2016; Gasparrini et al., 2019). This aberrant microbial succession is likely driven by frequent exposures in the neonatal intensive care unit (NICU), including broad-spectrum antibiotics, varying nutritional sources, and a controlled environment (Yee et al., 2019; Bokulich et al., 2016). While pioneering studies have begun to associate specific gut microbial features with neurodevelopmental outcomes in preterm infants (Pammi et al., 2017; Masi and Stewart, 2019), most have been limited by small sample sizes, a focus on later-age outcomes, or a lack of integrative multi-omics data. Consequently, the specific longitudinal gut microbial and metabolic signatures that differentiate preterm infants at high risk for NDI early in life remain poorly defined.
Therefore, a critical gap persists. We lack a comprehensive, longitudinal understanding of how the functional potential and metabolic output of the preterm gut ecosystem differentiate infants on high- versus low-risk neurodevelopmental paths from the earliest moments of life. We hypothesize that preterm infants at high risk for NDI harbor distinct gut microbiome and metabolome profiles that are evident from birth (in meconium) and become increasingly pronounced by 3 months corrected age (CA), prior to the manifestation of overt clinical symptoms. To test this, we conducted a prospective, longitudinal, matched-cohort study employing integrated shotgun metagenomics and untargeted metabolomics. Our aims were to: (i) define the taxonomic and functional trajectories of the gut microbiome associated with neurodevelopmental risk; (ii) characterize the corresponding fecal metabolomic profiles; (iii) integrate multi-omics data to construct microbe-metabolite interaction networks; and (iv) directly correlate these gut ecosystem features with standardized neurobehavioral assessments at 3 months CA.
Methods
Study design and participant enrollment
This prospective longitudinal cohort study was conducted in the Level III/IV Neonatal Intensive Care Unit (NICU) at Longgang District Maternity & Child Healthcare Hospital of Shenzhen City. The primary objective was to compare gut microbiome characteristics between risk groups at birth and at 3 months CA. For sample size estimation, we selected alpha diversity as the primary outcome measure, as it represents a global index of microbial community maturity. Based on an anticipated moderate-to−large effect size (Cohen’s d = 0.75) for alpha diversity differences, informed by previous preterm infant microbiome study (Gibson et al., 2016), a sample size of 29 infants per group was calculated to achieve 80% power at a two-sided α level of 0.05. Our final cohort of 30 matched pairs (60 infants) exceeds this requirement, providing adequate statistical power to detect meaningful differences in microbial community characteristics between the risk groups, as well as to support taxonomic and functional analyses. Between February 1, 2024, and September 30, 2025, we screened 106 infants born at <37 weeks of gestational age for eligibility. Exclusion criteria were: (1) major congenital malformations (e.g., complex congenital heart disease, gastroschisis, neural tube defects); (2) known genetic or chromosomal syndromes; (3) congenital infections (TORCH); and (4) severe perinatal asphyxia (defined as Apgar score <3 at 5 minutes or umbilical cord pH <7.00). Written informed consent was obtained from the parents or legal guardians of all participants prior to enrollment. This study was approved by the Ethics Committee of Longgang District Maternity & Child Healthcare Hospital of Shenzhen City (Approval No: LGFYKYXMLL-2024-01). All procedures were performed in accordance with the Declaration of Helsinki.
Neurodevelopmental risk stratification protocol
The primary outcome for group stratification was neurodevelopmental risk status at 3 months CA, determined using a two-tiered, standardized assessment performed by certified assessors blinded to all clinical and microbiome data.
The 3-month assessment, while early, was chosen based on established literature demonstrating the predictive validity of these tools for later neurodevelopment (Hadders-Algra et al., 1997; Einspieler and Prechtl, 2005). We acknowledge that this is an early endpoint and have planned subsequent follow-up assessments at 12 and 24 months CA to confirm long-term outcomes.
Test of infant motor performance
The TIMP is a validated, norm-referenced assessment of postural and selective motor control for infants from 34 weeks post-conceptional age to 4 months post-term (Campbell et al., 2002, 1995). It evaluates both observed and elicited movements. The total raw score was converted to a percentile rank based on published normative data. A score below the 25th percentile was classified as indicative of impaired motor performance.
General movements assessment
The GMs assessment is a qualitative, video-based evaluation of spontaneous movement patterns that is a highly sensitive and specific predictor of later cerebral palsy (Seesahai et al., 2021). At the 3-month CA time point (the “fidgety movements” period), the presence of normal fidgety movements is the key marker of neurological integrity. Their absence is classified as “definitely abnormal” GMs.
Stratification and matching criteria
High-risk group
Infants presenting with both (a) a TIMP score <25th percentile and (b) definitely abnormal GMs (absent fidgety movements).
Low-risk group
Infants presenting with both (a) a TIMP score ≥25th percentile and (b) normal fidgety GMs.
Infants with discordant results (e.g., abnormal TIMP but normal GMs, or vice versa) were excluded from the primary comparative analysis to ensure a clear phenotypic contrast between risk groups.
To isolate the effect of neurodevelopmental risk from powerful confounders of the early gut microbiome, we employed a rigorous 1:1 matched case-control design after stratification. Each infant in the HR group was manually paired with an infant from the LR pool based on the following matching variables, in hierarchical order: 1) gestational age at birth (± 1 week); 2) Sex (exact match); 3) Birth weight (± 100 g); and 4) primary feeding mode at the time of the 3-month sample collection (exclusive human milk, exclusive formula, or mixed feeding). This process yielded the final, well-phenotyped analytical cohort of 30 matched pairs (n = 60 infants: 30 HR, 30 LR).
Sample collection, processing, and storage
Fecal samples were collected at two pre-defined time points: TP1 (Birth): The first passed meconium stool was collected within 24–72 hours after birth using sterile collection kits. TP2 (3 months CA): A fecal sample was collected at the routine 3-month CA follow-up visit. All samples were immediately placed on ice, transported to the laboratory within 2 hours, aliquoted into sterile cryovials, and stored at −80 °C until subsequent DNA and metabolite extraction. Stringent contamination control protocols were followed throughout.
Shotgun metagenomic sequencing and bioinformatics analysis
DNA extraction and library preparation
Microbial genomic DNA was extracted from approximately 200 mg of fecal homogenate using the QIAamp PowerFecal Pro DNA Kit (Qiagen), which includes mechanical bead-beating to ensure lysis of Gram-positive bacteria. DNA quality and concentration were assessed via fluorometry (Qubit) and gel electrophoresis. Sequencing libraries were prepared using the Illumina DNA Prep kit and sequenced on an Illumina NovaSeq 6000 platform (2 × 150 bp paired-end reads), generating a minimum of 10 million high-quality reads per sample.
Bioinformatic processing
Raw reads were processed through a standardized pipeline:
Untargeted metabolomic profiling by LC-MS
Metabolite extraction
Metabolites were extracted from approximately 50 mg of fecal sample using a cold methanol:water (80:20, v/v) solution containing internal standards for quality control. Samples were vortexed, sonicated, and centrifuged. The supernatant was dried under a gentle stream of nitrogen and reconstituted in an appropriate injection solvent.
LC-MS analysis
Metabolic profiling was performed on a Thermo Scientific Q Exactive HF-X hybrid quadrupole-Orbitrap mass spectrometer coupled to a Vanquish Horizon UHPLC system. Chromatographic separation was achieved on a SeQuant ZIC-pHILIC column (150 × 2.1 mm, 5 μm) under HILIC conditions. The mass spectrometer operated in both positive and negative electrospray ionization modes with full-scan MS (m/z 70–1050) at a resolution of 120, 000. Pooled quality control samples were analyzed throughout the sequence to monitor instrument stability.
Metabolomics data processing
Raw data files were converted to mzML format and processed using the XCMS package in R for peak picking, alignment, and retention time correction. Metabolites were annotated by matching accurate mass (± 5 ppm) and, when available, MS/MS fragmentation spectra against public databases (HMDB, METLIN, LipidMaps) and an in-house spectral library. Identification confidence was reported according to the Metabolomics Standards Initiative levels.
Data availability
The raw metagenomic sequencing data supporting the findings of this study have been deposited in the NCBI Sequence Read Archive (SRA) under BioProject accession number PRJNA1428929. The raw metabolomics data have been deposited in the MetaboLights database under accession number MTBLS13962. Both datasets will be made publicly available upon publication of this manuscript.
Statistical analysis
Clinical and demographic data
Continuous variables were summarized as mean ± standard deviation or median [interquartile range] based on distribution and compared between matched HR and LR groups using the paired Wilcoxon signed-rank test or the Mann-Whitney U test, as appropriate. Categorical variables were presented as counts (percentages) and compared using McNemar’s test or Fisher’s exact test. A two-sided p-value < 0.05 was considered statistically significant.
Microbiome and metabolome data analysis
The primary analysis for identifying features associated with neurodevelopmental risk was performed using MaAsLin2 (Multivariate Association with Linear Models 2). MaAsLin2 applies linear (or generalized linear) models to identify associations while adjusting for specified covariates. For all models, we adjusted for the following a priori selected confounders: delivery mode (vaginal/cesarean), birth weight (continuous), sex, gestational age (continuous), and a binary indicator for culture-proven neonatal infection. For metabolomics data, normalization was performed using the total ion current. Exploratory analysis for differentially abundant taxa was also conducted using Linear Discriminant Analysis Effect Size (LEfSe), with an LDA score > 2.0 and p < 0.01. To account for multiple testing, the Benjamini-Hochberg procedure was applied to all statistical tests (including MaAsLin2, LEfSe, and correlation analyses) to control the false discovery rate (FDR). An FDR-adjusted q-value < 0.10 was set as the threshold for significant discovery, with features meeting q < 0.05 highlighted as high-confidence findings.
Network and integration analysis
Spearman rank correlation coefficients were calculated between significantly differentially abundant microbial species and metabolites. Mantel tests were performed to assess the overall correlation between microbial (Bray–Curtis) and metabolic (Euclidean) distance matrices. For network visualization, we retained correlations with |r| ≥ 0.25, as this threshold was empirically determined to balance the inclusion of biologically plausible, moderate-strength associations while reducing visual noise from very weak correlations. All reported correlations in the network also met a nominal p-value < 0.05. Correlation networks were visualized using Cytoscape v3.9.1. For integration with neurodevelopmental scores, tripartite networks were constructed incorporating key microbes, metabolites, and scores from the Alberta Infant Motor Scale (AIMS) and Griffiths Mental Development Scales.
Results
Cohort characteristics and efficacy of the matched design
From the 106 infants screened, 100 completed the 3-month CA neurodevelopmental assessment. Application of the strict, dual-assessment risk stratification criteria identified 33 HR and 41 LR infants. After implementing the 1:1 matching protocol, 30 perfectly matched pairs were formed, constituting the final analytical cohort of 60 infants (Figure 1: Study Flowchart). The success of the matching procedure is demonstrated in Table 1. The HR and LR groups were statistically indistinguishable across all major perinatal and early clinical parameters that are known to significantly influence the early gut microbiome. Crucially, there were no significant differences in gestational age, birth weight, sex distribution, mode of delivery, exposure to intrapartum antibiotics, incidence of bronchopulmonary dysplasia, or feeding mode at the time of the 3-month sampling. This rigorous matching effectively minimizes the likelihood that subsequent multi-omics differences are driven by these extrinsic clinical factors, thereby strengthening the inference that observed disparities are more directly linked to the neurodevelopmental risk phenotype itself.

Study flowchart.
| Variable | Low-risk group | High-risk group | Pvalue- |
|---|---|---|---|
| GA at birth, weeks (median [SD]) | 33.63 ± 2.31 | 33.83 ± 2.41 | 0.744 |
| Female infants (%) | 15 (50) | 11 (36.67) | 0.297 |
| Birthweight, g (mean [SD]) | 2162 ± 469.49 | 2132 ± 578.57 | 0.826 |
| Vaginal delivery (%) | 14 (46.67) | 9 (30) | 0.184 |
| Labor antibiotics (%) | 2 (6.67) | 3 (10) | 0.64 |
| Bronchopulmonary dysplasia (%) | 12 (40) | 9 (30) | 0.417 |
| Breastmilk exposure (%) | 28 (93.33) | 25 (83.33) | 0.306 |
| Neonatal Jaundice (%) | 18 (60) | 19 (63.33) | 0.791 |
Postnatal maturation is the primary driver of overall microbial community structure, obscuring risk-related differences
We first examined the macro-scale development of the gut ecosystem. Alpha-diversity increased significantly from meconium (TP1) to 3 months CA (TP2) in both the low-risk (LR) and high-risk (HR) groups (Figures 2A, B, p < 0.001 for both, paired Wilcoxon test), indicating expected ecological succession after leaving the NICU environment. Crucially, no significant differences in species richness were observed between the HR and LR groups at either time point (Figures 2C, D). Beta-diversity analysis (Principal Coordinate Analysis, PCoA) showed that the largest source of variation separated samples by postnatal age (explaining 37.49% of variance, Figure 2E), not by risk group. Genus-level compositional profiles further demonstrated that all infants transitioned from a TP1 microbiota dominated by environmental genera (e.g., Stenotrophomonas) to a TP2 microbiota dominated by classic enteric facultative anaerobes (e.g., Escherichia, Klebsiella) (Figure 2F). These results indicate that the overall timetable and structural assembly of early gut microbial colonization are conserved between HR and LR infants, and neurodevelopmental risk is not associated with a gross failure or significant delay in this ecological succession.

Gut microbial community structure and development in preterm infants stratified by neurodevelopmental risk. FNTF0, low-risk group at meconium; FNTF2, low-risk group at 3 months CA; NTF0, high-risk group at meconium; NTF2, high-risk group at 3 months CA.Alpha-diversity (Observed Species) within the low-risk (LR, A) and high-risk (HR, B) groups from birth (meconium, TP1) to 3 months CA (TP2). Box plots show median and interquartile range; p-values from paired tests.Species richness comparison between risk groupsat birth andat 3 months CA. Box plots show median and IQR; ns, not significant (Mann–Whitney U test).Principal Coordinate Analysis (PCoA) of Bray–Curtis dissimilarity (family level). Each point represents a sample, colored by time point (TP1: meconium; TP2: 3 months CA) and shaped by risk group (circle: LR; triangle: HR). Axes show percent variance explained.Genus-level relative abundance stacked bar plots, grouped by risk status and time point. Only the top 15 most abundant genera across all samples are shown; remaining genera are collapsed into “Other”. (A, B) (C, D) (C) (D) (E) (F)
Divergent taxonomic trajectories reveal an early and persistent high-risk pathobiont signature
Beneath this conserved structural framework, we discovered profound differences in bacterial species abundance using covariate-adjusted multivariate association modeling (MaAsLin2) and LEfSe analysis (Figure 3). The longitudinal design allowed four key comparisons:
Longitudinal change in LR infants (FNTF0 vs. FNTF2): Development was characterized by an increase in a heterogeneous set of taxa, including environmental species and the commensal Lactobacillus crispatus (Figures 3A, F), suggesting a resilient and adaptable ecosystem.
Longitudinal change in HR infants (NTF0 vs. NTF2): In stark contrast, the HR trajectory was marked by a significant increase in species linked to environmental reservoirs and opportunistic infections (e.g., Vibrio harveyi, Pseudomonas aeruginosa), with a notable absence of canonical beneficial commensals like Bifidobacterium (Figures 3B, E).
Risk comparison at birth (FNTF0 vs. NTF0): The meconium of later HR infants was already enriched with Pseudomonas aeruginosa and Achromobacter xylosoxidans (Figures 3C, G), indicating very early microbial seeding associated with subsequent neurological vulnerability.
Risk comparison at 3 months (FNTF2 vs. NTF2): By this time point, the gut ecosystems had crystallized into a stark opposition. The LR gut was strongly associated with Akkermansia muciniphila, while the HR gut was dominated by Klebsiella variicola (Figures 3D, H). This establishes a clear “Akkermansia-Klebsiella axis” as a core taxonomic feature distinguishing neurodevelopmental risk.

Differential abundance of gut microbial taxa between neurodevelopmental risk groups across time.Differentially abundant taxa identified by LEfSe analysis. Bar length reflects the Linear Discriminant Analysis (LDA) score (log10). Comparisons shown are: longitudinal change within LR infants (: FNTF0 vs. FNTF2), longitudinal change within HR infants (: NTF0 vs. NTF2), risk comparison at birth (: FNTF0 vs. NTF0), and risk comparison at 3 months CA (: FNTF2 vs. NTF2).Forest plots showing effect sizes (MaAsLin2 coefficient) and significance for the top 20 species associated with the same four comparisons after covariate adjustment. Key species are highlighted:(enriched in LR) and(enriched in HR). (A–D) A B C D (E–H) Akkermansia muciniphila Klebsiella variicola
Functional metagenomics uncovers a fundamental bifurcation: healthy maturation vs. dysbiotic drift
KEGG pathway analysis revealed a profound bifurcation in the functional potential of the microbial communities (Figure 4).
LR Functional Trajectory (FNTF0 vs. FNTF2): Exhibited healthy maturation, with significant expansion in core biosynthetic and metabolic housekeeping functions (e.g., amino acid metabolism, vitamin biosynthesis) (Figure 4A).
Initial Functional Divergence at Birth (FNTF0 vs NTF0): LR meconium showed enrichment of host-derived eukaryotic pathways and ‘Human Diseases’ pathways (reflecting genes involved in protein folding and oxidative stress), whereas HR meconium was characterized by prokaryotic environmental information processing and metabolic pathways (Figure 4B).
Functional Divergence at 3 Months (FNTF2 vs. NTF2): HR microbiota was enriched for pathogenicity-related pathways (flagellar assembly, secretion systems, biofilm formation) and inflammation-associated ‘Human Diseases’ pathways, while LR microbiota showed enrichment of metabolic pathways and host-supportive functions (e.g., protein digestion, amino acid metabolism) (Figure 4C).
HR Functional Trajectory (NTF0 vs. NTF2): Followed an aberrant developmental pattern, shifting from host-dominated signals at birth toward enrichment of bacterial stress response and virulence pathways at 3 months (e.g., quorum sensing, biofilm formation, antimicrobial peptide resistance), with sustained enrichment of neurodegeneration-related pathways (Figure 4D).It is important to note that these ‘Human Diseases’ pathway annotations—particularly those related to neurodegenerative conditions—are based on orthology mapping and likely reflect the presence of microbial genes involved in core biological modules (e.g., oxidative stress, protein misfolding, inflammatory signaling) that are shared across multiple contexts, rather than implying any disease-specific processes in the infants.

Divergent functional trajectories of the gut microbiome revealed by KEGG pathway analysis. Top 30 enriched pathways (|ReporterScore| > 2, p < 0.05). Bar height: enrichment magnitude; colors: enriched groups. ReporterScore sign: directional (+/−) for pairwise comparisons, non−directional (absolute value) for multi−group comparisons.Low-risk infant development (FNTF0 vs. FNTF2).Birth comparison (FNTF0 vs. NTF0).Risk comparison at 3 months (FNTF2 vs. NTF2).High-risk infant development (NTF0 vs. NTF2). (A) (B) (C) (D)
Metabolomic profiling confirms a disrupted gut-brain metabolic state in high-risk infants
Untargeted fecal metabolomics reflected the net output of host and microbial metabolism (Figure 5).
Risk Comparison at Birth (FNTF0 vs. NTF0): Differences were minimal (Figure 5A).
Risk Comparison at 3 Months (FNTF2 vs. NTF2): The HR metabolome was defined by persistent amino acid metabolic impairment and aberrant neuroactive signaling, including dysregulated “D-Amino acid metabolism” (Figure 5B), providing direct evidence that the gut microbiome likely shapes these systemic metabolic disparities.
Longitudinal Change in HR infants (NTF0 vs. NTF2): Development was aberrant, showing concurrent disturbances: (1) impaired core nutrient metabolism (depletion of amino acid pathways) and (2) aberrant neuroactive pathway enrichment (e.g., “Tryptophan metabolism”) (Figure 5C).
Longitudinal Change in LR infants (FNTF0 vs. FNTF2): Development indicated coordinated host-microbe collaboration, with enrichment of pathways like “Protein digestion and absorption” and “Endocannabinoid signaling” (Figure 5D).

Distinct fecal metabolic profiles between risk groups revealed by KEGG pathway enrichment analysis (top 30 pathways by p-value). Bubble size reflects the number of enriched metabolites; bubble color denotes the enrichment p-value (darker, more significant).Birth comparison (FNTF0 vs. NTF0).Low-risk group development (FNTF0 vs. FNTF2).High-risk group development (NTF0 vs. NTF2).Risk comparison at 3 months (FNTF2 vs. NTF2). (A) (B) (C) (D)
Risk-specific assembly of integrated microbe-metabolite interaction networks reveals ecological disarray
To move beyond individual features and understand the system-level relationships, we analyzed how microbial species and metabolites co-vary to form ecological networks using Mantel tests and correlation analysis (Figure 6). This revealed markedly different states of ecological organization.

Divergent assembly of gut microbe-metabolite interaction networks between risk groups. Mantel test networks visualize significant Spearman correlations between differentially abundant microbial species and metabolites. Nodes: Yellow circles represent bacterial species, blue squares represent metabolites. Edges: Red lines indicate positive correlations (r > 0), blue lines indicate negative correlations (r < 0); edge thickness scales with the absolute correlation coefficient (|r|). Networks are shown for the birth comparison (: FNTF0 vs NTF0) and the 3-month CA comparison (: FNTF2 vs NTF2). The longitudinal development of these networks is shown for LR (: FNTF0 vs FNTF2) and HR (D: NTF0 vs NTF2) infants. Key interpretation note in results text: At 3 months CA, LR infants establish a dense, cooperative network centered onlinked to beneficial glycerophospholipids (e.g., phosphatidylcholines). In contrast, the HR network remains sparse, lacks a beneficial hub, and is centered on pathobionts likeandwith persistent negative correlations. A B C Akkermansia muciniphila Klebsiella Escherichia
At birth (FNTF0 vs. NTF0)
The networks were sparse and dominated by negative correlations (blue edges, Figure 6A), particularly in HR infants where lower levels of metabolites like decanoylcarnitine (involved in fatty acid metabolism) were associated with higher abundance of Pseudomonas aeruginosa. This indicates early, antagonistic host-pathogen or microbe-metabolite dynamics.
At 3 months, low-risk vs. high-risk (FNTF2 vs. NTF2)
A marked ecological shift had occurred (Figure 6B). The LR infant network was dense, complex, and dominated by positive, cooperative associations (red edges). Akkermansia muciniphila emerged as a central hub, strongly and positively linked to a cluster of beneficial glycerophospholipid metabolites (e.g., phosphatidylcholines PC(34:3), PC(36:4)). This formed a coherent and robust “symbiotic Akkermansia-lipid axis.” In stark contrast, the HR network remained sparse, lacked any central beneficial hub, and was characterized by persistent negative correlations. New, concerning pro-inflammatory links emerged, such as a positive correlation between Escherichia coli and androgen sulfates, which can have immunomodulatory effects.
Longitudinal trajectory - low-risk (FNTF0 vs. FNTF2)
LR infants successfully transitioned from the sparse, antagonistic birth network to the dense, cooperative 3-month network (Figure 6C). This maturation involved the positive integration of taxa like Bifidobacterium longum into associations with bile acid metabolites, indicating developing functional mutualism.
Longitudinal trajectory - high-risk (NTF0 vs. NTF2)
HR infants failed to make this critical ecological transition (Figure 6D). Their network remained structurally simple, fragile, and centered on Enterobacteriaceae (Klebsiella, Escherichia), maintaining inflammatory features and lacking cooperative structure.
In summary, LR infants assemble a robust, metabolically integrated, and cooperative gut ecosystem, resembling a resilient ecological community. HR infants remain in a state of ecological disarray, characterized by antagonistic interactions, a lack of keystone symbionts, and pro-inflammatory host-microbe-metabolite associations. The establishment of the Akkermansia-glycerophospholipid axis appears to be a hallmark of healthy gut ecosystem maturation.
Tripartite networks directly link gut ecosystem features to neurobehavioral performance
Finally, to directly bridge the gap between omics signatures and clinical phenotype, we constructed integrated tripartite correlation networks at 3 months CA (Figure 7). These networks incorporated (i) the key differential bacterial species, (ii) the key differential metabolites, and (iii) the neurobehavioral scores (AIMS total and Griffiths sub-domains).

Tripartite networks link gut microbes and metabolites to neurodevelopmental scores. Integrated correlation networks at 3 months CA incorporate key differential bacteria (yellow circles), metabolites (blue squares), and neurodevelopmental scores (green diamonds). Edges represent significant Spearman correlations (|r| ≥ 0.25, p < 0.05), with pink for positive and light blue for negative associations.Network centered on the Alberta Infant Motor Scale (AIMS) total score.Network incorporating Griffiths Mental Development Scales sub-domains (Language, Motor, Personal-Social, Hand-Eye Coordination, Performance). (A) (B)
Network centered on the AIMS motor score
The network organized with compelling clarity around two opposing bacterial hubs (Figure 7A). Akkermansia muciniphila demonstrated a strong positive correlation with the AIMS total score and with a cluster of six glycerophospholipid metabolites. Conversely, Klebsiella variicola was negatively correlated with the AIMS score and this same beneficial metabolite cluster. Klebsiella variicola was positively associated with a distinct set of metabolites, including triterpenoids like oleanolic acid, defining a coherent “dysbiotic cluster.”
Network with griffiths developmental scores
The pattern was remarkably consistent across multiple developmental domains (Figure 7B). In the Griffiths network (Figure 7B), multiple phospholipid metabolites—including PC (18:3/18:1) and PS (16:0/20:3)—clustered with developmental scores across all sub-domains, while metabolites such as desbutyl-lumefantrine and oleanolic acid were negatively associated with these scores. Notably, indole-3-lactic acid (labeled as Imazoleacetic acid) appeared in proximity to Hand-Eye Coordination, suggesting a potential neurosupportive role. In contrast, negative correlations were observed with metabolites like drug-derived desbutyl-lumefantrine, which was more abundant in the HR group.
These integrated analyses move beyond simple association to provide a direct, systems-level link between specific gut ecosystem states and tangible neurobehavioral outcomes. They nominate the Akkermansia muciniphila glycerophospholipid axis not just as a correlate, but as a compelling, multi-omics biomarker signature for healthy neurodevelopment.
Discussion
Our prospective, longitudinal, multi-omics study delineates a distinct developmental trajectory of the gut ecosystem in preterm infants who are at high risk for NDI. This trajectory is not defined by a mere delay in microbial colonization or reduced diversity (Yassour et al., 2016), but by an early and persistent functional divergence characterized by the depletion of key commensals, expansion of pathobionts, and a collective shift in microbial community metabolism towards pathways implicated in inflammation and neurotoxicity (Sharon et al., 2016; Valles-Colomer et al., 2019). Critically, signatures of this divergent trajectory are detectable in meconium and correlate with later neurobehavioral performance, suggesting their potential utility as early biomarkers for risk prediction (Grier et al., 2017). We emphasize that these findings demonstrate a strong association, not a proven causal relationship, between this gut ecosystem state and neurodevelopmental outcomes.
The most striking taxonomic finding is the observed dichotomy between Akkermansia muciniphila and Klebsiella variicola. Akkermansia muciniphila, considered a keystone species for gut health, is associated with enhanced intestinal barrier integrity (Derrien et al., 2017)and modulated host immunity (Ottman et al., 2017), and produces anti-inflammatory metabolites like short-chain fatty acids and amino acid derivatives (Dodd et al., 2017; Cani and de Vos, 2017). Akkermansia muciniphila has garnered attention as a “next-generation probiotic” due to its established roles in enhancing intestinal barrier integrity, modulating host immunity, and producing anti-inflammatory metabolites (Derrien et al., 2017; Ottman et al., 2017; Cani and de Vos, 2017). Recent strain-specific safety evaluations have provided important groundwork: a comprehensive assessment of Akkermansia muciniphila Akk11, isolated from healthy infant feces, demonstrated no adverse effects in toxicity studies, with no transferable antibiotic resistance genes and excellent gastrointestinal stress tolerance (Wang et al., 2025). These findings support the strain-specific safety profile of Akkermansia muciniphila and provide a rationale for further investigation, although causal relationships and optimal intervention parameters for preterm infants require rigorous evaluation in experimental models and preclinical studies before clinical translation can be considered.
Its consistent depletion in the HR group is correlated with poorer neurobehavioral scores. Conversely, the dominance of Klebsiella variicola, a member of a genus notorious for nosocomial infections in the NICU (Gorrie et al., 2017), in the HR group is associated with a pro-inflammatory ecosystem state (Martin and Bachman, 2018). This “Akkermansia-Klebsiella axis” represents a core ecological association distinguishing the groups, consistent with observations of Akkermansia muciniphila depletion in other inflammatory and metabolic conditions (Dao et al., 2016), though its causal role in neurodevelopment remains to be determined.
Beyond taxonomy, our functional metagenomic data provide a insight into the associated community phenotype. The enrichment of pathways related to bacterial virulence (e.g., LPS biosynthesis) and stress response in the HR microbiome indicates a community phenotype geared towards persistence and immune activation (Rooks and Garrett, 2016; Behnsen et al., 2013). LPS is a known trigger for systemic inflammation, a risk factor for white matter injury in preterm infants (Strunk et al., 2014). These pathways contain microbial genes involved in core biological processes—such as oxidative stress and inflammation—that have been associated with neuronal injury in the literature (Xiao, 2024; Wang et al., 2023). It is important to emphasize that this observation reflects microbial gene content related to these fundamental processes, rather than indicating any disease state in the infants. However, it raises the possibility that the functional profile of the dysbiotic HR microbiome overlaps with pathways implicated in neuroinflammation, a factor that may be relevant during this critical developmental window (Bokobza et al., 2019; Drobyshevsky et al., 2024).
The metabolomic findings corroborate and extend this picture of an associated dysfunctional gut-brain metabolic state. The impaired amino acid metabolism and aberrant enrichment of neuroactive pathways (e.g., tryptophan/serotonin) in HR infants suggest a state of metabolic imbalance. The tryptophan-kynurenine pathway, which can be influenced by gut microbes, is known to produce neuroactive metabolites that have been shown to impact brain development in preclinical models (Cervenka et al., 2017).The differential regulation of microbially derived D-amino acids provides compelling circumstantial evidence that the gut microbiome is a key contributor to this metabolic disparity (Sasabe et al., 2016).
The integrated correlation networks strengthen these associations by demonstrating system-level relationships. The identification of Akkermansia muciniphila and its co-varying glycerophospholipids as a cohesive “beneficial hub” positively linked to neurodevelopmental scores provides a tangible, multi-omics biomarker signature. Glycerophospholipids are essential for brain development (Hamilton et al., 2007), and their association with a gut symbiont suggests a potential microbiota-dependent pathway for supporting neurodevelopment (Schwarzer et al., 2016). Glycerophospholipids—including phosphatidylcholines, phosphatidylethanolamines, and phosphatidylserines—are fundamental structural components of neuronal membranes and play essential roles in brain development and function (Zhang et al., 2026). As the main lipid constituents of cellular membranes, they are critical for membrane fluidity, vesicle trafficking, and synaptic transmission (Lamari et al., 2025). During perinatal brain development, glycerophospholipids provide the building blocks for rapid synaptogenesis and myelination, with specific fatty acid compositions (particularly long-chain polyunsaturated fatty acids) being preferentially incorporated into developing neural tissues. Studies in animal models of perinatal brain injury have demonstrated that alterations in glycerophospholipid metabolism accompany excitotoxic damage, with decreased levels of specific phosphatidylcholine species observed following neonatal brain lesions (Hermans et al., 2024). Conversely, dietary interventions that increase phospholipid and unsaturated fatty acid levels have been associated with reduced brain damage and improved cognitive outcomes in hypoxic-ischemic encephalopathy models (Chen et al., 2023). The coherence of our multi-omics data—linking a keystone commensal with these neurosupportive lipids and better developmental scores—suggests a potential gut microbiota-dependent pathway for supporting early brain development, though the directionality and causality of these associations remain to be experimentally established. Conversely, the “Klebsiella variicola triterpenoid” hub associated with poorer outcomes defines a clear dysbiotic pattern. Triterpenoids can have complex bioactivities, and their microbial co-association warrants investigation into their role in inflammation or cellular stress (Dzubak et al., 2006). Oleanolic acid is a pentacyclic triterpenoid abundantly present in olive oil and various medicinal plants, with documented pharmacological activities including hepatoprotective, anti-inflammatory, and antioxidant effects. Its co-occurrence with Klebsiella variicola in the high-risk group is noteworthy and invites cautious interpretation. Experimental studies demonstrate its ability to modulate gut microbiota composition and intestinal epithelial immune gene expression (Xue et al., 2022), and protective effects in neonatal contexts have been reported (Fu et al., 2024). The apparent contrast—a molecule associated with protective properties in some settings co-occurring with a dysbiotic, pro-inflammatory ecosystem in our high-risk infants—highlights the critical importance of context in host-microbe-metabolite interactions. A metabolite’s biological effects may be profoundly influenced by the accompanying microbial community, host developmental stage, and physiological milieu (Rooks and Garrett, 2016; Xue et al., 2022). In preterm infants at high neurodevelopmental risk, the presence of oleanolic acid within a Klebsiella variicola-dominated ecosystem could reflect context-dependent functionality, altered metabolism by the dysbiotic community, or a host counter-regulatory response. This underscores the need for future studies to investigate metabolite functions within their ecological context and to assess whether the oleanolic acid-neurodevelopment relationship is modified by the surrounding microbial community.
The predictive potential of meconium profiles is a significant finding. The presence of specific bacteria and metabolites at birth in infants later classified as HR indicates that the seeding of the gut ecosystem is associated with future developmental vulnerability. Meconium reflects in utero and peripartum exposures (Gosalbes et al., 2013; Chu et al., 2017), suggesting that prenatal factors may set the stage for this divergent trajectory (Arboleya et al., 2015; Gensollen et al., 2016).
Our study has limitations that contextualize the findings. The sample size, though robust for this design, warrants validation in larger, independent cohorts (Knights et al., 2014).Most importantly, the observational, matched-cohort design allows us to control for major confounders and reveal strong associations, but it cannot definitively prove causation. Residual confounding or the possibility that an innate neurodevelopmental vulnerability shapes the gut microbiome (reverse causality) remain plausible. Therefore, our findings should be interpreted as identifying a high-risk biosignature that includes a specific gut ecosystem state.
Future research must address causality. Employing gnotobiotic animal models to test the direct influence of identified microbial communities on neurodevelopment is a critical next step (Walter et al., 2020). Subsequently, carefully designed clinical trials evaluating targeted interventions (e.g., probiotics or postbiotics aimed at promoting beneficial taxa like Akkermansia muciniphila (Plovier et al., 2017) will be necessary to determine if modulating this gut ecosystem trajectory can improve outcomes (Athalye-Jape and Patole, 2019). Our work defines the specific microbial and metabolic targets for such essential future studies.
In conclusion, this study identifies a gut ecosystem trajectory associated with high risk for NDI in preterm infants, with features detectable from the earliest days of life. The observed signatures, particularly the association between Akkermansia and glycerophospholipids, may represent candidate multi-omics biomarkers for early risk prediction, though this requires validation in independent cohorts. More broadly, these findings provide a rationale for further investigation into the relationship between the preterm gut microbiome and neurodevelopmental outcomes in future studies.
Conclusion
In summary, this prospective matched-cohort multi-omics study delineates a distinct developmental trajectory of the gut ecosystem in preterm infants classified as high-risk for neurodevelopmental impairment. This trajectory is characterized not by gross failure of microbial colonization, but by early commensal depletion (Akkermansia muciniphila), pathobiont expansion (Klebsiella variicola), a community-wide functional shift toward pro-inflammatory pathways, and failure to establish cooperative microbe-metabolite networks—features detectable from meconium and persisting through 3 months of age.
It is important to emphasize that these findings establish strong longitudinal associations between this gut ecosystem trajectory and neurodevelopmental risk, not causation. The detection of risk-associated features from the first days of life and their correlation with later neurobehavioral scores positions the gut microbiome as a promising early biomarker system. The observed multi-omics signatures, particularly the Akkermansia-glycerophospholipid module positively correlated with neurodevelopment, may represent candidate biomarkers for early risk stratification, though validation in independent cohorts is essential.
The primary translational value of this study lies in identifying a multi-omics signature associated with neurodevelopmental risk, which provides a theoretical foundation for future investigations—including experimental studies to establish causality and well-designed clinical trials to evaluate whether microbiome-targeted approaches could ultimately support neurodevelopment in this vulnerable population—thereby laying the groundwork for improving neurodevelopmental outcomes in preterm infants.