A Machine‐Learning Model of Chronological Age Based on Routine Blood Biomarkers in a Central European Population: A Potential Biological Age Marker

Jan 7, 2026Journal of aging research

Using routine blood tests and machine learning to estimate biological age in Central Europeans

AI simplified

Abstract

XGBoost achieved a (MAE) of 8.73 years when estimating chronological age from blood biomarkers.

  • Machine-learning models can use blood biomarkers to estimate a person's chronological age.
  • The analysis included over 26 million anonymized laboratory results from more than 3 million individuals.
  • Ten key blood biomarkers were identified as influential in age estimation, including alanine aminotransferase and creatinine.
  • The findings suggest a potential biological age marker, though validation against clinical outcomes is necessary.
  • Further studies are needed to explore associations with health-related outcomes and other biological age measures.

AI simplified

Key numbers

8.73 years
Performance of the XGBoost model in predicting chronological age.
26,818,974
Total Test Results
Number of valid test results in the dataset for analysis.
3 million individuals
Population Size
Size of the Central European population contributing to the dataset.

Key figures

Figure 1
Distribution of test results by age group and gender in a population sample
Frames the age distribution of test data, highlighting more samples in middle-aged groups than in younger or older groups
JARE-2025-9924922-g010
  • Panel single
    with males on the left (blue) and females on the right (red), showing number of test results in millions per age group
  • Panel single
    Highest concentration of test results appears in middle-aged groups (35–69 years) for both genders
  • Panel single
    Noticeable decline in test results is visible in younger (under 20) and older (over 85) age categories
Figure 3
Number of blood test results per patient in females versus males
Frames the distribution of sample per patient and highlights slight differences between females and males
JARE-2025-9924922-g007
  • Panel single
    Cumulative distribution functions (CDFs) show the proportion of patients with up to a given number of results, stratified by gender with females in red and males in blue; median, 75th, and 90th are marked by dashed horizontal lines
Figure 4
Number of blood test results per year grouped by gender from 2010 to 2020
Highlights a consistent higher volume of female blood test results compared to males over a decade
JARE-2025-9924922-g006
  • Panel single
    Bar chart showing of results each year from 2010 to 2020, with female counts consistently higher than male counts across all years
Figure 5
levels across age groups in males, females, and the overall population
Highlights distinct age-related creatinine patterns with a notable late-life decline in males versus steady female increase
JARE-2025-9924922-g005
  • Single line chart
    Average creatinine levels by age group with blue line for males, red line for females, and gray line for population median
  • Single line chart
    Creatinine rises sharply in early childhood, stabilizes in adolescence and early adulthood, then gradually increases from age 50
  • Single line chart
    Female creatinine levels rise steadily throughout life, while male levels decline rapidly after age 90
Figure 6
levels by age group and gender in a Central European population
Highlights a sharper urea increase in older males, spotlighting age- and gender-related biomarker changes
JARE-2025-9924922-g004
  • Panel single
    Line graph of average urea levels across age groups; male values in blue, female in red, population median in gray
  • Panel single
    Urea levels stable through childhood and early adulthood, then gradually increase with age
  • Panel single
    After age 60, urea levels rise more sharply, especially in males
  • Panel single
    In individuals over 90, males show a noticeable peak in urea levels followed by a decrease, while females show a slight continued increase
1 / 5

Full Text

What this is

  • This research develops machine-learning models to estimate chronological age using routine blood biomarkers.
  • The study utilizes over 26 million anonymized laboratory results from more than 3 million individuals in Central Europe.
  • XGBoost outperformed other algorithms with a () of 8.73 years, indicating its effectiveness in predicting age.

Essence

  • Machine-learning models can estimate chronological age from blood biomarkers with an of 8.73 years. XGBoost was the most effective algorithm tested.

Key takeaways

  • XGBoost achieved the best predictive performance with an of 8.73 years, outperforming neural networks, random forests, and ridge regression.
  • The ten most influential biomarkers identified include alanine aminotransferase (ALT), creatinine, and glucose, spanning various physiological domains.
  • The study emphasizes that while the model shows promise, it requires validation against clinical outcomes to confirm its utility as a biological age marker.

Caveats

  • The model has not been validated against clinical outcomes, limiting its current application as a biological age marker.
  • The dataset may introduce sampling bias, as individuals with chronic conditions contribute more samples than healthy individuals.
  • The model is not recommended for individuals under 20 years due to nonlinear developmental changes in youth.

Definitions

  • Mean Absolute Error (MAE): A measure of prediction accuracy, indicating the average absolute difference between predicted and actual values.

AI simplified

what lands in your inbox each week:

  • 📚7 fresh studies
  • 📝plain-language summaries
  • direct links to original studies
  • 🏅top journal indicators
  • 📅weekly delivery
  • 🧘‍♂️always free