AI model identifies prediabetes risk with high accuracy


By combining the biology of oxidative stress with advanced machine learning, researchers show how a simple blood-based measurement of antioxidants can significantly improve prediabetes risk prediction and support earlier, more targeted prevention strategies.

Study: The artificial intelligence model as a tool to predict prediabetes. Image Credit: CI Photos/Shutterstock

In a recent study published in the journal Scientific reportsResearchers developed a neural network (PNN) model combining a novel measure of total antioxidant status with traditional indicators to improve predictions of prediabetes in Indian adults. The PNN outperformed support vector machines, k-nearest neighbors, and logistic regression models trained on the same dataset, achieving 98.3% accuracy. Waist circumference and antioxidant status showed the highest predictive power, depending on the importance of model-derived features, with BMI also contributing significantly to classification performance.

Growing need for accurate detection of prediabetes

Prediabetes is a critical early stage characterized by high blood sugar levels that has not yet progressed to diabetes. Each year, approximately 5 to 10% of people with prediabetes develop diabetes, while a comparable proportion return to normal blood sugar levels. Because progression is not inevitable, early detection is essential to prevent type 2 diabetes and its associated long-term complications.

Traditional diagnostic approaches rely on blood tests and clinical assessment, but these methods can be time-consuming, expensive, and sometimes limited in their ability to predict individual risk. As data-driven tools advance, AI allows researchers to combine data from multiple sources and emerges as a promising alternative for early disease detection.

AI-based prediction models offer multiple benefits, including greater diagnostic accuracy, individualized risk profiles, and earlier intervention. These advances could significantly reduce health care costs by preventing disease progression.

Integration of oxidative stress markers into AI models

Researchers developed an AI model specifically optimized for prediabetes prediction using real-world clinical data from Indian adults. Unlike previous studies, the researchers sought to identify not only the most accurate model, but also one that closely matches clinically relevant biomarkers, including indicators of oxidative stress that may reflect the underlying pathophysiology.

This pilot study included 199 adults aged 18 to 60 years, classified as either prediabetic (n = 100) or healthy controls (n = 99) based on glycated hemoglobin (HbA1c) levels. After an overnight fast, 6 ml of peripheral blood was collected. Biochemical testing included measurements of HbA1c, fasting blood glucose, and lipid profile using standardized enzymatic assays. High-density lipoprotein (HDL), low-density lipoprotein (LDL), and very low-density lipoprotein (VLDL) values ​​were calculated.

A key addition to this dataset was the measurement of total antioxidant status, with antioxidant activity expressed as a percentage of recovery potential. Healthy individuals generally make up 20-60% of the total.

A total of 14 features, including demographic, clinical, biochemical, and oxidative stress markers, were used to train a model neural network including 14 input nodes, 10 hidden nodes, and one output node. The data was randomly divided into training, validation, and testing sets, followed by preprocessing steps such as normalization, outlier removal, and handling of missing values. The model’s performance was compared to other AI models and logistic regression. Pearson correlation and descriptive statistics were used to examine relationships between variables and assess feature relevance before model training.

Key biomarkers distinguishing prediabetes profiles

Of the 14 variables measured, six showed significant differences between individuals with and without prediabetes: age, body mass index (BMI), waist circumference, antioxidant activity, oral glucose tolerance test (OGTT), and HbA1c levels. People with prediabetes had significantly lower antioxidant capacity, indicating higher oxidative stress, and showed higher values ​​for key metabolic indicators such as HbA1c and glucose responses.

Boxplot analyzes reinforced these group differences by revealing distinct distributions for HbA1c, OGTT, and lipid markers, with fasting blood glucose showing some distributional differences although the group comparison did not reach statistical significance. Some parameters showed positively skewed distributions, suggesting a clustering of abnormal values ​​in the prediabetic group. Correlation tests found moderate associations between BMI and waist circumference, as well as modest associations between anthropometric measures and fasting glucose, which together capture overlapping but not redundant aspects of metabolic risk.

PNN model demonstrates superior predictive accuracy

The PNN model trained on these variables demonstrated very accurate classification. It achieved 97.9% accuracy on the training set and 95.2% on the testing and validation sets. The overall precision across all datasets was 98.3%, with perfect precision, high recall, and F1 scores. Compared to other models, PNN consistently outperformed alternatives, achieving the highest area under the curve (AUC) and strongest error minimization.

Implications for early risk stratification

This study successfully integrated total antioxidant status into an AI-based prediabetes prediction model for an Indian population, highlighting oxidative stress as an important and often overlooked risk marker with potential mechanistic relevance for disease development rather than a simple correlational feature.

Results confirm that waist circumference, BMI, glucose markers, and antioxidant capacity are among the most informative predictors, consistent with evidence from other populations. The PNN provided higher accuracy than traditional machine learning models and demonstrated strong potential as a rapid and inexpensive screening tool pending external validation in independent cohorts.

Strengths include the comprehensive set of biochemical and clinical features and the introduction of oxidative stress measurements, which add biological depth to risk assessment. However, the single-center design, modest sample size, and cross-sectional nature limit generalizability and the ability to track changes over time.

Overall, the PNN provides a robust framework for early detection and risk stratification in prediabetes. Future research should validate the model in larger multisite cohorts and explore integration with longitudinal clinical data for prospective clinical and public health applications while formally assessing feasibility and stability of real-world performance.

Journal reference:

  • Yesupatham, A., Das, R., Bharani, G., Shaikmeeran, M., Saraswathy, R. (2025). Artificial intelligence model as a prediction tool for prediabetes. Scientific Reports 15: 43421. DOI: 10.1038/s41598-025-23227-0,

Leave a Reply

Your email address will not be published. Required fields are marked *