Home
International Journal of Science and Research Archive
International, Peer reviewed, Open access Journal ISSN Approved Journal No. 2582-8185

Main navigation

  • Home
    • Journal Information
    • Abstracting and Indexing
    • Editorial Board Members
    • Reviewer Panel
    • Journal Policies
    • IJSRA CrossMark Policy
    • Publication Ethics
    • Instructions for Authors
    • Article processing fee
    • Track Manuscript Status
    • Get Publication Certificate
    • Current Issue
    • Issue in Progress
    • Past Issues
    • Become a Reviewer panel member
    • Join as Editorial Board Member
  • Contact us
  • Downloads

ISSN Approved Journal || eISSN: 2582-8185 || CODEN: IJSRO2 || Impact Factor 8.2 || Google Scholar and CrossRef Indexed

Fast Publication within 48 hours || Low Article Processing Charges || Peer Reviewed and Referred Journal || Free Certificate

Research and review articles are invited for publication in January 2026 (Volume 18, Issue 1)

Leveraging machine learning for early detection of diabetes: A dataset-driven comparative study

Breadcrumb

  • Home
  • Leveraging machine learning for early detection of diabetes: A dataset-driven comparative study

Taaha Ansari 1, * and Vaishali M. Bagade 2

1 PG Scholar, Department of Electronics and Tele Communication Engineering, Alamuri Ratnamala Institute of Engineering and Technology, Shahapur, India.

2 Assistant Professor, Department of Electronics and Tele Communication Engineering, Alamuri Ratnamala Institute of Engineering and Technology, Shahapur, India.

Research Article

International Journal of Science and Research Archive, 2025, 16(03), 417–422

Article DOI: 10.30574/ijsra.2025.16.3.2538

DOI url: https://doi.org/10.30574/ijsra.2025.16.3.2538

Received on 13 July 2025; revised on 04 September 2025; accepted on 06 September 2025

Diabetes mellitus is one of the most prevalent chronic diseases worldwide. This study investigates the application of machine learning algorithms for early diabetes prediction, a crucial step in preventing severe health complications. Two benchmark datasets, the Pima Indians Diabetes dataset and the Scikit-learn Diabetes dataset (converted to binary classification), were analyzed using Logistic Regression, Support Vector Machine (SVM with RBF kernel), and XGBoost. The methodology included preprocessing, train-test splitting, feature scaling, and model evaluation with metrics such as Accuracy, ROC-AUC, Precision, Recall, F1-score, and Confusion Matrices. Logistic Regression, serving as the baseline, performed best on the Pima dataset with 75.3% accuracy and a ROC-AUC of 0.815, demonstrating its strength in structured medical data. On the Scikit-learn dataset, XGBoost achieved the highest accuracy (76.4%), while SVM produced the best ROC-AUC (0.841), showcasing its ability to capture complex non-linear patterns. Findings highlight the importance of dataset-specific model selection and the integration of linear and non-linear approaches for reliable healthcare decision-support systems.

Diabetes Prediction; Machine Learning; Logistic Regression; Support Vector Machine; Xgboost; Medical Diagnosis

https://journalijsra.com/sites/default/files/fulltext_pdf/IJSRA-2025-2538.pdf

Preview Article PDF

Taaha Ansari  and Vaishali M. Bagade. Leveraging machine learning for early detection of diabetes: A dataset-driven comparative study. International Journal of Science and Research Archive, 2025, 16(03), 417–422. Article DOI: https://doi.org/10.30574/ijsra.2025.16.3.2538.

Copyright © 2025 Author(s) retain the copyright of this article. This article is published under the terms of the Creative Commons Attribution Liscense 4.0

For Authors: Fast Publication of Research and Review Papers


ISSN Approved Journal publication within 48 hrs in minimum fees USD 35, Impact Factor 8.2


 Submit Paper Online     Google Scholar Indexing Peer Review Process

Footer menu

  • Contact

Copyright © 2026 International Journal of Science and Research Archive - All rights reserved

Developed & Designed by VS Infosolution