1 PG Scholar, Department of Electronics and Tele Communication Engineering, Alamuri Ratnamala Institute of Engineering and Technology, Shahapur, India.
2 Assistant Professor, Department of Electronics and Tele Communication Engineering, Alamuri Ratnamala Institute of Engineering and Technology, Shahapur, India.
International Journal of Science and Research Archive, 2025, 16(03), 417–422
Article DOI: 10.30574/ijsra.2025.16.3.2538
Received on 13 July 2025; revised on 04 September 2025; accepted on 06 September 2025
Diabetes mellitus is one of the most prevalent chronic diseases worldwide. This study investigates the application of machine learning algorithms for early diabetes prediction, a crucial step in preventing severe health complications. Two benchmark datasets, the Pima Indians Diabetes dataset and the Scikit-learn Diabetes dataset (converted to binary classification), were analyzed using Logistic Regression, Support Vector Machine (SVM with RBF kernel), and XGBoost. The methodology included preprocessing, train-test splitting, feature scaling, and model evaluation with metrics such as Accuracy, ROC-AUC, Precision, Recall, F1-score, and Confusion Matrices. Logistic Regression, serving as the baseline, performed best on the Pima dataset with 75.3% accuracy and a ROC-AUC of 0.815, demonstrating its strength in structured medical data. On the Scikit-learn dataset, XGBoost achieved the highest accuracy (76.4%), while SVM produced the best ROC-AUC (0.841), showcasing its ability to capture complex non-linear patterns. Findings highlight the importance of dataset-specific model selection and the integration of linear and non-linear approaches for reliable healthcare decision-support systems.
Diabetes Prediction; Machine Learning; Logistic Regression; Support Vector Machine; Xgboost; Medical Diagnosis
Preview Article PDF
Taaha Ansari and Vaishali M. Bagade. Leveraging machine learning for early detection of diabetes: A dataset-driven comparative study. International Journal of Science and Research Archive, 2025, 16(03), 417–422. Article DOI: https://doi.org/10.30574/ijsra.2025.16.3.2538.
Copyright © 2025 Author(s) retain the copyright of this article. This article is published under the terms of the Creative Commons Attribution Liscense 4.0







