Prediction of Diabetes Mellitus Using Logistic Regression and Decision Tree



Tain, Aik Siang (2022) Prediction of Diabetes Mellitus Using Logistic Regression and Decision Tree. Final Year Project (Bachelor), Tunku Abdul Rahman University College.

[img] Text
Tain Aik Siang_FullText.pdf
Restricted to Registered users only

Download (1MB)


Diabetes mellitus is a disease that occurs when blood glucose or blood sugar levels are too high or the body cannot regulate the amount of glucose in the blood stream. The Pima Indians Diabetes dataset has been used in this experimental purpose. In this study, logistic regression analysis and decision tree method will be used to aid in diabetes prediction and overcome the critical medical issues. The main objective is to study the performance of the algorithms and find out an optimal model between these two techniques in the diabetes classification. Confusion matrix with various measures such as accuracy, precision, recall and F1 score is used to describe the performance of a classification model. Grid search cross validation and parameter tuning are applied to further improve the accuracy of the model. In short, decision tree classifier will be concluded as the finalized model due to a slightly higher of average recall score; with an average precision score of 61.02%, an average recall score of 58.67%, an average F1 score of 59.26% and an average accuracy score of 73.10%.

Item Type: Final Year Project
Subjects: Science > Mathematics
Medicine > Internal medicine
Faculties: Faculty of Computing and Information Technology > Bachelor of Science (Honours) in Management Mathematics with Computing
Depositing User: Library Staff
Date Deposited: 17 Aug 2022 03:26
Last Modified: 17 Aug 2022 03:26