Assessment of Alternative Credit Scoring Using Machine Learning Algorithm: A Behavioural Scoring Thru the Usage of E-Wallet Usage, Employment History, and Financial Aid

 




 

Low, Yee Jing (2023) Assessment of Alternative Credit Scoring Using Machine Learning Algorithm: A Behavioural Scoring Thru the Usage of E-Wallet Usage, Employment History, and Financial Aid. Final Year Project (Bachelor), Tunku Abdul Rahman University of Management and Technology.

[img] Text
RDS_Low Yee Jing_Fulltext.pdf
Restricted to Registered users only

Download (2MB)

Abstract

Currently, traditional credit scoring has difficulty to provide an accurate credit risk evaluation as the data that can be obtained is limited. Therefore, alternative credit scoring will be using the alternative data to increase the creditworthiness of consumers. Hence, it can increase the chance of borrowers to get loan approval in the future. This project will use alternative data such as E-Wallet Usage, Employment History, And Financial Aid to indicate the loan repayment ability of customers. This project will use the Jupyter Notebook to do the CRISP-DM process by using Python Language. The machine learning algorithms that are used are K-Nearest Neighbors, Random Forest, Logistic Regression, Artificial Neural Network, Support Vector Machine and XGBoost. Apart from that, the dataset of this project is conducted through an online Questionnaire which the survey form has sent to respondents through social media and email. This project is tested with three different sizes of datasets in order to see which size of the dataset can provide the best results. The three different sizes of dataset are dataset with 110 rows of data, 1000 rows of data and 10000 rows of data. At the end of this project can enable users to predict the loan repayment ability of customers by entering the values for 11 questions that were provided. This project is also able to increase consumer's creditworthiness by using alternative data. In conclusion, Random Forest Algorithm is the most suitable model for this project because it produces a highest accuracy of 0.98 since the larger dataset can provide better accuracy. The biggest limitation of this project is difficulty to find a suitable dataset which contains the exact privacy data of consumers.

Item Type: Final Year Project
Subjects: Science > Computer Science
Faculties: Faculty of Computing and Information Technology > Bachelor of Computer Science (Honours) in Data Science
Depositing User: Library Staff
Date Deposited: 21 Aug 2023 06:53
Last Modified: 21 Aug 2023 06:53
URI: https://eprints.tarc.edu.my/id/eprint/26065