Credit Card Approval Prediction using Classification Algorithms

Authors: Naman Dalsania, Devang Punatar, Deep Kothari

DOI Link: https://doi.org/10.22214/ijraset.2022.47369

Abstract

Credit risk as the boards in banks basically revolves around determining the probability of default or the creditworthiness of a customer, collapse, and the cost, assuming it happens. It is important to consider key factors and anticipate the likelihood of consumer default, given the circumstances. This is where machine learning models come into play. This allows banks and large financial institutions to predict whether their customers will default on their loans. This project uses Python to create machine-learning models with the highest possible accuracy. First, we load the dataset and take a glimpse. The data set is a combination of mathematical and non-mathematical elements, with various ranges of values and some missing points. We pre-process the dataset so that the selected ML model meets high expectations. Once the information looks good, an exploratory information check is performed to glean instincts. Finally, we created a machine learning model that can predict whether an individual\'s credit card application will be approved. This project uses the Jupyter Python programming notebook to create a machine-learning model. This project used data analytics and machine learning to determine the most important parameters for credit card approval. The machine learning model we built is based on the idea that a credit card will get approved or not, considering various factors listed in the credit cardholder\'s application. We have analysed three algorithms using precision measures including F1 score, precision, and recall. We got the highest accuracy of 90% from Gradient Boosting Classifier out of the other two models that we applied i.e., Support Vector Classifier and Adaboost classifier.

Introduction

I. INTRODUCTION

Credit approval, such as for credit cards, is crucial to the modern economy. In today's interconnected globe, even in developing nations such as India, the use of credit cards is no longer a fantasy.

Credit acceptance remains a challenge for moneylenders, as it is difficult to forecast whether consumers pose an acceptable credit risk and should be granted credit. This is especially true in emerging nations, where established rules and models from industrialized nations may not apply. Therefore, productive methods for automatic credit approval that can aid bankers in analysing consumer credit must be investigated.

Each bank receives tens of thousands of credit card applications each month. Banks have to manually skim through each of these applications, while paying close attention to these factors to determine whether the applicant is to be granted a credit card

or not. Due to the time-intensive nature of this activity and the growing likelihood of error as the number of applications increases, banks are seeking prediction-based algorithms that can do this task effectively and accurately.

In this paper, we predict if an applicant will be approved for a credit card or not using few machine learning algorithms. To begin with, we pre-processed the data and performed thorough EDA to better comprehend the factors that are crucial for training the model. In addition, we have implemented ten machine learning algorithms on these pre-processed data to identify the model that provides the most accurate results given the precision-recall trade-off.

The essay has been organized in the following way. In Section II, we have described our findings from the literature review we performed. In Section III, we explain the entire system in detail. We also demonstrated and analysed the results and compared them with a different view. Finally, we concluded the outcomes and observations.

II. LITERATURE REVIEW

A. Comparison of Different Supervised Machine Learning Classifiers to Predict Credit Card Approvals (IRJET)

This study contrasts various supervised machine learning models to forecast the likelihood that a credit card request will be accepted based on various criteria like Precision, Recall, Time, Accuracy, and F1 Score. The aim here was to identify the best classifier for automatically predicting credit card approval based on the characteristics of credit card applications. The analysis also demonstrates that every classifier performs better in one or more metrics. To improve the performance of each model, the method used hyperparameter optimization based on GridSearchCV to optimise certain parameters. The UCI Machine Learning Repository dataset used in this work was unbalanced and hence F1 score was relied on up to test the models. The classifiers used in this study are Logistic Regression, Random Forest, Decision Tree, XGBoost, Gradient Boost, Support Vector Machine (SVM), and Sequential Neural Network. Finally, based on F1 Score and AUC value, the research finds that Random Forest classifier is the best model for predicting Credit Card approvals with a F1 score of 86%. Although the research tries multiple machine learning models to test the dataset, there was no attempt made to balance out the data for better results.

B. Predicting Credit Card Approval of Customers Through Customer Profiling using Machine Learning (IJEAT)

This study focuses on forecasting credit card approval for users using a limited number of algorithms. The data was taken from bank customers in 2 ways, primary data and secondary data and then combined into one. These customer datasets were fully gathered, evaluated, and trained. These trained datasets helped in predicting whether credit card applications from customers will be approved. Since only a small number of variables were employed to determine the final decision, the training and testing accuracy of both decision tree and k nearest neighbour algorithms were roughly 99.7% and 99.6%, respectively. The training and testing accuracies of the decision and knn algorithms would alter in real time as more datasets are trained and tested and as the variables for the final choice are raised. This study however falls short in testing various other classification algorithms that could show better results in the future when more variables are taken in consideration.

C. Credit Card Approval Predictions Using Logistic Regression, Linear SVM and Naïve Bayes Classifier (IEEE)

This paper compares the prediction accuracy of Logistic Regression, Linear SVM and Naïve Bayes Classifier in the credit card approval process, with the Balanced Accuracy as the performance criteria. The dataset contains 2 types of features, numerical and categorical. Some of them include debt, age, income, education, income, etc. Credit applicants are split into "good credit" and "poor credit" categories according to the credit scoring algorithm. Based on the model implementation, Linear SVM has showcased the best prediction performance among the models, with a Balanced Accuracy of around 89%. However, the performance for each model would fluctuate slightly depending on the data processing, parameter tuning process and data features. One of the limitations to this paper is that further comprehensive factors such as the computational efficiency, reject inference and outlier handling to assess the prediction performance are not included.

Here, the number of False Positives and False Negatives decreased. This model performed well than SVC.

Since the purpose of this problem is to minimise the risk of loan default for the financial institution, the criteria that should be employed depend on the present economic climate:

During a bull market (when the economy is expanding), people feel prosperous and are typically employed. Typically, money is inexpensive, and the danger of default is low. Since the financial institution can manage the risk of default, it is not overly stringent when extending loans. The financial organisation can accommodate a small number of undesirable customers so long as most applicants are desirable (aka those who payback their credit). Ideal in this situation is a high recall (sensitivity) rate.
People lose their employment and their money through the stock market during a bear market (when the economy is contracting). Numerous individuals struggle to fulfil their financial obligations. Therefore, the financial institution tends to be more careful when extending credit or loans. The financial firm cannot afford to extend credit to customers who will be unable to repay it. The financial organisation would prefer to have fewer good customers, even if it means denying credit to some of them, than to have any bad customers. In this circumstance, precision (specificity) is desired.

Since we are currently in the longest bull market (excluding the flash crash in March 2020), we will utilise recall as our gauge.

In conclusion, gradient boosting classifier is the best performing model using ROC curve and recall.

IV. RESULTS

We are considering only those values whose class is ‘1’ i.e., when the credit card gets approved.

Table I: Comparing Accuracies Obtained For Different Algorithms

Model	Accuracy	Precision	Recall	F1-Score
Adaboost Classifier	0.76	0.75	0.78	0.77
Support Vector Classifier	0.85	0.83	0.88	0.86
Gradient Boosting Algorithm	0.90	0.90	0.90	0.90

V. FUTURE SCOPE

To further improve our system, we can use deep learning models as it can increase our accuracy. Neural networks can be used as it can discover hidden patterns and correlations in raw data, cluster and classify it, and continuously learn and improve over time. In the future, this credit card approval system will be able to be optimized and implemented in an artificial intelligence environment. By displaying the prediction result on a web or desktop application, the system can also be automated. Thus, this work has a good future scope and can be enhanced by adding other various feature for better predictions.

Conclusion

In this paper, we have mentioned various machine learning methods to predict whether a credit card will be approved for an individual or not. Several parameters were taken into consideration as these parameters make the model more effective and help institutions make better decisions to avoid fraud and losses. We applied a lot of data pre-processing techniques as good amount of data pre-processing contributes effectively to developing better performance of traditional machine learning models. During Exploratory Data Analysis, we plotted a lot of graphs and charts to study the dataset deeply so that we can get a better understanding of the dataset. This was done so that we can decide which models to apply which can perform well on this dataset and can correctly predict whether to approve a credit card or not. This prediction system can be helpful to various banks as it makes their task easier and increases efficiency as compared to the manual system which is currently used by many banks and this system is cost effective.

References

[1] Siddhi Bansal, Tushar Punjabi Comparison of Different Supervised Machine Learning Classifiers to Predict Credit Card Approvals IRJET-, Volume: 08 Issue: 03 2021 [2] Arokiaraj Christian St Hubert, R. Vimalesh, M. Ranjith, S. Aravind Raj Predicting Credit Card Approval of Customers Through Customer Profiling using Machine Learning IJEAT- Volume-9 Issue-6, April 2021. [3] Yiran Zhao University of Toronto Credit Card Approval Predictions using Logistic Regression, Linear SVM, and Naïve Bayes Classifier 2022 International Conference on Machine Learning and Knowledge Engineering. [4] A. A. Taha and S. J. Malebary, \"An Intelligent Approach to Credit Card Fraud Detection Using an Optimized Light Gradient Boosting Machine,\" in IEEE Access, vol. 8, pp. 25579-25587, 2020, doi: 10.1109/ACCESS.2020.2971354.Niloy, NH. (2018). Naïve Bayesian Classifier and Classification Trees for the Predictive Accuracy of Probability of Default Credit Card Clients. American Journal of Data Mining and Knowledge Discovery. 3. 1. 10.11648/j.ajdmkd.20180301.11. [5] Husejinovic, Admel & Ke?o, Dino & Mašeti?, Zerina. (2018). Application of Machine Learning Algorithms in Credit Card Default Payment Prediction. International Journal of Scientific Research. 7. 425. 10.15373/22778179#husejinovic. [6] D. J. C. MacKay, \"Comparison of Approximate Methods for Handling Hyperparameters,\" in Neural Computation, vol. 11, no. 5, pp. 1035-1068, 1 July 1999, doi: 10.1162/089976699300016331. [7] Song, Jong-Woo. (2008). A Comparison of Classification Methods for Credit Card Approval Using R. Journal of the Korean society for quality management. 36. Narkhede, Sarang. “Understanding Confusion Matrix.” Medium, Towards Data Science, 29 Aug. 2019, towardsdatascience.com/understanding-confusionmatrix-a9ad42dcfd62. [8] D. Prusti and S. K. Rath, \"Web service-based credit card fraud detection by applying machine learning techniques,\" TENCON 2019 - 2019 IEEE Region 10 Conference (TENCON), Kochi, India, 2019, pp. 492-497, doi: 10.1109/TENCON.2019.8929372 [9] Liu, R., 2018. Machine Learning Approaches to Predict Default of Credit Card Clients. Modern Economy, 09(11), pp.1828-1838. [10] S. Xuan, G. Liu, Z. Li, L. Zheng, S. Wang and C. Jiang, \"Random Forest for credit card fraud detection,\" 2018 IEEE 15th International Conference on Networking, Sensing and Control (ICNSC), Zhuhai, 2018, pp. 1-6, doi: 10.1109/ICNSC.2018.8361343. [11] Shung, Koo Ping. “Accuracy, Precision, Recall or F1?” Medium, Towards Data Science, 10 Apr. 2020, towardsdatascience.com/accuracy-precision-recall-orf1-331fb37c5cb9?gi=8377df893c73. [12] S. Khatri, A. Arora and A. P. Agrawal, \"Supervised Machine Learning Algorithms for Credit Card Fraud Detection: A Comparison,\" 2020 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, India, 2020, pp. 680-683, doi: 10.1109/Confluence47617.2020.9057851. [13] Brownlee, Jason. “Your First Deep Learning Project in Python with Keras Step-By-Step.” Machine Learning Mastery, 16 Apr. 2020, machinelearningmastery.com/tutorial-first-neuralnetwork-python-keras/. [14] A. A. Taha and S. J. Malebary, \"An Intelligent Approach to Credit Card Fraud Detection Using an Optimized Light Gradient Boosting Machine,\" in IEEE Access, vol. 8, pp. 25579-25587, 2020, doi: 10.1109/ACCESS.2020.2971354.

Copyright

Copyright © 2022 Naman Dalsania, Devang Punatar, Deep Kothari. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET47369

Publish Date : 2022-11-08

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here