Credit default risk prediction using boosting algorithms

Show simple item record

dc.contributor.advisor	Tan, Le Nhat
dc.contributor.author	Loan, Thai Do Phuong
dc.date.accessioned	2020-12-04T07:29:06Z
dc.date.available	2020-12-04T07:29:06Z
dc.date.issued	2019
dc.identifier.other	022004835
dc.identifier.uri	http://keep.hcmiu.edu.vn:8080/handle/123456789/3914
dc.description.abstract	Credit risk is one of the major nancial challenges that exists in the banking system and nancial institutions. This thesis proposes a Machine-Learning-based approach named Boosting Algorithms in order to solve the default risk problem. Boosting Algorithm is the general name representing for ensemble models. There are many di erent boosting algorithms, the later versions improve the shortcomings of previous one as well as be designed to work with complicated and heterogeneous data, especially to tackle sparsity and large-scale data issues. This thesis mainly introduces about AdaBoost and Gradient Boosted Decision Trees (GBDTs). While AdaBoost is the very rst version of boosting algorithm, GBDTs proves to be a clever algorithm and has a lot of potential for further improvement. Talking about GBDTs cannot help but mention three powerful implementations, which are XGBoost, LightGBM and CatBoost. Applying to the Home Credit dataset to solve the credit default risk problem, XGBoost, LightGBM and CatBoost achieved auc score larger than 0.75 (AdaBoost was more modest with 0.71) while the current highest score is 0.8. At the end of this thesis, after improving CatBoost with some advanced techniques, our model gained 0.79 auc score. Without stopping here, understanding the foundation of these algorithms can help us to continue to research and improve their performance. Key words: Default Risk, Machine Learning, Boosting Algorithms, AdaBoost, Gradient Boosting Machine, Gradient Boosted Decision Trees, XGBoost, LightGBM, CatBoost.	en_US
dc.language.iso	en_US	en_US
dc.publisher	International University - HCMC	en_US
dc.subject	Default risk; Machine Learning	en_US
dc.title	Credit default risk prediction using boosting algorithms	en_US
dc.type	Thesis	en_US

Files in this item

Name:: 022004835 - Loan, Thai Do ...
Size:: 2.816Mb
Format:: PDF

This item appears in the following Collection(s)

Bachelor Thesis - Mathematics

Show simple item record