Methods for finding one - year probability of default in credit risk modeling

View/Open

022004836 - Anh, Nguyen Duy.pdf (2.102Mb)

Date

2019

Author

Anh, Nguyen Duy

Metadata

Show full item record

Abstract

Machine Learning is becoming one of the most important elds in our world. The reason is that thanks to the growth of technology, it is getting easier to collect data of individuals, objects, or phenomenons. With the enormous volume of data, it enables scientists to predict the outcomes using corresponding variables. Furthermore, the need for prediction is becoming more and more important in our daily lives. A case in point is that a bank needs to determine whether it should lend money to a customer, and hence to come to a nal conclusion, data scientists are asked to build models based on some certain information. The information of the customer may include annual income, age, gender, marriage status, number of children, history of repaying loans, and many more types of data. Using these variables, scientists will be able to give advice to the bank about whether the person can pay back the loan. The prediction does not guarantee that it will be correct in the future. However, we can believe that there is a high chance of obtaining the same result. The example just demonstrates a fraction of how Machine Learning can be used in real life, but it shows the potential of the eld. A professor once told me that Machine Learning in Vietnam was like a "baby", it would de nitely grow up in a few years. Moreover, he said that there was a high chance this "baby" could be a "genius". In other words, Machine Learning could grow signi cantly to become the backbone of Vietnam industrial development. Because of the ability to grow in the future, I aim to give an introduction of Machine Learning, including Logistic Regression, Decision Trees, Bagging, and Random Forest, and how to apply in Credit Risk Modelling. Di erent from what had been done before by other researchers, my goal is to clarify the idea behind each approach. In R, all of the methods are written in compact functions and this can prevent students from understanding how the codes work. To avoid blindly applying the functions, the thesis will perform every single step of each method and recheck with R. To sum up, a diagram is drawn to visualize the structure of the dissertation.

URI

http://keep.hcmiu.edu.vn:8080/handle/123456789/3916

Collections

Bachelor Thesis - Mathematics