Customer Classification Using K-Means Clustering Method

View/Open

022006131 - Phuoc, Tran Huynh Huu.pdf (7.786Mb)

Date

2020

Author

Tran, Huynh Huu Phuoc

Metadata

Show full item record

Abstract

Customer is the key component of the success of any business, especially in commercial industry. As a matter of fact, the cost of maintaining existing customers is considerably less than acquiring a new one. Thus, in order to maintain the sustainable growth, vendors creates competitive advantages by make every effort to understand shopping behavior of vendee. Easier said than done, it is difficult for merchants to know their customers with out the helps of techniques. One of the most popular techniques is Customer Segmentation, this procedure is helpful in identification the groups of similar customers base on their characteristics. Depend on the peculiarities of business, vendors have their own criterions to classify vendee. However, there is a popular classification standard, which was called as RFM Segmentation. This criterion scores customers base on three aspects: Recency – How long has it been since customer’s last activity or transaction with the brand?; Frequency – How often has the customer negotiated with the brand during a certain period?; Monetary – How much a customer has spent with the brand during a specific period?. Each customer is scored from the best which has score 5 to the worst which has score 1 for each of above aspects. After compute RFM score of customer, we apply K-mean clustering, which is a famous cluster analysis method, to classify customer base on their score. The final result of this process is expected to be a dataset of customer with the cluster they belong to. Although RFM Segmentation and K-mean clustering are very helpful for vendors in customer clustering, they also brings difficulty when doing with a huge dataset of information. In reality, vendors in commercial industry usually work with more-than- 50,000-row dataset of customers, moreover the RFM algorithm requires calculation from the beginning of dataset when new data are added to dataset, which cost much time when doing manually. To solve this problem, in this research we try to find a suitable Prediction method which can precisely classify new customers from existed sample.

URI

http://keep.hcmiu.edu.vn:8080/handle/123456789/4580

Collections

Bachelor Thesis - Mathematics