Applying latent dirichlet allocation in spark to recommendation systems
Abstract
From the early times, books have been considered to be one of the most valuable resources for many purposes, from history to science and even self-improvement. It was difficult back then when books were rare to find due to the small number of people who read it and have a privilege to earn a serious education. However, throughout the time, many companies are ready to print many types of books with a huge number of copies, it's easy now for someone to search and find the desired books with plenty of reviews from all over the Internet, promising a robust and dynamic world with more and more options, unlike the previous days. Even though there can be so many books and reviews about each of them, getting a perfect type of book or a specific one can be extremely difficult due to the extreme pollution of information, results in the opportunities that many customers will miss their wished product as they can't purchase the product they want.
Consequently, this research will not only analyze about the issue, but also to demonstrate the solution to organize those information about the books and each of their reviews by using a productive clustering algorithm, particularly, Latent Dirichlet Allocation in Spark, then from this process, the system will recommend the suitable type of books that matches the options or desires of the customers. This recommendation system not only helps the companies to increase their sales by giving the customers related categories of books based on their options but also for the customers to expand their joy in finding new products as they can be surrounded by nothing but their wished items.