Mining big data with WEKA, a case study: Mining books features for predicting book rating
Abstract
In the time that ones who have more data will have more chance at success, it is undeniable that data has played such an important role in every aspect of our lives. We use data to study, to make business, to predict weather’s condition, to diagnose people’s health status and so on. However, the information that we need to process into data is so large that our traditional way cannot handle properly. Therefore, people had invented suitable methods and tools to handle Big Data, and that is how WEKA was born.
In this thesis, it will show some aspects about Big Data and Data Mining with WEKA, an extraordinary application which provides us a lot of features and functions to work with Big Data in many ways like Classifying, Clustering, etc. My demo will make use of the Classification feature provided in WEKA.
Also, we will be provided enough knowledge about WEKA in order to see how a model is generated by certain classifier, how the model evaluate the test data and later apply on a book recommender system, which contains a large number of reviews of users in the dataset.