A Machine Learning-Based Approach To Predict The Cervical Cancer
Abstract
The leading cause of high mortality rates in the world continues to be cervical
cancer. It is essential to put in place a comprehensive strategy that includes prevention,
early diagnosis, screening, and treatment programs to address this. For long-term
success, utilizing technology can offer a competitive advantage. By identifying risk
patterns from medical records, machine learning-based predictive models have
demonstrated promise in predicting patient outcomes, leading to higher survival rates
through early detection. The right machine learning techniques must be used for accurate
cancer diagnosis. In this thesis, we suggest the application of the Logistic Regression
(LR), Random Forest (RF), Support Vector Machine (SVM), Extreme Gradient
Boosting (XGBoost) algorithms and Ensemble Method (Bagging) to cervical cancer
prediction. The technique Synthetic Minority Techniques (SMOTE) is used to address
these problems because the dataset used in this study has missing values and is highly
unbalanced. This has improved detection of cervical cancer in patients. In order to
reduce the complexity of the classifier and the amount of computational work needed,
feature selection is a crucial pre-processing step that is frequently used to determine the
most important input characteristics. Grid search and Random Search are then used to
further improve the results after that. The suggested algorithms were Ensemble method
(Bagging) which combine XGBoost and LR meta-learner because they produced the
better results. With an accuracy of 98.83% and an F1 score of more than 80%, the
acquired findings are quite strong and encouraging. This could be seen as a sign that
future iterations of these synthesis techniques should be executed in an effort to improve
the accuracy of cervical cancer prediction.