Predictive models for equipment fault detection: Application in semiconductor industry
Abstract
Predictive Maintenance (PdM) in manufacturing using Machine Learning (ML)
techniques has been received much attention of many researchers and practitioners in
recent years especially for capital-intensive industry such as semiconductor
manufacturing. Building a decision tool that detects problems in equipment or processes
in semiconductor industry as promptly as feasible to maintain high process efficiencies
is a crucial step in process control for cost reduction. However, imbalanced characteristic
in dataset is a challenge to implement a predictive model. In this study, an experiment
of comparing several different data balancing techniques for predictive maintenance for
fault detection are studied. The purpose is to build a high-quality predictive model and
investigate the influence of different approaches on data pre-processing phase and data
balancing phase to the efficiency of classification. The procedure is examined on three
different cases corresponding to three different data preprocessing approaches with five
different machine learning algorithms. For this SECOM dataset, it is found that data
with different balancing techniques outperforms when training model for this data and
this dataset just performs remarkable results if the data is balanced. More importantly,
the best model is found with the optimal performance in terms of evaluating by Recall
score, as well as highest TPR and lowest FPR. Among that, Logistic Regression trained
with SMOTE as data balancing method and Mutual Information for data preprocessing
approach provides the best performance as a classification model in this dataset. This
study's contribution is to provide a clear understanding and overview of several
predictive approaches developed by ML algorithms using this imbalanced dataset, with
a focus on semiconductor data applications. The research aims to assess the performance
of some commonly used techniques and provides an overview of machine learning as
well as predictive maintenance.