TY - JOUR
T1 - Performance Evaluation of Supervised Learning Model Based on Functional Data Analysis and Summary Statistics
AU - Ju, Yonghan
AU - Lee, Yung Seop
N1 - Publisher Copyright:
© 1988-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - The Fourth Industrial Revolution offers an opportunity to companies to improve their competitiveness by utilizing data analytics. Particularly, real-time analysis of data gathered using various sensors is an area of interest for manufacturing companies aiming to use captured data for developing more robust monitoring systems. Therefore, trace data related to real-time processes have attracted attention in various fields. However, exploiting large amounts of trace data requires high-performance smart infrastructure. To this end, this study proposes statistics that incorporate the characteristics of trace data based on functional data analysis (FDA) and applies them to supervised learning. The empirical test results indicate that the functional principal component of FDA exhibits a significantly lower misclassification rate for the proposed model compared with that of the summary statistics-based model. Particularly, the FDA-based supervised model is less complex and exhibits less variability in terms of the number of explanatory variables based on the sample size of training data. When using summary statistics, the FDA variables were potentially selected as important variables in the least absolute shrinkage and selection operator model. The results of this study may assist various industries dealing with the aggregation of trace data for anomaly detection and intelligent factory management.
AB - The Fourth Industrial Revolution offers an opportunity to companies to improve their competitiveness by utilizing data analytics. Particularly, real-time analysis of data gathered using various sensors is an area of interest for manufacturing companies aiming to use captured data for developing more robust monitoring systems. Therefore, trace data related to real-time processes have attracted attention in various fields. However, exploiting large amounts of trace data requires high-performance smart infrastructure. To this end, this study proposes statistics that incorporate the characteristics of trace data based on functional data analysis (FDA) and applies them to supervised learning. The empirical test results indicate that the functional principal component of FDA exhibits a significantly lower misclassification rate for the proposed model compared with that of the summary statistics-based model. Particularly, the FDA-based supervised model is less complex and exhibits less variability in terms of the number of explanatory variables based on the sample size of training data. When using summary statistics, the FDA variables were potentially selected as important variables in the least absolute shrinkage and selection operator model. The results of this study may assist various industries dealing with the aggregation of trace data for anomaly detection and intelligent factory management.
KW - Cyber physics systems
KW - functional data analysis
KW - manufacturing industry
KW - trace data
UR - http://www.scopus.com/inward/record.url?scp=85203637873&partnerID=8YFLogxK
U2 - 10.1109/TSM.2024.3452947
DO - 10.1109/TSM.2024.3452947
M3 - Article
AN - SCOPUS:85203637873
SN - 0894-6507
JO - IEEE Transactions on Semiconductor Manufacturing
JF - IEEE Transactions on Semiconductor Manufacturing
ER -