TY - JOUR
T1 - Application of automated machine learning and clustering algorithm for data-driven site characterization
T2 - Predicting the soil-rock interface
AU - Lim, Dongwoo
AU - Goo, Mijin
AU - Kim, Han Saem
AU - Ku, Taeseo
N1 - Publisher Copyright:
© 2025 Techno-Press, Ltd.
PY - 2025/9/10
Y1 - 2025/9/10
N2 - The development of underground spaces requires detailed insight into subsurface conditions, particularly the soil– rock interfaces, as this information is crucial for the effective design and safe construction of underground infrastructures. Traditional geotechnical site investigations rely mainly on direct drilling and sampling; however, these methods yield data only at specific investigation points, thus posing limitations in comprehensively capturing ground conditions across an entire area. To address this limitation, various studies have aimed to predict unknown subsurface sections using existing borehole data. Conventional methods use geospatial interpolation, while machine learning has emerged as a strong alternative. The selection and proper tuning of an appropriate model are critical to achieving optimal performance. This study applies automated machine learning, focusing on predicting soil-rock interfaces in unsampled regions using borehole data. AutoGluon is used as the machine learning framework to automate data preprocessing, model selection, hyperparameter tuning, and model ensemble. For this study, approximately 20,000 boreholes from the Seoul metropolitan area were collected and employed. Additionally, various digital maps were used to extract input variables. To capture non-linearity among input variables, Uniform Manifold Approximation and Projection were employed to reduce the dimensionality of the dataset, while Hierarchical Density-Based Spatial Clustering of Applications and Noise was implemented as the clustering algorithm. When compared to a model tuned using Bayesian optimization, AutoGluon exhibited superior predictive performance and reduced errors. Furthermore, although the focus of this study is on predicting the soil-rock interface, the methodology can be extended to the prediction of other geotechnical parameters.
AB - The development of underground spaces requires detailed insight into subsurface conditions, particularly the soil– rock interfaces, as this information is crucial for the effective design and safe construction of underground infrastructures. Traditional geotechnical site investigations rely mainly on direct drilling and sampling; however, these methods yield data only at specific investigation points, thus posing limitations in comprehensively capturing ground conditions across an entire area. To address this limitation, various studies have aimed to predict unknown subsurface sections using existing borehole data. Conventional methods use geospatial interpolation, while machine learning has emerged as a strong alternative. The selection and proper tuning of an appropriate model are critical to achieving optimal performance. This study applies automated machine learning, focusing on predicting soil-rock interfaces in unsampled regions using borehole data. AutoGluon is used as the machine learning framework to automate data preprocessing, model selection, hyperparameter tuning, and model ensemble. For this study, approximately 20,000 boreholes from the Seoul metropolitan area were collected and employed. Additionally, various digital maps were used to extract input variables. To capture non-linearity among input variables, Uniform Manifold Approximation and Projection were employed to reduce the dimensionality of the dataset, while Hierarchical Density-Based Spatial Clustering of Applications and Noise was implemented as the clustering algorithm. When compared to a model tuned using Bayesian optimization, AutoGluon exhibited superior predictive performance and reduced errors. Furthermore, although the focus of this study is on predicting the soil-rock interface, the methodology can be extended to the prediction of other geotechnical parameters.
KW - automated ML
KW - clustering
KW - data-driven
KW - soil-rock interface
KW - spatial prediction
UR - https://www.scopus.com/pages/publications/105015997242
U2 - 10.12989/gae.2025.42.5.321
DO - 10.12989/gae.2025.42.5.321
M3 - Article
AN - SCOPUS:105015997242
SN - 2005-307X
VL - 42
SP - 321
EP - 332
JO - Geomechanics and Engineering
JF - Geomechanics and Engineering
IS - 5
ER -