Assessing the environmental determinants of micropollutant contamination in streams using explainable machine learning and network analysis

Min Jeong Ban, Dong Hoon Lee, Byung Tae Lee, Joo Hyon Kang

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Even at trace concentrations, micropollutants, including pesticides and pharmaceuticals, pose considerable ecological risks, and the increasing presence of synthetic chemical substances in aquatic systems has emerged as a growing concern. Moreover, limited machine-learning (ML) approaches exist for analyzing environmental data, and the increasing complexity of ML models has made it challenging to understand predictor-outcome relationships. In particular, understanding complex interactions among multiple variables remains challenging. This study applies and integrates explainable ML techniques and network analysis to identify the sources of micropollutants in a large watershed and determine the factors affecting micropollutant levels. We assessed the performance of four ML algorithms—support vector machine, random forest, extreme gradient boosting (XGB), and autoencoder-XGB—in predicting micropollutant levels based on the spatial characteristics of the watershed. We applied the synthetic minority oversampling technique to address the data imbalance. The XGB model demonstrated superior predictive performance, particularly for high concentration levels, achieving an accuracy of 87%–99%. Shapley additive explanations (SHAP) analysis identified temperature and rainfall as significant factors. Moreover, agricultural activities contributed to pesticide pollution, whereas urban activities contributed to pharmaceutical contamination. The network analysis corroborated the SHAP findings and revealed event-specific contamination characteristics. This included distinct discharge pathways during a dry summer event and shared pathways during a wet winter event. This approach enhances an understanding of contamination sources and pathways and subsequently aids in developing control measures and making informed policy decisions to preserve water quality in mixed land-use areas.

Original languageEnglish
Article number144041
JournalChemosphere
Volume370
DOIs
StatePublished - Feb 2025

Keywords

  • Network analysis
  • Pesticide
  • Pharmaceutical
  • SHAP
  • XGB

Fingerprint

Dive into the research topics of 'Assessing the environmental determinants of micropollutant contamination in streams using explainable machine learning and network analysis'. Together they form a unique fingerprint.

Cite this