Deep learning-based prediction of exceeding the criteria for river chlorophyll a concentrations using high-frequency data from a sensor network |
Gunhyeong Lee1, Jihoon Shin1, Young Woo Kim1, Eun Jin Han2, Chung Seok Yu2, Taeho Kim3, and YoonKyung Cha1† |
1School of Environmental Engineering, University of Seoul, Dongdaemun-gu, Seoul, 02504, Republic of Korea 2Water Environmental Research Department, National Institute of Environmental Research, Hwangyeong-ro 42, Seogu, Incheon, 22689, Republic of Korea 3Civil and Environmental Engineering Department, University of Michigan, United States |
Corresponding Author:
YoonKyung Cha ,Tel: +82-2-6490-2872, Email: ykcha@uos.ac.kr |
Received: May 14, 2024; Accepted: January 24, 2025. |
|
Share :
|
ABSTRACT |
Sensor networks enable the collection of high-frequency, large water quality datasets that provide valuable information for managing eutrophication, such as chlorophyll a (Chl-a) concentration. Deep learning models have been successfully applied to derive useful insights from large-scale environmental data. However, sensor data often contain missing values, presenting challenges for applying deep learning models. Therefore, we employed the reverse time attention model with a decay mechanism (RETAIN-D) to simultaneously conduct feature engineering, prediction, and interpretation within a single model structure. Various environmental, hydrological, and meteorological variables were utilized as input features to predict the exceedance of Chl-a criteria. Data were collected from 2018 to 2022 at four monitoring sites along the Geum River, South Korea. RETAIN-D demonstrated strong prediction performance (accuracy = 0.84–0.90, AUC = 0.69–0.91, F-measure = 0.89–0.90 on the test set) across varying Chl-a criteria. Environmental variables were more important than hydrological and meteorological for predicting the exceedance of Chl-a criteria. The contribution of input features to the model prediction was generally higher in more recent time steps when the Chl-a criterion of the target site was applied. These results highlight the effectiveness of RETAIN-D in analyzing high-frequency time series data from sensor networks. |
Keywords:
Chlorophyll a | Decay mechanism | Eutrophication | Explainable artificial intelligence | Reverse time attention mechanism | Sensor network |
|
|
|