| Home | E-Submission | Sitemap | Contact Us |  
Environ Eng Res > Volume 30(6); 2025 > Article
Kim, Park, Singh, and Kim: AI based prediction of wastewater treatment plant effluent to supplement the minimal instream flow in the Han River

Abstract

Securing the minimum instream flow is crucial for utilizing rivers as sustainable water resources and maintaining a resilient ecosystem. For this, the effluent discharged from the J wastewater treatment plant (WWTP) near Hangang Bridge on the Han River (Seoul, South Korea) has been predicted to monitor its contribution to the minimum instream flow using a nonlinear autoregressive exogenous (NARX) model and a support vector regression model using radial basis function kernel (SVR-RBF). Firstly, the discharge flow rate of J WWTP has been predicted based on the influent water quality parameters (i.e., BOD5, COD (or TOC), TN, and TP) and local meteorological data (i.e., humidity and precipitation). Furthermore, parameters were attempted to be more accurately optimized by coupled with principal component analysis (PCA). Simulation without PCA indicated that SVR-RBF outperformed NARX, achieving superior accuracy with RMSE = 1.73%; MAE = 1.23%; and SCC = 0.53. Combining with PCA, both have improved their prediction accuracy higher than without PCA, where SVR-RBF still achieved greater accuracy than NARX. It is decided that the SVR-RBF coupling with PCA can be the most accurate way to predict the WWTP discharge and its influences on the minimum instream flow rate.

Graphical Abstract

/upload/thumbnails/eer-2025-028f6.gif

1. Introduction

Minimum instream flow provides a certain level of water habitats in the river environment to be securely preserved [1]. It can help keep the river systems resiliently sustainable for providing living, industrial, and agricultural water as well as for power generation and transportation. WWTP effluent discharged into rivers contributes to a certain portion of the minimum instream flow to be kept. By the way, urbanization and extreme precipitation patterns (e.g., flood and drought) resulting from climate change may swing up the water balance of demand, eventually declining river flows below the minimum. For example, urbanization leading to increasing impervious surfaces can reduce groundwater recharge and consequently decrease river baseflow [2]. For this reason, rivers flowing through the urban area are severely relying on WWTP discharge to satisfy their instream flow requirements [3]. Therefore, it is imperative to predict and monitor WWTP discharge to maintain the minimum instream flow in nearby rivers.
Artificial intelligence (AI) can efficiently process and analyze relevant data to simulate WWTP discharge through machine learning and deep learning technologies. Li et al. [4] used a model based on a stochastic gradient boosting regression method, supporting vector machine, and multivariate adaptive regression splines to predict WWTP discharge depending upon living and industrial water consumption considering the area of greenbelts and annual precipitation showing higher prediction. Mansour-Bahmani et al. [5] estimated the WWTP discharge in Kerman City, Iran, using two different models of multi-layered perceptron neural network (MLPNN) and genetic programming (GP). The MLPNN consists of first and second hidden layers with seven and five neurons, respectively, while the GP was composed of three genes, each with two time lag units. The GP was slightly less accurate than the MLPNN. Nonetheless, the GP was found to be more effective when applied for practical uses to real-world situations. It is because of that the MLPNN is not available to mathematically and explicitly explain the whole processes of prediction mode because it is a black box of deep learning model with the complex structure of numerous parameters interactively correlated. Furthermore, artificial neural networks are very hard to account for numerous mathematical calculation procedures because they utilize very complex algorisms in training and decision making. In this regard, the GP is still attractive for mathematically expressible to be three genes using relevant equations.
Nonlinear autoregressive exogenous (NARX) and support vector regression (SVR) are widely applied in hydrological modeling due to their ability to handle complex, non-linear relationships and dynamic system behaviors compared to conventional models [6]. They are time series prediction models whose performance has been statistically compared in various studies. Guzman et al. [7] evaluated the NARX and the SVR-RBF for predicting groundwater levels in farm wells located in the highly productive agricultural regions in the southeastern USA, using several years of daily historical data. Here, SVR-RBF showed better prediction performance as of the mean squared error (MSE) for both the summer and winter seasons. Koschwitz et al. [8] evaluated the prediction performance for the heating and cooling load in the building using four different models: single-layer NARX, two-layer NARX, SVR-RBF, and SVR-Polynomial. It showed that single-layer NARX has the highest prediction performance than SVR-RBF and SVR-Polynomial, even though both two-layer NARX and SVR-RBF conducted better prediction performance. It is noteworthy that two-layer NARX showed lower performance than single-layer NARX referring to MSE and mean absolute error (MAE). It was because of the increased depth of neural network architecture with the additional hidden layers in NARX, which can in turn ascribe to declining its performance.
Due to rapid urbanization and climate change, the need for countermeasures for proper river flow maintenance is becoming more urgent. In this regard, in large cities, river maintenance flows have been more dependently influenced by WWTP discharge flow. Therefore, it was requisitely needed to predict the minimum instream flow depending upon discharge from J WWTP (Seoul, South Korea) which can help implement more sustainable river management at least keeping up the minimum instream flow at the Han River near the Hangang Bridge located upstream of the Han River (63.5 m3/s, recorded by the Ministry of Environment, South Korea in 2006). To predict daily discharge from J WWTP by employing the influent water quality parameters and local meteorological data, the NARX and the SVR models adopting the radial basis function kernel (SVR-RBF) were implemented solely and along with principal component analysis (PCA). Their prediction performances were then statistically compared by root mean square error (RMSE), mean absolute error (MAE), and Spearman’s rank correlation coefficient (SCC). The entire modeling was run on MATLAB version R2022b. In the end, it will provide valuable insights as a decision-making tool as to whether WWTP discharge can come up with minimum instream flow while decision-makers could formulate future water security strategies.

2. Materials and Methods

2.1. Data Manipulation

2.1.1. Description of the dataset

J WWTP located upstream of the Han River is discharging effluents of approximately 1.59 million tons per day near the Hangang Bridge, as shown in Fig. 1(a). The raw sewage has been treated in three different ways, which are the conventional activated sludge process (CAS), anaerobic-anoxic-oxic (A2O) process, and modified Ludzack Ettinger (MLE) process, as shown in Fig. 1(b).
To predict discharge from the WWTP, daily influent water quality parameters and meteorological data obtained at the J WWTP from January 2016 to December 2021 were used as input variables in the model. The influent water quality parameters used in this study were five-day biological oxygen demand (BOD5), chemical oxygen demand (COD), total organic carbon (TOC), total nitrogen (TN), and total phosphorous (TP) provided by the Seoul Open Plaza [9]. Meteorological data such as humidity and precipitation were collected from the Korea Meteorological Administration [10]. Among these organic contaminant indicators, COD has been used solely before January 2021, and since after TOC was used as an input parameter for this simulation, as part of the Water Environment Conservation Act reinforced in January 2021 substituting COD with TOC. Aside from this, the flow rate of inflow and outflow were assumed to be the same under the premise that the wastewater loss during the process was negligibly small so the flow rate of influent in the WWTP obtained from Seoul Open Plaza was used for verifying those of model prediction.
Table 1 summarizes the correlation of water quality parameters and meteorological data over discharge flow rates for 5 years. There was a moderate negative correlation between the influent quality parameters and the discharge, with correlation coefficients (R) of −0.32 ~ −0.47, ascribing to the dilution effect of rainwater flowing into the WWTP process line built in a combined sewer during rainfall events so that the concentration of them has been subsequently declined. For this reason, there was a positive correlation between humidity or precipitation and the discharge especially while raining [11].

2.1.2. Data preprocessing

There might be noise and missing data, which can lead to inaccurate prediction. To overcome these, data with a standard deviation (σ) more than 3σ from the mean were excluded from the simulation [12], while linear interpolation was made to replace missing data [13]. Moreover, the model will be needed to compare different units of operating parameters, so data was standardized to normalize in the distribution with a mean of 0 and a standard deviation of 1 according to Eq. (1) [14].
(1)
x=x-mσ
where χ is the observed data, m is the mean, and σ is the standard deviation.
The data-driven NARX and SVR models require at least training and testing datasets, so these preprocessed data were divided into two groups: for training from 2016 to 2020 and testing from 2021 onwards.

2.2. Prediction Performance Evaluation

The prediction performance of the model developed in this study was evaluated by three different statistical measures: root mean square error (RMSE), mean absolute error (MAE), and Spearman’s rank correlation coefficient (SCC), as shown in Eqs. (2)(4). RMSE measures the accuracy relative to the peak predicted data against the actual data, while MAE measures the accuracy associated with the overall predicted data against the actual data [15]. SCC determines the correlation between predicted and actual data [16]. The smaller the RMSE and MAE values and the closer the SCC value to 1, the better the predictive performance is obtained.
(2)
RMSE=1nΣi=1n(yi-y^i)2
(3)
MAE=1nΣi=1n(yi-y^i)
(4)
SC​C=Σi=1n(xi-x¯)(zi-z¯)Σi=1n(xi-x¯)2Σi=1n(zi-z¯)2
Here, n is the number of samples, yi is the ith observed data, ŷi is the ith predicted data xi, is the rank of the ith data in variable x, χ̄ is the mean of data x, zi is the rank of the ith data in variable z, and is the mean of the variable z, respectively.

2.3. NARX Neural Network

The NARX model is a type of recurrent neural network (RNN) among artificial neural networks (ANN), which is more effective than other ANN models as it can achieve faster convergence in reaching the optimal weights of connection between neurons and inputs. It is also much better at finding out long-time dependencies than conventional RNN [17]. In particular, the output time lag of NARX can also help to discern long-time dependencies that resulted from the gradient vanishing problem related to an increasing number of neurons and layers [18, 19]. The NARX model is formulated in Eq. (5):
(5)
y(t+1)=f[y(t),y(t-1),,y(t-ny),u(t),u(t-1),,u(t-nu)]
where f is the nonlinear function, y is the output variable, t is the time, ny is the output time lag, u is the input variable, nu is the input time lag, and y(t+1) is the one day ahead of predicted data. The NARX model optimized the hyperparameters of input time lag (nu), output time lag (ny), training algorithm, and the number of hidden neurons.

2.4. SVR-RBF Model

Support vector machine (SVM), one of the machine learning models used in binary classification, has evolved into the SVR model [20, 21]. The SVR simulation starts from given training data such as {(xi, yi), …, (xm, ym)}, xm ∈ ℝn, ym ∈ ℝ, where xm is the input variables (i.e., BOD5, COD/TOC, TN, TP, humidity and precipitation), ym is the output variable (i.e., discharge), and n is the number of data. It will find a regression function f(x) that has the maximum deviation ɛ from the target actually obtained for all training data as in Eq. (6). Therein, errors smaller than ɛ are ignored, while errors larger than ɛ are not allowed.
(6)
f(x)=w·x+b         wRn,bR
where w is the weight vector and b is the bias term.
To achieve as flat as possible of f(x), the square of the norm of weight vector w must be minimized (Eq. (7)). Thus, the SVR can be formulated as a convex quadratic optimization model.
To minimize:
(7)
12w2
Subject to:
(8)
w·xi+b-yiɛ
(9)
yi-w·xi-bɛ
In Eqs. (7)(9), some degree of error can be ideally allowed so that analogous to the soft margin loss function, slack variables, denoted as ξi and ξi*, are introduced. Consequently, the SVR with some errors allowed can be formulated as follows:
To minimize:
(10)
12w2+CΣi=1m(ξi+ξi*)
Subject to:
(11)
w·xi+b-yiɛ+ξi
(12)
yi-w·xi-bɛ+ξi*
(13)
ξi0,ξi*0
The constant C (C > 0) reflects the trade-off between the flatness of f(x) and the extent to which deviations are more tolerable than ɛ. This can induce to -insensitive loss function of |ξ|ɛ, as shown in Eq. (14).
(14)
ξɛ=0,if ξɛɛ
Otherwise, |ξ|ɛ can be formulated to Eq. (15).
(15)
ξɛ=ξ-ɛ
The certain sphere outranged from +ɛ and −ɛ should need to costly spend such that these of outranged points are linearly penalized to their own burden of the deviation (Fig. S1). On the other hand, if the point is within the range of +ɛ and ɛ, the error can be ignored.
Meanwhile, the SVR model can be linked with the Lagrangian dual function once their objective function and constraints are proposed. It will be induced from the Lagrangian primal function as follows:
To minimize
(16)
Lp=12w2+CΣi=1m(ξi+ξi*)=Σi=1m(ηiξi+ηi*ξi*)-Σi=1mαi(ɛ+ξi-yi+w·xi+b)-Σi=1mαi*(ɛ-ξi*+yi-w·xi-b)
where Lp is a Lagrangian primal function, and ηi, ηi*, αi, and αi* are Lagrange multipliers that must satisfy constraints to be positive (Eq. (17)).
(17)
αi,αi*,ηi,ηi*0
It can bring about the Lagrangian dual function after the partial derivatives of b, w, ξi, and ξi* in the Lagrange equations referring to as the theory of duality is to be zero [22].
(18)
Lpb=Σi=1m(αi-αi*)=0
(19)
Lpw=w-Σi=1m(αi-αi*)xi=0
(20)
Lpξi=C-αi-ηi=0
(21)
Lpξi*=C-αi*-ηi*=0
By substituting Eqs. (18)(21) with Eq. (16), the Lagrangian primal function was converted to the Lagrangian dual function, as in Eq. (22).
To maximize
(22)
LD=-12Σi,j=1m(αi-αi*)(αj-αj*)xi·xj-ɛΣi=1m(αi+αi*)+Σi=1myi(αi-αi*)
Subject to:
(23)
Σi=1m(αi-αi*)=0,   αi,αi*[0,C]
In addition, Eq. (19) can be rewritten as follows:
(24)
w=Σi=1m(αi-αi*)xi,so f(x)=Σi=1m(αi-αi*)xi·x+b
The SVR model with errors function is finally given by Eq. (24), which can be further aligned to a nonlinear SVR with a kernel function of an appropriate mapping mode (φ: ℝn → ℝnh. Therefore, the training data nonlinearly arrayed in a high dimensional feature space can be treated linearly. It consequently simplifies the complexity of the algorithm, which in turn enables to achieve high prediction performance [23, 24]. The Lagrangian dual expression of the nonlinear problem solver parameterized with a kernel function is as follows:
To maximize
(25)
LD=-12Σi,j=1m(αi-αi*)(αj-αj*)K(xi,xj)-ɛΣi=1m(αi+αi*)+Σi=1myi(αi-αi*)
Subject to:
(26)
Σi=1m(αi-αi*)=0,αi,αi*[0,C]
From these, Eq. (19) can be realigned as follows:
(27)
w=Σi=1m(αi-αi*)xi,so f(x)=Σi=1m(αi-αi*)K(xi,x)+b
Here, K(xi, x) is the kernel function which is the inner product of and xi. Furthermore, various kernel functions can be still used for nonlinear problems solvers, each providing its own unique nonlinear mappings [25]. Table 2 shows various types of kernel function equations with their hyperparameters.
Among them, the radial basis function (RBF) kernel was used in this study. The SVR model coupled with the RBF kernel, requires two hyperparameters, C and γ [26], of which C is related to controlling the trade-off between errors and margins of the trained data, and γ is related to controlling the degree of nonlinearity for the RBF kernel function [27]. A two-step grid search method was used to optimize these two hyperparameters [28].

3. Preprocessing for Modeling

3.1. Single NARX Model

3.1.1. Determination of time lag

Cross-correlation analysis and average mutual information (AMI) have been used to determine optimal time lag. The cross-correlation analysis can obtain the degree of dependency between two time series, which in turn determines optimal time lag [29, 30], but it can only measure the dependency on linearity rather than nonlinearity. In contrast, AMI can observe the extent of both dependencies [31]. Nonlinearity existed on the WWTP discharge data, so the time lag of the NARX model was optimized by employing the AMI, from which input time lag (nu) and output time lag (ny) were set to be equal [32].
In the long run, the AMI determined the mutual information parameter, I(x(t), x(t + τ)), between the origin of the time series x(t) and the followed time series x(t + τ) [33], as shown in Eq. (28):
(28)
I(x(t),x(t+τ)=Σi,jpij(τ)log (pij(τ)pipj)
where pi is the probability that x(t) is in bin of the histogram constructed from the data points in x, pj is the probability that x(t + τ) is in the bin j of the histogram constructed from the data points in x, and pij(τ) is the probability that x(t) is in bin and x(t + τ) is in bin where τ is a controlling factor.
From this, optimal input time lag (nu) and output time lag (ny) were determined to be 6 as the first minimum local AMI was turned up at 6 (Fig. S2). It can enable the NARX model to better predict WWTP discharge flow rate.

3.1.2. Training algorithms & hidden neurons

To discern the optimal training algorithm, a total of nine training algorithms were comparatively evaluated: levenberg-marquardt (LM), broyden-fletcher-Goldfarb-shanno quasi-newton (BFGS), resilient back propagation (RP), scaled conjugate gradient (SCG), conjugate gradient with powell-beale restarts (CGB), conjugate gradient with fletcher-reeves updates (CGF), conjugate gradient with polak-ribiere updates (CGP), one-step secant (OSS), and gradient descent with momentum and adaptive learning rate (GDX) [34]. Each algorithm was reiteratively run 10 times [35]. The WWTP discharge flow rate predicted by the NARX model was comparatively trained with nine different algorithms as shown in Fig. 2. It revealed that SCG gave the lowest median flow rate of RMSE of 114,280 m3/day in the range of 104,800 to 117,800 m3/day, whereas RP resulted in the highest median of 126,695 m3/day in the range of 111,010~251,060 m3/day. The median RMSE flow rates for nine algorithms were in the following ascending order: SCG < OSS < CGP < BFGS < CGB < CGF < LM < GDX < RP, indicating that the SCG algorithm has the best prediction performance while the RP has the worst [36]. The SCG algorithm is one of the conjugate gradient methods that adopts a step size scaling optimization approach to deter a time-consuming line search at each training iteration, thereafter making it more effectively functioning than other second-order algorithms as consisted with previous studies [3739].
Determining the optimal number of hidden neurons in a neural network is a difficult task because there is no general formula given. Nevertheless, Yotov et al. [40] found the optimal number of neurons when the Jacobi matrix in their loss function has been trained by three different algorithms of LM, SCG, and BFGS. The optimal hidden neuron number was determined in the same manner as suggested by Yotov et al. [40] together with a trial-and-error method. As a result, a neural network consisting of an output neuron and a single layer can have the number of hidden neurons as determined by Eq. (29):
(29)
qm-1n+2
where q is the number of hidden neurons, m is the number of training samples, and n is the number of input stimuli.
In the NARX model trained with the SCG algorithm, consisting of a single output neuron (i.e., WWTP discharge) and a single layer, m and n were 1825 and 6, respectively, so the number of hidden neurons (q) was determined to be 228 or less. Within this limitation, the optimal number of hidden neurons was ultimately determined to be 5 throughout the trial-and-error method.

3.2. PCA-NARX Model

The PCA-NARX model was established by combining PCA with NARX to improve the prediction performance of the model [41, 42]. The PCA enables dimensionality reduction to curtail the high dimensional input variables into low dimensional input variables without losing intrinsic information, which can subsequently help alleviate multicollinearity [43]. It will derive existing input variables to create new, uncorrelated input variables called principal components (PCs). As this, PC1 represents the most variation in the total input variables of data, and PC2 represents the second most variation in those data. In other words, the input variables (X1, X2, …, Xn) were transformed into different types of input variables (U1, U2, …, Un) through PCA, as shown in Fig. 3.
In this way, PCA pertained to preprocessed data of BOD5, COD/TOC, TN, TP, humidity, and precipitation. As of PCs obtained from these, U1 represented 58.45% of the contribution for the total input variance, while U2, U3, U4, U5, and U6 were descending ordered by 17.02%, 12.64%, 6.43%, 3.07%, and 2.39%, respectively. In the following, PCA-NARX was run for six different PCA modes coupled with different combinations of PCs [32]. That is, the PCA1 was run on the PC (U1), the PCA 2 on two PCs (U1 and U2), and the PCA3 on three PCs (U1, U2, and U3), and so on. From this, there was the cumulative variance contribution rate to be ascendingly ordered as of 58.45%, 75.47%, 88.11%, 94.54%, 97.61%, and 100%, respectively.
Fig. 4 shows the RMSE on WWTP discharge flow rate simulated by PCA-NARX with regard to varying PCAs. Among them, the PCA3 model had the lowest RMSE of 107,930 m3/day, suggesting that the PCA3 model can be best fitted into the PCA-NARX model under the given condition. Upon PCA3, the RMSEs were all greater in the case of both smaller and bigger numbers of PCs adopted. To clarify the fitness of PCAs upon modelling, the coefficient of determination (R2) was compared during training and testing. It very narrowly ranged from 0.68 to 0.71 and 0.68 to 0.72, respectively, meaning that there were no models to be over- or underestimated. Finally, the hyperparameters for the PCA-NARX simulation were realigned: input time lag = output time lag = 6, training algorithms = SCG, and number of hidden neurons = 4.

3.3. SVR-RBF Model

The SVR-RBF model was used to optimize the hyperparameter of C and γ by employing a two-step grid search method. It will search for the optimal hyperparameters of the pair by systematically exploring a predefined hyperparameter placed in a space of equal length [44]. It is known that it has high computational complexity, so the two-step grid search method has been widely used [45]. Firstly, it determines the best region of this grid through a coarse grid search method and then attempts to activate a finer grid search method that region to find the optimal hyperparameters of those pairs [46].
As for the rolling window technique, a k-fold cross-validation has been implemented to evaluate how well a trained predictive model can carry out its performance on the unused dataset. It can unintentionally divide a dataset into multiple sets of folds, whether they are used for training or validation. In other words, one set will be used for training and the other set will be used for validation or vice versa. So in this manner, these processes are indiscriminately repeated for multiple datasets of folds as verified by which the validation is important to optimize a pair of hyperparameters leading to prevent overfitting during training [47]. That is, the k-fold cross-validation is independently implemented on certain data rather than the time series data. Namely, a training goes through a certain dataset of fold, while a validation undergoes through the other dataset of fold following the training, where the ratio of datasets on training to validation should be equal. Furthermore, as the training moves forward, the initial dataset for training ends to be taken out of this simulation [48].
From this, the initial two years of data were used as the training set and the following year was used as the validation set, which has been consecutively repeated in the mode of four-folds by moving forward in the same manner as before. Firstly, the coarse grid search was run on C and γ in the range of 10−2, 10−1, …, 105 where C and γ were at the best specifically optimized at 10−1 and 10, respectively, from which a minimum RMSE discharge flow rate was 95,879 m3/day. Secondly, the finer grid search was made at C = 10−2, 10−1.8, …, 100; γ = 100, 100.2, …, 102, as a result of which the best performance was observed at C of 10−0.8 and γ of 10, with a minimum RMSE discharge of 94,792 m3/day.

3.4. PCA-SVR-RBF Model

The PCA-SVR-RBF model has used the input variables of U1, U2, and U3 derived from PCA as obtained from section 3.2. PCA-NARX model. The hyperparameter pairs of C and γ were optimized through coarse grid search and by then ensued a finer grid search, using the rolling window technique. Like SVR-RBF, the coarse grid search resulted in the same of C and γ at 10−1 and 10, respectively, but its minimum RMSE discharge was slightly lower at 95,155 m3/day than that of SVR-RBF. Along with this, a finer grid search showed the best performance at C of 10−0.2 and γ of 100.4 with a minimum RMSE discharge of 91,168 m3/day which was likewise lower than that of SVR-RBF.

4. Results and Discussion

4.1. Comparison of Prediction Performance Among NARX, PCA-NARX, SVR-RBF, and PCA-SVR-RBF Models

The performance of the NARX, PCA-NARX, SVR-RBF, and PCA-SVR-RBF models was compared to evaluate how much J WWTP discharge could contribute to keeping up the minimum instream flow (63.5 m3/s at the Hangang Bridge) (Fig. 5). The ratio of WWTP discharge to minimum instream flow in terms of the average, minimum, and maximum were determined as follows: 22.90% (minimum 19.60% – maximum. 27.22%) for NARX; 22.35% (20.68%–25.76%) for PCA-NARX; 22.09% (19.97%–28.73%) for SVR-RBF; and 22.40% (20.12%–30.66%) for PCA-SVR-RBF model, respectively. It simply implied that the WWTP discharge can contribute approximately 22.4% to minimum instream flow on average. This suggests that WWTP discharges can invariably guarantee a certain level of flow to the river [49], even when water resource imbalance worsens. Furthermore, predicting the contributions of various WWTP discharges can help maintain a suitable minimum instream flow in the Han River basin, thereby supporting ongoing water resource management efforts.
The prediction performance was evaluated by RMSE, MAE, and SCC, as shown in Table 3. As mentioned above, RMSE and MAE represent peak and overall prediction accuracy [15], noticing lower values indicating higher accuracy. Overall, the SVR-RBF model showed better prediction performance (RMSE = 1.73% and MAE =1.23%) than the NARX model (RMSE = 2.08% and MAE = 1.53%). The higher accuracy of SVR-RBF can be attributed to its structural advantage in handling non-linear relationships within the data. RBF kernel effectively maps input features into a higher-dimensional space, enabling better capture of complex patterns and dependencies. Additionally, its loss function follows a convex quadratic optimization problem with a single global minimum, allowing SVR-RBF to converge efficiently to the optimal solution. On the other hand, the NARX model, which relies on an RNN structure, is susceptible to getting trapped in local minima due to its non-convex loss function. This can lead to suboptimal parameter estimation, thereby reducing prediction accuracy. The combination with PCA further improved the degree of simulation performance [50, 51] by filtering out less relevant variables. PCA transforms high-dimensional data into a lower-dimensional space while retaining the intrinsic information. It helps prevent overfitting and alleviates multicollinearity among independent variables. In this study, the PCA-SVR-RBF model showed the lowest extent of RMSE = 1.70% and MAE = 1.22%, indicating the smallest deviation between predicted and actual values. For PCA-NARX, the prediction performance was significantly improved with RMSE = 2.00% and MAE = 1.28% than those of NARX. In addition, SCC, as of measures for the correlation between predicted and observed values as defined by Rannou et al. [16] ranked models in descending order as follows: SVR-RBF (SCC = 0.53), PCA-SVR-RBF (0.51), PCA-NARX (0.45), and NARX (0.39). These confirmed that SVR-RBF has a higher predictive capability than NARX due to its superior optimization properties for given data, and PCA effectively enhances model performance by refining input features.

5. Conclusions

This study provides a protocol for predicting the ratio of WWTP discharges fed to the minimum instream flow using NARX, PCA-NARX, SVR-RBF, and PCA-SVR-RBF models. The hyperparameters of each model were optimized with input variables (water quality parameters and meteorological data). The results demonstrated that the SVR-RBF model outperformed the NARX model, further improving when coupled with PCA. The models successfully predicted WWTP discharges, quantifying their contribution to the minimum instream flow. This approach can serve as a valuable water resource management tool to help maintain instream flow and sustain the riverine ecosystem. This study is expected to play a crucial role in addressing water resource conservation challenges arising from climate change and urbanization. In this regard, a model would be further developed to gain a more comprehensive understanding of how WWTP discharge contributes to resilient river environments and ecosystems, considering both quantitative and qualitative impacts of WWTP effluent on receiving environments. Future research should explore model selection and optimization of their hyperparameters depending on the various data types and their properties, leading to further improvements in prediction accuracy.

Supplementary Information

Notes

Acknowledgments

This study was supported by INHA University Grant.

Conflict-of-Interest Statement

The authors declare the they have no conflict of interest.

Author Contributions

J.B.K (BSc) performed the conceptualization, methodology, investigation, formal analysis, and data curation and wrote the original draft. S.Y.P (Ph. D) performed validation and revision of the manuscript. V.S (Ph. D) performed a revision of the manuscript. C.G.K (Professor) performed conceptualization, supervision, and comprehensive revision of the manuscript.

References

1. Liu WC, Liu SY, Hsu MH, Kuo AY. Water quality modeling to determine minimum instream flow for fish survival in tidal rivers. J. Environ. Manage. 2005;76:293–308. https://doi.org/10.1016/j.jenvman.2005.02.005
crossref pmid

2. Mukherjee A, Bhanja SN, Wada Y. Groundwater depletion causing reduction of baseflow triggering Ganges river summer drying. Sci. Rep. 2018;8:1–9. https://doi.org/10.1038/s41598-018-30246-7
crossref pmid pmc

3. Luthy RG, Sedlak DL, Plumlee MH, Austin D, Resh VH. Wastewa ter-effluent-dominated streams as ecosystem-management tools in a drier climate. Front. Ecol. Environ. 2015;13:477–485. https://doi.org/10.1890/150038
crossref

4. Li H, Wang H, Li G, Li S, Guo J. Study on urban wastewater discharge forecasting and influence factors analysis based on stochastic gradient regression. In : 3rd International Symposium on Intelligence Information Technology Application; 21–22 November 2009; NanChang. p. 483–486.


5. Mansour-Bahmani A, Haghiabi AH, Shamsi Z, Parsaie A. Predictive modeling the discharge of urban wastewater using artificial intelligent models (case study: Kerman city). Model. Earth Syst. Environ. 2021;7:1917–1925. https://doi.org/10.1007/s40808-020-00900-z
crossref

6. Shao Y, Zhao J, Xu J, Fu A, Li M. Application of Rainfall-Runoff Simulation Based on the NARX Dynamic Neural Network Mode l. Water (Switzerland). 2022;14:1–16. https://doi.org/10.3390/w14132082
crossref

7. Guzman SM, Paz JO, Tagert MLM, Mercer AE. Evaluation of Seasonally Classified Inputs for the Prediction of Daily Groun dwater Levels: NARX Networks Vs Support Vector Machines. Environ. Model. Assess. 2019;24:223–234. https://doi.org/10.1007/s10666-018-9639-x
crossref

8. Koschwitz D, Frisch J, van Treeck C. Data-driven heating and cooling load predictions for non-residential buildings based on support vector machine regression and NARX Recurrent Neural Network: A comparative study on district scale. Energy. 2018;165:134–142. https://doi.org/10.1016/j.energy.2018.09.068
crossref

9. The sewage treatment volume of the Seoul Water Reuse Center [Internet]. Seoul Open Plaza. c2022. [cited 15 November 2022]. Available from: https://data.seoul.go.kr/dataList/OA-15561/S/1/datasetView.do


10. Automated Synoptic Observing System (ASOS) [Internet]. Korea Meteorological Administration; c2022. [cited 15 December 2022]. Available from: https://data.kma.go.kr/data/grnd/selectAsosRltmList.do?pgmNo=36


11. Moon T, Choi J, Kim S, Cha J, Yoom H, Kim C. Prediction of Influent Flow Rate and Influent Components using Artificial Neural Network (ANN). J. Korean Soc. Water Qual. 2008;24:91–98.


12. Mjalli FS, Al-Asheh S, Alfadala HE. Use of artificial neural network black-box modeling for the prediction of wastewater treatment plants performance. J. Environ. Manage. 2007;83:329–338. https://doi.org/10.1016/j.jenvman.2006.03.004
crossref pmid

13. Junninen H, Niska H, Tuppurainen K, Ruuskanen J, Kolehmainen M. Methods for imputation of missing values in air quality data sets. Atmos. Environ. 2004;38:2895–2907. https://doi.org/10.1016/j.atmosenv.2004.02.026
crossref

14. Stajkowski S, Zeynoddin M, Farghaly H, Gharabaghi B, Bonakdari H. A methodology for forecasting dissolved oxygen in urban streams. Water (Switzerland). 2020;12:2568. https://doi.org/10.3390/W12092568
crossref

15. Kisi O, Cimen M. A wavelet-support vector machine conjunction model for monthly streamflow forecasting. J. Hydrol. 2011;399:132–140. https://doi.org/10.1016/j.jhydrol.2010.12.041
crossref

16. Rannou F, Poiraudeau S, Berezné A, et al. Assessing disability and quality of life in systemic sclerosis: Construct validities of the Cochin Hand Function Scale, Health Assessment Questio nnaire (HAQ), systemic sclerosis HAQ, and medical outcomes study 36-item short form health survey. Arthritis Care Res. 2007;57:94–102. https://doi.org/10.1002/art.22468
crossref pmid

17. Desouky MAA, Abdelkhalik O. Wave prediction using wave rider position measurements and NARX network in wave energy conversion. Appl. Ocean Res. 2019;82:10–21. https://doi.org/10.1016/j.apor.2018.10.016
crossref

18. Alghamdi H, Maduabuchi C, Albaker A, et al. A prediction model for the performance of solar photovoltaic-thermoelectric systems utilizing various semiconductors via optimal surrogate machine learning methods. Eng. Sci. Technol. Int. J. 2023;40:101–363. https://doi.org/10.1016/j.jestch.2023.101363
crossref

19. Diaconescu E. The use of NARX neural networks to predict chaotic time series. WSEAS Trans. Comput. Res. 2008;3:182–191.


20. Smola AJ, Scholkopf B. A tutorial on support vector regression. Stat. Comput. 2004;14:199–222. https://doi.org/10.1186/s12984-021-00806-6
crossref pmid pmc

21. Dhiman HS, Deb D, Guerrero JM. Hybrid machine intelligent SVR variants for wind forecasting and ramp events. Renew. Sustain. Energy Rev. 2019;108:369–379. https://doi.org/10.1016/j.rser.2019.04.002
crossref

22. Lu X, Miao F, Xie X, Li D, Xie Y. A new method for displacement prediction of “step-like” landslides based on VMD-FOA-SVR model. Environ. Earth Sci. 2021;80:1–12. https://doi.org/10.1007/s12665-021-09825-x
crossref

23. Cheng K, Lu Z, Wei Y, Shi Y, Zhou Y. Mixed kernel function support vector regression for global sensitivity analysis. Mech. Syst. Signal Process. 2017;96:201–214. https://doi.org/10.1016/j.ymssp.2017.04.014
crossref

24. Frénay B, Verleysen M. Parameter-insensitive kernel in extreme learning for non-linear support vector regression. Neurocomputing. 2011;74:2526–2531. https://doi.org/10.1016/j.neucom.2010.11.037
crossref

25. Ara A, Maia M, Louzada F, Macêdo S. Regression random machines: An ensemble support vector regression model with free kernel choice. Expert Syst. Appl. 2022;202:117107. https://doi.org/10.1016/j.eswa.2022.117107
crossref

26. Dash RK, Nguyen TN, Cengiz K, Sharma A. Fine-tuned support vector regression model for stock predictions. Neural Comput. Appl. 2023;35:23295–23309. https://doi.org/10.1007/s00521-021-05842-w
crossref

27. Al-Fugara A, Ahmadlou M, Al-Shabeeb AR, AlAyyash S, Al-Amoush H, Al-Adamat R. Spatial mapping of groundwater springs potentiality using grid search-based and genetic algorithm-based support vector regression. Geocarto Int. 2022;37:284–303. https://doi.org/10.1080/10106049.2020.1716396
crossref

28. Hsu C-W, Chang C-C, Lin C-J. A Practical Guide to Support Vector Classification. 2003;1–16.


29. Bowden GJ, Dandy GC, Maier HR. Input determination for neural network models in water resources applications. Part 1 - Background and methodology. J. Hydrol. 2005;301:75–92. https://doi.org/10.1016/j.jhydrol.2004.06.021
crossref

30. Zounemat-Kermani M. Hourly predictive Levenberg-Marquardt ANN and multi linear regression models for predicting of dew point temperature. Meteorol. Atmos. Phys. 2012;117:181–192. https://doi.org/10.1007/s00703-012-0192-x
crossref

31. Zounemat-Kermani M, Stephan D, Hinkelmann R. Multivariate NARX neural network in prediction gaseous emissions within the influent chamber of wastewater treatment plants. Atmos. Pollut. Res. 2019;10:1812–1822. https://doi.org/10.1016/j.apr.2019.07.013
crossref

32. Yang Y, Kim KR, Kou R, et al. Prediction of effluent quality in a wastewater treatment plant by dynamic neural network modeling. Process Saf. Environ. Prot. 2022;158:515–524. https://doi.org/10.1016/j.psep.2021.12.034
crossref

33. Wallot S, Mønster D. Calculation of Average Mutual Information (AMI) and false-nearest neighbors (FNN) for the estimation of embedding parameters of multidimensional time series in matlab. Front. Psychol. 2018;9:1–10. https://doi.org/10.3389/fpsyg.2018.01679
crossref pmid pmc

34. Beale MH, Hagen MT, Demuth HB. Neural Network Toolbox 7. User’s Guid MathWorks 2. 2010;77–81. https://doi.org/10.1007/s005210070003
crossref

35. Liu W, Li J. Comparison of algorithms for seismic topology optimisation of lifeline networks. Struct. Infrastruct. Eng. 2014;10:1357–1368. https://doi.org/10.1080/15732479.2013.808234
crossref

36. You KW, Arumugasamy SK. Deep learning techniques for polyc aprolactone molecular weight prediction via enzymatic polyme rization process. J. Taiwan Inst. Chem. Eng. 2020;116:238–255. https://doi.org/10.1016/j.jtice.2020.11.003
crossref

37. Orozco J, García CAR. Detecting pathologies from infant cry applying scaled conjugate gradient neural networks. In : Europe an Symposium on Aritificial Neural Networks; 23–25 April 2003; Bruges. p. 349–354.


38. Arslan MH. An evaluation of effective design parameters on earthquake performance of RC buildings using neural networks. Eng. Struct. 2010;32:1888–1898. https://doi.org/10.1016/j.engstruct.2010.03.010
crossref

39. Selvamuthu D, Kumar V, Mishra A. Indian stock market prediction using artificial neural networks on tick data. Financ Innov. 2019;5:https://doi.org/10.1186/s40854-019-0131-7
crossref

40. Yotov K, Hadzhikolev E, Hadzhikoleva S, Cheresharov S. Finding the Optimal Topology of an Approximating Neural Network. Mathematics. 2023;11:217. https://doi.org/10.3390/math11010217
crossref

41. Zhang B, Han Y, Yu B, Geng Z. Novel Nonlinear Autoregression with External Input Integrating PCA-WD and Its Application to a Dynamic Soft Sensor. Ind. Eng. Chem. Res. 2020;59:15697–15706. https://doi.org/10.1021/acs.iecr.0c02944
crossref

42. Ma Q, Liu S, Zhao X. PCA-NARX Time Series Prediction Model of Surface Settlement during Excavation of Deep Foundation Pit. In : IOP Conf. Ser. Earth Environ. Sci; 2020; 560:https://doi.org/10.1088/1755-1315/560/1/012056
crossref

43. Zhang T, Fell F, Liu ZS, Preusker R, Fischer J, He MX. Evaluating the performance of artificial neural network techniques for pigment retrieval from ocean color in Case I waters. J Geophys Res Ocean. 2003;108:https://doi.org/10.1029/2002jc001638
crossref

44. Liashchynskyi P, Liashchynskyi P. Grid Search, Random Search Genetic Algorithm A Big Comparison for NAS. 2019;1–11. https://doi.org/10.48550/arXiv.1912.06059


45. Haddad K, Rahman A. Regional flood frequency analysis: evaluation of regions in cluster space using support vector regression. Nat. Hazards. 2020;102:489–517. https://doi.org/10.1007/s11069-020-03935-8
crossref

46. Chen ST, Yu PS. Real-time probabilistic forecasting of flood stages. J. Hydrol. 2007;340:63–77. https://doi.org/10.1016/j.jhydrol.2007.04.008
crossref

47. Beniwal M, Singh A, Kumar N. Forecasting long-term stock prices of global indices: A forward-validating Genetic Algorithm optimization approach for Support Vector Regression. Appl. Soft Comput. 2023;145:110566. https://doi.org/10.1016/j.asoc.2023.110566
crossref

48. Vu HL, Ng KTW, Richter A, Li J, Hosseinipooya SA. Impacts of nested forward validation techniques on machine learning and regression waste disposal time series models. Ecol. Inform. 2022;72:101897. https://doi.org/10.1016/j.ecoinf.2022.101897
crossref

49. Moon J, Choi S, Kang SK, Lee D. Contribution Degree Analysis of Discharge from Sewage Treatment Plants at Streamflow in River. In : The Korea Water Resources Associdation Conference; 13–14 May 2010; Daejeon. p. 1370–1374.


50. Noori R, Karbassi AR, Moghaddamnia A, et al. Assessment of input variables determination on the SVM model performance using PCA, Gamma test, and forward selection techniques for monthly stream flow prediction. J. Hydrol. 2011;401:177–189. https://doi.org/10.1016/j.jhydrol.2011.02.021
crossref

51. Yehia AM, Mohamed HM. Chemometrics resolution and quantification power evaluation: Application on pharmaceutical quaternary mixture of Paracetamol, Guaifenesin, Phenylephrine and p-aminophenol. Spectrochim. Acta - Part A Mol. Biomol. Spectrosc. 2016;152:491–500. https://doi.org/10.1016/j.saa.2015.07.101
crossref pmid

Fig. 1
(a) J WWTP and the Hangang Bridge located along the Han River in Seoul, South Korea, and (b) the schematic flow diagram of J WWTP treating sewage in three different ways.
/upload/thumbnails/eer-2025-028f1.gif
Fig. 2
Comparison of RMSE flow rates for the NARX model to be variously trained for nine algorithms: LM, BFGS, RP, SCG, CGB, CGF, CGP, OSS, and GDX.
/upload/thumbnails/eer-2025-028f2.gif
Fig. 3
Schematic flow diagram of PCA-NARX model constitution.
/upload/thumbnails/eer-2025-028f3.gif
Fig. 4
Comparison of RMSEs on PCA-NARX models runs according to six different PCAs (PCA1-PCA6).
/upload/thumbnails/eer-2025-028f4.gif
Fig. 5
Prediction of WWTP discharge fed to minimum instream flow near the Hangang Bridge, the Han River through (a) NARX, (b) PCA-NARX, (c) SVR-RBF, and (d) PCA-SVR-RBF models.
/upload/thumbnails/eer-2025-028f5.gif
Table 1
The correlation between input variables and discharge.
Water quality parameters Meteorological factors Discharge

BOD5 (mg O2/L) COD1 (mg O2/L) TOC2 (mg C/L) TN (mg N/L) TP (mg P/L) Humidity (%) Precipitation (mm) (m3/day)
Avg. 500.99 268.16 256.14 128.64 12.80 60.00 2.06 1.25×106
Max. 813.6 424.60 421.80 212.44 21.60 98.10 67.50 1.87×106
Min. 154.5 91.00 104.80 42.26 3.72 17.90 0 0.97×106
R* −0.45 −0.46 −0.32 −0.47 −0.43 +0.52 +0.17 1.00

Chemical oxygen demand monitored from 2016 to 2020.

Total organic carbon monitored in 2021.

Correlation coefficient between input variables and discharge flow rate.

Table 2
Various kernel functions of equations expressed with their hyperparameters.
Kernel K(xi, x) Hyperparameter
Linear Kernel γ(xi · x) γ
Polynomial Kernel (γ(xi · x) + r)d γ, d
Radial basis function Kernel exp (–γ||xix||2) γ
Laplacian Kernel exp (–γ||xix||) γ
Table 3
Comparison of the prediction performance of four models according to RMSE, MAE, and SCC.
NARX PCA-NARX SVR-RBF PCA-SVR-RBF
RMSE 2.08% 2.00% 1.73% 1.70%
MAE 1.53% 1.28% 1.23% 1.22%
SCC 0.39 0.45 0.53 0.51
TOOLS
PDF Links  PDF Links
PubReader  PubReader
Full text via DOI  Full text via DOI
Download Citation  Download Citation
Supplement  Supplement
  Print
Share:      
METRICS
0
Crossref
0
Scopus
1,491
View
109
Download
Editorial Office
464 Cheongpa-ro, #726, Jung-gu, Seoul 04510, Republic of Korea
FAX : +82-2-383-9654   E-mail : eer@kosenv.or.kr

Copyright© Korean Society of Environmental Engineers.        Developed in M2PI
About |  Browse Articles |  Current Issue |  For Authors and Reviewers