Simultaneous modelling and forecasting of hourly dissolved ......Kisi et al. (2013) investigated the accuracy of three artiﬁcial intelligence techniques, namely MLPNN, ANFIS and

ORIGINAL ARTICLE

Simultaneous modelling and forecasting of hourly dissolvedoxygen concentration (DO) using radial basis function neuralnetwork (RBFNN) based approach: a case studyfrom the Klamath River, Oregon, USA

Salim Heddam1

Received: 11 July 2016 / Accepted: 12 July 2016 / Published online: 20 July 2016

� Springer International Publishing Switzerland 2016

Abstract In the present study, we developed and com-

pared two artificial intelligences technique (AI) for simul-

taneous modelling and forecasting hourly dissolved oxygen

(DO) in river ecosystem. The two techniques are: radial

basis function neural network (RBFNN) and multilayer

perceptron neural network (MLPNN). For the purpose of

the study, we choose two stations from the United States

Geological Survey: (USGS ID: 421015121471800) at Lost

River Diversion Channel nr Klamath River, Oregon, USA

(Latitude 42�1001500, Longitude 121�4701800 NAD83), witha total of 8703 data, and (USGS ID: 421401121480900) at

Upper Klamath Lake at Link River Dam, Oregon USA

(Latitude 42�1400100, Longitude 121�4800900 NAD83) witha total of 8552 data. The investigation is divided into two

distinguished phase. Firstly, using four water quality vari-

ables that are, water pH, temperature (TE), specific con-

ductance (SC), and sensor depth (SD); we compared five

models (M1 to M5) with different combination of input

variables. As a result of the first investigation we found that

generally RBFNN outperform MLPNN according to the

performances criteria calculated. In the second part of the

study, six Different models (FM1 to FM6) having the same

input data sets are developed for 1,12, 24,48,72 and 168 h

ahead (in advance) forecasting. The performance of the

RBFNN and MLPNN models in training, validation and

testing sets are compared with the observed data. Our

results reveal that the two models provided relatively

similar results and they successfully forecasting DO with a

high level of accuracy and the reliability of forecasting

decreases with increasing the step ahead.

Keywords Modelling � Forecasting � Dissolved oxygen �RBFNN � MLPNN � River

Introduction

One of the most important components of the aquatic life is

certainly Dissolved oxygen concentration (DO). The prin-

cipal sources of in-stream DO are: (1) diffusion from the

atmosphere at the stream surface exchange, (2) mixing of

the stream water at riffles, and (3) photosynthesis from in-

stream primary production (O’Driscoll et al. 2016). DO is

measured in milligrams per liter (mg/l). In the river

ecosystems, DO is produced and consumed continuously

and it is necessary to the fauna, flora and aquatic organ-

isms. A reduction of level of DO may cause long-term

adverse effects in the aquatic environment (Gonçalves and

Costa 2013), and a deficiency of DO is a sign of an

unhealthy river (Mondal et al. 2016). It is also reported that

DO is an important factor influencing the dynamics of

phytoplankton and zooplankton populations and a model

has been recently proposed and tested describing the role of

DO on the plankton dynamics (Dhar and Baghel 2016).

Misra and Chaturvedi (2016) reported that DO concentra-

tion is the most important factor affecting subsequent

survival of fish. Since then, some models using different

modelling approaches have been proposed for estimating

DO in rivers, streams and lakes ecosystems.

Abdul-Aziz et al. (2007a, b) developed an empirical

model to adjust discrete DO measurements to a common

time-reference value using an extended stochastic har-

monic analysis (ESHA) algorithm. The model was

& Salim [email protected]

1 Faculty of Science, Agronomy Department, Hydraulics

Division University, 20 Août 1955, Route EL Hadaik, BP 26,

Skikda, Algeria

123

Model. Earth Syst. Environ. (2016) 2:135

DOI 10.1007/s40808-016-0197-4

http://orcid.org/0000-0002-8055-8463http://crossmark.crossref.org/dialog/?doi=10.1007/s40808-016-0197-4&domain=pdfhttp://crossmark.crossref.org/dialog/?doi=10.1007/s40808-016-0197-4&domain=pdf

calibrated and validated for different stream sites across

Minnesota, USA, incorporating effects of different ecore-

gions and variable drainage areas. In order to validate the

model, the authors have used with independent data for

other sites in Minnesota. Considering DO as important

water quality indicator, Costa and Gonçalves (2011)

compared two approaches: the linear and the state-space

models, the two models have been associated to the clus-

tering technique. The authors have tried to identify and

classify homogeneous groups of water quality based on

similarities in the temporal dynamics of the DO concen-

tration. Subsequently, they have compared the two models

for predicting DO using data from River Ave basin, Por-

tugal. The authors have obtained root mean square errors

(RMSE) of 0.961 and 0.846 for the linear models and the

state space model, respectively. Prasad et al. (2011) used

MLR approach to develop a three dimensional model for

prediction of spatially explicit DO levels in Chesapeake

Bay, USA, by accounting for long-term variability of

nutrient concentrations: total dissolved nitrogen (TDN),

total dissolved phosphorus (TDP), water temperature (TE)

and salinity (SA) across the Bay. The model has been

applied at monthly time step and the step-wise regression

approach was used to select a starting point for the MLR

models relating DO concentrations to monthly water TE,

SA, TDN and TDP levels. Akkoyunlu et al. (2011)

examined the depth-dependent estimation of a lake’s DO

using two ANN methods: (1) the RBFNN and the MLPNN,

and (2) the multiple linear regression (MLR). The com-

parison results revealed that the ANN methods were

noticeably superior to those of MLR in modelling the DO.

Ay and Kisi (2012) developed and compared two dif-

ferent artificial neural network (ANN) techniques, the

multi-layer perceptron artificial neural network (MLPNN)

and the radial basis function neural network (RBFNN), for

modelling DO concentration. The ANN models were

developed using experimental data collected from the

upstream and downstream USGS stations on Foundation

Creek, Colorado, USA. Antanasijević et al. (2013) devel-

oped and compared three types of ANN namely, general-

ized regression neural network (GRNN), backpropagation

neural network (BPNN) and recurrent neural network

(RNN), for the prediction of DO concentration in the

Danube River, North Serbia. An innovative approach has

been proposed by Areerachakul et al. (2013) that combine

unsupervised and supervised artificial neural networks

(ANN) based approaches. Using thirteen (13) water quality

variables collected with a time step of 1 month, the authors

have applied in the first part, the standard multilayer per-

ceptron neural network (MLPNN). In the second part, they

have applied the MLPNN with a priori unsupervised

clustering methods: the K-mean and fuzzy c-mean algo-

rithms, the combined model is called k-MLPNN, and

applied to predict the DO in k clusters based on the 13

water quality. As a result of the investigation, the authors

have demonstrated that k-MLPNN had higher predictive

capability than a standard MLPNN model with coefficient

of correlation (CC) of 0.83 and 0.62 for the k-MLPNN and

MLPNN, respectively.

A model called real-value genetic algorithm support

vector regression (RGA-SVR) has been proposed by Liu

et al. (2013). The authors have applied the model for pre-

dicting water DO in in aquaculture river crab pond in

China, and demonstrated that RGA-SVR provided best

results in comparison to the standard support vector

regression (SVR) and MLPNN. The root mean square error

(RMSE) obtained in the testing phase was 0.0195, 0.051

and 0.283 for RGA-SVR, SVR and MLPNN, respectively.

Wavelet neural network (WNN), that uses morlet wavelet

as the wavelet transfer function in the hidden layer has

been proposed by Xu and Liu (2013). The authors have

applied the model for predicting DO in the Intensive

freshwater pearl breeding ponds in Duchang county,

Jiangxi province, China. Compared with prediction results

achieved by the MLPNN and the Elman neural network

(ELM), the low mean absolute percentage error (MAPE)

was obtained with WNN. Kisi et al. (2013) investigated the

accuracy of three artificial intelligence techniques, namely

MLPNN, ANFIS and gene expression programming (GEP)

in modelling daily DO in South Platte River at Englewood,

Colorado, USA. As a conclusion of the investigation, the

authors have demonstrated that GEP model performed

better than the MLPNN and ANFIS models in modelling

DO concentration. Liu et al. (2014) proposed at the first

time a particular model that combining both wavelet

analysis (WA) and least squares support vector regression

(LSSVR) with an optimal improved Cauchy particle swarm

optimization (CPSO) algorithm. The proposed hybrid

model called WA-CPSO-LSSVR has been applied to pre-

dict DO in river crab culture ponds, at the Yixing base of

intelligent aquaculture management systems in Jiangsu

pro-vince, China. For comparison, the authors have applied

for the same data set the standard LSSVR, and the flexible

structure radial basis function neural network (FS-

RBFNN). From the results obtained it can be concluded

that the estimation results were significantly different and

the WA-CPSO-LSSVR model provided good results in

comparison to the other two. The CC obtained in the

testing phase was 0.89, 0.92 and 0.96 for LSSVR, FS-

RBFNN and WA-CPSO-LSSVR, respectively.

Heddam (2014a) applied generalized regression neural

network (GRNN) based model for modelling hourly DO, at

Klamath River, Oregon, USA. Evrendilek and Karakaya

(2014) investigated the effects of discrete wavelet transforms

(DWT)with the orthogonal Symmlet and the semi orthogonal

Chui-Wang B-spline on predictive power of multiple non-

135 Page 2 of 18 Model. Earth Syst. Environ. (2016) 2:135

123

linear regression models (MNLR) models for diel, daytime

(diurnal) and nighttime (nocturnal) DO dynamics. Using

three artificial intelligence-based models, Emamgholizadeh

et al. (2014) have conducted an investigation for modelling

DO in Karoon River, Iran. The authors have selected nine (9)

water quality variables with a time step of 1 month. The

investigated models were: radial basis function neural net-

work (RBFNN), multilayer perceptron neural network

(MLPNN) and adaptive neuro-fuzzy inference system

(ANFIS) models. From the results reported by the authors

MLPNN, outperforms ANFIS and RBFNN for DO predic-

tion, producing a CC of 0.86, while the other two models

(ANFIS and RBFNN) have provided a CC equal to 0.83 and

0.75, respectively. Abdul-Aziz and Ishtiaq (2014) conducted

an investigation for predicting hourly DO time-series from

different streams representing four distinctUSEnvironmental

Protection Agency (US EPA) Level III Ecoregions of Min-

nesota, USA. The authors have developed a scaling-based

robust, empirical model for simulating the diurnal cycle of

stream DO from a single reference observation, and have

obtained a high CC rather than 0.94. Antanasijević et al.

(2014) applied GRNN model with the Monte Carlo Simula-

tion (MCS) technique for modelling DO, across multiple

sites; located on the Danube River, North Serbia.

Heddam (2014b) developed and compared two adaptive

neuro-fuzzy inference systems (ANFIS) for modeling

hourly DO, at Klamath River, Oregon, USA. In another

study, Heddam (2014c) applied an artificial intelligence

(AI) technique model called dynamic evolving neural-fuzzy

inference system (DENFIS) based on an evolving clustering

method (ECM), for modelling hourly DO in Klamath River,

Oregon, USA. In another study, Evrendilek and Karakaya

(2015) used median and linear regression models of satu-

rated dissolved oxygen (DOsat) after denoising by using

discrete wavelet transform (DWT) with Chui-Wang

B-spline and Coiflet wavelets decomposition. Alizadeh and

Kavianpour (2015) compared two artificial neural networks

models: standard MLPNN and wavelet-neural network

(WNN) for predicting DO concentration, using a variety of

water quality variables as input. Using data collected from

Hilo Bay on the east side of the Big Island, the authors have

developed the models at daily and hourly time step. For the

model at hourly time step, they have reported a high CC

rather than 0.98 and 0.97 in the validation and testing phase,

respectively using the WNN model, while for the MLPNN

the CC were 0.89 and 0.94 in the validation and testing

phase, respectively. For the model at daily time step the CC

for the WNN model was slightly less than in hourly time

step but still well above the MLPNN model. In another

study, Nemati et al. (2015) have compared three artificial

intelligence modelling techniques namely, ANFIS,

MLPNN and MLR for predicting DO in Tai Po River, New

Territories, Hong Kong. In order to investigate the

capability of these techniques for predicting DO, they have

used data at time step of 1 month and eight (8) water quality

variables were selected as input variables according to their

correlations with DO. According to the results reported,

MLPNN model outperforms ANFIS and MLR. The CC was

0.798, 0.645 and 0.681, for MLPNN, ANFIS and MLR,

respectively. Recently, An et al. (2015) used the nonlinear

grey Bernoulli model (NGBM (1, 1)) to simulate and

forecasting DO in the Guanting reservoir (inlet and outlet),

located at the upper reaches of the Yongding River in the

northwest of Beijing, China. Bayram et al. (2015) investi-

gated the applicability of teaching–learning based opti-

mization (TLBO) algorithm in modeling stream DO in

turkey. The authors have used four stream water quality

indicators, namely, water temperature (TE), pH, electrical

conductivity (EC), and hardness (WH). The TLBO method

is compared with those of the artificial bee colony algorithm

(ABC) and conventional regression analysis methods

(CRA). These methods are applied to four different

regression forms: quadratic, exponential, linear, and power.

Heddam (2016a) applied optimally pruned extreme learning

machine (OP-ELM) in forecasting DO several hours in

advance, at Klamath River, Oregon, USA.

Artificial intelligence (AI) techniques have been fre-

quently applied in environmental modelling. Some of these

applications include, among others, the following: predic-

tion of reservoir permeability from porosity measurements

(Handhal 2016); predictive modeling of discharge in

compound open channel (Parsaie et al. 2015); automatic

inversion tool for geoelectrical resistivity (Raj et al. 2015);

forecasting monthly groundwater level (Kasiviswanathan

et al. 2016); predicting the dispersion coefficient (D) in a

river ecosystem (Antonopoulos et al. 2015); modelling the

permeability losses in permeable reactive barriers (San-

tisukkasaem et al. 2015); estimating the reference evapo-

transpiration (ET0) (Adamala et al. 2015); calculating the

dynamic coefficient in porous media (Das et al. 2015);

predicting Indian monsoon rainfall (Azad et al. 2015), and

modeling of arsenic (III) removal (Mandal et al. 2015).

Although RBFNN has been applied for modelling DO

concentration, to the best of our knowledge, there have

been no studies done on the application of RBFNN for

forecasting DO in rivers; hence the present study aims to

investigate the capabilities of the RBFNN in comparison to

the standard MLPNN for simultaneous modelling and

forecasting of hourly DO concentration.

Methodology

Two models of artificial neural networks (ANN) are

developed and compared in this study: the Multilayer

Perceptron Neural Network (MLPNN) and the Radial Basis

Model. Earth Syst. Environ. (2016) 2:135 Page 3 of 18 135

123

Function Neural Network (RBFNN). A brief description of

these models is set forth hereafter.

Multilayer perceptron neural network (MLPNN)

Multilayer perceptron neural network (MLPNN) (Rumel-

hart et al.1986) is the most important type of ANN and its

structure can be represented as in Fig. 1. The MLPNN is a

feedforward network and has three layers: the input layer,

the hidden layer and the output layer. Each layer contains a

number of neurons. The processing ability of the network is

stored in the inter unit connection strengths (or weights) that

are obtained through a process of adaptation to a set of

training pattern (Haykin 1999). The number of neurons in

the input layer corresponds to the number of input variables;

the input layer only collects information. The hidden layer

is the important layer in the MLPNN model and contains

several neurons, and each neuron in this layer is connected

to the every neuron in the next and previous layer. Each

neuron in the hidden layer calculates the sum of the

weighted input and adds a bias value. The sum value

obtained on this application is passed through a non-linear

function known as the transfer function, which is usually a

sigmoid function, to the output layer. The third layer is the

output layer. There is only one neuron in this layer: the

desired DO. The MLPNN are capable of approximating any

function with a finite number of discontinuities (Hornik

et al. 1989) and considered as a universal approximators

(Hornik et al. 1989; Hornik 1991).

Let us denote k as the number of input variables, m as

the number of neurons in the hidden layer, the mathemat-

ical structure of the MLPNN from the input to the output

can be formulated as follow:

Aj ¼ B1 þXk

i¼1wij � xi; ð1Þ

where Aj is the weighted sum of the j hidden neuron, k is

the total number of inputs, wij denotes the weight charac-

terising the connection between the nth input to the mthhidden neuron, and B1 is the bias term of each hidden

neuron. The output of the mth hidden neuron is given by

� j ¼ f Aj� �

: ð2Þ

The activation function f adopted for the present study

was the sigmoid, represented by Eq. (3).

f Að Þ ¼ 11þ e�A : ð3Þ

The neural network output is then given by

Ok ¼ B2 þXm

j¼1wjk �� j; ð4Þ

where wjk denotes the weight characterising the connection

between the mth hidden neuron to the pth output neuron,

m the total number of hidden neurons) and B2 is the bias

term. The linear activation function is most commonly

applied to the output layer.

MLPNN is the most widely used neural network model,

and has been applied to solve many difficult problems in

environmental sciences. Some different applications are as

follows. Prediction of uniaxial compressive strength of tra-

vertine rocks (Barzegar et al. 2016); temperature variations

and generate missing temperature data in Iran (Salami and

Ehteshami 2016); river flow forecasting (Kasiviswanathan

and Sudheer 2016); prediction of water quality index in

groundwater systems (Sakizadeh 2016); runoff simulation

x1

x2

x3

xn

.

Input Layer (n neurons)

Hidden Layer(m neurons)

Output Layer(k neurons)

wjk

wijB2 =biasB1 =bias

Fig. 1 Architecture ofmultilayer perceptron neural

network (MLPNN)


123

(Javan et al. 2015); runoff and sediment yield modeling

(Sharma et al. 2015); modeling Secchi disk depth (SD) in

river (Heddam 2016b); and predicting phycocyanin (PC)

pigment concentration in river (Heddam 2016c).

Radial basis function neural network (RBFNN)

Proposed by Broomhead and Lowe (1988), radial-basis

functions neural networks (RBFNN) is a feed-forward net-

work and have three layers: the input layer, the hidden layer

and output layer. The RBFNN uses a linear transfer function

for the output neurons and a nonlinear Gaussian function for

the hidden neurons (Moody and Darken 1989). The RBFNN

neural network model has been proven to be a universal

function approximator (Park and Sandberg 1991). To the

mathematical point of view, the RBFNN structure shown in

Fig. 2 can be presented as follow (Lin and Wu 2011):

ui xð Þ ¼ x� lik k i ¼ 1; 2; . . .;N; ð5Þ

u (x) is the output of the jth hidden neuron, �k kdenotes theEuclidean distance, x is the p-dimensional input vector, liis the center (vector) of the ith hidden neuron, and u is theactivation function (Lin and Wu 2011). The RBFNN

Gaussian function can be written as:

ui xð Þ ¼ exp �x� lik k2

2 r2i

!i ¼ 1; 2;N; ð6Þ

where ri is the widths (or spread) of the hidden neuron.The output of the RBFNN model can be calculated as

follow

� i ¼XN

j¼1wij uj xð Þ þ B2 ð7Þ

wij represents a weighted connections between the radial

basis function neuron and output neuron; and N = number

of hidden-layer neurons. The constant term B2 in Eq. (7)

represents a bias. The output of the network is a linear

combination of the basis functions computed by the hidden

layer nodes and the supervised gradient-descent-based

method is used for the network training (Poggio and Girosi,

1990a, b).

In previous works, there have been reported some

important applications of RBFNN in different areas of

environmental science. Some of these applications are as

follows. Modelling coagulant dosage in water treatment

plant (Heddam et al. 2011); modelling daily ET0 (Ladlani

et al. 2012); sequestration of soil organic carbon (SOC) in

the agricultural surface soils and bottom sediments (Pal

et al. 2016); spatial variability of soil organic carbon

(Bhunia et al. 2016); predicting the side weir discharge

coefficient (Parsaie 2016); simulation of nitrate contami-

nation in groundwater (Ehteshami et al. 2016); ground-

water salinity prediction (Barzegar and Moghaddam 2016),

and predicting the longitudinal dispersion coefficient in

rivers (Parsaie and Haghiabi 2015).

Description of study area

The historical hourly dissolved oxygen concentration (DO)

and the four water quality variables data from (1 Jun 2014)

to (31 May 2015) were used in this study and are available

at the United States Geological Survey (USGS) website,

http://or.water.usgs.gov/cgi-bin/grapher/table_setup.pl?site_

id.Two stations are chosen: (USGS ID: 421015121471800)at

Lost River Diversion Channel nr Klamath River, Oregon

USA (Latitude 42�1001500, Longitude 121�4701800 NAD83),and (USGS ID: 421401121480900) at Upper Klamath Lake

at Link River Dam, Oregon USA (Latitude 42�1400100,Longitude 121�4800900 NAD83). Figure 3 shows the loca-tions of the stations in study area. For the two stations the

data set is divided into three sub-data sets: (i) a training set

(60 %), (ii) a validation set (20 %) and (iii) a test set

(20 %).

x1

x2

x3

xn

.

Input Layer (n neurons)

Hidden Layer(m neurons)

Output Layer(k neurons)

wij

wjk

B2 =bias

Fig. 2 Architecture of radialbasis function neural network

(RBFNN)


123

http://or.water.usgs.gov/cgi-bin/grapher/table_setup.pl%3fsite_id.Twohttp://or.water.usgs.gov/cgi-bin/grapher/table_setup.pl%3fsite_id.Two

Ranges of water quality data

The statistical parameters of the DO and water quality

variables data such as the mean, maximum, minimum,

standard deviation, and the coefficient of variation values

(i.e., Xmean, Xmax, Xmin, Sx, and Cv respectively) are given in

Table 2. Because the five variables described above had

different dimensions, and there was major difference

among values, it was considered to be necessary to stan-

dardize the primary data in order to enhance the training

speed and the precision of the models. Input data were

entered into the models after normalization. For this pur-

pose, Eq. (8) was utilized:

xni;k ¼xi;k �mkSdK

; ð8Þ

xni, k: is the normalized value of the variable k (input or

output) for each sample i,. xi,k the original value of the

variable k (input or output). mk and Sdk are the mean value

and standard deviation of the variable k (input or output).

Fig. 3 Map showing the study area [adopted from Sullivan et al. (2012, 2013a, b)]


123

All the input and output variables were normalized to have

zero mean and unit variance (Heddam et al. 2012, 2016;

Heddam 2014d, 2016b, c).

From the Table 1, it can be seen that pH and SC have a

direct relationship with the DO, with a CC equal to 0.16

and 0.49, respectively, while SD and TE water quality

variable have an inverse relationship with the DO with a

CC equal to -0.26 and -0.59, respectively, for the USGS

421015121471800 station. Always, in Table 1 for the

USGS 421401121480900 station, it can be seen that pH

and SD have a direct relationship with the DO, with a CC

equal to 0.09 and 0.25, respectively, while SC and TE

water quality variable have an inverse relationship with the

DO, with a CC equal to -0.25 and -0.63, respectively.

According to Table 2, for the USGS 421015121471800

station, DO concentrations ranged over three orders of

magnitude, with minimum and maximum values of 0.1 and

nearly 30 mg/L (30.50 mg/L). The mean of all observa-

tions was 8.20 mg/L. At the USGS 421401121480900, DO

concentrations ranged over three orders of magnitude, with

minimum and maximum values of 1.90 and nearly 16 mg/

L (15.80 mg/L). The mean of all observations was

9.58 mg/L. According to Table 2, temperature inversely

related to the concentration of DO in water; as temperature

increases, DO decrease. Conversely, a temperature decline

causes the oxygen concentration to increase.

Performance indices

Any developed models must be evaluated regarding their

performances. In the present study we computed three

performances indices in order to validate and compare the

models developed. The three indices are calculated

according to Legates and McCabe (1999) and Moriasi et al.

(2007): the coefficient of correlation (CC), the root mean

squared error (RMSE) and the mean absolute error (MAE).

CC ¼1N

POi �Omð Þ Pi � Pmð Þ

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1N

Pn

i¼1Oi �Omð Þ2

s ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1N

Pn

i¼1Pi � Pmð Þ2

s ; ð9Þ

RMSE ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1

N

XN

i¼1Oi � Pið Þ2

vuut ; ð10Þ

MAE ¼ 1N

XN

i¼1Oi � Pij j; ð11Þ

where N is the number of data points, Oi is some measured

value and Pi is the corresponding model prediction. Om and

Pm are the average values of Oi and Pi.

Results and discussion

As stated above, the present study has two majors objec-

tives: one is modeling DO using water quality variables as

predictors and the second is the forecasting of DO at dif-

ferent hours in advance. In this section we present the

results obtained separately.

Modelling DO concentration

Modelling DO in the USGS 421015121471800 station

Five models were developed and compared. The five

models are the four-factor input vector model (TE, pH, SC

and SD), called M5; the three-factor input vector model

(TE, pH and SC), called M4; the three-factor input vector

model (TE, pH and SD), called M3; the two-factor input

vector model (TE and SC), called M2 and the two-factor

input vector model (TE and pH), called M1, respectively

(Table 3). A comparison of the performance of the RBFNN

model with that of the MLPNN model was carried out to

study their efficacy in modelling DO concentration. The

performances of the five (M1 to M5) developed models are

measured on the test set according to the three perfor-

mances indices and the results are reported in Table 4.

In all five RBFNN models developed, the key parameter

called spread (r) is the important parameter that must beoptimized during the training process. The spread param-

eter values providing the best testing performance of the

Table 1 Pearson correlation coefficients between and among physical water-quality parameters, and dissolved oxygen concentration

USGS421015121471800 USGS421401121480900

TE (�C) pH/ SC (lS/cm) SD (m) DO (mg/L) TE (�C) pH/ SC (lS/cm) SD (m) DO (mg/L)

TE (�C) 1.00 1.00pH 0.31 1.00 0.61 1.00

SC (lS/cm) -0.68 0.02 1.00 0.38 0.17 1.00

SD (m) 0.27 0.16 -0.07 1.00 -0.12 0.10 -0.66 1.00

DO (mg/L) -0.59 0.16 0.49 -0.26 1.00 -0.63 0.09 -0.25 0.25 1.00

�C degree Celsius, lS/cm microseimens per centimeter, m meter, mg/L milligrams per liter


123

RBNN were equal to 1. As is shown in Table 4, for the

RBFNN model, it is observed that the MAE, RMSE and

CC values vary in the range of 0.440–1.771 mg/L,

0.747–2.359 mg/L and 0.837–0.985, respectively, in the

training phase. In addition, in the validation phase, the

values of MAE, RMSE and CC, ranged from 0.521 to

1.759, 0.855 to 2.381 mg/L, and 0.838 to 0.981, respec-

tively. Finally, in the testing phase, the values of MAE,

RMSE and CC, ranged from 0.518 to 1.770, 0.884 to

2.388 mg/L, and 0.827 to 0.978, respectively. It may be

seen from Table 4, the CC values for all the six models are

reasonably good, being smallest (0.827) for M2 model and

greatest (0.985) for M5 model. The values of other model

performances such as RMSE, and MAE indicate that the

forecast performance of the RBFNN model is very good,

except the model M2 that is relatively acceptable, and the

RBFNN (M5) model performed better than the other

models in the training, validation, and testing phases. It is

important to state that the investigation showed that the

worst results were achieved using the model M2 that has

the TE and SC as inputs. The Scatterplots and comparison

of observed and calculated values of DO in the Training,

Validation and Testing phase, respectively, are shown in

Fig. 4 for the RBFNN M5 model, in the USGS

421015121471800 Station.

A multilayer perceptron neural network (MLPNN) as

shown in Fig. 2 has been developed for modelling DO using

the same input variables reported above. The proposed

MLPNN has three layers: an input layer with two to four

input variables under the (M1 to M5), a hidden layer with a

nonlinear sigmoid transfer function and a linear output layer

with only one neuron that correspond to the DO. The weights

and biases are the unique parameters that must be optimized

in the MLPNN model using a training algorithm. The opti-

mum number of neurons in the hidden layer is determined by

trial and error. We have varied the number of neurons from

one to twenty and we found that a model with thirteen

neurons at the hidden layer corresponds to the best model.

The parameters of MLPNN have been optimized using the

error back propagation algorithm which is an iterative

Learning algorithm. As seen from Table 4, the five MLPNN

models have shown significant variations based on the three

performance criteria. The lowest value of the RMSE in the

testing phase is 1.013 (inMLPNNM5) and the highest value

of the CC is 0.971 (in MLPNN M5). In addition, the lowest

value of MAE is 0.634 also (in MLPNN M5). From the

results of training, validation, and testing all the five models

developed in this study are evaluated all together, and the

M1, M3, M4, and M5 models are conspicuous. Among

these, the M4 and M5 models have quite low MAE and high

CC, and the M5 model is very successful on validation and

testing phase. All these five models were examined com-

paring their ability on modelling hourly DO concentration.

During training, the MLPNN (M5) performs slightly better

than the others. Also, in the validation and testing phases, the

MLPNN (M5) outperforms all other models in terms of

various performance criteria. The statistical indicators in the

Table 4 indicate that the calculated DO using RBFNN are

more accurate compared to MLPNN models (relatively low

values of MAE and RMSE, and high values of CC). In

conclusion, the M5 model is the best developed model for

modelling DO concentration, and RBFNN performs better

than MLPNN model. The Scatterplots and comparison of

observed and calculated values of DO in the Training,

Table 2 Hourly statisticalparameters of data set

Station Data Unit Xmean Xmax Xmin Sx Cv CC

USGS421015121471800 TE �C 12.73 26.90 0.70 6.84 0.54 -0.59pH / 8.18 10.20 7.10 0.70 0.09 0.16

SC lS/cm 233.54 518.00 116.00 136.53 0.58 0.49

SD m 1.03 1.16 0.77 0.06 0.06 -0.26

DO mg/l 8.20 30.50 0.10 4.31 0.53 1.00

USGS421401121480900 TE �C 12.39 26.20 0.20 7.06 0.57 -0.63pH / 8.23 10.30 7.10 0.74 0.09 0.09

SC lS/cm 120.08 146.00 107.00 8.06 0.07 -0.25

SD m 2.24 3.51 0.22 0.63 0.28 0.25

DO mg/l 9.58 15.80 1.90 2.11 0.22 1.00

Xmean mean, Xmax maximum, Xmin minimum, Sx standard deviation, Cv coefficient of variation, CC coef-

ficient de correlation with DO

Table 3 Combinations of input variables considered in developingmodels

Model Input structure Output

M1 TE and pH DO

M2 TE and SC DO

M3 TE, pH, and SD DO

M4 TE, pH, and SC DO

M5 TE, pH, SC and SD DO


123

Table 4 Performances of theRBFNN and MLPNN models in

different phases for USGS

421015121471800 station

Models Training Validation Testing

CC RMSE MAE CC RMSE MAE CC RMSE MAE

MLPNN

M1 0.951 1.334 0.851 0.954 1.321 0.848 0.956 1.255 0.827

M2 0.877 2.071 1.462 0.865 2.187 1.521 0.861 2.161 1.523

M3 0.960 1.205 0.772 0.963 1.186 0.764 0.966 1.109 0.723

M4 0.970 1.046 0.644 0.968 1.103 0.678 0.969 1.062 0.665

M5 0.970 1.042 0.647 0.972 1.021 0.657 0.971 1.013 0.634

RBFNN

M1 0.964 1.148 0.667 0.965 1.149 0.668 0.964 1.136 0.664

M2 0.837 2.359 1.771 0.838 2.381 1.759 0.827 2.388 1.770

M3 0.977 0.914 0.536 0.975 0.965 0.582 0.977 0.915 0.551

M4 0.977 0.915 0.524 0.972 1.038 0.616 0.973 0.996 0.588

M5 0.985 0.747 0.440 0.981 0.855 0.521 0.978 0.884 0.518

Fig. 4 Results with RBFNN model for USGS 421015121471800 station. Scatterplots and comparison of observed and calculated series of DO inthe: a training, b validation and c testing phase, respectively


123

Validation and Testing phases are shown in Fig. 5 for the

MLPNN M5 model, in the USGS 421015121471800

Station.

Modelling DO in the USGS 421401121480900 station

The accuracy and performance of RBFNN model for

modelling DO in the USGS 421401121480900 station are

evaluated and compared using RMSE, MAE, and CC sta-

tistical criterion. Table 5 shows all these criteria in the

training, validation and testing phases. The table shows

that, all models (M1 to M5) have a small RMSE value,

particularly M3, M4 and M5 models. According to Table 5

for all three RBFNN models (M3, M4 and M5), the per-

formance in the training phase was slightly better than the

performance for the validation and testing phases, with

only few improvements, with the exception of the model

M5 where the difference was statistically significant.

Nevertheless, the M5 model must be considered as the best

model developed. However, in accordance of the results

obtained in the previous station the M2 model that used SC

and TE as input, performed much poorer than that the

others in terms of RMSE, MAE, and CC. As seen from

Table 5, the five RBFNN models have shown significant

variations based on the three performance criteria. In the

training phase, the lowest value of the RMSE of the models

is 0.287 mg/L (in RBFNN M5) and the highest value of the

CC is 0.991 (in RBFNN M5). In addition, the lowest value

of MAE is 0.184 mg/L also (in RBFNN M5). Table 5

indicates that the RBFNN (M5) has the smallest MAE

(0.281 mg/L), RMSE (0.461 mg/L), and the highest CC

(0.978) in the validation phase; and in the testing phase the

RBFNN (M5) has the smallest MAE (0.312 mg/L), RMSE

(0.644 mg/L) and the highest CC (0.955). The Scatterplots

and comparison of observed and calculated values of DO in

the Training, Validation and Testing phase, respectively,

are shown in Fig. 6 for the RBFNN M5 model, in the

USGS 421401121480900 station.

Fig. 5 Results with MLPNN model for USGS 421015121471800 station. Scatterplots and comparison of observed and calculated series of DOin the: a training, b validation and c testing phase, respectively


123

Table 5 Performances of theRBFNN and MLPNN models in

different phases for USGS

421401121480900 station

Models Training Validation Testing


MLPNN

M1 0.941 0.706 0.474 0.946 0.693 0.473 0.934 0.761 0.509

M2 0.844 1.125 0.705 0.846 1.144 0.725 0.831 1.173 0.735

M3 0.964 0.559 0.367 0.965 0.562 0.383 0.954 0.641 0.409

M4 0.971 0.504 0.332 0.972 0.506 0.341 0.966 0.552 0.357

M5 0.976 0.453 0.300 0.976 0.469 0.320 0.972 0.497 0.324

RBFNN

M1 0.954 0.626 0.412 0.955 0.637 0.420 0.935 0.761 0.491

M2 0.825 1.184 0.748 0.828 1.201 0.785 0.810 1.236 0.774

M3 0.977 0.444 0.281 0.972 0.508 0.330 0.951 0.685 0.390

M4 0.977 0.444 0.267 0.975 0.481 0.303 0.951 0.670 0.370

M5 0.991 0.287 0.184 0.978 0.461 0.281 0.955 0.644 0.312

Fig. 6 Results with RBFNN model for USGS 421401121480900 station. Scatterplots and comparison of observed and calculated series of DO inthe: a training, b validation and c testing phase, respectively


123

As seen from Table 5, the five MLPNN models have

shown significant variations based on the three perfor-

mance criteria. The lowest value of the RMSE of fore-

casting models is 0.453 (mg/L) (in MLPNN M5) and the

highest value of the CC is 0.976 (in MLPNN M5). In

addition, the lowest value of MAE is 0.300 (mg/L) also (in

MLPNN M5). From the results of training, validation, and

testing all the five models developed in this study are

evaluated all together, and the M1, M3, M4, and M5

models are conspicuous. Among these, the M4 and M5

models have quite low MAE and high CC, and the M5

model is very successful on testing phase. All these models

were examined comparing their ability on modelling

hourly DO concentration. During training, the MLPNN

(M5) performs slightly better than the others. Also, in the

validation and testing phases, the MLPNN (M5) outper-

forms all other models in terms of various performance

criteria. In conclusion, the M5 model is the best developed

model for modelling DO concentration. The comparison

between the RBFNN and MLPNN clearly show the

differences between the two models which favour the

MLPNN in testing phase, while the RBFNN outperform

MLPNN in the training and validation phases, thereby

establishing the superiority of the RBFNN models. In the

training phase, the RBFNN M5 improved the MLPNN M5

of about 1.5 % regarding the CC value. In addition, in the

validation phase as seen in Tables 5, the RBFNN M5

improved the MLPNN M5 of about 0.2 % regarding the

CC value. In the testing phase, the MLPNN M5 improved

the RBFNN M5 of about 1.7 % regarding the CC value.

The Scatterplots and comparison of observed and calcu-

lated values of DO in the Training, Validation and Testing

phase, respectively, are shown in Fig. 7 for the MLPNN

M5 model, in the USGS 421401121480900 Station.

Forecasting DO concentration

Notwithstanding, the importance of the developed models

for estimating DO, it should be noted that they are linked to

the water quality variables, and the models cannot be done

Fig. 7 Results with MLPNN model for USGS 421401121480900 station. Scatterplots and comparison of observed and calculated series of DOin the: a training, b validation and c testing phase, respectively


123

and if they are done they can only be done with available,

timely and reliable water quality data (Heddam 2016a). It

would be interesting to investigate the capabilities of the

proposed models (RBFNN and MLPNN) for forecasting

DO at different level in advance using only the time series

of DO and without the need for water quality variables as

input to the models. Attempts to do this have not been

entirely investigated in the past, and the present study is

one of the first studies presented in the literature for fore-

casting DO time series. The statistics of the hourly DO

Time series used for developing the forecasting models are

shown in Table 6. We have developed six forecasting

models called FM1 to FM6 using the same input data and

have different output, the structure are shown in Table 7

below. It can be seen from Table 7; the six forecasting

models have the same input structure: the input variables

present the previously measured DO (t - 3, t - 2, t - 1

and t) and the output variable corresponds to the DO at

time t ? 1, t ? 12, t ? 24, t ? 48, t ? 72 and t ? 168,

where DO (t) corresponds to the DO at time t. Figure 8

shows how the input and output data sets are created.

According to Table 7, all the six developed models are

basically approximators of the general equation, where n is

the next time step:

DOðtþnÞ¼ f DOðtÞþDOðt�1ÞþDOðt�2ÞþDOðt�3Þ½ �:ð12Þ

Table 8 represents the MLPNN and RBFNN results of

DO forecasting in different phases, several hours in

advance, for USGS421401121480900 station. On

analyzing Table 8, it is apparent that the performance of

the RBFNN and MLPNN are good until 72 h in advance,

since the CC are rather than 0.92 in the validation phase. In

the testing phase the two models have a CC equals to 0.74.

From the Table 8 it can be observed that the model FM1

whose output is the DO at (t ? 1) performed better than the

others models in the training, validation, and testing pha-

ses. For the RBFNN model and according to Table 8, in the

training phase, the values of CC, RMSE and MAE, ranged

from 0.796 to 0.989, 0.322 to 1.356, and 0.220 to 0.948,

respectively. In addition, in the validation phase, the values

of CC, RMSE, and MAE ranged from 0.753 to 0.997, 0.089

to 0.807, and 0.065 to 0.649, respectively. Finally, in the

testing phase, the values of CC, RMSE, and MAE ranged

from 0.533 to 0.997, 0.071 to 0.704, and 0.054 to 0.606,

respectively. It may be seen from Table 8, the results

obtained using the MLPNN models are generally similar to

those obtained by the RBFNN models, and there were

some noticeable differences. In the training phase, MLPNN

models FM1, FM3, FM4 and FM5 are superior to the same

RBFNN models and the difference was more marked

amongst the CC coefficients. From the Table 8 it can be

revealed that RBFNN and MLPNN FM1 models with 1 h

ahead obtained the best statistics of CC (0.997 and 0.997),

RMSE (0.067 mg/L and 0.071 mg/L), and MAE

(0.052 mg/L and 0.054 mg/L) respectively, in the testing

phase. An important conclusion from the results obtained is

that that increasing the forecasting horizon from (1) to

(168) h ahead decreases the model accuracy. The CC

decreases from 0.99 to 0.53 in testing phase and the RMSE

Table 6 Hourly statisticalparameters of data set for

forecasting models

Station Period Unit Xmean Xmax Xmin Sx Cv

USGS421015121471800 Training mg/l 6.347 14.400 0.100 3.500 0.551

Validation mg/l 12.120 30.500 7.000 5.257 0.434

Testing mg/l 9.653 15.300 6.000 1.218 0.126

Whole period mg/l 8.163 30.500 0.100 4.328 0.530

USGS421401121480900 Training mg/l 8.883 15.800 1.900 2.205 0.248

Validation mg/l 11.709 14.600 10.100 1.170 0.100

Testing mg/l 11.709 14.600 10.100 1.170 0.100

Whole period mg/l 9.606 15.800 1.900 2.128 0.222

Xmean mean, Xmax maximum, Xmin minimum, Sx standard deviation, Cv coefficient of variation

Table 7 Combinations of inputvariables considered in

developing forecasting models

Model Input structure Output

FM1 DO (t - 3), DO (t - 2), DO (t - 1), DO (t) DO (t ? 1): ?1 h ahead







123

and MAE increases from (0.067 and 0.052) to (0.704 and

0.606) always in the testing phase.

Table 9 represents the MLPNN and RBFNN results of

DO forecasting in different phases, for USGS

421015121471800 station. In general, CC were high (0.79

to 0.99), RMSE and MAE are low (0.450 mg/L to

2.143 mg/L, 0.253 mg/L to 1.581 mg/L) in the training

phase. From the Table 9 it can be observed that the model

FM1 either for the MLPNN or RBFNN, performed better

than the others models in the training, validation, or testing

phases. Overall, in the testing phase, RBFNN have higher

CC value and lower RMSE and MAE values than those of

the MLPNN. According to Table 9, the FM6 model is good

in the training phase, but very poor with very low CC and

very high RMSE and MAE in the validation and testing

phases. The CC ranged from (0.111 to 0.224), RMSE from

(1.210 to 1.171 mg/L), and MAE from (0.934 to 0.895 mg/

L). An important conclusion from the results obtained is

that that increasing the forecasting horizon from (1) to

(168) hours ahead decreases the model accuracy. The CC

decreases from 0.97 to 0.111 in testing phase and the

RMSE and MAE increases from (0.287 and 0.192) to

(1.210 and 0.934) always in the testing phase. Finally, two

points have to be highlighted: first, we think it is very

important that we should investigate the capabilities of the

proposed models using a long-term data set rather than

1 year. The second and final point to be stated is that the

proposed models are a powerful tool for forecasting DO up

to 72 h (3 days) ahead with good accuracy.

Conclusion

In this study, two well know artificial intelligences tech-

niques namely, RBFNN and MLPNN were developed for

modeling and forecasting DO using water quality

x1 x2 x3 x4 x5 x16 x28 x52 x76 x172 xn

Model1 : DO (t+1)

Model 2 : DO (t+12)

Model 3 : DO (t+24)

Model 4 : DO (t+48)

Model 5 : DO (t+72)

Model1 : DO (t+168)

The hourly time step (t): the value of x correspond to the DO (mg/L) at one hour interval Fig. 8 Illustration of data inputand output format for the

MLPNN and RBFNN

forecasting models and the

multi hours in advances

forecasting scheme

Table 8 Performances of theMLPNN and RBFNN

forecasting models in different

phases several hours in advance

for USGS 421401121480900

station

Forecasting interval Training Validation Testing


MLPNN model

?1 h 0.997 0.159 0.107 0.997 0.097 0.073 0.997 0.067 0.052

?12 h 0.831 1.232 0.847 0.942 0.454 0.350 0.896 0.390 0.307

?24 h 0.960 0.621 0.415 0.980 0.238 0.182 0.939 0.286 0.223

?48 h 0.927 0.833 0.562 0.954 0.363 0.287 0.845 0.444 0.353

?72 h 0.900 0.974 0.674 0.920 0.472 0.375 0.741 0.562 0.463

?168 h 0.791 1.377 0.964 0.753 0.806 0.648 0.542 0.687 0.584

RBFNN model

?1 h 0.989 0.322 0.220 0.997 0.089 0.065 0.997 0.071 0.054

?12 h 0.838 1.207 0.828 0.926 0.453 0.339 0.901 0.370 0.293

?24 h 0.949 0.700 0.468 0.980 0.236 0.181 0.935 0.296 0.231

?48 h 0.913 0.907 0.610 0.952 0.368 0.297 0.841 0.449 0.355

?72 h 0.886 1.031 0.714 0.917 0.481 0.388 0.743 0.555 0.450

?168 h 0.796 1.356 0.948 0.753 0.807 0.649 0.533 0.704 0.606

?1 h 1 h in advance, ?12 h 12 h in advance, ?24 h 24 h in advance (1 day ahead), ?48 h 48 h in advance

(2 day ahead), ?72 h 72 h in advance (3 day ahead), ?168 h 168 h in advance (1 week ahead)


123

variables and antecedent values of DO, respectively.

Given a set of training data, the two models provide a

good and powerful tool for estimating DO. To demon-

strate the usefulness of the models, we used data obtained

from two particular stations operate by the USGS data

survey. In the modeling phase, we developed five models

with different combinations of input variables and we

select the model that has the best performance according

to three performances criteria: RMSE, MAE and CC. As a

result we have obtained a CC ranged from 0.82 to 0.99,

0.82 to 0.98 and 0.81 to 0.97, in the training, validation

and testing phase respectively and the best results are

obtained with the model that contains all candidate input

variables:water pH, TE, SC, and SD. Also, it is important

to note that using only two variables as inputs, which are

TE and pH, we have obtained a high CC approximately

0.96 in the testing phase that is very promising and

encouraging. In the forecasting phase, we compared six

models using the same input structure and we have

attempted, however, to forecast, as much as possible the

DO at different hours in advance, from 1 h in advance up

to 168 h (7 days) in advance. The results obtained are

very promising, and we demonstrated that until 72 h in

advance we have a high CC approximately 0.92 in the

validation phase and 0.74 in the testing phase. At 168 h

(7 days) in advance we obtained a low result with a CC

close to 0.54. The reasons for this low level of results can

be explained in a number of ways. Firstly, the length of

the data set is probably insufficient and a data base that

covers more than 1 year is necessary, this is in order to

include in the validation and testing phases the all four

seasons. Furthermore, it may also help if we applied data-

preprocessing techniques like wavelet multi-resolution

analysis, coupled with artificial intelligences techniques,

especially for forecasting longer hours in advance. In the

future, further research is necessary to improve the pre-

diction accuracy of the proposed models and it would be

interesting to evaluate the applied models for a long

period rather than 1 year.

Acknowledgments The author thanks the staffs of USGS web serverfor providing the data that makes this research possible.

References

Abdul-Aziz OI, Ishtiaq KS (2014) Robust empirical modelling of

dissolved oxygen in small rivers and streams: scaling by a single

reference observation. J Hydrol 511:648–657. doi:10.1016/j.

jhydrol.2014.02.022

Abdul-Aziz OI, Wilson BN, Gulliver JS (2007a) An extended

stochastic harmonic analysis algorithm: application for dissolved

oxygen. Water Resour Res 43:W08417. doi:10.1029/

2006WR005530

Abdul-Aziz OI, Wilson BN, Gulliver JS (2007b) Calibration and

validation of an empirical dissolved oxygen model. J Environ

Eng 133(7):698–710. doi:10.1061/(ASCE)0733-9372(2007)133:

7(698)

Adamala S, Raghuwanshi NS, Mishra A (2015) Generalized quadratic

synaptic neural networks for ET0 modeling. Environ Process

2:309–329. doi:10.1007/s40710-015-0066-6

Akkoyunlu A, Altun H, Cigizoglu H (2011) Depth-integrated

estimation of dissolved oxygen in a lake. ASCE J Environ

Eng. 137(10):961–967. doi:10.1061/(ASCE)EE.1943-7870.

0000376

Alizadeh MJ, Kavianpour MR (2015) Development of wavelet-ANN

models to predict water quality parameters in Hilo Bay, Pacific

Ocean. Mar Pollut Bull 98:171–178. doi:10.1016/j.marpolbul.

2015.06.052

Table 9 Performances of theMLPNN and RBFNN

forecasting models in different

phases several hours in advance

for USGS 421015121471800

station

Forecasting interval Training Validation Testing


MLPNN model

?1 h 0.992 0.450 0.253 0.996 0.478 0.248 0.943 0.446 0.208

?12 h 0.925 1.332 0.945 0.963 1.647 1.137 0.685 0.880 0.640

?24 h 0.957 1.045 0.728 0.970 1.360 0.869 0.835 0.615 0.447

?48 h 0.930 1.292 0.909 0.924 2.175 1.435 0.700 0.791 0.617

?72 h 0.913 1.434 1.007 0.850 2.889 2.002 0.576 0.940 0.733

?168 h 0.832 1.957 1.448 0.630 4.211 3.238 0.111 1.210 0.934

RBFNN model

?1 h 0.987 0.563 0.330 0.997 0.458 0.254 0.970 0.287 0.192

?12 h 0.905 1.491 1.061 0.974 1.319 0.840 0.712 0.837 0.638

?24 h 0.955 1.042 0.710 0.968 1.404 0.897 0.833 0.620 0.444

?48 h 0.924 1.344 0.922 0.917 2.203 1.429 0.683 0.826 0.647

?72 h 0.904 1.505 1.061 0.856 2.828 1.944 0.586 0.936 0.739

?168 h 0.794 2.143 1.581 0.594 4.334 3.309 0.224 1.171 0.895

?1 h 1 h in advance, ?12 h 12 h in advance, ?24 h 24 h in advance (1 day ahead), ?48 h 48 h in advance

(2 day ahead), ?72 h 72 h in advance (3 day ahead), ?168 h 168 h in advance (1 week ahead)


123

http://dx.doi.org/10.1016/j.jhydrol.2014.02.022http://dx.doi.org/10.1016/j.jhydrol.2014.02.022http://dx.doi.org/10.1029/2006WR005530http://dx.doi.org/10.1029/2006WR005530http://dx.doi.org/10.1061/(ASCE)0733-9372(2007)133:7(698)http://dx.doi.org/10.1061/(ASCE)0733-9372(2007)133:7(698)http://dx.doi.org/10.1007/s40710-015-0066-6http://dx.doi.org/10.1061/(ASCE)EE.1943-7870.0000376http://dx.doi.org/10.1061/(ASCE)EE.1943-7870.0000376http://dx.doi.org/10.1016/j.marpolbul.2015.06.052http://dx.doi.org/10.1016/j.marpolbul.2015.06.052

An Y, Zou Z, Zhao Y (2015) Forecasting of dissolved oxygen in the

Guanting reservoir using an optimized NGBM (1, 1) model.

J Environ Sci. 29:158–164. doi:10.1016/j.jes.2014.10.005

Antanasijević D, Pocajt V, Povrenović D, Perić-Grujić A, Ristić M

(2013) Modelling of dissolved oxygen content using artificial

neural networks: Danube River, North Serbia, case study.

Environ Sci Pollut Res 20:9006–9013. doi:10.1007/s11356-

013-1876-6

Antanasijević D, Pocajt V, Povrenović D, Perić-Grujić A, Ristić M

(2014) Modelling of dissolved oxygen in the Danube River using

artificial neural networks and Monte Carlo Simulation uncer-

tainty analysis. J Hydrol 519:1895–1907. doi:10.1016/j.jhydrol.

2014.10.009

Antonopoulos VZ, Georgiou PE, Antonopoulos ZV (2015) Dispersion

coefficient prediction using empirical models and ANNs.

Environ Process 2:379–394. doi:10.1007/s40710-015-0074-6

Areerachakul S, Sophatsathit P, Lursinsap C (2013) Integration of

unsupervised and supervised neural networks to predict dis-

solved oxygen concentration in canals. Ecol Model

261–262:1–7. doi:10.1016/j.ecolmodel.2013.04.002

Ay M, Kisi O (2012) Modeling of dissolved oxygen concentration

using different neural network techniques in Foundation Creek,

El Paso County, Colorado. ASCE J Environ Eng

138(6):654–662. doi:10.1061/(ASCE)EE.1943-7870.0000511

Azad S, Debnath S, Rajeevan M (2015) Analysing predictability in

indian monsoon rainfall: a data analytic approach. Environ

Process 2(1):717–727. doi:10.1007/s40710-015-0108-0

Barzegar R, Moghaddam AA (2016) Combining the advantages of

neural networks using the concept of committee machine in the

groundwater salinity prediction. Model Earth Syst Environ 2:26.

doi:10.1007/s40808-015-0072-8

Barzegar R, Sattarpour M, Nikudel MR, Moghaddam AA (2016)

Comparative evaluation of artificial intelligence models for

prediction of uniaxial compressive strength of travertine rocks,

case study: Azarshahr area, NW Iran. Model Earth Syst Environ

2:76. doi:10.1007/s40808-016-0132-8

Bayram A, Uzlu E, Kankal M et al (2015) Modeling stream dissolved

oxygen concentration using teaching-learning based optimiza-

tion algorithm. Environ Earth Sci 73:6565–6576. doi:10.1007/

s12665-014-3876-3

Bhunia GS, Shit PK, Maiti R (2016) Spatial variability of soil organic

carbon under different land use using radial basis function

(RBF). Model Earth Syst Environ. 2:17. doi:10.1007/s40808-

015-0070-x

Broomhead DS, Lowe D (1988) Multivariable functional interpola-

tion and adaptive networks. Complex Syst. 2:321–355

Costa M, Gonçalves AM (2011) Clustering and forecasting of

dissolved oxygen concentration on a river basin. Stoch Environ

Res Risk Assess 25:151–163. doi:10.1007/s00477-010-0429-5

Das DB, Thirakulchaya T, Deka L, Hanspal NS (2015) Artificial

neural network to determine dynamic effect in capillary pressure

relationship for two-phase flow in porous media with micro-

heterogeneities. Environ Process 2:1–18. doi:10.1007/s40710-

014-0045-3

Dhar J, Baghel RS (2016) Role of dissolved oxygen on the plankton

dynamics in spatiotemporal domain. Model Earth Syst Environ

2:6. doi:10.1007/s40808-015-0061-y

Ehteshami M, Farahani ND, Tavassoli S (2016) Simulation of nitrate

contamination in groundwater using artificial neural networks.

Model Earth Syst Environ 2:28. doi:10.1007/s40808-016-0080-3

Emamgholizadeh S, Kashi H, Marofpoor I, Zalaghi E (2014)

Prediction of water quality parameters of Karoon River (Iran)

by artificial intelligence-based models. Int J Environ Sci Technol

11:645–656. doi:10.1007/s13762-013-0378-x

Evrendilek F, Karakaya N (2014) Regression model-based predictions

of diel, diurnal and nocturnal dissolved oxygen dynamics after

wavelet denoising of noisy time series. Phys A 404:8–15. doi:10.

1016/j.physa.2014.02.062

Evrendilek F, Karakaya N (2015) Spatiotemporal modeling of

saturated dissolved oxygen through regressions after wavelet

denoising of remotely and proximally sensed data. Earth Sci Inf

8:247–254. doi:10.1007/s12145-014-0148-4

Gonçalves AM, Costa M (2013) Predicting seasonal and hydro-

meteorological impact in environmental variables modelling via

Kalman filtering. Stoch Environ Res Risk Assess 27:1021–1038.

doi:10.1007/s00477-012-0640-7

Handhal AM (2016) Prediction of reservoir permeability from

porosity measurements for the upper sandstone member of

Zubair Formation in Super-Giant South Rumila oil field,

southern Iraq, using M5P decision tress and adaptive neuro-

fuzzy inference system (ANFIS): a comparative study. Model

Earth Syst Environ 2:111. doi:10.1007/s40808-016-0179-6

Haykin S (1999) Neural networks a comprehensive foundation.

Prentice Hall, Upper Saddle River

Heddam S (2014a) Generalized regression neural network (GRNN)

based approach for modelling hourly dissolved oxygen concen-

tration in the Upper Klamath River, Oregon, USA. Environ

Technol 35(13):1650–1657. doi:10.1080/09593330.2013.878396

Heddam S (2014b) Modelling hourly dissolved oxygen concentration

(DO) using two different adaptive neuro-fuzzy inference systems

(ANFIS): a comparative study. Environ Monit Assess

186:597–619. doi:10.1007/s10661-013-3402-1

Heddam S (2014c) Modelling hourly dissolved oxygen concentration

(DO) using dynamic evolving neural-fuzzy inference system

(DENFIS) based approach: case study of Klamath River at miller

island boat ramp, Oregon, USA. Environ Sci Pollut Res

21:9212–9227. doi:10.1007/s11356-014-2842-7

Heddam S (2014d) Generalized regression neural network (GRNN)-

based approach for colored dissolved organic matter (CDOM)

retrieval: case study of Connecticut River at Middle Haddam

Station, USA. Environ Monit Assess 186:7837–7848. doi:10.

1007/s10661-014-3971-7

Heddam S (2016a) Use of optimally pruned extreme learning

machine (OP-ELM) in forecasting dissolved oxygen concentra-

tion (DO) several hours in advance: a case study from the

Klamath River. Environ Process, Oregon, USA. doi:10.1007/

s40710-016-0172-0

Heddam S (2016b) Secchi disk depth estimation from water quality

parameters: artificial neural network versus multiple linear

regression models? Environ Process. doi:10.1007/s40710-016-

0144-4

Heddam S (2016c) Multilayer perceptron neural network based

approach for modelling Phycocyanin pigment concentrations:

case study from lower Charles River buoy. Environ Sci Pollut

Res, USA. doi:10.1007/s11356-016-6905-9

Heddam S, Bermad A, Dechemi N (2011) Applications of radial basis

function and generalized regression neural networks for mod-

elling of coagulant dosage in a drinking water treatment: a

comparative study. ASCE J Environ Eng 137(12):1209–1214.

doi:10.1061/(ASCE)EE.1943-7870.0000435

Heddam S, Bermad A, Dechemi N (2012) ANFIS-based modelling

for coagulant dosage in drinking water treatment plant: a case

study. Environ Monit Assess 184:1953–1971. doi:10.1007/

s10661-011-2091-x

Heddam S, Lamda H, Filali S (2016) Predicting effluent biochemical

oxygen demand in a wastewater treatment plant using general-

ized regression neural network based approach: a comparative

study. Environ Process 3(1):153–165. doi:10.1007/s40710-016-

0129-3

Hornik K (1991) Approximation capabilities of multilayer feed

forward networks. Neural Netw 4(2):251–257. doi:10.1016/

0893-6080(91)90009-T


123

http://dx.doi.org/10.1016/j.jes.2014.10.005http://dx.doi.org/10.1007/s11356-013-1876-6http://dx.doi.org/10.1007/s11356-013-1876-6http://dx.doi.org/10.1016/j.jhydrol.2014.10.009http://dx.doi.org/10.1016/j.jhydrol.2014.10.009http://dx.doi.org/10.1007/s40710-015-0074-6http://dx.doi.org/10.1016/j.ecolmodel.2013.04.002http://dx.doi.org/10.1061/(ASCE)EE.1943-7870.0000511http://dx.doi.org/10.1007/s40710-015-0108-0http://dx.doi.org/10.1007/s40808-015-0072-8http://dx.doi.org/10.1007/s40808-016-0132-8http://dx.doi.org/10.1007/s12665-014-3876-3http://dx.doi.org/10.1007/s12665-014-3876-3http://dx.doi.org/10.1007/s40808-015-0070-xhttp://dx.doi.org/10.1007/s40808-015-0070-xhttp://dx.doi.org/10.1007/s00477-010-0429-5http://dx.doi.org/10.1007/s40710-014-0045-3http://dx.doi.org/10.1007/s40710-014-0045-3http://dx.doi.org/10.1007/s40808-015-0061-yhttp://dx.doi.org/10.1007/s40808-016-0080-3http://dx.doi.org/10.1007/s13762-013-0378-xhttp://dx.doi.org/10.1016/j.physa.2014.02.062http://dx.doi.org/10.1016/j.physa.2014.02.062http://dx.doi.org/10.1007/s12145-014-0148-4http://dx.doi.org/10.1007/s00477-012-0640-7http://dx.doi.org/10.1007/s40808-016-0179-6http://dx.doi.org/10.1080/09593330.2013.878396http://dx.doi.org/10.1007/s10661-013-3402-1http://dx.doi.org/10.1007/s11356-014-2842-7http://dx.doi.org/10.1007/s10661-014-3971-7http://dx.doi.org/10.1007/s10661-014-3971-7http://dx.doi.org/10.1007/s40710-016-0172-0http://dx.doi.org/10.1007/s40710-016-0172-0http://dx.doi.org/10.1007/s40710-016-0144-4http://dx.doi.org/10.1007/s40710-016-0144-4http://dx.doi.org/10.1007/s11356-016-6905-9http://dx.doi.org/10.1061/(ASCE)EE.1943-7870.0000435http://dx.doi.org/10.1007/s10661-011-2091-xhttp://dx.doi.org/10.1007/s10661-011-2091-xhttp://dx.doi.org/10.1007/s40710-016-0129-3http://dx.doi.org/10.1007/s40710-016-0129-3http://dx.doi.org/10.1016/0893-6080(91)90009-Thttp://dx.doi.org/10.1016/0893-6080(91)90009-T

Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward

networks are universal approximators. Neural Netw 2:359–366.

doi:10.1016/0893-6080(89)90020-8

Javan K, Lialestani MR, Nejadhossein M (2015) A comparison of

ANN and HSPF models for runoff simulation in Gharehsoo

River watershed, Iran. Modell Earth Syst Environ. 1:41. doi:10.

1007/s40808-015-0042-1

Kasiviswanathan KS, Sudheer KP (2016) Comparison of methods

used for quantifying prediction interval in artificial neural

network hydrologic models. Model Earth Syst Environ 2:22.

doi:10.1007/s40808-016-0079-9

Kasiviswanathan KS, Saravanan S, Balamurugan M, Saravanan K

(2016) Genetic programming based monthly groundwater level

forecast models with uncertainty quantification. Model Earth

Syst Environ 2:27. doi:10.1007/s40808-016-0083-0

Kisi O, Akbari N, Sanatipour M, Hashemi A, Teimourzadeh K, Shiri J

(2013) Modeling of dissolved oxygen in river water using

artificial intelligence techniques. J Environ Inform JEI

22(2):92–101. doi:10.3808/jei.201300248

Ladlani I, Houichi L, Djemili L, Heddam S, Belouz K (2012)

Modeling daily reference evapotranspiration (ET0) in the north

of Algeria using generalized regression neural networks (GRNN)

and radial basis function neural networks (RBFNN): a compar-

ative study. Meteorol Atmos Phys 118:163–178. doi:10.1007/

s00703-012-0205-9

Legates DR, McCabe GJ (1999) Evaluating the use of ‘‘goodness of

fit’’ measures in hydrologic and hydroclimatic model validation.

Water Resour Res 35:233–241. doi:10.1029/1998WR900018

Lin GF, Wu MC (2011) An RBF network with a two-step learning

algorithm for developing a reservoir inflow forecasting model.

J Hydrol 405:439–450. doi:10.1016/j.jhydrol.2011.05.042

Liu S, Tai H, Ding Q, Li D, Xu L, Wei Y (2013) A hybrid approach of

support vector regression with genetic algorithm optimization for

aquaculture water quality prediction. Math Comput Model

58:458–465. doi:10.1016/j.mcm.2011.11.021

Liu S, Xu L, Jiang Y, Li D, Chen Y, Li Z (2014) A hybrid WA-

CPSO-LSSVR model for dissolved oxygen content prediction in

crab culture. Eng Appl Artif Intell 29:114–124. doi:10.1016/j.

engappai.2013.09.019

Mandal S, Mahapatra SS, Adhikari S, Patel RK (2015) Modeling of

arsenic (III) removal by evolutionary genetic programming and

least square support vector machine models. Environ Process

2:145–172. doi:10.1007/s40710-014-0050-6

Misra OP, Chaturvedi D (2016) Fate of dissolved oxygen and survival

of fish population in aquatic ecosystem with nutrient loading: a

model. Model Earth Systems Environ 2:112. doi:10.1007/

s40808-016-0168-9

Mondal I, Bandyopadhyay J, Paul AK (2016) Water quality modeling

for seasonal fluctuation of Ichamati River, West Bengal, India.

Model Earth Syst Environ 2:113. doi:10.1007/s40808-016-0153-3

Moody J, Darken C (1989) Fast learning in networks of locally tuned

processing units. Neural Comput 1(2):281–294. doi:10.1162/

neco.1989.1.2.281

Moriasi DN, Arnold JG, Van Liew MW, Bingner RL, Harmel RD,

Veith TL (2007) Model evaluation guidelines for systematic

quantification of accuracy in watershed simulations. Trans

ASABE 50(3):885–900. doi:10.13031/2013.23153

Nemati S, Fazelifard MH, Terzi O, Ghorbani MA (2015) Estimation

of dissolved oxygen using data-driven techniques in the Tai Po

River, Hong Kong. Environ Earth Sci 74:4065–4073. doi:10.

1007/s12665-015-4450-3

O’Driscoll C, O’Connor M, Asam Z, Eyto E, Brown LE, Xiao L

(2016) Forest clear felling effects on dissolved oxygen and

metabolism in peatland streams. J Environ Manag 166:250–259.

doi:10.1016/j.jenvman.2015.10.031

Pal S, Manna S, Chattopadhyay B, Mukhopadhyay SK (2016) Carbon

sequestration and its relation with some soil properties of East

Kolkata Wetlands (a Ramsar Site): a spatio-temporal study using

radial basis functions. Model Earth Syst Environ 2:80. doi:10.

1007/s40808-016-0136-4

Park J, Sandberg IW (1991) Universal approximation using radial

basis function networks. Neural Comput 3(2):246–257. doi:10.

1162/neco.1991.3.2.246

Parsaie A (2016) Predictive modeling the side weir discharge

coefficient using neural network. Model Earth Syst Environ

2:63. doi:10.1007/s40808-016-0123-9

Parsaie A, Haghiabi AH (2015) Predicting the longitudinal dispersion

coefficient by radial basis function neural network. Model Earth

Syst Environ 1:34. doi:10.1007/s40808-015-0037-y

Parsaie A, Yonesi HA, Najafian S (2015) Predictive modeling of

discharge in compound open channel by support vector machine

technique. Model Earth Syst Environ 1:1. doi:10.1007/s40808-

015-0002-9

Poggio T, Girosi F (1990a) Regularization algorithms for learning

that are equivalent to multilayer networks. Sci New Ser

247(4945):978–982. doi:10.1126/science.247.4945.978

Poggio T, Girosi F (1990b) Networks for approximation and learning.

Proc IEEE 78:1481. doi:10.1109/5.58326

Prasad MB, Long W, Zhang X, Wood RJ, Murtugudde R (2011)

Predicting dissolved oxygen in the Chesapeake Bay: applicationsand implications. Aquat Sci 73:437–451. doi:10.1007/s00027-

011-0191-x

Raj AS, Oliver DH, Srinivas Y (2015) An automatic inversion tool for

geoelectrical resistivity data using supervised learning algorithm

of adaptive neuro fuzzy inference system (ANFIS). Model Earth


Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal

representations by error propagation. In: Rumelhart DE,

McClelland PDP, Research Group (eds) Parallel distributed

processing: explorations in the microstructure of cognition.

Foundations, vol I. MIT Press, Cambridge, pp 318–362

Sakizadeh M (2016) Artificial intelligence for the prediction of water

quality index in groundwater systems. Model Earth Syst Environ

2:8. doi:10.1007/s40808-015-0063-9

Salami ES, Ehteshami M (2016) Application of neural networks

modeling to environmentally global climate change at San

Joaquin Old River Station. Model Earth Syst Environ 2:38.

doi:10.1007/s40808-016-0094-x

Santisukkasaem U, Olawuyi F, Oye P, Das DB (2015) Artificial

neural network (ANN) For evaluating permeability decline in

permeable reactive barrier (PRB). Environ Process 2:291–307.

doi:10.1007/s40710-015-0076-4

Sharma N, Zakaullah Md, Tiwari H, Kumar D (2015) Runoff and

sediment yield modeling using ANN and support vector

machines: a case study from Nepal watershed. Model Earth


Sullivan AB, Rounds SA, Deas ML, Sogutlugil IE (2012) Dissolved

oxygen analysis, TMDL model comparison, and particulate

matter shunting-preliminary results from three model scenarios

for the Klamath River upstream of Keno Dam, Oregon: US

Geological Survey Open-File Report 2012-1101, 30 p. http://

pubs.usgs.gov/of/2012/1101/. Accessed 13 June 2016

Sullivan AB, Rounds SA, Asbill-Case JR, Deas ML (2013a)

Macrophyte and pH buffering updates to the Klamath River

water-quality model upstream of Keno Dam, Oregon: US

Geological Survey Scientific Investigations Report 2013-5016,


123

http://dx.doi.org/10.1016/0893-6080(89)90020-8http://dx.doi.org/10.1007/s40808-015-0042-1http://dx.doi.org/10.1007/s40808-015-0042-1http://dx.doi.org/10.1007/s40808-016-0079-9http://dx.doi.org/10.1007/s40808-016-0083-0http://dx.doi.org/10.3808/jei.201300248http://dx.doi.org/10.1007/s00703-012-0205-9http://dx.doi.org/10.1007/s00703-012-0205-9http://dx.doi.org/10.1029/1998WR900018http://dx.doi.org/10.1016/j.jhydrol.2011.05.042http://dx.doi.org/10.1016/j.mcm.2011.11.021http://dx.doi.org/10.1016/j.engappai.2013.09.019http://dx.doi.org/10.1016/j.engappai.2013.09.019http://dx.doi.org/10.1007/s40710-014-0050-6http://dx.doi.org/10.1007/s40808-016-0168-9http://dx.doi.org/10.1007/s40808-016-0168-9http://dx.doi.org/10.1007/s40808-016-0153-3http://dx.doi.org/10.1162/neco.1989.1.2.281http://dx.doi.org/10.1162/neco.1989.1.2.281http://dx.doi.org/10.13031/2013.23153http://dx.doi.org/10.1007/s12665-015-4450-3http://dx.doi.org/10.1007/s12665-015-4450-3http://dx.doi.org/10.1016/j.jenvman.2015.10.031http://dx.doi.org/10.1007/s40808-016-0136-4http://dx.doi.org/10.1007/s40808-016-0136-4http://dx.doi.org/10.1162/neco.1991.3.2.246http://dx.doi.org/10.1162/neco.1991.3.2.246http://dx.doi.org/10.1007/s40808-016-0123-9http://dx.doi.org/10.1007/s40808-015-0037-yhttp://dx.doi.org/10.1007/s40808-015-0002-9http://dx.doi.org/10.1007/s40808-015-0002-9http://dx.doi.org/10.1126/science.247.4945.978http://dx.doi.org/10.1109/5.58326http://dx.doi.org/10.1007/s00027-011-0191-xhttp://dx.doi.org/10.1007/s00027-011-0191-xhttp://dx.doi.org/10.1007/s40808-015-0006-5http://dx.doi.org/10.1007/s40808-015-0063-9http://dx.doi.org/10.1007/s40808-016-0094-xhttp://dx.doi.org/10.1007/s40710-015-0076-4http://dx.doi.org/10.1007/s40808-015-0027-0http://pubs.usgs.gov/of/2012/1101/http://pubs.usgs.gov/of/2012/1101/

52 p. http://pubs.usgs.gov/sir/2013/5016/. Accessed 13 June

2016

Sullivan AB, Sogutlugil IE, Rounds SA, Deas ML (2013b) Modeling

the water-quality effects of changes to the Klamath River

upstream of Keno Dam, Oregon: US Geological Survey

Scientific Investigations Report 2013-5135, 60 p. http://pubs.

usgs.gov/sir/2013/5135. Accessed 13 June 2016

Xu L, Liu S (2013) Study of short-term water quality prediction

model based on wavelet neural network. Math Comput Model

58:807–813. doi:10.1016/j.mcm.201


123

http://pubs.usgs.gov/sir/2013/5016/http://pubs.usgs.gov/sir/2013/5135http://pubs.usgs.gov/sir/2013/5135http://dx.doi.org/10.1016/j.mcm.201

Simultaneous modelling and forecasting of hourly dissolved oxygen concentration (DO) using radial basis function neural network (RBFNN) based approach: a case study from the Klamath River, Oregon, USAAbstractIntroductionMethodologyMultilayer perceptron neural network (MLPNN)Radial basis function neural network (RBFNN)Description of study areaRanges of water quality dataPerformance indices

Results and discussionModelling DO concentrationModelling DO in the USGS 421015121471800 stationModelling DO in the USGS 421401121480900 station

Forecasting DO concentration

ConclusionAcknowledgmentsReferences

Documents

Simultaneous modelling and forecasting of hourly dissolved ......Kisi et al. (2013) investigated the accuracy of three artiﬁcial intelligence techniques, namely MLPNN, ANFIS and