Upload
effatuniversity
View
0
Download
0
Embed Size (px)
Citation preview
Epileptic seizure detection using dynamic wavelet network
Abdulhamit Subasi*
Department of Electrical and Electronics Engineering, Kahramanmaras Sutcu Imam University, 46601 Kahramanmaras, Turkey
Abstract
Epileptic seizures are manifestations of epilepsy. Careful analyses of the electroencephalograph (EEG) records can provide valuable
insight and improved understanding of the mechanisms causing epileptic disorders. The detection of epileptiform discharges in the EEG is an
important component in the diagnosis of epilepsy. Wavelet transform is particularly effective for representing various aspects of non-
stationary signals such as trends, discontinuities, and repeated patterns where other signal processing approaches fail or are not as effective.
Through wavelet decomposition of the EEG records, transient features are accurately captured and localized in both time and frequency
context. This paper deals with a novel method of analysis of EEG signals using discrete wavelet transform, and classification using ANN.
EEG signals were decomposed into the frequency sub-bands using wavelet transform. Then these sub-band frequencies were used as an input
to an ANN with two discrete outputs: normal and epileptic. In this study, FEBANN and DWN based classifiers were developed and compared
in relation to their accuracy in classification of EEG signals. The comparisons between the developed classifiers were primarily based on
analysis of the ROC curves as well as a number of scalar performance measures pertaining to the classification. The DWN-based classifier
outperformed the FEBANN based counterpart. Within the same group, the DWN-based classifier was more accurate than the FEBANN-
based classifier.
q 2005 Elsevier Ltd. All rights reserved.
Keywords: Electroencephalogram (EEG); Epileptic seizure; Discrete wavelet transform (DWT); Feedforward error backpropagation artificial neural network
(FEBANN); Dynamic wavelet network (DWN)
1. Introduction
Epileptic seizures result from a temporary electrical
disturbance of the brain. Sometimes seizures may go
unnoticed, depending on their presentation, and sometimes
may be confused with other events, such as a stroke, which
can also cause falls or migraines. Approximately one in
every 100 persons will experience a seizure at some time in
their life (Iasemidis et al., 2003). Unfortunately, the
occurrence of an epileptic seizure seems unpredictable
and its process is very little understood. Electroencephalo-
gram (EEG) has been the most utilized signal to clinically
assess brain activities. EEG is a record of the electrical
potentials generated by the cerebral cortex nerve cells.
There are two different types of EEG depending on where
the signal is taken in the head: scalp or intracranial. For
scalp EEG, the focus of this research, small metal discs, also
0957-4174/$ - see front matter q 2005 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2005.04.007
* Tel.: C90 344 219 1253; fax: C90 344 219 1052.
E-mail address: [email protected].
known as electrodes, are placed on the scalp with good
mechanical and electrical contact. Intracranial EEG is
obtained by special electrodes implanted in the brain during
a surgery. In order to provide an accurate detection of the
voltage of the brain neuron current, the electrodes of low
impedance (!5 kU). The changes in the voltage difference
between electrodes are sensed and amplified before being
transmitted to a computer program to display the tracing of
the voltage potential recordings. The recorded EEG
provides a continuous graphic exhibition of the spatial
distribution of the changing voltage fields over time.
Since routine clinical diagnosis needs to analysis of EEG
signals, some automation and computer techniques have
been used for this aim. In the early days of automatic EEG
processing, representations based on a Fourier transform
have been most commonly applied. This approach is based
on earlier observations that the EEG spectrum contains
some characteristic waveforms that fall primarily within
four frequency bands—d (!4 Hz), q (4–8 Hz), a (8–
13 Hz), and b (13–30 Hz). Such methods have proved
beneficial for various EEG characterizations, but fast
Fourier transform (FFT), suffer from large noise sensitivity.
Parametric power spectrum estimation methods such as
Expert Systems with Applications 29 (2005) 343–355
www.elsevier.com/locate/eswa
A. Subasi / Expert Systems with Applications 29 (2005) 343–355344
autoregressive (AR), reduces the spectral loss problems and
gives better frequency resolution. But, since the EEG
signals are non-stationary, the parametric methods are not
suitable for frequency decomposition of these signals
(Subasi, 2005).
A powerful method was proposed in the late 1980s to
perform time-scale analysis of signals: the wavelet trans-
forms (WT). This method provides a unified framework for
different techniques that have been developed for various
applications (Adeli, Zhou, &, Dadmehr, 2003; Basar,
Schurmann, Demiralp, Basar-Eroglu, & Ademoglu, 2001;
Folkers, Mosch, Malina, & Hofmann, 2003; Geva, &
Kerem, 1998; Hazarika, Chen, Tsoi, & Sergejer, 1997;
Kalayci, & Ozdamar, 1995; Khan, & Gotman, 2003;
Patwardhan, Dhawan, & Relue, 2003; Petrosian, Prokhorov,
Homan, Dashei, & Wunsch, 2000; Quiroga, Sakowitz,
Basar, & Schurmann, 2001; Quiroga, & Schurmann, 1999;
Rosso, Blanco, & Rabinowicz, 2003; Rosso, Martin, &
Plastino, 2002; Samar, Bopardikar, Rao, & Swartz, 1999;
Soltani, Simard, & Boichu, 2004; Zhang, Kawabata, & Liu,
2001). It should also be emphasized that the WT is
appropriate for analysis of non-stationary signals, and this
represents a major advantage over spectral analysis. Hence
the WT is well suited to locating transient events. Such
transient events as spikes can occur during epileptic
seizures.
Wavelet is an effective time–frequency analysis tool for
analyzing transient signals. Its feature extraction and
representation properties can be used to analyze various
transient events in biological signals. Adeli et al. (2003)
gave an overview of the DWT developed for recognizing
and quantifying spikes, sharp waves and spike-waves. They
used wavelet transform to analyze and characterize epilepti-
form discharges in the form of 3-Hz spike and wave
complex in patients with absence seizure. Through wavelet
decomposition of the EEG records, transient features are
accurately captured and localized in both time and
frequency context. The capability of this mathematical
microscope to analyze different scales of neural rhythms is
shown to be a powerful tool for investigating small-scale
oscillations of the brain signals. A better understanding of
the dynamics of the human brain through EEG analysis can
be obtained through further analysis of such EEG records.
Numerous other techniques from the theory of signal
analysis have been used to obtain representations and
extract the features of interest for classification purposes.
Neural networks and statistical pattern recognition methods
have been applied to EEG analysis. Neural network
detection systems have been proposed by a number of
researchers (Gabor, Leach, & Dowla, 1996; Haselsteiner, &
Pfurtscheller, 2000; Kiymik, Akin, & Subasi, in
press; Peters, Pfurtscheller, & Flyvbjerg, 2001; Pradhan,
Sadasivan, & Arunodaya, 1996; Qu, & Gotman, 1997;
Robert, Gaudy, & Limoge, 2002; Sun, & Sclabassi, 2000;
Webber, Lesser, Richardson , & Wilson, 1996; Weng, &
Khorasani, 1996).
Kalayci and Ozdamar (1995) showed that an ANN
performs better, if the input and output data can be
processed to capture the characteristic features of the signal.
They used a wavelet representation for automated detection
of the EEG spikes. More recently, ANN that applies
Bayesian methods are shown to be more robust compared
with other techniques because they incorporate measures of
confidence in their output for the Levenberg-Marquardt
(LM) procedure (Vuckovic, Radivojevic, Chen, & Popovic,
2002). In addition, standard MLP was improved by using
finite impulse response filters (FIR) instead of static weights
for a temporal processing of data (Haselsteiner, &
Pfurtscheller, 2000). Petrosian et al. (2000) showed that
the ability of specifically designed and trained recurrent
neural networks (RNN) combined with wavelet pre-
processing, to predict the onset of epileptic seizures both
on scalp and intracranial recordings. Recently, Kiymik et al.
(2004) presented time–frequency analysis of EEG signals
for detecting the information on alertness and drowsiness
using spectral densities of DWT coefficients as an input to
ANN.
Some studies, such as those of Petrosian et al. (2000),
report seizure prediction after analyzing one channel of
electroencephalogram (EEG) from an intracranial depth
electrode in one patient. In these studies, using univariate
techniques no analysis of baseline data far removed from the
seizure was undertaken. A potential pitfall of conclusions
based upon such limited data is that quantitative changes
identified prior to seizure onset may not be specific to the
pre-seizure period, but may occur at other times as well,
unrelated to epileptic events. Validation of prediction
algorithms on long, continuous sets of clinical data,
representing all states of awareness, is an important part
of more recent seizure prediction studies. A number of
promising quantitative features derived from the EEG, each
with different theoretical bases, have demonstrated utility
for seizure prediction.
As compared to the conventional method of frequency
analysis using Fourier transform or short time Fourier
transform, wavelets enable analysis with a coarse to fine
multi-resolution perspective of the signal (Subasi, 2005). In
this work, DWT has been applied for the time–frequency
analysis of EEG signals and ANN for the classification
using wavelet coefficients. EEG signals were decomposed
into frequency sub-bands using discrete wavelet transform
(DWT). An ANN based system was implemented to classify
the EEG signal to one of the categories: epileptic or normal.
The aim of this study was to develop a simple algorithm for
the detection of epileptic seizure which could also be
applied to real-time.
This paper aims to compare the more advanced and
relatively recent neural network techniques, as mathematical
tools for developing classifiers for the detection of epileptic
seizure. In the neural network techniques, both the feedfor-
ward error backpropagation ANN (FEBANN), and the
dynamic wavelet neural network (DWN) (Becerikli, 2004;
A. Subasi / Expert Systems with Applications 29 (2005) 343–355 345
Oysal, Yilmaz, & Koklukaya, 2005) will be used. The choice
of these two networks was based on the fact that the former is
the most popular type of ANNs and the latter is one of the
most powerful networks commonly used in solving
classification/discrimination problems. The accuracy of the
various classifiers will be assessed and cross-compared, and
advantages and limitations of each technique will be
discussed.
Table 1
Frequencies corresponding to different levels of decomposition for
Daubechies 4 filter wavelet with a sampling frequency of 200 Hz
Decomposed signal Frequency range (Hz)
D1 50–100
D2 25–50
D3 12.5–25
D4 6.25–12.5
D5 3.125–6.25
A5 0–3.125
2. Materials and method
2.1. Subjects and data recording
The EEG data used in our study was downloaded from
24-h EEG recorded from both epileptic patients and normal
subjects. The following bipolar EEG channels were selected
for analysis: F7-C3, F8-C4, T5-O1 and T6-O2. In order to
assess the performance of the classifier, we selected 500
EEG segments containing spike and wave complex,
artifacts, and background normal EEG. Twenty absence
seizures (petit mal) from 5 epileptic patients admitted for
video-EEG monitoring were analyzed. The subjects con-
sisted of 3 males and 2 females, age 28.87G15.27 (meanGSD; range 6–43) with a diagnosis of epilepsy and no other
accompanying disorders. Recordings were done under video
control to have an accurate determination of the different
stage of the seizure. The different stages of EEG signals
were determined by two physicians. EEG data were
acquired with Ag/AgCl disk electrodes placed using the
10–20 international electrode placement system. The
recordings band-pass filtered (1–70 Hz) EEG. For this
study, 15-min, 4 channel recordings containing epileptiform
events (spikes, spike and waves) were digitized at 200
samples per second using 12 bit resolution. All EEG were
taken during restful wakefulness stage but some portions of
the EEG contained EMG artifacts.
2.2. Visual inspection and validation
Two neurologists with experience in the clinical analysis
of EEG signals separately inspected every recording
included in this study to score epileptic and normal signals.
Each event was filed on the computer memory and linked to
the tracing with its start and duration. These were then
revised by the two experts jointly to solve disagreements
and set up the training set for the program, consenting to
the choice of threshold for the epileptic seizure detection.
The agreement between the two experts was evaluated—for
the testing set—as the rate between the numbers of epileptic
seizures detected by both experts. A further step was then
performed with the aim of checking the disagreements and
setting up a ‘gold standard’ reference set (De Carli, Nobili,
Gelcich, & Ferrillo, 1999). When revising this unified event
set, the human experts, by mutual consent, marked each
state as epileptic or normal. They also reviewed each
recording entirely for epileptic seizures that had been
overlooked by all during the first pass and marked them as
definite or possible. This validated set provided the
reference evaluation to estimate the sensitivity and speci-
ficity of computer scorings. Nevertheless, a preliminary
analysis was carried out solely on events in the training set,
as each stage in these sets had a definite start and duration.
2.3. Analysis using discrete wavelet transforms
The Discrete Wavelet Transform (DWT) is a versatile
signal processing tool that finds many engineering and
scientific applications. One area in which the DWT has been
particularly successful is the epileptic seizure detection
because it captures transient features and localizes them in
both time and frequency content accurately.
DWT analyzes the signal at different frequency bands,
with different resolutions by decomposing the signal into a
coarse approximation and detail information. DWT
employs two sets of functions called scaling functions and
wavelet functions, which are associated with low-pass and
high-pass filters, respectively. The decomposition of the
signal into the different frequency bands is simply obtained
by successive high-pass and low-pass filtering of the time
domain signal. Detailed derivations related to wavelet
transform are given in Appendix.
Selection of suitable wavelet and the number of levels of
decomposition is very important in analysis of signals using
DWT. The typical way is to visually inspect the data first,
and if the data are kind of discontinuous, Haar or other sharp
wavelet functions are applied; otherwise a smoother wavelet
can be employed. Usually, tests are performed with different
types of wavelets and the one which gives maximum
efficiency is selected for the particular application.
The number of levels of decomposition is chosen based
on the dominant frequency components of the signal. The
levels are chosen such that those parts of the signal that
correlate well with the frequencies required for classifi-
cation of the signal are retained in the wavelet coefficients.
Since the EEG signals do not have any useful frequency
components above 30 Hz, the number of levels was chosen
to be 5. Thus the signal is decomposed into the details D1–
D5 and one final approximation, A5. The ranges of various
frequency bands are shown in Table 1.
0 500 1000 1500 2000 2500 3000–500
0
500
Am
plitu
de
0 500 1000 1500 2000 2500 3000–500
0
500
Am
plitu
de
0 500 1000 1500 2000 2500 3000–500
0
500
Am
plitu
deA
mpl
itude
0 500 1000 1500 2000 2500 3000–500
0
500
F8-C4
F7-C3
T6-O2
T5-O1
Fig. 1. EEG signal taken from unhealthy subject (epileptic patient).
A. Subasi / Expert Systems with Applications 29 (2005) 343–355346
The proposed method was applied on a wide variety of
EEG data for both epileptic and normal signals. Four
channels of EEG (F7-C3, F8-C4, T5-O1 and T6-O2)
recorded from a patient with absence seizure epileptic
discharges are shown in Fig. 1 and normal EEG signal
shown in Fig. 2. Fig. 3 shows five different levels of
approximation (identified by A1–A5 and displayed in the
left column) and details (identified by D1–D5 and displayed
in the right column) of an epileptic EEG signal. Fig. 4 shows
five different levels of approximation (identified by A1–A5
and displayed in the left column) and details (identified by
D1–D5 and displayed in the right column) of a normal EEG
signal. These approximation and detail records are recon-
structed from the Daubechies 4 (DB4) wavelet filter. These
approximation and detail records are reconstructed from the
wavelet coefficients. Approximation A4 is obtained by
superimposing details D5 on approximation A5. Approxi-
mation A3 is obtained by superimposing details D4 on
approximation A4, and so on. Finally, the original signal is
obtained by superimposing details D1 on approximation A1.
Wavelet transform acts like a mathematical microscope,
zooming into small scales to reveal compactly spaced
events in time and zooming out into large scales to exhibit
the global waveform patterns (Adeli et al., 2003).
The extracted wavelet coefficients provide a compact
representation that shows the energy distribution of the EEG
signal in time and frequency. Table 1 presents frequencies
corresponding to different levels of decomposition for
Daubechies order 4 wavelet with a sampling frequency of
200 Hz. It can be seen from Table 1 that the components A5
decomposition are within the d (1–4 Hz), D5 decomposition
are within the q range (4–8 Hz), D4 decomposition are
within the a range (8–13 Hz), and D3 decomposition are
within the b range (13–30 Hz). Lower level decompositions
corresponding to higher frequencies have negligible mag-
nitudes in a normal EEG.
2.4. Classification using artificial neural networks
Artificial neural networks (ANNs) are formed of cells
simulating the low level functions of biological neurons. In
ANN, knowledge about the problem is distributed in
neurons and connections weights of links between neurons.
The neural network has to be trained to adjust the
connection weights and biases in order to produce the
desired mapping. At the training stage, the feature vectors
are applied as input to the network and the network adjusts
its variable parameters, the weights and biases, to capture
the relationship between the input patterns and outputs.
ANNs are particularly useful for complex pattern recog-
nition and classification tasks. The capability of learning
from examples, the ability to reproduce arbitrary non-linear
functions of input, and the highly parallel and regular
structure of ANN make them especially suitable for pattern
0 500 1000 1500 2000 2500 3000–100
0
100
200A
mpl
itude
0 500 1000 1500 2000 2500 3000–500
0
500
Am
plitu
de
0 500 1000 1500 2000 2500 3000–400
–200
0
200
Am
plitu
de
0 500 1000 1500 2000 2500 3000–200
–100
0
100
Number of Samples
Am
plitu
de
F8-C4
F7-C3
T6-O2
T5-O1
Fig. 2. EEG signal taken from a healthy subject.
A. Subasi / Expert Systems with Applications 29 (2005) 343–355 347
classification tasks (Basheer, & Hajmeer, 2000; Fausett,
1994; Haselsteiner, & Pfurtscheller, 2000; Shimada, Shiina
& Saito, 2000; Sun, & Sclabassi, 2000). In this paper, two
neural networks relevant to the application being considered
(i.e., classification of epileptic/normal EEG data) will be
employed for designing classifiers; namely the FEBANN
and DWN.
For network trained by error backpropagation, a number
of issues have to be addressed to insure successful network
development (Basheer, & Hajmeer, 2000). Most important
among those issues are the network size (architecture) and
number of training cycles. If training is insufficient, the
network will not learn the examples presented to it. In
contrast, extremely excessive training of the network will
force it to memorize the training examples. This will result
in a network that is unable to generalize to cases from
outside the training database. Additionally, an oversized
ANN comprised of large number of units in the hidden
layers tends to learn the noise and over-fit the data rather
than uncover the overall underlying trend (similar to over-
parameterized polynomials). One practical approach to
avoid these problems is through cross validation in which
test examples (different from training examples) selected
randomly from the parent database are continuously used to
examine generalization of the network after each cycle
during training. The quality of network predictions for these
test examples, quantified using some error measure, can
serve as a criterion for stopping training or determining the
optimum network size. Unlike the error in training data that
continues to decline with network size and number of
training cycles, the test sets error reaches a minimum at the
optimum ANN size and/or number of training cycles. The
‘optimum’ network is considered to contain sufficient
knowledge about the phenomenon being modelled.
2.4.1. Selection of network parameters
For solving pattern classification problem ANN employ-
ing back-propagation training algorithm was used. Effective
training algorithm and better-understood system behaviour
are the advantages of this type of neural network. Selection
of network input parameters and performance of neural
network are important for epileptic seizure detection.
The classification scheme of 1-of-C coding has been used
for classifying the signal into one of the output categories.
For each type of EEG signals, a corresponding output class
is associated. The feature vector set, x represents the ANN
inputs, and the corresponding class, once coded, constitutes
the ANN outputs. In order to make the neural network
training more efficient, the input feature vectors were
normalized so that they fall in the range [0, 1.0]. Since the
number of output classes is 2, the ANN with one output is
500 1000 1500 2000 2500 3000
–400–200
0200400
Original signal and approximations
s
0 1000 2000 3000–1000
0
1000
A1
A2
A3
A4
A5
D1
D2
D3
D4
D5
0 1000 2000 3000–1000
0
1000
0 1000 2000 3000–1000
0
1000
0 1000 2000 3000–1000
0
1000
0 1000 2000 3000–1000
0
1000
Number of Samples
500 1000 1500 2000 2500 3000
–100
0
100
Details
500 1000 1500 2000 2500 3000
–200
0
200
500 1000 1500 2000 2500 3000
–400–200
0200
500 1000 1500 2000 2500 3000–400–200
0200
400
500 1000 1500 2000 2500 3000
–400–200
0200
Number of Samples
Fig. 3. Approximate and detailed coefficients of EEG signal taken from unhealthy subject (epileptic patient).
A. Subasi / Expert Systems with Applications 29 (2005) 343–355348
sufficient to produce a code for each class. The outputs are
represented by basis vectors:
[0.1]Znormal
[0.9]Zepileptic
Each dummy variable is given the value 0.1 except for
the one corresponding to the correct category, which is
given the value 0.9. Using target values of 0.1 and 0.9
instead of the common practice of 0 and 1 prevents the
outputs of the network from being directly interpretable as
posterior probabilities (Kandaswamy, Kumar, Ramanathan,
Jayaraman, & Malmurugan, 2004).
2.4.2. Cross validation
Cross validation (CV) (Basheer, & Hajmeer, 2000;
Haselsteiner, & Pfurtscheller, 2000) is often used for
comparing two or more learning ANN models to estimate
which model will perform the best on the problem at hand.
With n-fold CV, the available data is partitioned into n
disjoint subsets, the union of which is equal to the original
set. Each learning model is trained on nK1 of the available
subsets, and then tested on the one subset which was not
used during training. This process is repeated n times, each
time using a different test set chosen from the n available
partitions of the training data, until all possible choices for
the test set have been exhausted. The n test set scores for
each learning model are then averaged, and the model with
the highest average test set score is chosen as the one most
likely to perform well on unseen data (Kandaswamy et al.,
2004).
2.4.3. Measuring error
Given a random set of initial weights, the outputs of the
network will be very different from the desired classifi-
cations. As the network is trained, the weights of the system
are continually adjusted to reduce the difference between
the output of the system and the desired response. The
difference is referred to as the error and can be measured in
several ways. The most common measurement is sum
500 1000 1500 2000 2500 3000
0
50
100
Original signal and approximations
0 1000 2000 3000–200
0
200
0 1000 2000 3000–200
0
200
0 1000 2000 3000–200
0
200
0 1000 2000 3000–200
0
200
0 1000 2000 3000–100
0
100
Number of Samples
500 1000 1500 2000 2500 3000–5
0
5
Details
500 1000 1500 2000 2500 3000–10
0
10
500 1000 1500 2000 2500 3000
–20
0
20
500 1000 1500 2000 2500 3000
–20
0
20
500 1000 1500 2000 2500 3000–40–20
0204060
Number of Samples
sA
1A
2A
3A
4A
5
D1
D2
D3
D4
D5
Fig. 4. Approximate and detailed coefficients of EEG signal taken from a healthy subject.
A. Subasi / Expert Systems with Applications 29 (2005) 343–355 349
squared error (SSE) and mean squared error (MSE). SSE is
the average of the squares of the difference between each
output and the desired output (Basheer, & Hajmeer, 2000;
Haselsteiner, & Pfurtscheller, 2000). In this study, SSE was
used for measuring performance of the neural network.
W2
W3
u y
W1
Fig. 5. Schematic diagram of a DWN with three-wavelon.
2.5. Dynamic wavelet network
Dynamic wavelet network (DWN) models have been
used in the meaning of a network. The DWN model we used
has unconstrained connectivity and has dynamic elements in
the wavelon (neuron of DWN) processing units. A
schematic diagram for the dynamic networks with three
neurons is shown in Fig. 5. Wi can be a wavelon in a DWN.
In general, there are L input signals which can be time-
varying, n dynamic units, n bias terms, and M output signals.
The units have dynamics associated with them and they
receive the input from themselves, the bias term, and from
all other units. The output of a unit yi is an activation
function h(xi) of a state variable xi associated with the unit.
The output of the overall network is a linear weighted sum
of the unit outputs. The bias term bi is added to the unit
inputs. pij is the input connection weights from the jth input
to the ith wavelon, wij is the interconnection weight from the
jth wavelon to the ith wavelon and qij is the output
connection weight from the jth wavelon to the ith output. Ti
is the dynamic constant of the ith wavelon and bi is the bias
(or polarization) term of the ith wavelon (Becerikli, 2004).
In DWNs, wavelet neurons (wavelons) input over a lag
dynamic transport to output via a wavelet activation
function. Wavelets are usually explained as basis functions
which are compact (closed and bounded), orthogonal
A. Subasi / Expert Systems with Applications 29 (2005) 343–355350
(or orthonormal), and have time–frequency localization
properties. But, to provide all of those properties is very
difficult. Basis functions are called ‘activation functions’ in
ANN literature, and can be a global or local feature in time.
Global basis functions are active for the wide values of
inputs and the receptive field of the basis function is
approximately constant far from the center (i.e., logarithmic
sigmoid function). But, the local basis functions are only
active near the center; the value tends to zero far from the
centre (Becerikli, 2004).
If the global basis function is used in a network, all
activation functions interact with each other and each node,
and they cover a wide input interval. This causes the large
number of parameters to adjust and necessitates a long
computation time. In addition, for wide input intervals,
much more extrapolation error occurs. The most important
disadvantage of orthonormal compact basis functions is that
they can not be obtained in the closed analytical form.
To remove all those disadvantages, the local basis
functions can be used. The local basis functions are only
active for certain inputs. In addition, the generalization
errors decrease (Becerikli, 2004). In this study, only the
local basis functions have been used. The most important
local function is Gaussian:
fðtÞ Z exp Kt2
2
� �; x2R (1)
where f2L2(R). For the more general case:
ft Kb
a
� �Z exp K
1
2
t Kb
a
� �2� �; t 2R (2)
where b is the center or translation and a is the standard
deviation or dilation. However, the Gaussian function is not
local in frequency (Becerikli, 2004). The locality features in
both time and frequency is a very important concept for the
representation of the signals. Therefore, the mission of the
wavelet functions is comprehensive.
The locality in time and frequency can be explained as
follows:
†
If a function is described in a bounded interval and has avery small value outside the boundary, then that function
is local in time. The local function in time can be shifted
by changing its centre.
†
If the frequency spectrum of the local function in time isdescribed in a bounded frequency interval and has very
small value outside the boundary, and also can be shifted
by changing its dilation, then that function is local in
frequency.
A deficiency of Gaussian-based ANNs is that they do not
have localization capabilities in frequency. Since the
Gaussian function is not local in frequency, it is very
difficult to use Gaussian-based functions in some appli-
cations (Sanner, & Slotine, 1992). To overcome these
problems, there is a very effective way to use wavelet
functions with time–frequency localization properties
(Cannon, & Slotine, 1995). In some studies, the first
derivative of the Gaussian function has been used (Mallat,
1987; Qussar, Rivals, Personnaz, & Dreyfus, 1998).
However, the locality properties of the second derivative
of the Gaussian function are clearer. A non-orthonormal
Mexican Hat basis function (second derivative of the
Gaussian function) can be easily written in the analytical
form and its Fourier transform can be found (Becerikli,
2004), thus:
fðtiÞ Z ð1 K t2i Þexp K
t2i
2
� �; t 2R (3)
fðuÞ Zffiffiffiffiffiffi2p
pu2 exp K
u2
2
� �; u2R (4)
where u is a real frequency. The last equation can be
generalized as follows:
fti Kbi
ai
� �Z 1 K
ti Kbi
ai
� �2� �exp K
1
2
ti Kbi
ai
� �2� �
(5)
where bi and ai are the translation (center) and dilation
(standard deviation) parameters, respectively. Wavelet
functions have efficient time–frequency localization proper-
ties, as shown from the frequency spectrum (Mallat, 1987).
If the dilation parameter is changed, the support region
width of the wavelet function changes, but the number of
cycles does not change. That is, the peak number does not
change; however, when the dilation parameter decreases,
the peak point of the spectrum shifts to a higher frequency.
Therefore, all frequency spectrums can be obtained by
changing the dilation. In this study, Eq. (4) has been used as
a mother (main) wavelet (Becerikli, 2004). An N-dimen-
sional mother wavelet can be given in the separable
structure with the product rule as follows (Cannon, &
Slotine, 1995; Mallat, 1987; Qussar et al., 1998; Zhang, &
Benveniste, 1992; Zhang, Walter, & Lee, 1995):
FiðtÞ ZYN
jZ1
fj
tj Kbij
aij
� �(6)
where ti2RN is the input and N is the input number. A
function yZf(t) can be represented with wavelets obtained
from the mother wavelet, (Cannon, & Slotine, 1995; Mallat,
1987; Qussar et al., 1998) as below:
yi Z hiðtÞ ZXNw
jZ1
sijfjðtÞCai0 CXN
jZ1
aiktk (7)
where sij are the coefficients of the mother wavelets, Nw is
the number of wavelets, ai0 is a mean or bias term, and aik
are the linear term coefficients of this approach.
A. Subasi / Expert Systems with Applications 29 (2005) 343–355 351
The wavelet function in this structure will be used in the
DWN given in Fig. 5. The structure used in (Becerikli,
2004; Becerikli, Konar, & Samad, 2003; Becerikli, Oysal, &
Konar, 2004; Oysal et al., 2005) has been adapted to this
network. The wavelets in Eqs. (6) and (7) will be used as the
activation functions in the network. Each activation
function has a single input/single output (SISO), and can
be re-expressed as:
FiðtiÞ Z fi
ti Kbij
aij
� �(8)
yi Z hiðtiÞ ZXNw
jZ1
sijfi
ti Kbij
aij
� �Cai0 Cai1ti (9)
fi
ti Kbij
aij
� �Z 1 K
ti Kbij
aij
� �2� �exp K
1
2
ti Kbij
aij
� �2� �
(10)
2.6. Evaluation of performance
The coherence of the diagnosis of the expert neurologists
and diagnosis information was calculated at the output of
the classifier. Prediction success of the classifier may be
evaluated by examining the confusion matrix. In order to
analyze the output data obtained from the application,
sensitivity (true positive ratio) and specificity (true negative
ratio) are calculated by using confusion matrix. The
sensitivity value (true positive, same positive result as the
diagnosis of expert neurologists) was calculated by dividing
the total of diagnosis numbers to total diagnosis numbers
that are stated by the expert neurologists. Sensitivity, also
called the true positive ratio, is calculated by the formula:
Sensitivity Z TPR ZTP
TP CFN!100% (11)
On the other hand, specificity value (true negative, same
diagnosis as the expert neurologists) is calculated by
dividing the total of diagnosis numbers to total diagnosis
numbers that are stated by the expert neurologists.
Specificity, also called the true negative ratio, is calculated
by the formula:
Specifity Z TNR ZTN
TN CFP!100% (12)
Neural network analyses were compared to each other by
receiver operating characteristic (ROC) analysis. ROC
analysis is an appropriate means to display sensitivity and
specificity relationships when a predictive output for two
possibilities is continuous. In its tabular form, the ROC
analysis displays true and false positive and negative totals
and sensitivity and specificity for each listed cutoff value
between 0 and 1 (Subasi, in 2005).
In order to perform the performance measure of the
output classification graphically, the ROC curve was
calculated by analyzing the output data obtained from the
test. Furthermore, the performance of the model may be
measured by calculating the region under the ROC curve.
The ROC curve is a plot of the true positive rate (sensitivity)
against the false positive rate (1-specificity) for each
possible cutoff. A cutoff value is selected that may classify
the degree of epileptic seizure detection correctly by
determining the input parameters optimally according to
the used model.
3. Results and discussion
In this study, we used EEG signals of normal and
epileptic patients in order to perform comparison between
two neural network models. EEG recordings were divided
into sub-band frequencies such as a, b, d and q by using
DWT (Figs. 3 and 4). Then these wavelet sub-band
frequencies d (1–4 Hz), q (4–8 Hz), a (8–13 Hz) and b
(13–30 Hz) are applied to neural networks.
The classification efficiency which is defined as the
percentage ratio of the number of EEG signals correctly
classified to the total number of EEG signals considered for
classification also depends on the type of wavelet chosen for
the application. In the previous work on application of WT
in EEG analysis (Subasi, 2005), Daubechies wavelet of
order 2 (db2) was used and found to yield good results. In
order to investigate the effect of other wavelets on
classifications efficiency, tests were carried out using other
wavelets also. Apart from db2, Symmlet of order 10
(sym10), Coiflet of order 4 (coif4), Daubechies of order 4
(db4) and Daubechies of order 8 (db8) were also tried.
Average efficiency obtained for each wavelet when EEG
signals were classified using various ANN structures. It can
be seen that the Daubechies wavelet offers better efficiency
than the others, and db4 is marginally better than db2 and
db8. Hence db4 wavelet is chosen for this application.
3.1. Development of neural network model
The objective of the modelling phase in this application
was to develop classifiers that are able to identify any input
combination as belonging to either one of the two classes:
normal or epileptic. For developing neural network
classifiers, 300 examples were randomly taken from the
500 examples and used for training the neural networks, and
100 for the cross validation. The remaining 100 examples
were kept aside and used for testing the developed models.
The class distribution of the samples in the training and
validation data set is summarized in Table 2.
The ANNs were designed with wavelet sub-band
frequencies of EEG signal using DWT, in the input layer;
and the output layer consisted of one node representing
whether epileptic seizure detected or not. A value of 0.1 was
used when the experimental investigation indicated a
normal and 0.9 for epileptic seizure. The preliminary
Table 2
Class distribution of the samples in the training and the validation data sets
Class Training set Validation
set
Test set Total
Epileptic 102 46 42 190
Normal 198 54 58 310
Total 300 100 100 500
Table 3
Comparison of neural network models for EEG signals
Classifier
type
Correctly
classified
(%)
Specifity
(%)
Sensitivity
(%)
Area under
ROC curve
FEBANN 91 91.3 90.4 0.907
DWN 93 93.1 92.8 0.921
A. Subasi / Expert Systems with Applications 29 (2005) 343–355352
architecture of the network was examined using one and two
hidden layers with a variable number of hidden nodes in
each. It was found that one hidden layer is adequate for the
problem at hand. Thus the sought network will contain three
layers of nodes. The training procedure started with one
hidden node in the hidden layer, followed by training on the
training data, and then by testing on the validation data to
examine the network’s prediction performance on cases
never used in its development. Then, the same procedure
was run repeatedly each time the network was expanded by
adding one more node to the hidden layer, until the best
architecture and set of connection weights were obtained.
Using the modified error-backpropagation algorithm for
training, a training rate of 0.0001 and momentum coefficient
of 0.95 were found optimum for training the network with
various topologies. The selection of the optimal network
was based on monitoring the variation of error and some
accuracy parameters as the network was expanded in the
hidden layer size and for each training cycle. The sum of
squares of error representing the sum of square of deviations
of ANN solution (output) from the true (target) values for
both the training and test sets was used for selecting the
optimal network. Additionally, because the problem
involves classification into two classes, sensitivity and
specifity were used as a performance measure. A computer
program that we have written for the training algorithm
based on backpropagation of error was used to develop the
FEBANNs.
The classifier implemented for this work is a DWN with
one hidden layer and one output layer. An input vector is
applied to the input layer where all of the inputs are
distributed to each unit in the first hidden layer. All of the
units have weight vectors which are multiplied by these
input vectors. Each unit sums these inputs and produces a
value that is transformed by a wavelet activation function.
The output of the final layer is then computed by
multiplying the output vector from the hidden layer by the
weights into the final layer. More summations and
activations at these units then give the actual output of the
network. We used a network with a variable number of
hidden units and one output unit as in FEBANN. One output
unit is all that is needed because we are only classifying a
two-task problem.
The number of input vector was equal to the total number
of wavelet coefficients of four sub-bands (a, b, d, q), and the
number of output vector consisted of one node representing
whether epileptic seizure detected or not. Optimum number
of neurons in the hidden layer, training algorithm,
parameters of the training algorithm, and the activation
functions of the two layers were determined by repeated
simulation. According to the theory, the number of nodes in
the hidden layer of the network is equal to that of wavelet
base. If the number is too small, DWN may not reflect the
complex function relationship between input data and
output value. On the contrary, a large number may create
such a complex network that might lead to a very large
output error caused by over-fitting of the training sample.
The optimum number of nodes in hidden layer is 65. As a
result, the network in this paper is constructed by the error
backpropagation neural network using Mexican Hat wavelet
basic function as node activation function. The amount by
which the weights are adjusted on each step is parameter-
ized by learning rate constants. We used one learning rate
for the hidden layer and a different rate for the output layer.
After trying a large number of different values, we found
that a learning rate of 0.0001 for the hidden layer and the
output layer produced the best performance.
3.2. Experimental results
Firstly we used sub-band frequencies of EEG signals for
FEBANN and DWN classification. The procedure was
repeated on EEG recordings of all subjects (healthy and
epileptic patients). Table 3 shows a summary of the
performance measures by using sub-band frequencies of
EEG signals using DWT. It is obvious from Table 3 that the
DWN-based classifier is ranked first in terms of its correct
classification percentage of the EEG signals (epileptic/-
normal data 93%), while the FEBANN-based classifier
came second (91%). While the FEBANN classified the same
data with a success rate of 90.4% and DWN classified the
same data with a success rate of 92.8%.
Also, the area under ROC curves for two classifiers
(FEBANN and DWN) are given in Table 3. To quantify the
performance characteristics of each classifier, the area under
ROC curve (AUC) was computed for validation data ROC
curve. Table 3 presents a summary of these AUC values for
the 2 classifiers developed. As can be seen from Table 3, the
DWN-based classifier is undoubtedly the better classifier
with AUCZ0.921 for the validation data ROC curves,
while the FEBANN-based classifier exhibited a slightly
lower performance (AUCZ0907).
The testing performance of the DWN diagnostic system
is found to be satisfactory and we think that this system can
A. Subasi / Expert Systems with Applications 29 (2005) 343–355 353
be used in clinical studies in the future after it is developed.
This application brings objectivity to the evaluation of EEG
signals and its automated nature makes it easy to be used in
clinical practice. Besides the feasibility of a real-time
implementation of the expert diagnosis system, diagnosis
may be made more accurately by increasing the variety and
the number of parameters. A ‘black box’ device that may be
developed as a result of this study may provide feedback to
the neurologists for classification of the EEG signals quickly
and accurately by examining the EEG signals with real-time
implementation.
4. Summary and conclusions
Diagnosing epilepsy is a difficult task requiring obser-
vation of the patient, an EEG, and gathering of additional
clinical information. An artificial neural network that
classifies subjects as having or not having an epileptic
seizure provides a valuable diagnostic decision support tool
for physicians treating potential epilepsy, since differing
etiologies of seizures result in different treatments.
Conventional method of classification of EEG signals
using mutually exclusive time and frequency domain
representations does not give efficient results. In this
work, a novel method of diagnostic classification of EEG
signals is proposed. The EEG signals were decomposed into
time–frequency representations using DWT. FEBANN and
DWN were implemented for the classification of EEG
signals using DWT sub-bands as inputs.
In this paper, two approaches to develop classifiers for
identifying epileptic seizure were discussed. One approach
is based on the traditional neural network technology,
mainly using feedforward neural network trained by the
error-backpropagation algorithm (FEBANN) and the other
is the dynamic wavelet network (DWN). Using DWT of
EEG signals, two classifiers; namely FEBANN, and DWN,
were constructed and cross-compared in terms of their
accuracy relative to the observed epileptic/normal patterns.
The comparisons were based on analysis of the receiving
operator characteristic (ROC) curves of the three classifiers
and two scalar performance measures derived from the
confusion matrices; namely specifity and sensitivity. The
DWN-based classifier identified accurately all the epileptic
and normal cases with specifity 93.1% and sensitivity 92.8%
and the FEBANN-based classifier with specifity 91.3% and
sensitivity 90.4%. Out of the 100 epileptic/normal cases,
FEBANN-based classifier misclassified 9 cases, while the
DWN-based classifier misclassified 7 cases.
Essentially, DWNs and FEBANNs require deciding on
the number of hidden layers, number of nodes in each
hidden layer, number of training iteration cycles, choice of
activation function, selection of the optimal learning rate
and momentum coefficient, as well as other parameters and
problems pertaining to convergence of the solution.
Advantages of DWNs over FEBANNs include their
robustness to noisy data (with outliers) which can severely
hamper many types of ANNs as well as most traditional
statistical methods. Finally, the fact that a DWN-based
classifier can be developed quickly makes such classifiers
efficient tools that can be easily re-trained, as additional data
become available, when implemented in the hardware of
EEG signal processing systems.
With specificity and sensitivity values both above 92%,
the wavelet neural network classification may be used as an
important diagnostic decision support mechanism to assist
physicians in the treatment of epileptic patients.
Appendix
Wavelet transform
The wavelet transform specifically permits to discrimi-
nation of non-stationary signals with different frequency
features (Daubechies, 1996). A signal is stationary if it does
not change much over time. Fourier transform can be
applied to the stationary signals. However, like EEG, plenty
of signals may contain non-stationary or transitory charac-
teristics. Thus it is not ideal to directly apply Fourier
transform to such signals.
The wavelet transform decomposes a signal into a set of
basic functions called wavelets. These basic functions are
obtained by dilations, contractions and shifts of a unique
function called wavelet prototype. Continuous wavelets are
functions generated from one single function j by dilations
and translations (Cohen, & Kovacevic, 1996; Rioul, &
Vetterli, 1991).
ja;bðtÞ Z1ffiffiffiffiffiffi
aj jp j
t Kb
a
� �(A.1)
where b is real valued and called the shift parameter. The
function set (ja,b(t)) is called a wavelet family. Since the
parameters (a, b) are continuous valued, the transform is
called continuous wavelet transform. The definition of
classical wavelets as dilates of one function means that high
frequency wavelets correspond to a!1 or narrow width,
while low frequency wavelets have aO1 or wider width. In
the wavelet transform, f(t) is expressed as linear combi-
nation of scaling and wavelet functions. Both scaling
functions and the wavelet functions are complete sets
(Rioul, & Vetterli, 1991). However, it is common to employ
both wavelet and scaling functions in the transform
representation. In general, the scale and shift parameters
of the discreet wavelet family are given by
a Z aj0; b Z kb0a
j0 (A.2)
where j and k are integers. The function family with
discretized parameters becomes
jj;kðtÞ Z aKj=20 j aKjt Kkb0
� �(A.3)
A. Subasi / Expert Systems with Applications 29 (2005) 343–355354
jj,k(t) is called the discrete wavelet transform (DWT) basis.
Although it is called DWT, the time variable of the
transform is still continuous. The DWT coefficients of a
continuous time function are similarly defined as
dj;k Z hfwðtÞ;jj;kðtÞi Z1
aj=20
ðfwðtÞjða
Kj0 t Kkb0Þdt (A.4)
When the DWT set (jj,k(t)) is complete, the wavelet
representation of a function fw(t) is expressed as
fwðtÞ ZX
j
Xk
hfwðtÞ;jj;kðtÞijj;kðtÞ (A.5)
In general, a function can be completely represented by
using L-finite resolutions of wavelet, and the scaling
function with parameters value of a0Z2 and b0Z1 as
fwðtÞ ZXN
kZKN
cL;k2KL=2fð2t=L KkÞ
CXL
jZ1
XN
kZKN
dj;k2Kj=2jð2t=j KkÞ (A.6)
Where scaling coefficients [cL,k] are similarly defined as
cL;k Z hfwðtÞ;fL;KðtÞi Z
ðfwðtÞ2
KL=2ft
2LKk
�dt (A.7)
and
fL;kðtÞ Z 2KL=2fð2KLt KkÞ (A.8)
j Z 2X
k
h1ðtÞfð2t KkÞ (A.9)
f Z 2X
k
h0ðkÞfð2t KkÞ (A.10)
References
Adeli, H., Zhou, Z., & Dadmehr, N. (2003). Analysis of EEG records in an
epileptic patient using wavelet transform. Journal of Neuroscience
Methods, 123, 69–87.
Basar, E., Schurmann, M., Demiralp, T., Basar-Eroglu, C., & Ademoglu, A.
(2001). Event-related oscillations are ‘real brain responses’—wavelet
analysis and new strategies. International Journal of Psychophysiology,
39, 91–127.
Basheer, I. A., & Hajmeer, M. (2000). Artificial neural networks:
Fundamentals, computing, design, and application. Journal of Micro-
biological Methods, 43, 3–31.
Becerikli, Y. (2004). On three intelligent systems: Dynamic neural, fuzzy
and wavelet networks for training trajectory. Neural Computing
Applications, 13(4), 339–351.
Becerikli, Y., Konar, A. F., & Samad, T. (2003). Intelligent optimal control
with dynamic neural networks. Neural Networks, 16(2), 251–259.
Becerikli, Y., Oysal, Y., & Konar, A. F. (2004). Trajectory priming with
dynamic fuzzy networks in nonlinear optimal control. IEEE Trans-
actions Neural Networks, 15(2), 383–394.
Cannon, M., & Slotine, J. J. E. (1995). Space-frequency localized basis
function networks for nonlinear system estimation and control.
Neurocomputing, 9(3), 293–342.
Cohen, A., & Kovacevic, J. (1996). Wavelets: the mathematical back-
ground. Proceedings of the IEEE, 84, 514–522.
Daubechies, I. (1996). Where do wavelets come from? A personal point of
view. Proceedings of the IEEE, 84, 510–513.
De Carli, F., Nobili, L., Gelcich, P., & Ferrillo, F. (1999). A method for the
automatic detection of arousals during sleep. Sleep, 22, 561–572.
Fausett, L. (1994). Fundamentals of neural networks architectures,
algorithms, and applications. Englewood Cliffs, NJ: Prentice Hall Inc..
Folkers, A., Mosch, F., Malina, T., & Hofmann, U. G. (2003). Realtime
bioelectrical data acquisition and processing from 128 channels
utilizing the wavelet-transformation. Neurocomputing, 52–54, 247–
254.
Gabor, A. J., Leach, R. R., & Dowla, F. U. (1996). Automated seizure
detection using a self-organizing neural network. Electroencephalo-
graphy and Clinical Neurophysiology, 99, 257–266.
Geva, A. B., & Kerem, D. H. (1998). Forecasting generalized epileptic
seizures from the EEG signal by wavelet analysis and dynamic
unsupervised fuzzy clustering. IEEE Transactions on Biomedical
Engineering, 45(10), 1205–1216.
Haselsteiner, E., & Pfurtscheller, G. (2000). Using time-dependent neural
networks for EEG classification. IEEE Transactions on Rehabilitation
Engineering, 8, 457–463.
Hazarika, N., Chen, J. Z., Tsoi, A. C., & Sergejew, A. (1997). Classification
of EEG signals using the wavelet transform. Signal Processing, 59(1),
61–72.
Iasemidis, L. D., Shiau, D. S., Chaovalitwongse, W., Sackellares, J. C.,
Pardalos, P. M., Principe, J. C., et al. (2003). Adaptive epileptic seizure
prediction system. IEEE Transactions on Biomedical Engineering,
50(5), 616–627.
Kalayci, T., & Ozdamar, O. (1995). Wavelet preprocessing for automated
neural network detection of EEG spikes. IEEE Engineering in Medicine
and Biology Magazine, Mar/Apr, 160–166.
Kandaswamy, A., Kumar, C. S., Ramanathan, R. P., Jayaraman, S., &
Malmurugan, N. (2004). Neural classification of lung sounds using
wavelet coefficients. Computers in Biology and Medicine, 34(6),
523–537.
Khan, Y. U., & Gotman, J. (2003). Wavelet based automatic seizure
detection in intracerebral electroencephalogram. Clinical Neurophy-
siology, 114, 898–908.
Kiymik, M. K., Akin, M., & Subasi, A. (2004). Automatic recognition of
alertness level by using wavelet transform and artificial neural network.
Journal of Neuroscience Methods, 139(2), 231–240.
Mallat, S. G. (1987). Multifrequency channel decompositions of images
and wavelet models. IEEE Transactions on ASSP, 37(12), 2091–2109.
Oysal, Y., Yilmaz, A.S., Koklukaya, E. (2005). A dynamic wavelet network
based adaptive load frequency control in power systems. Electrical
Power and Energy Systems, 27, 21–29.
Patwardhan, S. V., Dhawan, A. P., & Relue, P. A. (2003). Classification of
melanoma using tree structured wavelet transforms. Computer Methods
and Programs in Biomedicine, 72, 223–239.
Peters, B. O., Pfurtscheller, G., & Flyvbjerg, H. (2001). Automatic
differentiation of multichannel EEG signals. IEEE Transactions on
Biomedical Engineering, 48, 111–116.
Petrosian, A., Prokhorov, D., Homan, R., Dashei, R., & Wunsch, D. (2000).
Recurrent neural network based prediction of epileptic seizures in intra-
and extracranial EEG. Neurocomputing, 30, 201–218.
Pradhan, N., Sadasivan, P. K., & Arunodaya, G. R. (1996). Detection of
seizure activity in EEG by an artificial neural network: A preliminary
study. Computers and Biomedical Research, 29, 303–313.
Qu, H., & Gotman, J. (1997). A patient-specific algorithm for the detection
of seizure onset in long-term EEG monitoring: Possible use as a
warning device. IEEE Transactions on Biomedical Engineering, 44,
115–122.
A. Subasi / Expert Systems with Applications 29 (2005) 343–355 355
Quiroga, R. Q., Sakowitz, O. W., Basar, E., & Schurmann, M. (2001).
Wavelet transform in the analysis of the frequency composition of
evoked potentials. Brain Research Protocols, 8, 16–24.
Quiroga, R. Q., & Schurmann, M. (1999). Functions and sources of event-
related EEG alpha oscillations studied with the wavelet transform.
Clinical Neurophysiology, 110, 643–654.
Qussar, Y., Rivals, I., Personnaz, L., & Dreyfus, G. (1998). Training
wavelet networks for nonlinear dynamic input–output modeling.
Neurocomputing, 20, 173–188.
Rioul, O., & Vetterli, M. (1991). Wavelet and signal processing. IEEE
Signal Processing Magazine , 14–46.
Robert, C., Gaudy, J. F., & Limoge, A. (2002). Electroencephalogram
processing using neural networks. Clinical Neurophysiology, 113,
694–701.
Rosso, O. A., Blanco, S., & Rabinowicz, A. (2003). Wavelet analysis of
generalized tonic-clonic epileptic seizures. Signal Processing, 83,
1275–1289.
Rosso, O. A., Martin, M. T., & Plastino, A. (2002). Brain electrical activity
analysis using wavelet-based informational tools. Physica A, 313, 587–
608.
Samar, V. J., Bopardikar, A., Rao, R., & Swartz, K. (1999). Wavelet
analysis of neuroelectric waveforms: A conceptual tutorial. Brain and
Language, 66, 7–60.
Sanner, R., & Slotine, J. J. E. (1992). Gaussian networks for direct adaptive
control. IEEE Transactions Neural Network, 13(6), 837–863.
Shimada, T., Shiina, T., & Saito, Y. (2000). Detection of characteristic
waves of sleep EEG by neural network analysis. IEEE Transactions on
Biomedical Engineering, 47, 369–379.
Soltani, S., Simard, P., & Boichu, D. (2004). Estimation of the self-
similarity parameter using the wavelet transform. Signal Processing,
84, 117–123.
Subasi, A. (2005). Automatic recognition of alertness level from EEG by
using neural network and wavelet coefficients. Expert Systems with
Applications, 28, 701–711.
Sun, M., & Sclabassi, R. J. (2000). The forward EEG solutions can be
computed using artificial neural networks. IEEE Transactions on
Biomedical Engineering, 47, 1044–1050.
Vuckovic, A., Radivojevic, V. A., Chen, C. N., & Popovic, D. (2002).
Automatic recognition of alertness and drowsiness from EEG by an
artificial neural network. Medical Engineering and Physics, 24,
349–360.
Webber, W. R. S., Lesser, R. P., Richardson, R. T., & Wilson, K. (1996).
An approach to seizure detection using an artificial neural network
(ANN). Electroencephalography and Clinical Neurophysiology, 98,
250–272.
Weng, W., & Khorasani, K. (1996). An adaptive structure neural network
with application to EEG automatic seizure detection. Neural Networks,
9, 1223–1240.
Zhang, Q., & Benveniste, A. (1992). Wavelet networks. IEEE Transactions
Neural Networks, 3(6), 889–898.
Zhang, M., Kawabata, H., & Liu, Z. Q. (2001). Electroencephalogram
analysis using fast wavelet transform. Computers in Biology and
Medicine, 31, 429–440.
Zhang, J., Walter, G. G., & Lee, W. (1995). Wavelet neural networks for
function learning. IEEE Transactions Signal Processing, 43(6), 1485–
1497.