Upload
abu-hussein
View
241
Download
0
Embed Size (px)
Citation preview
7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf
1/13
IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 5, NO. 6, NOVEMBER 1997 529
A Neural-Network Approach to Fault Detectionand Diagnosis in Industrial Processes
Yunosuke Maki and Kenneth A. Loparo, Senior Member, IEEE
Abstract Using a multilayered feedforward neural-networkapproach, the detection and diagnosis of faults in industrialprocesses that requires observing multiple data simultaneouslyare studied in this paper. The main feature of our approach isthat the detection of the faults occurs during transient periods ofoperation of the process. A two-stage neural network is proposedas the basic structure of the detection system. The first stage ofthe network detects the dynamic trend of each measurement, andthe second stage of the network detects and diagnoses the faults.The potential of this approach is demonstrated in simulationusing a model of a continuously well-stirred tank reactor. Theneural-network-based method successfully detects and diagnosespretrained faults during transient periods and can also generalize
properly. Finally, a comparison with a model-based method ispresented.
Index TermsFault detection, fault diagnosis, neural networks.
I. INTRODUCTION
FAULT detection and diagnosis problems have been stud-
ied intensively in industries such as chemical processing
and utility power generation. Prompt detection and diagnosis
of faults is essential for the reliable, safe, and efficient opera-
tion of the plant and for maintaining quality of the products.
Faults may occur in the process, the sensors, the actuators,
and the instruments independently or simultaneously. For a
simple fault that can be detected by a single measurement,a conventional alarm circuit may be sufficient. However,
because it is usually very difficult in complex industrial
systems to directly measure process states that are good
indicators of faults, more elaborate and automatic measures
are necessary. Observing multiple data simultaneously, skilled
operators are often required to make tough decisions based on
their experience and empirical knowledge.
One of the common approaches is to use model-based
methods for detection and diagnosis. This requires modeling
of the process, filtering the measured data, and estimation of
the unknown state variables. The basic idea is compare the
output of the model to the measurements from the process,
thereby generating a residual or error which is used makea decision about the operating state of the system. A wide
variety of methods and applications have been studied and
are summarized by Himmelblau [1], Frank [2], and Gertler
[3]. The nonlinear filtering approach studied in [4] is in this
category. Himmelblau et al. [5], [6] demonstrated the appli-
Manuscript received August 22, 1995; revised February 5, 1997. Recom-mended by Associate Editor, E. O. King.
The authors are with the Case School of Engineering, Case Western ReserveUniversity, Cleveland, OH 44106-7082 USA.
Publisher Item Identifier S 1063-6536(97)07770-1.
cation of extended Kalman filtering (EKF) to fault detection
and diagnosis in chemical processes. Model-based approaches
can use either state space or inputoutput representations
of dynamic systems. As a consequence, the system model
must be known and accurate for these methods to be highly
effective. Uncertainty in the process model can easily degrade
the estimation output and cause either missed detections or
false alarms. The nonlinear filtering approach is usually more
robust than conventional linear filtering-based methods, but a
substantial modeling effort may be required.
On the other hand, qualitative approaches that do not require
process models have elicited considerable research interest inthe last ten years. Decision table-based methods, knowledge-
based expert systems [7] and artificial neural-network-based
methods are considered to be in this category. Neural-network-
based methods have received much attention because of their
fast and robust implementation, their performance in learning
arbitrary nonlinear mappings and their ability for pattern
recognition and association. The fault detection and diagnosis
problem can be interpreted as a pattern recognition task.
Neural networks are an appropriate tool for fault detection
and diagnosis in which measured data, not discernible at the
instant of sensing, is transformed into useful information for
decision-making.
The potential of this approach for chemical processeswas initially proposed by Hoskins and Himmelblau [8] and
Venkatasubramanian and Chan [9]. Watanabe et al. [10]
demonstrated the use of a two-stage neural network to add
information about the severity of the fault. More detailed
analysis regarding the learning, recall and generalization
characteristics of the method was given by Venkatasubra-
manian et al. [11] and a large-scale application to a complex
chemical plant was demonstrated by Hoskins et al. [12].
However, these approaches are static in nature because the
neural networks are trained using only steady-state data. If
the steady-state operating conditions are changed, the network
must be retrained in order to work properly. Oftentimes,faster detection of the fault is required and it is necessary to
use transient data for this purpose. Dietz et al. [13] trained
the network by presenting dynamic data and Li et al. [14]
developed an approach using a moving time window. Ohga
and Seki [15] trained the network using a number of sets of
time series data.
The major motivation of this work is the use of artificial
neural networks, capable of operating during process tran-
sients, for fault detection and diagnosis of industrial processes.
The ultimate goal is to develop a general method that can
10636536/97$10.00 1997 IEEE
7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf
2/13
530 IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 5, NO. 6, NOVEMBER 1997
Fig. 1. Process flow of well-stirred tank (Luyben model).
be applied to a broad spectrum of industrial processes. The
main feature of the proposed method is the rapid and robust
detection of faults during transient periods of the process.
Maintenance of the neural network should also be donewith less effort. This paper is organized as follows: First, a
model of a well-stirred tank is developed as a target process
for this study. Second, the neural-network-based method is
developed and implemented in a simulation environment.
Third, a simulation study using the plant model is performed
and the results of various tests are discussed. Finally, the
proposed neural-network-based approach is compared with a
conventional model-based method, the EKF.
II. PLANT MODEL
In order to illustrate the method proposed in this work, a
target plant model is developed. A continuously well-stirred
tank reactor (CSTR) is used for all case studies in this work.In this section, the plant model and its implementation are
described in detail and the faults that are considered in the
study are introduced.
A. Description of the Plant
Fig. 1 illustrates the jacketed CSTR in which an irreversible
and exothermic reaction A B takes place. The reactor
is operated by three control loops that regulate the outlet
temperature, the inlet flow rate of the reactant tank level. A
cooling jacket surrounds the reactor and the coolant is water in
this case. Negligible heat losses, constant densities and perfect
mixing inside the tank are assumed. Therefore the temperature
in the jacket is uniform and equal to the outlet temperature.The process variables are as follows.
concentration of at the inlet and outlet,
respectively;
flow rate of the liquid at inlet and outlet,
respectively;
flow rate of coolant;
volume of the tank;
temperature of the inlet reactant;
temperature of the outlet coolant (water);
temperature of the tank;
control valve openings.
The parameters, assumed to be constant, are as follows.
frequency factor;
activation energy;
gas constant;
volumetric heat capacity;
H heat of reaction;
heat exchange area;
overall heat transfer coefficient;
area of the tank;
jacket volume;
heat capacity of water;
density of water;
temperature of the inlet coolant (water).
The equations describing the system are
Mass balance:
(1)
(2)
Energy balance:
H
(3)
(4)
The system is highly nonlinear. All equations and parametervalues are taken from Luyben [16]. Moreover, this CSTR
model has been used in previous neural-network-based studies;
see [8] and [11].
B. Implementation of the Plant Model
For computer simulation, the plant model is implemented
using Simulink in Matlab. The basic time unit is the hour.
The step size for Euler integration is denoted by and it is
usually 0.01 [h].
A block diagram of each control loop is shown in Fig. 2.
Three PI controllers are used to regulate the outlet temperature
, the inlet flow rate of the reactant and the tanklevel . Equal percentage valves are used to control the flow
rate of the reactant, coolant and outlet liquid. To simplify the
simulation, upstream and downstream pressures are assumed
to be constant. A first-order lag is used to model the actuator
and sensing dynamics.
C. Faults Studied
Complex and frequently observable faults are selected for
this study as listed in Table I. All possible faults are not
included. However, three different kinds of faults that effect
the sensor, actuator, and process are considered.
7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf
3/13
MAKI AND LOPARO: FAULT DETECTION AND DIAGNOSIS IN INDUSTRIAL PROCESSES 531
Fig. 2. Block diagram of the control loop.
TABLE ILIST OF FAULTS STUDIED
III. NEURAL-NETWORK-BASED METHOD
A. Introduction
In recent years artificial neural networks have generated
considerable interest in the field of engineering as problem
solving tools. The fundamental element is a neuron which has
multiple inputs and a single output. Each input is multiplied by
a weight, the inputs are summed and this quantity is operated
on by the transfer function of the neuron to generate the output.The output is sometimes referred to as an activity level.
In this study, the multilayer feedforward neural network that
has one hidden layer is used. The bias unit, whose activity
level is fixed at one, is connected to all neurons in the hidden
and output layer to adjust the weighted sum input of each
neuron. The number of neurons in the input and output layer
is determined by each application, and the number of neurons
in the hidden layer must be adjusted during the learning phase
so that the network can be trained efficiently. The activity level
of the th neuron is obtained as
(5)
where
activity level (output) of the th neuron;
input to the th neuron;
the transfer function of the th neuron;
connection weight from the th neuron to the th
neuron;
activity level of the th neuron in the prior layer;
connection weight from the bias unit to the th
neuron.
The log-sigmoid function is used as the transfer function in
this study.
The backpropagation algorithm [17] is used to train the net-
work. The connection weights such as and are adjustedso that the average squared error between the network output
and the desired output (target) for a given reference input is
minimized. Learning continues iteratively until the sum of the
squared error is below a certain goal. The incremental change
of weight from the th neuron to the th is computed by(6)
(7)
(8)
where
incremental change in the weight at time ;
desired output of the th neuron in the output
layor;
learning rate (usually a constant);
momentum (usually a constant).
Equation (7) holds for the th neuron in the output layer,and (8) holds for the th neuron in the hidden layer. In (6),
and are adjustable parameters. In order to accelerate the
learning, the following methods are applied in this study.
1) The second term on the right-hand side of (6) is added
to the original update term to improve the learning [17].
Momentum is the key parameter here and it is set at
0.95 in this study.
2) The adaptive learning rate [18], which attempts to keep
the learning rate as large as possible while maintaining
the stability of the learning process, is also used. This
has a significant effect on convergence of the weights.
7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf
4/13
532 IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 5, NO. 6, NOVEMBER 1997
Fig. 3. Schematic diagram of fault detection and diagnosis system.
Fig. 4. Three training patterns for primary neural network.
B. Design of the Neural-Network-Based Detection System
1) Basic Concept: The capability of neural-network-based
methods for fault detection has been established in previous
works. The particular goals of this study are.
1) Preknown faults should be detectable by the neural-
network-based system. Unknown operating conditions
should not generate a false alarm. In another words, the
system is designed to detect faults that have occurred
in the past and to be robust to unmodeled operating
conditions.2) The transient state of the fault can be detected dynam-
ically. No steady-state values of process variables are
required as parameters in the design of the detection
system.
3) Detection is expected to be fast, reliable, and robust to
noise.
4) The method should be applicable to various industrial
processes with little additional effort and adjustmentsto the parameters and network structure are conducted
easily.
2) Basic Structure: Fig. 3 depicts the basic structure of the
fault detection and diagnosis system developed in this work.
A two-stage neural-network system is proposed to improveflexibility and applicability to other industrial processes. The
first stage network is referred to as the primary neural network
and the second stage network is referred to as the secondary
neural network. Each primary neural network corresponds to
a channel of measured data and is used to detect the extent of
changes such as increasing, decreasing, and steady behavior
with numbers that indicate the extent of such changes. There-
fore, the primary neural network can be designed independent
from the secondary neural network. Furthermore, we do not
have to design more than two primary neural networks, even
for multiple measurements, because the same network can be
applied to different measurement channels. The primary neural
network eliminates the need for additional input neurons to
capture the dynamic aspects of the data, refer to Li [14].
The moving time window technique as described in [14] is
used and a delay unit eliminates the effect of plant fluctuations
and accommodates for differences in the response time for
different measurement channels. A reset and restriction rule
is used to reduce the probability of false alarm. Details are
presented later.
3) Design of Primary Neural Network: As we mentioned
above, this network is designed to be used with the various
observations that are available. Observation histories are cat-
egorized into three types of behavior: increasing, decreasing,
and steady, and this network is trained to give this type of
trend information including the extent of change. We assume
that the measured data is normalized to the range [ 1, 1]
before it is used as an input to the network.
We begin with data obtained by periodic sampling (21 sam-
ples) from a single measurement source. After normalization,
a vector of 21 elements is given to the network as an input.
A feedforward type network is used. The number of units inthe input layer is 21 and the number of units in the output layer
is three. The activity level, defined to be between zero and one,
of each unit corresponds to the extent of increase, decrease,
and steadiness of the input, respectively. These activity levels
are denoted by and in the following examples. The
number of hidden units is adjustable and after some trial and
error during the learning phase it is chosen at 15. The superior
feature of the feedforward type network is that its output can
include information about both the direction and the extent of
change as mentioned above.
Training is performed by presenting the three target patterns
as given in Fig. 4. Fig. 5 shows examples of how well the net-
work can generalize. Two different sets of noisy data, markedby x, are presented to the networks and the recognized output
values are given under the same graph. The extent of increase
in the case of (a) is apparently greater because the value of
is larger. On the other hand, the extent of steadiness in the
case of (b) is greater because the value of is larger.
4) Design of Secondary Neural Network: The secondary
neural network receives the outputs from the primary neural
networks and produces information about the faults. A
conceptual diagram of a two-stage network system is depicted
in Fig. 6. This network must be designed and trained to satisfy
the particular requirements of each application problem. A
7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf
5/13
MAKI AND LOPARO: FAULT DETECTION AND DIAGNOSIS IN INDUSTRIAL PROCESSES 533
(a)
(b)
Fig. 5. Examples of given input and recognized output. (a) Input: typicalincrease. (b) Input: slight increase.
feedforward type network that can be trained using preknown
information is also appropriate for this case. Suppose that the
number of sensors used for detection is and the number of
faults to be detected is , the secondary neural network has3 neurons in the input layer and ( ) neurons in the output
layer. The number of units in the hidden layer is adjustable
and 15 is chosen for this paper from experimental trial and
error. The transfer function chosen is of the log-sigmoid type.
This is a reasonable choice because the extent of each faultcan be represented by a number between zero and one.
For the plant model of CSTR, the eight variables,
and are assumed to be mea-
surable.
In actual processes, it is very difficult to measure concentra-
tion continuously. Hence, concentrations of the substance A,
and , are assumed to not be measurable. The number of
input neurons of the secondary network is .For the training of the network, target patterns must be
set beforehand. According to the faults defined in Table I,
Table II gives the 12 sets of target patterns used for this study.
Each column corresponds to one specific fault and each row
corresponds to a neuron in the input layer. The values in
each column of the table are used as a reference input to the
network for each of the faults to be learned. Any combination
of faults can be chosen and the number of output neurons
is so determined. As the target patterns, the value of the
corresponding output neuron is set to one and the value of
other outputs is set to zero. These targets for the network can
Fig. 6. Conceptual diagram of two-stage neural network.
Fig. 7. Moving time window that trace the dynamic data.
be determined empirically by carefully investigating the faults
that have occurred in the past, but fine adjustment may be
necessary in order that the network generalizes satisfactorily.
Details will be discussed later.
5) A Moving Time Window and Normalization: A movingtime window is an indispensable technique to track dynamic
data and detect the transient state of faults. As shown in Fig. 7,
the window moves forward at each time increment . The
right side of the each window corresponds to the current time,
the time span of the window is , and the number of
samples is . The window length is adjustable for
each application. For this study, only three different lengths
are used: 20, 50, and 100. Vertical window height must
be specified according to the range of each measurement and
the amount of change in the measurement signals that can be
caused by the faults. Table III gives the values of and for
each window used in this study. Using uniform samples
of a time series of data, the window calculates the average ofthe samples and rescales the vertical axis. The average value
is set equal to zero, the upper value is set to one, and the
lower value is set to 1 using . Values that exceed one are
set equal to one and values below 1 are set equal to 1.
Finally, the output of the moving window is used as the input
to the primary neural network.
In adjusting , there is a tradeoff between prompt detection
and disturbance rejection. By increasing , the network is
unlikely to be effected by plant disturbances. However, the
detection can become insensitive to the measurements and the
response of the detection system can also become sluggish.
7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf
6/13
534 IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 5, NO. 6, NOVEMBER 1997
TABLE IITRAINING PATTERNS FOR THE SECONDARY NEURAL NETWORK
TABLE IIIHEIGHTS OF EACH WINDOW (NOTE: d t = 0 : 0 1 H)
TABLE IVTIME CONSTANTS OF DELAY UNITS
Normalization of the inputs to the primary network is
automatically conducted by this moving time window; each
output of the primary network is between zero and one
and consequently, normalization for the secondary network
is also accomplished within the primary network. Although
the parameters shown in Table III must be adjusted for everyapplication, knowledge of the steady-state values for each
measurement is not necessary. Even though the steady-state
operating conditions of the plant are likely to change, the
frequency at which network parameters require readjustment
should be low.
6) Delay Unit, Reset, and Restriction Rules: Some supple-
mentary functions of the neural-network-based method are
briefly discussed in this section.
When multiple observations are necessary to detect a certain
fault, the response time for each observation may be different
because of the plant characteristics. In addition, some obser-
vations may be contaminated by noise, and the intensity of
this sensor noise may be different for each measurement. In
order to accommodate the different response characteristics
for multiple data or to remove the effects of noise, a first-
order lag (delay unit) is incorporated into the detection system.
The sensitivity of the detection system to changes in theparameters of the delay unit should be evaluated with the
width of the window fixed and training patterns specified. The
time constants chosen for this study are shown in Table IV.
Note if the window length is chosen to be large enough, and
the time constant tau of the coolant flow rate delay unit is
either 0.01 or 0.10, the network does not yield a false alarm
from unmodeled disturbances considered in this study. For
particular applications, incorporating a pure dead-time delay
in the network could also be effective.
A key feature of the detection system developed in this work
is the detection of faults during transient operating conditions.
7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf
7/13
MAKI AND LOPARO: FAULT DETECTION AND DIAGNOSIS IN INDUSTRIAL PROCESSES 535
TABLE VRESULTS OF TRAINING AND RECALL (MULTIPLE FAULTS)
Because the detection system does not include information on
normal steady-state values, it cannot determine if the plant is
in a normal steady state or in an abnormal steady state. If all
observations are steady during a fault condition, it is possible
that the detection system can misdiagnose the situation and
conclude that the plant is normal. Hence, it is necessary that
the detection system is manually reset (reinitialized) after afault is detected and the operator concludes that the plant
has resumed normal steady-state operation. This is not going
to be a problem in practical implementations because once
the system detects a fault, the alarm will be kept until it is
manually reset.The outputs of the network have values between zero and
one and a threshold value of 0.9 is used in this study to set
the alarms for the operator.
From Table II, we notice that most faults are enumerated
by pairs that have opposite direction. For example, if we want
to detect the fault #8p, it is natural to train the network by
the target patterns of faults #8p and #8n. Because many of
the process variables have second order dynamic responsecharacteristics, a false alarm of #8n is likely to occur after
the correct detection of fault #8p. However, the probability of
such an event is quite low in actual applications. Therefore,
the fault detection system as designed and implemented in this
work in a way that if a fault is detected, a fault with opposite
direction to the fault detected is not considered until after the
system is reset manually.
IV. SIMULATION RESULTS
The proposed fault detection and diagnosis system is ex-
pected to recall pretrained faults correctly. Also it shouldgeneralize appropriately even from distorted or noisy input
data. From a different point of view, it is also desired that
the neural network can be trained to detect multiple faults,as many as possible. In this section, the capabilities and
limitations regarding these requirements are examined and
discussed using the CSTR simulation.
A. Recall to Trained Faults
Basically, if the training of the secondary neural network
is accomplished within the time allocated for training and if
the error goal is achieved, recall should not be a problem. The
error goal of the backpropagation algorithm is set at 1 10
for the following experiments.
For a single fault, it is necessary to train the network
by presenting the target pattern of the fault and the normal
condition. Learning a faulty pattern along with the pattern of
normal operation is important to achieving correct recall. If
the network is trained only using the target pattern of a singlefault, then only a single neuron in the output layer is fired
for any input. Because most faults are considered in pairs
as mentioned previously, from a practical point of view, it
is recommended that the network be trained with the normal
pattern and at least one pair of faulty patterns.
For multiple faults, target patterns should include the faults
and the pattern for normal operation. Recall ability of the
network is investigated by presenting both the normal pattern
and one pair of faulty patterns. Then, by augmenting the
number of faults, training and recall capabilities are also
investigated. Results are summarized in Table V. For example,
suppose the network is trained using five sets of data thatrepresent the faults #1p, #1n, #2p, #2n and normal operation.
Fig. 8 shows an example of recall by presenting the data of
fault #1p. A symptom of the fault started at time 2.00. The
first neuron that corresponds to the fault #1p is fired promptly
at time 2.20 detecting the change that occurred in and
. Note that the fifth neuron, which corresponds to the normal
operation also responds but no false alarm occurs. The alarm
comes before the plant stabilizes to the faulty steady condition
at about time 2.60. Therefore, detection by this method is faster
than the conventional method that uses steady-state data.
From the results shown in Table V, this fault detection and
diagnosis system can also be trained using multiple faults
within an allowable time period. However, as the numberof faults increases, trapping in a local minimum in the error
surface is more likely to happen. This actually occurred when
we attempted to simultaneously train 12 patterns. Randomized
network weights and biases at the beginning of training can
help mitigate this problem.
Obviously from Table V, the correct recall rate decreases
as the number of trained patterns increases. One reason is that
there are faults that are similar to each other, such as faults #1n
and #2p, and so on. The network trained using many faults is
likely to give a false alarm by misunderstanding such similar
patterns. Furthermore, right after the occurrence of fault #2p,
7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf
8/13
536 IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 5, NO. 6, NOVEMBER 1997
TABLE VIGENERALIZATION RESULTS FROM TRAINED DATA WITH NOISE
Fig. 8. An example of correct recall.
#2n, #6p or #6n, the coolant flow rate fluctuates for a
while. This can also initiate a false alarm of fault #3p/#3nor #5p/#5n because is also the key observation for these
fault pairs. By updating the error goal to a smaller value,
the occurrence of such false alarms can be reduced, but not
eliminated entirely.
There is no distinct limit regarding the number of patterns
to be trained. But for this particular application, we found thatit is better to train the network using less than ten patterns.
Of course, this will vary from application to application, and
this limitation must be discovered as a part of the learning
and training process.
B. Generalization from Untrained Faults, Input withNoise, and Faults with Different Severity
How does the secondary neural network respond to un-
trained faults? How does it work with noise-corrupted input
data or faults with different severity? These generalization
issues are examined next. Furthermore, this section contains
a discussion of the applicability of the proposed system to
real-world industrial processes.
First, the response of the detection system to an input that
represents an untrained fault is investigated. According to
the experiments performed with many pretrained networks,
untrained faults were diagnosed as the normal operating con-
dition. In the hyperplane generated by the input vectors, an
arbitrary input vector locates closest to the vector of normal
operation. Arbitrary input vectors can be considered to be
outputs of the primary network representing unknown faults.
Therefore, the network cannot generalize to this situation, but
this is consistent with our objectives for the design of the
detection system. As depicted in Fig. 6, the output neuron
for the normal condition should be referred to as normal or
untrained faults.
When the set point of the controller is changed, an unsteady
operating condition is generated in the plant, but this is still
normal operation. If this response is not similar to one of
the trained faults, the network diagnoses a normal operating
condition, similar to the case of an untrained fault. However,
for example, fault #1p and #1n are quite similar to the response
of the plant to a set point change of the inlet flow rate . It is
quite likely that a false alarm is generated during such set point
changes. As these set point changes are commonly initiated by
an operator, the detection system should be tentatively disabled
while the plant is in such a transient state.
Second, noisy inputs of pretrained faults are given to the
network. White noise with a normal distribution [ (0, 1)]is multiplied by a constant and added to each measurable
variable. The control actions of the three controllers are also
affected by the noise processes. The noise level is expressed as
a percentage that represents the ratio of the standard deviation
of the noise term to the amount of change caused by the fault.
Let us consider the same example as given in Fig. 8, in which
the change of the coolant flow rate is about 4 ft /h and the
change of the control valve opening is about 8.1% of total
stroke. If the standard deviation of the noise term of the coolant
flow rate is 0.55 ft /h, the noise level of the coolant flow rate
is %. If the standard deviation of the noise
term of the control valve opening is 0.56%, the noise level of
the control valve opening is %.For simplicity, the noise level of temperatures and level
measurements are kept constant. Other noise levels are altered
as in Table VI, which also shows the results of this experiment.
It is obvious that the more noise that is added the more difficult
it is to detect faults. However, the network performance
demonstrates that it is adequately robust to noise. Fig. 9
illustrates an example where fault #1p is correctly generalized
from a faulty input with noise. Because the threshold level of
firing each neuron is set at 0.9, the system can detect this fault
at time . Further discussion of noise is given in the
next section.
7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf
9/13
7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf
10/13
538 IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 5, NO. 6, NOVEMBER 1997
give and
(14)
(15)
where
(16)
(17)
and are used to compute the state estimate and
error covariance at the next time step.
B. Application of EKF to the CSTR
Define the state vector
(18)
where it is assumed that and are measurable. It
follows that , , , ,
and . From the assumption, and
are measurable and the others are not. Usually and are
treated as inputs in such a state space representation. However,
in this application they are defined as variables because they
must be estimated.
For computational reasons, the system equations are mod-
ified to be dimensionless. The normalized state variables in
deviation form are defined as shown in (19) at the bottom of
the page, where (ft ), , ,
, (ft /h) and (ft /h). These
are the steady-state values. Hence, ,
, , ,
and . From here, the * is
omitted to simplify the notation and because they are defined
as deviation variables, the initial state is given as .
From (1)(4), the system equations of the CSTR are written
as shown in (20) at the bottom of the page, where
(21)
represents the modeling uncertainty.
From the assumption on measurability, the output equation is
written below, where the observation matrix is denoted by
(22)
Here
(23)
and is the measurement noise vector.
Examining the parameters given in (21), we notice that there
are variables that require estimation. For example, the coolant
flow rate, , and the reactant concentration at the inlet, ,
can change during the operation of the process. In a fault mode,
parameters such as , , and can also vary. The reason for
(19)
(20)
7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf
11/13
MAKI AND LOPARO: FAULT DETECTION AND DIAGNOSIS IN INDUSTRIAL PROCESSES 539
the choice of the system state as given in (18) is discussed in
the next section.
C. Observability
The model (20) has unmeasured state variables and param-
eters that can change during different operating modes of the
process. If parameters such as , , and can be estimated,
it is very helpful for fault detection and diagnosis. However,because they are not measurable directly and are necessary for
the detection of certain faults, the state vector is augmented to
include these variables and parameters to be estimated by the
EKF. The augmented state must be observable, otherwise it
is meaningless to incorporate these variables and parameters
into the model. Various realizations, other than the system
representation given in the previous section, were considered.
Unfortunately, all other realizations that were tried resulted in
an unobservable realization.
D. EKF Tuning and Simulation for Fault Detection
The parameters of the EKF are , , and , which are
assumed to be diagonal matrices. Also, the initial state must
be specified. As described in (9) and (10), and can
be determined from the known noise covariances. After that,
however, further tuning by trial and error is always necessary
to achieve stable and accurate estimation. Each element of
is inversely proportional to the gain matrix . Hence, smaller
elements are chosen as long as the estimation process is stable.
As each element of increases, faster response is obtained
but the amplitude of fluctuation increases. Conversely, the
smaller each element of is, the slower the response and the
smaller the fluctuations. The elements of that correspond to
unmeasurable states need to be chosen so that the estimated
states track the true values. As long as is given as ,and and are chosen appropriately, it is not necessary
to adjust . This delicate balancing of estimator parameters
and performance is similar to the situation we discussed
earlier regarding the influence of window height on network
performance for different fault severity scenarios.
Two faults, #3p and #5p are selected for the case study
to compare the performance of the EKF to the performance
of neural-network-based approach. For #3p, the unmeasurable
state, , rises due to the sticking of the control valve. For #5p,
the constant, , rises and consequently the unmeasurable state
increases. Hence, the estimates of or
by the EKF are used to detect these faults.
The square root of the covariance matrix is given as
The square root of the covariance matrix is given as
The covariance of the initial state estimation error is given
as
Fig. 11. Results of fault #5p.
Computer simulations for the two faults are performed under
the above conditions.
Fault #3p: For fault #3p, after adjustment and of
the EKF are
Results of estimated states are shown in Fig. 10. The change
of coolant flow rate affects the accuracy of the plant model.Nevertheless, the estimates of and track
the actual values very well. The fault is detected by trending
of the estimated states and .
Fault #5p: For fault #5p, and are adjusted to be
Results of the estimated states are shown in Fig. 11.
Despite the intensive effort to search for the optimal
and , does not follow the actual value. The
coolant flow rate and the inlet concentration change
due to the fault and affect the accuracy of the model. The
plant model with degraded accuracy hinders the estimation
and proper tracking. Hence, the fault is not detectable.
E. Comparison of the Two Methods
Following the above results, a comparison of the EKF and
neural-network-based method is examined for the two faults
#3p and #5p. Simultaneously, five different levels of noise are
given and performance for different S/N ratios are compared.
As the noise level of fault #3p increases, false alarms are
likely to happen and tuning of the parameters is necessary
for more than half the cases. For all cases of fault #5p, the
EKF does not work well even though extensive efforts were
taken to tune the parameters. We conclude this section withthe following comments regarding the comparison.
1) A model-based approach such as EKF significantly
depends on the validity of the model. The performance
of a model-based system is easily degraded by unmod-
eled disturbances such as measurement or process noise
caused by perturbations or other malfunctions of the
plant.
2) Use of the EKF is limited by observability of the real-
ization, including unmeasurable states and parameters. If
the system is not observable, we must look for a reduced
set of states and/or parameters that are observable.
7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf
12/13
540 IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 5, NO. 6, NOVEMBER 1997
Reducing the state variables makes the model vulnerable
to uncertainty.
3) Parameters of the EKF need to be adjusted every time
the noise level changes. Parameters of the neural-
network-based approach are generally more robust in
this sense.
4) The advantage of the model-based method is that it
can estimate unmeasurable parameters, if they are ob-
servable. On the other hand, the neural-network methodmust rely on the measurable information and can best
correlate with preknown faults through training.
VI. CONCLUSIONS
A neural-network-based fault detection and diagnosis sys-
tem is developed and applied to a plant model of a CSTR.We summarize and review the main results of this work,
evaluate the neural-network-based approach for fault detection
and discuss the applicability to general industrial processes.
A nonlinear CSTR was chosen as the study system. Aplant model was developed and implemented for computer
simulation. The following results were obtained from the case
studies conducted using the CSTR model.
1) A two-stage neural-network system, where each element
can be designed independently, has been proposed. Us-
ing data preprocessed by the moving time window, the
primary network detects the transient state of each mea-
surement dynamically, and this architecture can be used
for many industrial process applications. A secondary
neural network was developed to detect and diagnose a
set of preknown faults according to the specification of
the application. The two-stage network approach yields
an efficient and simplified design procedure.2) For the training of the neural networks, the backpropaga-
tion algorithm was chosen. This approach is considered
to be better for training feedforward neural networks, asopposed to Hopfield networks. Combined with momen-
tum and the adaptive learning rate method, training of
the neural networks was performed very efficiently.
3) The secondary neural network can be trained for multi-
ple faults as long as all the patterns differ from each
other. It can also recall the trained faults correctly.
However, the more patterns that are trained, the more
likely it is to be trapped in a local minimum. Moreover,
similar patterns of faults or perturbations that occur after
a fault are likely to produce a false alarm.4) The neural-network-based system can detect trained
faults promptly during the transient period. It is faster
than the method trained using steady-state data. It detects
untrained fault as the normal operating condition, as
desired. When a set point of a controller is changed dur-
ing normal operation of the plant, the detection system
diagnoses that the plant is normal unless the response of
the plant induced by the change in set point is similar to
that of the trained faults. Because an operator should be
aware of either a set point change or the occurrence of a
measurable disturbance, in these situations the detection
system should be turned off temporarily until the process
returns to a normal operating state.5) The secondary neural network can generalize from faulty
data with noise if the amplitude of the noise is within
certain bounds. Also it can generalize from faulty data
with different severity, unless the severity is much
smaller than the window height.
6) A conventional model-based approach, the EKF, is cho-
sen for this study. Its performance strictly depends on the
accuracy of the model and its applicability is restricted
by observability of the realization. Compared with the
EKF model-based approach, the neural-network-based
approach is more robust with respect to noise. Generally,
we do not have to change parameters of the neural-
network-based method for different faults and different
noise level.
7) The secondary neural network must be designed using
the given specifications of the plant. However, tuning of
the parameters can be completed efficiently. This feature
implies that a wide scope of industrial process applica-
tions can be addressed using the approach developed inthis work.
REFERENCES
[1] D. M. Himmelblau, Fault Detection and Diagnosis in Chemical andPetrochemical Processes. New York: Elsevier, 1978.
[2] P. M. Frank, Fault diagnosis in dynamic systems using analyticaland knowledge-based redundancyA survey and some new results,
Automatica, vol. 26, no. 3, pp. 459474, 1990.[3] J. J. Gertler, Survey of model-based failure detection and isolation in
complex plants, IEEE Contr. Syst. Mag., vol. 8, pp. 311, 1988.[4] K. A. Loparo, M. R. Buchner, and K. S. Vasudeva, Leak detection in
an experimental heat exchanger process: A multiple model approach,IEEE Trans. Automat. Contr., vol. 36, 1991.
[5] S. Park and D. M. Himmelblau, Fault detection and diagnosis via
parameter estimation in lumped dynamic systems, Ind. Eng. Chem.Process Des. Dev., vol. 22, no. 3, pp. 482487, 1983.
[6] R. Li and J. H. Olsen, Fault detection and diagnosis in a closed-loopnonlinear distillation process: Application of extended Kalman filters,
Ind. Eng. Chem. Res., vol. 30, no. 5, pp. 898908, 1991.[7] S. K. Shum, J. F. Davis, W. F. Punch, and B. Chandrasekaran, An
expert system approach to malfunction diagnosis in chemical plants,Comput. Chem. Eng., vol. 12, no. 1, pp. 2736, 1988.
[8] J. C. Hopkins and D. M. Himmelblau, Artificial neural-network mod-els of knowledge representation in chemical engineering, ComputersChem. Eng., vol. 12, nos. 9/10, pp. 881890, 1988.
[9] V. Venkatasubramanian and K. Chan, A neural-network methodologyfor process fault diagnosis, AIChE J., vol. 35, no. 12, pp. 19932001,1989.
[10] K. Watanabe, I. Matsuura, M. Abe, and M. Kubota, Incipient faultdiagnosis of chemical processes via artificial neural networks, AIChE
J., vol. 35, no. 11, pp. 18031812, 1989.[11] V. Venkatasubramanian, R. Vaidyanathan, and Y. Yamamoto, Process
fault detection and diagnosis using neural networksI: Steady-stateprocesses, Computers Chem. Eng., vol. 14, no. 7, pp. 699712, 1990.
[12] J. C. Hoskins, K. M. Kaliyur, and D. M. Himmelblau, Fault diagnosisin complex chemical plants using artificial neural networks, AIChE J.,vol. 37, no. 1, pp. 137141, 1991.
[13] W. E. Dietz, E. L. Kiech, and M. Ali, Jet and rocket engine faultdiagnosis in real time, J. Neural-Network Computing, vol. 1, no. 5, pp.517, 1989.
[14] R. Li, J. H. Olson, and D. L. Chester, Dynamic fault detection anddiagnosis using neural networks, in Proc. 5th IEEE Symp. Intell. Contr.,1990, pp. 11691174.
[15] Y. Ohga and H. Seki, Abnormal event identification in nuclear powerplants using a neural network and knowledge processing, NuclearTechnol., vol. 101, pp. 159167, Feb. 1993.
[16] W. L. Luyben, Process Modeling, Simulation, and Control for ChemicalEngineers. New York: McGraw-Hill, 1990.
7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf
13/13
MAKI AND LOPARO: FAULT DETECTION AND DIAGNOSIS IN INDUSTRIAL PROCESSES 541
[17] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Learning internalrepresentations by error propagation, Parallel Distributed Processing:
Explorations in the Microstructure of CognitionI: Foundations, D. E.Rummelhart and J. L. McClelland, Eds. Cambridge, MA: MIT Press,1986.
[18] T. P. Vogel, J. K. Mangis, A. K. Rigler, W. T. Zink, and D. L. Alkon,Accelerating the convergence of the backpropagation method, Biol.Cybern., vol. 59, pp. 257263, 1988.
[19] C. K. Cui and G. Chen, Kalman Filtering. New York: Springer-Verlag.[20] A. H. Jazwinski, Stochastic Processes and Filtering Theory. New
York: Academic, 1970.
Yunosuke Maki was born in Toyohashi, Japan, in1958. He received the B.S. degree in mathematicsand instrumentation from the University of Tokyoin 1981 and the M.S. degree in Systems and ControlEngineering from Case Western Reserve University,Cleveland, OH, in 1994.
Since 1981, he has been working for theKawasaki Steel Corporation in Japan. His researchinterests include the industrial applications ofsystems and control theory, especially to the steelmaking process.
Mr. Maki is currently a member of the Iron and Steel Institute in Japan.
Kenneth A. Loparo (S75M77SM89) receivedthe Ph.D. degree in systems and control engineeringfrom Case Western Reserve University, Cleveland,OH, in 1977.
He was an Assistant Professor in the MechanicalEngineering Department at Cleveland State Univer-sity, OH, from 1977 to 1979, where he receivedthe Distinguished Faculty Award for contributionsto teaching and research. From 1979 to the presenttime, he has been on the faculty of The Case School
of Engineering, Case Western Reserve Universitywhere he is currently Associate Dean of The Case School of Engineeringand Professor of Systems and Control Engineering. He is also Professor ofMechanical and Aerospace Engineering and Professor of Mathematics. Heserved as Chair of the Department of Systems Engineering from 1990 to1994 and as Associate Director of the Center for Automation and IntelligentSystems Research from 1985 to 1989. His research interests include stabilityand control of nonlinear and stochastic systems with applications to large-scaleelectric power systems; nonlinear filtering with applications to monitoring,fault detection, diagnosis and reconfigurable control; information theoryaspects of stochastic and quantized systems with applications to adaptive anddual control; and the design of digital control systems.
At Case Western Reserve University he has received numerous awardsincluding the Sigma Xi Research Award for contributions to stochastic control,the John S. Diekoff Award for Distinguished Graduate Teaching, the Tau BetaPi Outstanding Engineering and Science Professor Award, the UndergraduateTeaching Excellence Award, and the Carl F. Wittke Award for Distinguished
Undergraduate Teaching.