A Neural-Network Approach to Fault Detection CSTR.pdf

7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf

1/13

IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 5, NO. 6, NOVEMBER 1997 529

A Neural-Network Approach to Fault Detectionand Diagnosis in Industrial Processes

Yunosuke Maki and Kenneth A. Loparo, Senior Member, IEEE

Abstract Using a multilayered feedforward neural-networkapproach, the detection and diagnosis of faults in industrialprocesses that requires observing multiple data simultaneouslyare studied in this paper. The main feature of our approach isthat the detection of the faults occurs during transient periods ofoperation of the process. A two-stage neural network is proposedas the basic structure of the detection system. The first stage ofthe network detects the dynamic trend of each measurement, andthe second stage of the network detects and diagnoses the faults.The potential of this approach is demonstrated in simulationusing a model of a continuously well-stirred tank reactor. Theneural-network-based method successfully detects and diagnosespretrained faults during transient periods and can also generalize

properly. Finally, a comparison with a model-based method ispresented.

Index TermsFault detection, fault diagnosis, neural networks.

I. INTRODUCTION

FAULT detection and diagnosis problems have been stud-

ied intensively in industries such as chemical processing

and utility power generation. Prompt detection and diagnosis

of faults is essential for the reliable, safe, and efficient opera-

tion of the plant and for maintaining quality of the products.

Faults may occur in the process, the sensors, the actuators,

and the instruments independently or simultaneously. For a

simple fault that can be detected by a single measurement,a conventional alarm circuit may be sufficient. However,

because it is usually very difficult in complex industrial

systems to directly measure process states that are good

indicators of faults, more elaborate and automatic measures

are necessary. Observing multiple data simultaneously, skilled

operators are often required to make tough decisions based on

their experience and empirical knowledge.

One of the common approaches is to use model-based

methods for detection and diagnosis. This requires modeling

of the process, filtering the measured data, and estimation of

the unknown state variables. The basic idea is compare the

output of the model to the measurements from the process,

thereby generating a residual or error which is used makea decision about the operating state of the system. A wide

variety of methods and applications have been studied and

are summarized by Himmelblau [1], Frank [2], and Gertler

[3]. The nonlinear filtering approach studied in [4] is in this

category. Himmelblau et al. [5], [6] demonstrated the appli-

Manuscript received August 22, 1995; revised February 5, 1997. Recom-mended by Associate Editor, E. O. King.

The authors are with the Case School of Engineering, Case Western ReserveUniversity, Cleveland, OH 44106-7082 USA.

Publisher Item Identifier S 1063-6536(97)07770-1.

cation of extended Kalman filtering (EKF) to fault detection

and diagnosis in chemical processes. Model-based approaches

can use either state space or inputoutput representations

of dynamic systems. As a consequence, the system model

must be known and accurate for these methods to be highly

effective. Uncertainty in the process model can easily degrade

the estimation output and cause either missed detections or

false alarms. The nonlinear filtering approach is usually more

robust than conventional linear filtering-based methods, but a

substantial modeling effort may be required.

On the other hand, qualitative approaches that do not require

process models have elicited considerable research interest inthe last ten years. Decision table-based methods, knowledge-

based expert systems [7] and artificial neural-network-based

methods are considered to be in this category. Neural-network-

based methods have received much attention because of their

fast and robust implementation, their performance in learning

arbitrary nonlinear mappings and their ability for pattern

recognition and association. The fault detection and diagnosis

problem can be interpreted as a pattern recognition task.

Neural networks are an appropriate tool for fault detection

and diagnosis in which measured data, not discernible at the

instant of sensing, is transformed into useful information for

decision-making.

The potential of this approach for chemical processeswas initially proposed by Hoskins and Himmelblau [8] and

Venkatasubramanian and Chan [9]. Watanabe et al. [10]

demonstrated the use of a two-stage neural network to add

information about the severity of the fault. More detailed

analysis regarding the learning, recall and generalization

characteristics of the method was given by Venkatasubra-

manian et al. [11] and a large-scale application to a complex

chemical plant was demonstrated by Hoskins et al. [12].

However, these approaches are static in nature because the

neural networks are trained using only steady-state data. If

the steady-state operating conditions are changed, the network

must be retrained in order to work properly. Oftentimes,faster detection of the fault is required and it is necessary to

use transient data for this purpose. Dietz et al. [13] trained

the network by presenting dynamic data and Li et al. [14]

developed an approach using a moving time window. Ohga

and Seki [15] trained the network using a number of sets of

time series data.

The major motivation of this work is the use of artificial

neural networks, capable of operating during process tran-

sients, for fault detection and diagnosis of industrial processes.

The ultimate goal is to develop a general method that can

10636536/97$10.00 1997 IEEE


2/13

530 IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 5, NO. 6, NOVEMBER 1997

Fig. 1. Process flow of well-stirred tank (Luyben model).

be applied to a broad spectrum of industrial processes. The

main feature of the proposed method is the rapid and robust

detection of faults during transient periods of the process.

Maintenance of the neural network should also be donewith less effort. This paper is organized as follows: First, a

model of a well-stirred tank is developed as a target process

for this study. Second, the neural-network-based method is

developed and implemented in a simulation environment.

Third, a simulation study using the plant model is performed

and the results of various tests are discussed. Finally, the

proposed neural-network-based approach is compared with a

conventional model-based method, the EKF.

II. PLANT MODEL

In order to illustrate the method proposed in this work, a

target plant model is developed. A continuously well-stirred

tank reactor (CSTR) is used for all case studies in this work.In this section, the plant model and its implementation are

described in detail and the faults that are considered in the

study are introduced.

A. Description of the Plant

Fig. 1 illustrates the jacketed CSTR in which an irreversible

and exothermic reaction A B takes place. The reactor

is operated by three control loops that regulate the outlet

temperature, the inlet flow rate of the reactant tank level. A

cooling jacket surrounds the reactor and the coolant is water in

this case. Negligible heat losses, constant densities and perfect

mixing inside the tank are assumed. Therefore the temperature

in the jacket is uniform and equal to the outlet temperature.The process variables are as follows.

concentration of at the inlet and outlet,

respectively;

flow rate of the liquid at inlet and outlet,

respectively;

flow rate of coolant;

volume of the tank;

temperature of the inlet reactant;

temperature of the outlet coolant (water);

temperature of the tank;

control valve openings.

The parameters, assumed to be constant, are as follows.

frequency factor;

activation energy;

gas constant;

volumetric heat capacity;

H heat of reaction;

heat exchange area;

overall heat transfer coefficient;

area of the tank;

jacket volume;

heat capacity of water;

density of water;

temperature of the inlet coolant (water).

The equations describing the system are

Mass balance:

(1)

(2)

Energy balance:

H

(3)

(4)

The system is highly nonlinear. All equations and parametervalues are taken from Luyben [16]. Moreover, this CSTR

model has been used in previous neural-network-based studies;

see [8] and [11].

B. Implementation of the Plant Model

For computer simulation, the plant model is implemented

using Simulink in Matlab. The basic time unit is the hour.

The step size for Euler integration is denoted by and it is

usually 0.01 [h].

A block diagram of each control loop is shown in Fig. 2.

Three PI controllers are used to regulate the outlet temperature

, the inlet flow rate of the reactant and the tanklevel . Equal percentage valves are used to control the flow

rate of the reactant, coolant and outlet liquid. To simplify the

simulation, upstream and downstream pressures are assumed

to be constant. A first-order lag is used to model the actuator

and sensing dynamics.

C. Faults Studied

Complex and frequently observable faults are selected for

this study as listed in Table I. All possible faults are not

included. However, three different kinds of faults that effect

the sensor, actuator, and process are considered.


3/13

MAKI AND LOPARO: FAULT DETECTION AND DIAGNOSIS IN INDUSTRIAL PROCESSES 531

Fig. 2. Block diagram of the control loop.

TABLE ILIST OF FAULTS STUDIED

III. NEURAL-NETWORK-BASED METHOD

A. Introduction

In recent years artificial neural networks have generated

considerable interest in the field of engineering as problem

solving tools. The fundamental element is a neuron which has

multiple inputs and a single output. Each input is multiplied by

a weight, the inputs are summed and this quantity is operated

on by the transfer function of the neuron to generate the output.The output is sometimes referred to as an activity level.

In this study, the multilayer feedforward neural network that

has one hidden layer is used. The bias unit, whose activity

level is fixed at one, is connected to all neurons in the hidden

and output layer to adjust the weighted sum input of each

neuron. The number of neurons in the input and output layer

is determined by each application, and the number of neurons

in the hidden layer must be adjusted during the learning phase

so that the network can be trained efficiently. The activity level

of the th neuron is obtained as

(5)

where

activity level (output) of the th neuron;

input to the th neuron;

the transfer function of the th neuron;

connection weight from the th neuron to the th

neuron;

activity level of the th neuron in the prior layer;

connection weight from the bias unit to the th

neuron.

The log-sigmoid function is used as the transfer function in

this study.

The backpropagation algorithm [17] is used to train the net-

work. The connection weights such as and are adjustedso that the average squared error between the network output

and the desired output (target) for a given reference input is

minimized. Learning continues iteratively until the sum of the

squared error is below a certain goal. The incremental change

of weight from the th neuron to the th is computed by(6)

(7)

(8)

where

incremental change in the weight at time ;

desired output of the th neuron in the output

layor;

learning rate (usually a constant);

momentum (usually a constant).

Equation (7) holds for the th neuron in the output layer,and (8) holds for the th neuron in the hidden layer. In (6),

and are adjustable parameters. In order to accelerate the

learning, the following methods are applied in this study.

1) The second term on the right-hand side of (6) is added

to the original update term to improve the learning [17].

Momentum is the key parameter here and it is set at

0.95 in this study.

2) The adaptive learning rate [18], which attempts to keep

the learning rate as large as possible while maintaining

the stability of the learning process, is also used. This

has a significant effect on convergence of the weights.


4/13


Fig. 3. Schematic diagram of fault detection and diagnosis system.

Fig. 4. Three training patterns for primary neural network.

B. Design of the Neural-Network-Based Detection System

1) Basic Concept: The capability of neural-network-based

methods for fault detection has been established in previous

works. The particular goals of this study are.

1) Preknown faults should be detectable by the neural-

network-based system. Unknown operating conditions

should not generate a false alarm. In another words, the

system is designed to detect faults that have occurred

in the past and to be robust to unmodeled operating

conditions.2) The transient state of the fault can be detected dynam-

ically. No steady-state values of process variables are

required as parameters in the design of the detection

system.

3) Detection is expected to be fast, reliable, and robust to

noise.

4) The method should be applicable to various industrial

processes with little additional effort and adjustmentsto the parameters and network structure are conducted

easily.

2) Basic Structure: Fig. 3 depicts the basic structure of the

fault detection and diagnosis system developed in this work.

A two-stage neural-network system is proposed to improveflexibility and applicability to other industrial processes. The

first stage network is referred to as the primary neural network

and the second stage network is referred to as the secondary

neural network. Each primary neural network corresponds to

a channel of measured data and is used to detect the extent of

changes such as increasing, decreasing, and steady behavior

with numbers that indicate the extent of such changes. There-

fore, the primary neural network can be designed independent

from the secondary neural network. Furthermore, we do not

have to design more than two primary neural networks, even

for multiple measurements, because the same network can be

applied to different measurement channels. The primary neural

network eliminates the need for additional input neurons to

capture the dynamic aspects of the data, refer to Li [14].

The moving time window technique as described in [14] is

used and a delay unit eliminates the effect of plant fluctuations

and accommodates for differences in the response time for

different measurement channels. A reset and restriction rule

is used to reduce the probability of false alarm. Details are

presented later.

3) Design of Primary Neural Network: As we mentioned

above, this network is designed to be used with the various

observations that are available. Observation histories are cat-

egorized into three types of behavior: increasing, decreasing,

and steady, and this network is trained to give this type of

trend information including the extent of change. We assume

that the measured data is normalized to the range [ 1, 1]

before it is used as an input to the network.

We begin with data obtained by periodic sampling (21 sam-

ples) from a single measurement source. After normalization,

a vector of 21 elements is given to the network as an input.

A feedforward type network is used. The number of units inthe input layer is 21 and the number of units in the output layer

is three. The activity level, defined to be between zero and one,

of each unit corresponds to the extent of increase, decrease,

and steadiness of the input, respectively. These activity levels

are denoted by and in the following examples. The

number of hidden units is adjustable and after some trial and

error during the learning phase it is chosen at 15. The superior

feature of the feedforward type network is that its output can

include information about both the direction and the extent of

change as mentioned above.

Training is performed by presenting the three target patterns

as given in Fig. 4. Fig. 5 shows examples of how well the net-

work can generalize. Two different sets of noisy data, markedby x, are presented to the networks and the recognized output

values are given under the same graph. The extent of increase

in the case of (a) is apparently greater because the value of

is larger. On the other hand, the extent of steadiness in the

case of (b) is greater because the value of is larger.

4) Design of Secondary Neural Network: The secondary

neural network receives the outputs from the primary neural

networks and produces information about the faults. A

conceptual diagram of a two-stage network system is depicted

in Fig. 6. This network must be designed and trained to satisfy

the particular requirements of each application problem. A


5/13


(a)

(b)

Fig. 5. Examples of given input and recognized output. (a) Input: typicalincrease. (b) Input: slight increase.

feedforward type network that can be trained using preknown

information is also appropriate for this case. Suppose that the

number of sensors used for detection is and the number of

faults to be detected is , the secondary neural network has3 neurons in the input layer and ( ) neurons in the output

layer. The number of units in the hidden layer is adjustable

and 15 is chosen for this paper from experimental trial and

error. The transfer function chosen is of the log-sigmoid type.

This is a reasonable choice because the extent of each faultcan be represented by a number between zero and one.

For the plant model of CSTR, the eight variables,

and are assumed to be mea-

surable.

In actual processes, it is very difficult to measure concentra-

tion continuously. Hence, concentrations of the substance A,

and , are assumed to not be measurable. The number of

input neurons of the secondary network is .For the training of the network, target patterns must be

set beforehand. According to the faults defined in Table I,

Table II gives the 12 sets of target patterns used for this study.

Each column corresponds to one specific fault and each row

corresponds to a neuron in the input layer. The values in

each column of the table are used as a reference input to the

network for each of the faults to be learned. Any combination

of faults can be chosen and the number of output neurons

is so determined. As the target patterns, the value of the

corresponding output neuron is set to one and the value of

other outputs is set to zero. These targets for the network can

Fig. 6. Conceptual diagram of two-stage neural network.

Fig. 7. Moving time window that trace the dynamic data.

be determined empirically by carefully investigating the faults

that have occurred in the past, but fine adjustment may be

necessary in order that the network generalizes satisfactorily.

Details will be discussed later.

5) A Moving Time Window and Normalization: A movingtime window is an indispensable technique to track dynamic

data and detect the transient state of faults. As shown in Fig. 7,

the window moves forward at each time increment . The

right side of the each window corresponds to the current time,

the time span of the window is , and the number of

samples is . The window length is adjustable for

each application. For this study, only three different lengths

are used: 20, 50, and 100. Vertical window height must

be specified according to the range of each measurement and

the amount of change in the measurement signals that can be

caused by the faults. Table III gives the values of and for

each window used in this study. Using uniform samples

of a time series of data, the window calculates the average ofthe samples and rescales the vertical axis. The average value

is set equal to zero, the upper value is set to one, and the

lower value is set to 1 using . Values that exceed one are

set equal to one and values below 1 are set equal to 1.

Finally, the output of the moving window is used as the input

to the primary neural network.

In adjusting , there is a tradeoff between prompt detection

and disturbance rejection. By increasing , the network is

unlikely to be effected by plant disturbances. However, the

detection can become insensitive to the measurements and the

response of the detection system can also become sluggish.


6/13


TABLE IITRAINING PATTERNS FOR THE SECONDARY NEURAL NETWORK

TABLE IIIHEIGHTS OF EACH WINDOW (NOTE: d t = 0 : 0 1 H)

TABLE IVTIME CONSTANTS OF DELAY UNITS

Normalization of the inputs to the primary network is

automatically conducted by this moving time window; each

output of the primary network is between zero and one

and consequently, normalization for the secondary network

is also accomplished within the primary network. Although

the parameters shown in Table III must be adjusted for everyapplication, knowledge of the steady-state values for each

measurement is not necessary. Even though the steady-state

operating conditions of the plant are likely to change, the

frequency at which network parameters require readjustment

should be low.

6) Delay Unit, Reset, and Restriction Rules: Some supple-

mentary functions of the neural-network-based method are

briefly discussed in this section.

When multiple observations are necessary to detect a certain

fault, the response time for each observation may be different

because of the plant characteristics. In addition, some obser-

vations may be contaminated by noise, and the intensity of

this sensor noise may be different for each measurement. In

order to accommodate the different response characteristics

for multiple data or to remove the effects of noise, a first-

order lag (delay unit) is incorporated into the detection system.

The sensitivity of the detection system to changes in theparameters of the delay unit should be evaluated with the

width of the window fixed and training patterns specified. The

time constants chosen for this study are shown in Table IV.

Note if the window length is chosen to be large enough, and

the time constant tau of the coolant flow rate delay unit is

either 0.01 or 0.10, the network does not yield a false alarm

from unmodeled disturbances considered in this study. For

particular applications, incorporating a pure dead-time delay

in the network could also be effective.

A key feature of the detection system developed in this work

is the detection of faults during transient operating conditions.


7/13


TABLE VRESULTS OF TRAINING AND RECALL (MULTIPLE FAULTS)

Because the detection system does not include information on

normal steady-state values, it cannot determine if the plant is

in a normal steady state or in an abnormal steady state. If all

observations are steady during a fault condition, it is possible

that the detection system can misdiagnose the situation and

conclude that the plant is normal. Hence, it is necessary that

the detection system is manually reset (reinitialized) after afault is detected and the operator concludes that the plant

has resumed normal steady-state operation. This is not going

to be a problem in practical implementations because once

the system detects a fault, the alarm will be kept until it is

manually reset.The outputs of the network have values between zero and

one and a threshold value of 0.9 is used in this study to set

the alarms for the operator.

From Table II, we notice that most faults are enumerated

by pairs that have opposite direction. For example, if we want

to detect the fault #8p, it is natural to train the network by

the target patterns of faults #8p and #8n. Because many of

the process variables have second order dynamic responsecharacteristics, a false alarm of #8n is likely to occur after

the correct detection of fault #8p. However, the probability of

such an event is quite low in actual applications. Therefore,

the fault detection system as designed and implemented in this

work in a way that if a fault is detected, a fault with opposite

direction to the fault detected is not considered until after the

system is reset manually.

IV. SIMULATION RESULTS

The proposed fault detection and diagnosis system is ex-

pected to recall pretrained faults correctly. Also it shouldgeneralize appropriately even from distorted or noisy input

data. From a different point of view, it is also desired that

the neural network can be trained to detect multiple faults,as many as possible. In this section, the capabilities and

limitations regarding these requirements are examined and

discussed using the CSTR simulation.

A. Recall to Trained Faults

Basically, if the training of the secondary neural network

is accomplished within the time allocated for training and if

the error goal is achieved, recall should not be a problem. The

error goal of the backpropagation algorithm is set at 1 10

for the following experiments.

For a single fault, it is necessary to train the network

by presenting the target pattern of the fault and the normal

condition. Learning a faulty pattern along with the pattern of

normal operation is important to achieving correct recall. If

the network is trained only using the target pattern of a singlefault, then only a single neuron in the output layer is fired

for any input. Because most faults are considered in pairs

as mentioned previously, from a practical point of view, it

is recommended that the network be trained with the normal

pattern and at least one pair of faulty patterns.

For multiple faults, target patterns should include the faults

and the pattern for normal operation. Recall ability of the

network is investigated by presenting both the normal pattern

and one pair of faulty patterns. Then, by augmenting the

number of faults, training and recall capabilities are also

investigated. Results are summarized in Table V. For example,

suppose the network is trained using five sets of data thatrepresent the faults #1p, #1n, #2p, #2n and normal operation.

Fig. 8 shows an example of recall by presenting the data of

fault #1p. A symptom of the fault started at time 2.00. The

first neuron that corresponds to the fault #1p is fired promptly

at time 2.20 detecting the change that occurred in and

. Note that the fifth neuron, which corresponds to the normal

operation also responds but no false alarm occurs. The alarm

comes before the plant stabilizes to the faulty steady condition

at about time 2.60. Therefore, detection by this method is faster

than the conventional method that uses steady-state data.

From the results shown in Table V, this fault detection and

diagnosis system can also be trained using multiple faults

within an allowable time period. However, as the numberof faults increases, trapping in a local minimum in the error

surface is more likely to happen. This actually occurred when

we attempted to simultaneously train 12 patterns. Randomized

network weights and biases at the beginning of training can

help mitigate this problem.

Obviously from Table V, the correct recall rate decreases

as the number of trained patterns increases. One reason is that

there are faults that are similar to each other, such as faults #1n

and #2p, and so on. The network trained using many faults is

likely to give a false alarm by misunderstanding such similar

patterns. Furthermore, right after the occurrence of fault #2p,


8/13


TABLE VIGENERALIZATION RESULTS FROM TRAINED DATA WITH NOISE

Fig. 8. An example of correct recall.

#2n, #6p or #6n, the coolant flow rate fluctuates for a

while. This can also initiate a false alarm of fault #3p/#3nor #5p/#5n because is also the key observation for these

fault pairs. By updating the error goal to a smaller value,

the occurrence of such false alarms can be reduced, but not

eliminated entirely.

There is no distinct limit regarding the number of patterns

to be trained. But for this particular application, we found thatit is better to train the network using less than ten patterns.

Of course, this will vary from application to application, and

this limitation must be discovered as a part of the learning

and training process.

B. Generalization from Untrained Faults, Input withNoise, and Faults with Different Severity

How does the secondary neural network respond to un-

trained faults? How does it work with noise-corrupted input

data or faults with different severity? These generalization

issues are examined next. Furthermore, this section contains

a discussion of the applicability of the proposed system to

real-world industrial processes.

First, the response of the detection system to an input that

represents an untrained fault is investigated. According to

the experiments performed with many pretrained networks,

untrained faults were diagnosed as the normal operating con-

dition. In the hyperplane generated by the input vectors, an

arbitrary input vector locates closest to the vector of normal

operation. Arbitrary input vectors can be considered to be

outputs of the primary network representing unknown faults.

Therefore, the network cannot generalize to this situation, but

this is consistent with our objectives for the design of the

detection system. As depicted in Fig. 6, the output neuron

for the normal condition should be referred to as normal or

untrained faults.

When the set point of the controller is changed, an unsteady

operating condition is generated in the plant, but this is still

normal operation. If this response is not similar to one of

the trained faults, the network diagnoses a normal operating

condition, similar to the case of an untrained fault. However,

for example, fault #1p and #1n are quite similar to the response

of the plant to a set point change of the inlet flow rate . It is

quite likely that a false alarm is generated during such set point

changes. As these set point changes are commonly initiated by

an operator, the detection system should be tentatively disabled

while the plant is in such a transient state.

Second, noisy inputs of pretrained faults are given to the

network. White noise with a normal distribution [ (0, 1)]is multiplied by a constant and added to each measurable

variable. The control actions of the three controllers are also

affected by the noise processes. The noise level is expressed as

a percentage that represents the ratio of the standard deviation

of the noise term to the amount of change caused by the fault.

Let us consider the same example as given in Fig. 8, in which

the change of the coolant flow rate is about 4 ft /h and the

change of the control valve opening is about 8.1% of total

stroke. If the standard deviation of the noise term of the coolant

flow rate is 0.55 ft /h, the noise level of the coolant flow rate

is %. If the standard deviation of the noise

term of the control valve opening is 0.56%, the noise level of

the control valve opening is %.For simplicity, the noise level of temperatures and level

measurements are kept constant. Other noise levels are altered

as in Table VI, which also shows the results of this experiment.

It is obvious that the more noise that is added the more difficult

it is to detect faults. However, the network performance

demonstrates that it is adequately robust to noise. Fig. 9

illustrates an example where fault #1p is correctly generalized

from a faulty input with noise. Because the threshold level of

firing each neuron is set at 0.9, the system can detect this fault

at time . Further discussion of noise is given in the

next section.


9/13


10/13


give and

(14)

(15)

where

(16)

(17)

and are used to compute the state estimate and

error covariance at the next time step.

B. Application of EKF to the CSTR

Define the state vector

(18)

where it is assumed that and are measurable. It

follows that , , , ,

and . From the assumption, and

are measurable and the others are not. Usually and are

treated as inputs in such a state space representation. However,

in this application they are defined as variables because they

must be estimated.

For computational reasons, the system equations are mod-

ified to be dimensionless. The normalized state variables in

deviation form are defined as shown in (19) at the bottom of

the page, where (ft ), , ,

, (ft /h) and (ft /h). These

are the steady-state values. Hence, ,

, , ,

and . From here, the * is

omitted to simplify the notation and because they are defined

as deviation variables, the initial state is given as .

From (1)(4), the system equations of the CSTR are written

as shown in (20) at the bottom of the page, where

(21)

represents the modeling uncertainty.

From the assumption on measurability, the output equation is

written below, where the observation matrix is denoted by

(22)

Here

(23)

and is the measurement noise vector.

Examining the parameters given in (21), we notice that there

are variables that require estimation. For example, the coolant

flow rate, , and the reactant concentration at the inlet, ,

can change during the operation of the process. In a fault mode,

parameters such as , , and can also vary. The reason for

(19)

(20)


11/13


the choice of the system state as given in (18) is discussed in

the next section.

C. Observability

The model (20) has unmeasured state variables and param-

eters that can change during different operating modes of the

process. If parameters such as , , and can be estimated,

it is very helpful for fault detection and diagnosis. However,because they are not measurable directly and are necessary for

the detection of certain faults, the state vector is augmented to

include these variables and parameters to be estimated by the

EKF. The augmented state must be observable, otherwise it

is meaningless to incorporate these variables and parameters

into the model. Various realizations, other than the system

representation given in the previous section, were considered.

Unfortunately, all other realizations that were tried resulted in

an unobservable realization.

D. EKF Tuning and Simulation for Fault Detection

The parameters of the EKF are , , and , which are

assumed to be diagonal matrices. Also, the initial state must

be specified. As described in (9) and (10), and can

be determined from the known noise covariances. After that,

however, further tuning by trial and error is always necessary

to achieve stable and accurate estimation. Each element of

is inversely proportional to the gain matrix . Hence, smaller

elements are chosen as long as the estimation process is stable.

As each element of increases, faster response is obtained

but the amplitude of fluctuation increases. Conversely, the

smaller each element of is, the slower the response and the

smaller the fluctuations. The elements of that correspond to

unmeasurable states need to be chosen so that the estimated

states track the true values. As long as is given as ,and and are chosen appropriately, it is not necessary

to adjust . This delicate balancing of estimator parameters

and performance is similar to the situation we discussed

earlier regarding the influence of window height on network

performance for different fault severity scenarios.

Two faults, #3p and #5p are selected for the case study

to compare the performance of the EKF to the performance

of neural-network-based approach. For #3p, the unmeasurable

state, , rises due to the sticking of the control valve. For #5p,

the constant, , rises and consequently the unmeasurable state

increases. Hence, the estimates of or

by the EKF are used to detect these faults.

The square root of the covariance matrix is given as

The square root of the covariance matrix is given as

The covariance of the initial state estimation error is given

as

Fig. 11. Results of fault #5p.

Computer simulations for the two faults are performed under

the above conditions.

Fault #3p: For fault #3p, after adjustment and of

the EKF are

Results of estimated states are shown in Fig. 10. The change

of coolant flow rate affects the accuracy of the plant model.Nevertheless, the estimates of and track

the actual values very well. The fault is detected by trending

of the estimated states and .

Fault #5p: For fault #5p, and are adjusted to be

Results of the estimated states are shown in Fig. 11.

Despite the intensive effort to search for the optimal

and , does not follow the actual value. The

coolant flow rate and the inlet concentration change

due to the fault and affect the accuracy of the model. The

plant model with degraded accuracy hinders the estimation

and proper tracking. Hence, the fault is not detectable.

E. Comparison of the Two Methods

Following the above results, a comparison of the EKF and

neural-network-based method is examined for the two faults

#3p and #5p. Simultaneously, five different levels of noise are

given and performance for different S/N ratios are compared.

As the noise level of fault #3p increases, false alarms are

likely to happen and tuning of the parameters is necessary

for more than half the cases. For all cases of fault #5p, the

EKF does not work well even though extensive efforts were

taken to tune the parameters. We conclude this section withthe following comments regarding the comparison.

1) A model-based approach such as EKF significantly

depends on the validity of the model. The performance

of a model-based system is easily degraded by unmod-

eled disturbances such as measurement or process noise

caused by perturbations or other malfunctions of the

plant.

2) Use of the EKF is limited by observability of the real-

ization, including unmeasurable states and parameters. If

the system is not observable, we must look for a reduced

set of states and/or parameters that are observable.


12/13


Reducing the state variables makes the model vulnerable

to uncertainty.

3) Parameters of the EKF need to be adjusted every time

the noise level changes. Parameters of the neural-

network-based approach are generally more robust in

this sense.

4) The advantage of the model-based method is that it

can estimate unmeasurable parameters, if they are ob-

servable. On the other hand, the neural-network methodmust rely on the measurable information and can best

correlate with preknown faults through training.

VI. CONCLUSIONS

A neural-network-based fault detection and diagnosis sys-

tem is developed and applied to a plant model of a CSTR.We summarize and review the main results of this work,

evaluate the neural-network-based approach for fault detection

and discuss the applicability to general industrial processes.

A nonlinear CSTR was chosen as the study system. Aplant model was developed and implemented for computer

simulation. The following results were obtained from the case

studies conducted using the CSTR model.

1) A two-stage neural-network system, where each element

can be designed independently, has been proposed. Us-

ing data preprocessed by the moving time window, the

primary network detects the transient state of each mea-

surement dynamically, and this architecture can be used

for many industrial process applications. A secondary

neural network was developed to detect and diagnose a

set of preknown faults according to the specification of

the application. The two-stage network approach yields

an efficient and simplified design procedure.2) For the training of the neural networks, the backpropaga-

tion algorithm was chosen. This approach is considered

to be better for training feedforward neural networks, asopposed to Hopfield networks. Combined with momen-

tum and the adaptive learning rate method, training of

the neural networks was performed very efficiently.

3) The secondary neural network can be trained for multi-

ple faults as long as all the patterns differ from each

other. It can also recall the trained faults correctly.

However, the more patterns that are trained, the more

likely it is to be trapped in a local minimum. Moreover,

similar patterns of faults or perturbations that occur after

a fault are likely to produce a false alarm.4) The neural-network-based system can detect trained

faults promptly during the transient period. It is faster

than the method trained using steady-state data. It detects

untrained fault as the normal operating condition, as

desired. When a set point of a controller is changed dur-

ing normal operation of the plant, the detection system

diagnoses that the plant is normal unless the response of

the plant induced by the change in set point is similar to

that of the trained faults. Because an operator should be

aware of either a set point change or the occurrence of a

measurable disturbance, in these situations the detection

system should be turned off temporarily until the process

returns to a normal operating state.5) The secondary neural network can generalize from faulty

data with noise if the amplitude of the noise is within

certain bounds. Also it can generalize from faulty data

with different severity, unless the severity is much

smaller than the window height.

6) A conventional model-based approach, the EKF, is cho-

sen for this study. Its performance strictly depends on the

accuracy of the model and its applicability is restricted

by observability of the realization. Compared with the

EKF model-based approach, the neural-network-based

approach is more robust with respect to noise. Generally,

we do not have to change parameters of the neural-

network-based method for different faults and different

noise level.

7) The secondary neural network must be designed using

the given specifications of the plant. However, tuning of

the parameters can be completed efficiently. This feature

implies that a wide scope of industrial process applica-

tions can be addressed using the approach developed inthis work.

REFERENCES

[1] D. M. Himmelblau, Fault Detection and Diagnosis in Chemical andPetrochemical Processes. New York: Elsevier, 1978.

[2] P. M. Frank, Fault diagnosis in dynamic systems using analyticaland knowledge-based redundancyA survey and some new results,

Automatica, vol. 26, no. 3, pp. 459474, 1990.[3] J. J. Gertler, Survey of model-based failure detection and isolation in

complex plants, IEEE Contr. Syst. Mag., vol. 8, pp. 311, 1988.[4] K. A. Loparo, M. R. Buchner, and K. S. Vasudeva, Leak detection in

an experimental heat exchanger process: A multiple model approach,IEEE Trans. Automat. Contr., vol. 36, 1991.

[5] S. Park and D. M. Himmelblau, Fault detection and diagnosis via

parameter estimation in lumped dynamic systems, Ind. Eng. Chem.Process Des. Dev., vol. 22, no. 3, pp. 482487, 1983.

[6] R. Li and J. H. Olsen, Fault detection and diagnosis in a closed-loopnonlinear distillation process: Application of extended Kalman filters,

Ind. Eng. Chem. Res., vol. 30, no. 5, pp. 898908, 1991.[7] S. K. Shum, J. F. Davis, W. F. Punch, and B. Chandrasekaran, An

expert system approach to malfunction diagnosis in chemical plants,Comput. Chem. Eng., vol. 12, no. 1, pp. 2736, 1988.

[8] J. C. Hopkins and D. M. Himmelblau, Artificial neural-network mod-els of knowledge representation in chemical engineering, ComputersChem. Eng., vol. 12, nos. 9/10, pp. 881890, 1988.

[9] V. Venkatasubramanian and K. Chan, A neural-network methodologyfor process fault diagnosis, AIChE J., vol. 35, no. 12, pp. 19932001,1989.

[10] K. Watanabe, I. Matsuura, M. Abe, and M. Kubota, Incipient faultdiagnosis of chemical processes via artificial neural networks, AIChE

J., vol. 35, no. 11, pp. 18031812, 1989.[11] V. Venkatasubramanian, R. Vaidyanathan, and Y. Yamamoto, Process

fault detection and diagnosis using neural networksI: Steady-stateprocesses, Computers Chem. Eng., vol. 14, no. 7, pp. 699712, 1990.

[12] J. C. Hoskins, K. M. Kaliyur, and D. M. Himmelblau, Fault diagnosisin complex chemical plants using artificial neural networks, AIChE J.,vol. 37, no. 1, pp. 137141, 1991.

[13] W. E. Dietz, E. L. Kiech, and M. Ali, Jet and rocket engine faultdiagnosis in real time, J. Neural-Network Computing, vol. 1, no. 5, pp.517, 1989.

[14] R. Li, J. H. Olson, and D. L. Chester, Dynamic fault detection anddiagnosis using neural networks, in Proc. 5th IEEE Symp. Intell. Contr.,1990, pp. 11691174.

[15] Y. Ohga and H. Seki, Abnormal event identification in nuclear powerplants using a neural network and knowledge processing, NuclearTechnol., vol. 101, pp. 159167, Feb. 1993.

[16] W. L. Luyben, Process Modeling, Simulation, and Control for ChemicalEngineers. New York: McGraw-Hill, 1990.


13/13


[17] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Learning internalrepresentations by error propagation, Parallel Distributed Processing:

Explorations in the Microstructure of CognitionI: Foundations, D. E.Rummelhart and J. L. McClelland, Eds. Cambridge, MA: MIT Press,1986.

[18] T. P. Vogel, J. K. Mangis, A. K. Rigler, W. T. Zink, and D. L. Alkon,Accelerating the convergence of the backpropagation method, Biol.Cybern., vol. 59, pp. 257263, 1988.

[19] C. K. Cui and G. Chen, Kalman Filtering. New York: Springer-Verlag.[20] A. H. Jazwinski, Stochastic Processes and Filtering Theory. New

York: Academic, 1970.

Yunosuke Maki was born in Toyohashi, Japan, in1958. He received the B.S. degree in mathematicsand instrumentation from the University of Tokyoin 1981 and the M.S. degree in Systems and ControlEngineering from Case Western Reserve University,Cleveland, OH, in 1994.

Since 1981, he has been working for theKawasaki Steel Corporation in Japan. His researchinterests include the industrial applications ofsystems and control theory, especially to the steelmaking process.

Mr. Maki is currently a member of the Iron and Steel Institute in Japan.

Kenneth A. Loparo (S75M77SM89) receivedthe Ph.D. degree in systems and control engineeringfrom Case Western Reserve University, Cleveland,OH, in 1977.

He was an Assistant Professor in the MechanicalEngineering Department at Cleveland State Univer-sity, OH, from 1977 to 1979, where he receivedthe Distinguished Faculty Award for contributionsto teaching and research. From 1979 to the presenttime, he has been on the faculty of The Case School

of Engineering, Case Western Reserve Universitywhere he is currently Associate Dean of The Case School of Engineeringand Professor of Systems and Control Engineering. He is also Professor ofMechanical and Aerospace Engineering and Professor of Mathematics. Heserved as Chair of the Department of Systems Engineering from 1990 to1994 and as Associate Director of the Center for Automation and IntelligentSystems Research from 1985 to 1989. His research interests include stabilityand control of nonlinear and stochastic systems with applications to large-scaleelectric power systems; nonlinear filtering with applications to monitoring,fault detection, diagnosis and reconfigurable control; information theoryaspects of stochastic and quantized systems with applications to adaptive anddual control; and the design of digital control systems.

At Case Western Reserve University he has received numerous awardsincluding the Sigma Xi Research Award for contributions to stochastic control,the John S. Diekoff Award for Distinguished Graduate Teaching, the Tau BetaPi Outstanding Engineering and Science Professor Award, the UndergraduateTeaching Excellence Award, and the Carl F. Wittke Award for Distinguished

Undergraduate Teaching.

Documents

A Neural-Network Approach to Fault Detection CSTR.pdf