A Neural-Network Approach to Fault Detection CSTR.pdf

Embed Size (px)

Citation preview

  • 7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf

    1/13

    IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 5, NO. 6, NOVEMBER 1997 529

    A Neural-Network Approach to Fault Detectionand Diagnosis in Industrial Processes

    Yunosuke Maki and Kenneth A. Loparo, Senior Member, IEEE

    Abstract Using a multilayered feedforward neural-networkapproach, the detection and diagnosis of faults in industrialprocesses that requires observing multiple data simultaneouslyare studied in this paper. The main feature of our approach isthat the detection of the faults occurs during transient periods ofoperation of the process. A two-stage neural network is proposedas the basic structure of the detection system. The first stage ofthe network detects the dynamic trend of each measurement, andthe second stage of the network detects and diagnoses the faults.The potential of this approach is demonstrated in simulationusing a model of a continuously well-stirred tank reactor. Theneural-network-based method successfully detects and diagnosespretrained faults during transient periods and can also generalize

    properly. Finally, a comparison with a model-based method ispresented.

    Index TermsFault detection, fault diagnosis, neural networks.

    I. INTRODUCTION

    FAULT detection and diagnosis problems have been stud-

    ied intensively in industries such as chemical processing

    and utility power generation. Prompt detection and diagnosis

    of faults is essential for the reliable, safe, and efficient opera-

    tion of the plant and for maintaining quality of the products.

    Faults may occur in the process, the sensors, the actuators,

    and the instruments independently or simultaneously. For a

    simple fault that can be detected by a single measurement,a conventional alarm circuit may be sufficient. However,

    because it is usually very difficult in complex industrial

    systems to directly measure process states that are good

    indicators of faults, more elaborate and automatic measures

    are necessary. Observing multiple data simultaneously, skilled

    operators are often required to make tough decisions based on

    their experience and empirical knowledge.

    One of the common approaches is to use model-based

    methods for detection and diagnosis. This requires modeling

    of the process, filtering the measured data, and estimation of

    the unknown state variables. The basic idea is compare the

    output of the model to the measurements from the process,

    thereby generating a residual or error which is used makea decision about the operating state of the system. A wide

    variety of methods and applications have been studied and

    are summarized by Himmelblau [1], Frank [2], and Gertler

    [3]. The nonlinear filtering approach studied in [4] is in this

    category. Himmelblau et al. [5], [6] demonstrated the appli-

    Manuscript received August 22, 1995; revised February 5, 1997. Recom-mended by Associate Editor, E. O. King.

    The authors are with the Case School of Engineering, Case Western ReserveUniversity, Cleveland, OH 44106-7082 USA.

    Publisher Item Identifier S 1063-6536(97)07770-1.

    cation of extended Kalman filtering (EKF) to fault detection

    and diagnosis in chemical processes. Model-based approaches

    can use either state space or inputoutput representations

    of dynamic systems. As a consequence, the system model

    must be known and accurate for these methods to be highly

    effective. Uncertainty in the process model can easily degrade

    the estimation output and cause either missed detections or

    false alarms. The nonlinear filtering approach is usually more

    robust than conventional linear filtering-based methods, but a

    substantial modeling effort may be required.

    On the other hand, qualitative approaches that do not require

    process models have elicited considerable research interest inthe last ten years. Decision table-based methods, knowledge-

    based expert systems [7] and artificial neural-network-based

    methods are considered to be in this category. Neural-network-

    based methods have received much attention because of their

    fast and robust implementation, their performance in learning

    arbitrary nonlinear mappings and their ability for pattern

    recognition and association. The fault detection and diagnosis

    problem can be interpreted as a pattern recognition task.

    Neural networks are an appropriate tool for fault detection

    and diagnosis in which measured data, not discernible at the

    instant of sensing, is transformed into useful information for

    decision-making.

    The potential of this approach for chemical processeswas initially proposed by Hoskins and Himmelblau [8] and

    Venkatasubramanian and Chan [9]. Watanabe et al. [10]

    demonstrated the use of a two-stage neural network to add

    information about the severity of the fault. More detailed

    analysis regarding the learning, recall and generalization

    characteristics of the method was given by Venkatasubra-

    manian et al. [11] and a large-scale application to a complex

    chemical plant was demonstrated by Hoskins et al. [12].

    However, these approaches are static in nature because the

    neural networks are trained using only steady-state data. If

    the steady-state operating conditions are changed, the network

    must be retrained in order to work properly. Oftentimes,faster detection of the fault is required and it is necessary to

    use transient data for this purpose. Dietz et al. [13] trained

    the network by presenting dynamic data and Li et al. [14]

    developed an approach using a moving time window. Ohga

    and Seki [15] trained the network using a number of sets of

    time series data.

    The major motivation of this work is the use of artificial

    neural networks, capable of operating during process tran-

    sients, for fault detection and diagnosis of industrial processes.

    The ultimate goal is to develop a general method that can

    10636536/97$10.00 1997 IEEE

  • 7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf

    2/13

    530 IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 5, NO. 6, NOVEMBER 1997

    Fig. 1. Process flow of well-stirred tank (Luyben model).

    be applied to a broad spectrum of industrial processes. The

    main feature of the proposed method is the rapid and robust

    detection of faults during transient periods of the process.

    Maintenance of the neural network should also be donewith less effort. This paper is organized as follows: First, a

    model of a well-stirred tank is developed as a target process

    for this study. Second, the neural-network-based method is

    developed and implemented in a simulation environment.

    Third, a simulation study using the plant model is performed

    and the results of various tests are discussed. Finally, the

    proposed neural-network-based approach is compared with a

    conventional model-based method, the EKF.

    II. PLANT MODEL

    In order to illustrate the method proposed in this work, a

    target plant model is developed. A continuously well-stirred

    tank reactor (CSTR) is used for all case studies in this work.In this section, the plant model and its implementation are

    described in detail and the faults that are considered in the

    study are introduced.

    A. Description of the Plant

    Fig. 1 illustrates the jacketed CSTR in which an irreversible

    and exothermic reaction A B takes place. The reactor

    is operated by three control loops that regulate the outlet

    temperature, the inlet flow rate of the reactant tank level. A

    cooling jacket surrounds the reactor and the coolant is water in

    this case. Negligible heat losses, constant densities and perfect

    mixing inside the tank are assumed. Therefore the temperature

    in the jacket is uniform and equal to the outlet temperature.The process variables are as follows.

    concentration of at the inlet and outlet,

    respectively;

    flow rate of the liquid at inlet and outlet,

    respectively;

    flow rate of coolant;

    volume of the tank;

    temperature of the inlet reactant;

    temperature of the outlet coolant (water);

    temperature of the tank;

    control valve openings.

    The parameters, assumed to be constant, are as follows.

    frequency factor;

    activation energy;

    gas constant;

    volumetric heat capacity;

    H heat of reaction;

    heat exchange area;

    overall heat transfer coefficient;

    area of the tank;

    jacket volume;

    heat capacity of water;

    density of water;

    temperature of the inlet coolant (water).

    The equations describing the system are

    Mass balance:

    (1)

    (2)

    Energy balance:

    H

    (3)

    (4)

    The system is highly nonlinear. All equations and parametervalues are taken from Luyben [16]. Moreover, this CSTR

    model has been used in previous neural-network-based studies;

    see [8] and [11].

    B. Implementation of the Plant Model

    For computer simulation, the plant model is implemented

    using Simulink in Matlab. The basic time unit is the hour.

    The step size for Euler integration is denoted by and it is

    usually 0.01 [h].

    A block diagram of each control loop is shown in Fig. 2.

    Three PI controllers are used to regulate the outlet temperature

    , the inlet flow rate of the reactant and the tanklevel . Equal percentage valves are used to control the flow

    rate of the reactant, coolant and outlet liquid. To simplify the

    simulation, upstream and downstream pressures are assumed

    to be constant. A first-order lag is used to model the actuator

    and sensing dynamics.

    C. Faults Studied

    Complex and frequently observable faults are selected for

    this study as listed in Table I. All possible faults are not

    included. However, three different kinds of faults that effect

    the sensor, actuator, and process are considered.

  • 7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf

    3/13

    MAKI AND LOPARO: FAULT DETECTION AND DIAGNOSIS IN INDUSTRIAL PROCESSES 531

    Fig. 2. Block diagram of the control loop.

    TABLE ILIST OF FAULTS STUDIED

    III. NEURAL-NETWORK-BASED METHOD

    A. Introduction

    In recent years artificial neural networks have generated

    considerable interest in the field of engineering as problem

    solving tools. The fundamental element is a neuron which has

    multiple inputs and a single output. Each input is multiplied by

    a weight, the inputs are summed and this quantity is operated

    on by the transfer function of the neuron to generate the output.The output is sometimes referred to as an activity level.

    In this study, the multilayer feedforward neural network that

    has one hidden layer is used. The bias unit, whose activity

    level is fixed at one, is connected to all neurons in the hidden

    and output layer to adjust the weighted sum input of each

    neuron. The number of neurons in the input and output layer

    is determined by each application, and the number of neurons

    in the hidden layer must be adjusted during the learning phase

    so that the network can be trained efficiently. The activity level

    of the th neuron is obtained as

    (5)

    where

    activity level (output) of the th neuron;

    input to the th neuron;

    the transfer function of the th neuron;

    connection weight from the th neuron to the th

    neuron;

    activity level of the th neuron in the prior layer;

    connection weight from the bias unit to the th

    neuron.

    The log-sigmoid function is used as the transfer function in

    this study.

    The backpropagation algorithm [17] is used to train the net-

    work. The connection weights such as and are adjustedso that the average squared error between the network output

    and the desired output (target) for a given reference input is

    minimized. Learning continues iteratively until the sum of the

    squared error is below a certain goal. The incremental change

    of weight from the th neuron to the th is computed by(6)

    (7)

    (8)

    where

    incremental change in the weight at time ;

    desired output of the th neuron in the output

    layor;

    learning rate (usually a constant);

    momentum (usually a constant).

    Equation (7) holds for the th neuron in the output layer,and (8) holds for the th neuron in the hidden layer. In (6),

    and are adjustable parameters. In order to accelerate the

    learning, the following methods are applied in this study.

    1) The second term on the right-hand side of (6) is added

    to the original update term to improve the learning [17].

    Momentum is the key parameter here and it is set at

    0.95 in this study.

    2) The adaptive learning rate [18], which attempts to keep

    the learning rate as large as possible while maintaining

    the stability of the learning process, is also used. This

    has a significant effect on convergence of the weights.

  • 7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf

    4/13

    532 IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 5, NO. 6, NOVEMBER 1997

    Fig. 3. Schematic diagram of fault detection and diagnosis system.

    Fig. 4. Three training patterns for primary neural network.

    B. Design of the Neural-Network-Based Detection System

    1) Basic Concept: The capability of neural-network-based

    methods for fault detection has been established in previous

    works. The particular goals of this study are.

    1) Preknown faults should be detectable by the neural-

    network-based system. Unknown operating conditions

    should not generate a false alarm. In another words, the

    system is designed to detect faults that have occurred

    in the past and to be robust to unmodeled operating

    conditions.2) The transient state of the fault can be detected dynam-

    ically. No steady-state values of process variables are

    required as parameters in the design of the detection

    system.

    3) Detection is expected to be fast, reliable, and robust to

    noise.

    4) The method should be applicable to various industrial

    processes with little additional effort and adjustmentsto the parameters and network structure are conducted

    easily.

    2) Basic Structure: Fig. 3 depicts the basic structure of the

    fault detection and diagnosis system developed in this work.

    A two-stage neural-network system is proposed to improveflexibility and applicability to other industrial processes. The

    first stage network is referred to as the primary neural network

    and the second stage network is referred to as the secondary

    neural network. Each primary neural network corresponds to

    a channel of measured data and is used to detect the extent of

    changes such as increasing, decreasing, and steady behavior

    with numbers that indicate the extent of such changes. There-

    fore, the primary neural network can be designed independent

    from the secondary neural network. Furthermore, we do not

    have to design more than two primary neural networks, even

    for multiple measurements, because the same network can be

    applied to different measurement channels. The primary neural

    network eliminates the need for additional input neurons to

    capture the dynamic aspects of the data, refer to Li [14].

    The moving time window technique as described in [14] is

    used and a delay unit eliminates the effect of plant fluctuations

    and accommodates for differences in the response time for

    different measurement channels. A reset and restriction rule

    is used to reduce the probability of false alarm. Details are

    presented later.

    3) Design of Primary Neural Network: As we mentioned

    above, this network is designed to be used with the various

    observations that are available. Observation histories are cat-

    egorized into three types of behavior: increasing, decreasing,

    and steady, and this network is trained to give this type of

    trend information including the extent of change. We assume

    that the measured data is normalized to the range [ 1, 1]

    before it is used as an input to the network.

    We begin with data obtained by periodic sampling (21 sam-

    ples) from a single measurement source. After normalization,

    a vector of 21 elements is given to the network as an input.

    A feedforward type network is used. The number of units inthe input layer is 21 and the number of units in the output layer

    is three. The activity level, defined to be between zero and one,

    of each unit corresponds to the extent of increase, decrease,

    and steadiness of the input, respectively. These activity levels

    are denoted by and in the following examples. The

    number of hidden units is adjustable and after some trial and

    error during the learning phase it is chosen at 15. The superior

    feature of the feedforward type network is that its output can

    include information about both the direction and the extent of

    change as mentioned above.

    Training is performed by presenting the three target patterns

    as given in Fig. 4. Fig. 5 shows examples of how well the net-

    work can generalize. Two different sets of noisy data, markedby x, are presented to the networks and the recognized output

    values are given under the same graph. The extent of increase

    in the case of (a) is apparently greater because the value of

    is larger. On the other hand, the extent of steadiness in the

    case of (b) is greater because the value of is larger.

    4) Design of Secondary Neural Network: The secondary

    neural network receives the outputs from the primary neural

    networks and produces information about the faults. A

    conceptual diagram of a two-stage network system is depicted

    in Fig. 6. This network must be designed and trained to satisfy

    the particular requirements of each application problem. A

  • 7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf

    5/13

    MAKI AND LOPARO: FAULT DETECTION AND DIAGNOSIS IN INDUSTRIAL PROCESSES 533

    (a)

    (b)

    Fig. 5. Examples of given input and recognized output. (a) Input: typicalincrease. (b) Input: slight increase.

    feedforward type network that can be trained using preknown

    information is also appropriate for this case. Suppose that the

    number of sensors used for detection is and the number of

    faults to be detected is , the secondary neural network has3 neurons in the input layer and ( ) neurons in the output

    layer. The number of units in the hidden layer is adjustable

    and 15 is chosen for this paper from experimental trial and

    error. The transfer function chosen is of the log-sigmoid type.

    This is a reasonable choice because the extent of each faultcan be represented by a number between zero and one.

    For the plant model of CSTR, the eight variables,

    and are assumed to be mea-

    surable.

    In actual processes, it is very difficult to measure concentra-

    tion continuously. Hence, concentrations of the substance A,

    and , are assumed to not be measurable. The number of

    input neurons of the secondary network is .For the training of the network, target patterns must be

    set beforehand. According to the faults defined in Table I,

    Table II gives the 12 sets of target patterns used for this study.

    Each column corresponds to one specific fault and each row

    corresponds to a neuron in the input layer. The values in

    each column of the table are used as a reference input to the

    network for each of the faults to be learned. Any combination

    of faults can be chosen and the number of output neurons

    is so determined. As the target patterns, the value of the

    corresponding output neuron is set to one and the value of

    other outputs is set to zero. These targets for the network can

    Fig. 6. Conceptual diagram of two-stage neural network.

    Fig. 7. Moving time window that trace the dynamic data.

    be determined empirically by carefully investigating the faults

    that have occurred in the past, but fine adjustment may be

    necessary in order that the network generalizes satisfactorily.

    Details will be discussed later.

    5) A Moving Time Window and Normalization: A movingtime window is an indispensable technique to track dynamic

    data and detect the transient state of faults. As shown in Fig. 7,

    the window moves forward at each time increment . The

    right side of the each window corresponds to the current time,

    the time span of the window is , and the number of

    samples is . The window length is adjustable for

    each application. For this study, only three different lengths

    are used: 20, 50, and 100. Vertical window height must

    be specified according to the range of each measurement and

    the amount of change in the measurement signals that can be

    caused by the faults. Table III gives the values of and for

    each window used in this study. Using uniform samples

    of a time series of data, the window calculates the average ofthe samples and rescales the vertical axis. The average value

    is set equal to zero, the upper value is set to one, and the

    lower value is set to 1 using . Values that exceed one are

    set equal to one and values below 1 are set equal to 1.

    Finally, the output of the moving window is used as the input

    to the primary neural network.

    In adjusting , there is a tradeoff between prompt detection

    and disturbance rejection. By increasing , the network is

    unlikely to be effected by plant disturbances. However, the

    detection can become insensitive to the measurements and the

    response of the detection system can also become sluggish.

  • 7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf

    6/13

    534 IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 5, NO. 6, NOVEMBER 1997

    TABLE IITRAINING PATTERNS FOR THE SECONDARY NEURAL NETWORK

    TABLE IIIHEIGHTS OF EACH WINDOW (NOTE: d t = 0 : 0 1 H)

    TABLE IVTIME CONSTANTS OF DELAY UNITS

    Normalization of the inputs to the primary network is

    automatically conducted by this moving time window; each

    output of the primary network is between zero and one

    and consequently, normalization for the secondary network

    is also accomplished within the primary network. Although

    the parameters shown in Table III must be adjusted for everyapplication, knowledge of the steady-state values for each

    measurement is not necessary. Even though the steady-state

    operating conditions of the plant are likely to change, the

    frequency at which network parameters require readjustment

    should be low.

    6) Delay Unit, Reset, and Restriction Rules: Some supple-

    mentary functions of the neural-network-based method are

    briefly discussed in this section.

    When multiple observations are necessary to detect a certain

    fault, the response time for each observation may be different

    because of the plant characteristics. In addition, some obser-

    vations may be contaminated by noise, and the intensity of

    this sensor noise may be different for each measurement. In

    order to accommodate the different response characteristics

    for multiple data or to remove the effects of noise, a first-

    order lag (delay unit) is incorporated into the detection system.

    The sensitivity of the detection system to changes in theparameters of the delay unit should be evaluated with the

    width of the window fixed and training patterns specified. The

    time constants chosen for this study are shown in Table IV.

    Note if the window length is chosen to be large enough, and

    the time constant tau of the coolant flow rate delay unit is

    either 0.01 or 0.10, the network does not yield a false alarm

    from unmodeled disturbances considered in this study. For

    particular applications, incorporating a pure dead-time delay

    in the network could also be effective.

    A key feature of the detection system developed in this work

    is the detection of faults during transient operating conditions.

  • 7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf

    7/13

    MAKI AND LOPARO: FAULT DETECTION AND DIAGNOSIS IN INDUSTRIAL PROCESSES 535

    TABLE VRESULTS OF TRAINING AND RECALL (MULTIPLE FAULTS)

    Because the detection system does not include information on

    normal steady-state values, it cannot determine if the plant is

    in a normal steady state or in an abnormal steady state. If all

    observations are steady during a fault condition, it is possible

    that the detection system can misdiagnose the situation and

    conclude that the plant is normal. Hence, it is necessary that

    the detection system is manually reset (reinitialized) after afault is detected and the operator concludes that the plant

    has resumed normal steady-state operation. This is not going

    to be a problem in practical implementations because once

    the system detects a fault, the alarm will be kept until it is

    manually reset.The outputs of the network have values between zero and

    one and a threshold value of 0.9 is used in this study to set

    the alarms for the operator.

    From Table II, we notice that most faults are enumerated

    by pairs that have opposite direction. For example, if we want

    to detect the fault #8p, it is natural to train the network by

    the target patterns of faults #8p and #8n. Because many of

    the process variables have second order dynamic responsecharacteristics, a false alarm of #8n is likely to occur after

    the correct detection of fault #8p. However, the probability of

    such an event is quite low in actual applications. Therefore,

    the fault detection system as designed and implemented in this

    work in a way that if a fault is detected, a fault with opposite

    direction to the fault detected is not considered until after the

    system is reset manually.

    IV. SIMULATION RESULTS

    The proposed fault detection and diagnosis system is ex-

    pected to recall pretrained faults correctly. Also it shouldgeneralize appropriately even from distorted or noisy input

    data. From a different point of view, it is also desired that

    the neural network can be trained to detect multiple faults,as many as possible. In this section, the capabilities and

    limitations regarding these requirements are examined and

    discussed using the CSTR simulation.

    A. Recall to Trained Faults

    Basically, if the training of the secondary neural network

    is accomplished within the time allocated for training and if

    the error goal is achieved, recall should not be a problem. The

    error goal of the backpropagation algorithm is set at 1 10

    for the following experiments.

    For a single fault, it is necessary to train the network

    by presenting the target pattern of the fault and the normal

    condition. Learning a faulty pattern along with the pattern of

    normal operation is important to achieving correct recall. If

    the network is trained only using the target pattern of a singlefault, then only a single neuron in the output layer is fired

    for any input. Because most faults are considered in pairs

    as mentioned previously, from a practical point of view, it

    is recommended that the network be trained with the normal

    pattern and at least one pair of faulty patterns.

    For multiple faults, target patterns should include the faults

    and the pattern for normal operation. Recall ability of the

    network is investigated by presenting both the normal pattern

    and one pair of faulty patterns. Then, by augmenting the

    number of faults, training and recall capabilities are also

    investigated. Results are summarized in Table V. For example,

    suppose the network is trained using five sets of data thatrepresent the faults #1p, #1n, #2p, #2n and normal operation.

    Fig. 8 shows an example of recall by presenting the data of

    fault #1p. A symptom of the fault started at time 2.00. The

    first neuron that corresponds to the fault #1p is fired promptly

    at time 2.20 detecting the change that occurred in and

    . Note that the fifth neuron, which corresponds to the normal

    operation also responds but no false alarm occurs. The alarm

    comes before the plant stabilizes to the faulty steady condition

    at about time 2.60. Therefore, detection by this method is faster

    than the conventional method that uses steady-state data.

    From the results shown in Table V, this fault detection and

    diagnosis system can also be trained using multiple faults

    within an allowable time period. However, as the numberof faults increases, trapping in a local minimum in the error

    surface is more likely to happen. This actually occurred when

    we attempted to simultaneously train 12 patterns. Randomized

    network weights and biases at the beginning of training can

    help mitigate this problem.

    Obviously from Table V, the correct recall rate decreases

    as the number of trained patterns increases. One reason is that

    there are faults that are similar to each other, such as faults #1n

    and #2p, and so on. The network trained using many faults is

    likely to give a false alarm by misunderstanding such similar

    patterns. Furthermore, right after the occurrence of fault #2p,

  • 7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf

    8/13

    536 IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 5, NO. 6, NOVEMBER 1997

    TABLE VIGENERALIZATION RESULTS FROM TRAINED DATA WITH NOISE

    Fig. 8. An example of correct recall.

    #2n, #6p or #6n, the coolant flow rate fluctuates for a

    while. This can also initiate a false alarm of fault #3p/#3nor #5p/#5n because is also the key observation for these

    fault pairs. By updating the error goal to a smaller value,

    the occurrence of such false alarms can be reduced, but not

    eliminated entirely.

    There is no distinct limit regarding the number of patterns

    to be trained. But for this particular application, we found thatit is better to train the network using less than ten patterns.

    Of course, this will vary from application to application, and

    this limitation must be discovered as a part of the learning

    and training process.

    B. Generalization from Untrained Faults, Input withNoise, and Faults with Different Severity

    How does the secondary neural network respond to un-

    trained faults? How does it work with noise-corrupted input

    data or faults with different severity? These generalization

    issues are examined next. Furthermore, this section contains

    a discussion of the applicability of the proposed system to

    real-world industrial processes.

    First, the response of the detection system to an input that

    represents an untrained fault is investigated. According to

    the experiments performed with many pretrained networks,

    untrained faults were diagnosed as the normal operating con-

    dition. In the hyperplane generated by the input vectors, an

    arbitrary input vector locates closest to the vector of normal

    operation. Arbitrary input vectors can be considered to be

    outputs of the primary network representing unknown faults.

    Therefore, the network cannot generalize to this situation, but

    this is consistent with our objectives for the design of the

    detection system. As depicted in Fig. 6, the output neuron

    for the normal condition should be referred to as normal or

    untrained faults.

    When the set point of the controller is changed, an unsteady

    operating condition is generated in the plant, but this is still

    normal operation. If this response is not similar to one of

    the trained faults, the network diagnoses a normal operating

    condition, similar to the case of an untrained fault. However,

    for example, fault #1p and #1n are quite similar to the response

    of the plant to a set point change of the inlet flow rate . It is

    quite likely that a false alarm is generated during such set point

    changes. As these set point changes are commonly initiated by

    an operator, the detection system should be tentatively disabled

    while the plant is in such a transient state.

    Second, noisy inputs of pretrained faults are given to the

    network. White noise with a normal distribution [ (0, 1)]is multiplied by a constant and added to each measurable

    variable. The control actions of the three controllers are also

    affected by the noise processes. The noise level is expressed as

    a percentage that represents the ratio of the standard deviation

    of the noise term to the amount of change caused by the fault.

    Let us consider the same example as given in Fig. 8, in which

    the change of the coolant flow rate is about 4 ft /h and the

    change of the control valve opening is about 8.1% of total

    stroke. If the standard deviation of the noise term of the coolant

    flow rate is 0.55 ft /h, the noise level of the coolant flow rate

    is %. If the standard deviation of the noise

    term of the control valve opening is 0.56%, the noise level of

    the control valve opening is %.For simplicity, the noise level of temperatures and level

    measurements are kept constant. Other noise levels are altered

    as in Table VI, which also shows the results of this experiment.

    It is obvious that the more noise that is added the more difficult

    it is to detect faults. However, the network performance

    demonstrates that it is adequately robust to noise. Fig. 9

    illustrates an example where fault #1p is correctly generalized

    from a faulty input with noise. Because the threshold level of

    firing each neuron is set at 0.9, the system can detect this fault

    at time . Further discussion of noise is given in the

    next section.

  • 7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf

    9/13

  • 7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf

    10/13

    538 IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 5, NO. 6, NOVEMBER 1997

    give and

    (14)

    (15)

    where

    (16)

    (17)

    and are used to compute the state estimate and

    error covariance at the next time step.

    B. Application of EKF to the CSTR

    Define the state vector

    (18)

    where it is assumed that and are measurable. It

    follows that , , , ,

    and . From the assumption, and

    are measurable and the others are not. Usually and are

    treated as inputs in such a state space representation. However,

    in this application they are defined as variables because they

    must be estimated.

    For computational reasons, the system equations are mod-

    ified to be dimensionless. The normalized state variables in

    deviation form are defined as shown in (19) at the bottom of

    the page, where (ft ), , ,

    , (ft /h) and (ft /h). These

    are the steady-state values. Hence, ,

    , , ,

    and . From here, the * is

    omitted to simplify the notation and because they are defined

    as deviation variables, the initial state is given as .

    From (1)(4), the system equations of the CSTR are written

    as shown in (20) at the bottom of the page, where

    (21)

    represents the modeling uncertainty.

    From the assumption on measurability, the output equation is

    written below, where the observation matrix is denoted by

    (22)

    Here

    (23)

    and is the measurement noise vector.

    Examining the parameters given in (21), we notice that there

    are variables that require estimation. For example, the coolant

    flow rate, , and the reactant concentration at the inlet, ,

    can change during the operation of the process. In a fault mode,

    parameters such as , , and can also vary. The reason for

    (19)

    (20)

  • 7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf

    11/13

    MAKI AND LOPARO: FAULT DETECTION AND DIAGNOSIS IN INDUSTRIAL PROCESSES 539

    the choice of the system state as given in (18) is discussed in

    the next section.

    C. Observability

    The model (20) has unmeasured state variables and param-

    eters that can change during different operating modes of the

    process. If parameters such as , , and can be estimated,

    it is very helpful for fault detection and diagnosis. However,because they are not measurable directly and are necessary for

    the detection of certain faults, the state vector is augmented to

    include these variables and parameters to be estimated by the

    EKF. The augmented state must be observable, otherwise it

    is meaningless to incorporate these variables and parameters

    into the model. Various realizations, other than the system

    representation given in the previous section, were considered.

    Unfortunately, all other realizations that were tried resulted in

    an unobservable realization.

    D. EKF Tuning and Simulation for Fault Detection

    The parameters of the EKF are , , and , which are

    assumed to be diagonal matrices. Also, the initial state must

    be specified. As described in (9) and (10), and can

    be determined from the known noise covariances. After that,

    however, further tuning by trial and error is always necessary

    to achieve stable and accurate estimation. Each element of

    is inversely proportional to the gain matrix . Hence, smaller

    elements are chosen as long as the estimation process is stable.

    As each element of increases, faster response is obtained

    but the amplitude of fluctuation increases. Conversely, the

    smaller each element of is, the slower the response and the

    smaller the fluctuations. The elements of that correspond to

    unmeasurable states need to be chosen so that the estimated

    states track the true values. As long as is given as ,and and are chosen appropriately, it is not necessary

    to adjust . This delicate balancing of estimator parameters

    and performance is similar to the situation we discussed

    earlier regarding the influence of window height on network

    performance for different fault severity scenarios.

    Two faults, #3p and #5p are selected for the case study

    to compare the performance of the EKF to the performance

    of neural-network-based approach. For #3p, the unmeasurable

    state, , rises due to the sticking of the control valve. For #5p,

    the constant, , rises and consequently the unmeasurable state

    increases. Hence, the estimates of or

    by the EKF are used to detect these faults.

    The square root of the covariance matrix is given as

    The square root of the covariance matrix is given as

    The covariance of the initial state estimation error is given

    as

    Fig. 11. Results of fault #5p.

    Computer simulations for the two faults are performed under

    the above conditions.

    Fault #3p: For fault #3p, after adjustment and of

    the EKF are

    Results of estimated states are shown in Fig. 10. The change

    of coolant flow rate affects the accuracy of the plant model.Nevertheless, the estimates of and track

    the actual values very well. The fault is detected by trending

    of the estimated states and .

    Fault #5p: For fault #5p, and are adjusted to be

    Results of the estimated states are shown in Fig. 11.

    Despite the intensive effort to search for the optimal

    and , does not follow the actual value. The

    coolant flow rate and the inlet concentration change

    due to the fault and affect the accuracy of the model. The

    plant model with degraded accuracy hinders the estimation

    and proper tracking. Hence, the fault is not detectable.

    E. Comparison of the Two Methods

    Following the above results, a comparison of the EKF and

    neural-network-based method is examined for the two faults

    #3p and #5p. Simultaneously, five different levels of noise are

    given and performance for different S/N ratios are compared.

    As the noise level of fault #3p increases, false alarms are

    likely to happen and tuning of the parameters is necessary

    for more than half the cases. For all cases of fault #5p, the

    EKF does not work well even though extensive efforts were

    taken to tune the parameters. We conclude this section withthe following comments regarding the comparison.

    1) A model-based approach such as EKF significantly

    depends on the validity of the model. The performance

    of a model-based system is easily degraded by unmod-

    eled disturbances such as measurement or process noise

    caused by perturbations or other malfunctions of the

    plant.

    2) Use of the EKF is limited by observability of the real-

    ization, including unmeasurable states and parameters. If

    the system is not observable, we must look for a reduced

    set of states and/or parameters that are observable.

  • 7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf

    12/13

    540 IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 5, NO. 6, NOVEMBER 1997

    Reducing the state variables makes the model vulnerable

    to uncertainty.

    3) Parameters of the EKF need to be adjusted every time

    the noise level changes. Parameters of the neural-

    network-based approach are generally more robust in

    this sense.

    4) The advantage of the model-based method is that it

    can estimate unmeasurable parameters, if they are ob-

    servable. On the other hand, the neural-network methodmust rely on the measurable information and can best

    correlate with preknown faults through training.

    VI. CONCLUSIONS

    A neural-network-based fault detection and diagnosis sys-

    tem is developed and applied to a plant model of a CSTR.We summarize and review the main results of this work,

    evaluate the neural-network-based approach for fault detection

    and discuss the applicability to general industrial processes.

    A nonlinear CSTR was chosen as the study system. Aplant model was developed and implemented for computer

    simulation. The following results were obtained from the case

    studies conducted using the CSTR model.

    1) A two-stage neural-network system, where each element

    can be designed independently, has been proposed. Us-

    ing data preprocessed by the moving time window, the

    primary network detects the transient state of each mea-

    surement dynamically, and this architecture can be used

    for many industrial process applications. A secondary

    neural network was developed to detect and diagnose a

    set of preknown faults according to the specification of

    the application. The two-stage network approach yields

    an efficient and simplified design procedure.2) For the training of the neural networks, the backpropaga-

    tion algorithm was chosen. This approach is considered

    to be better for training feedforward neural networks, asopposed to Hopfield networks. Combined with momen-

    tum and the adaptive learning rate method, training of

    the neural networks was performed very efficiently.

    3) The secondary neural network can be trained for multi-

    ple faults as long as all the patterns differ from each

    other. It can also recall the trained faults correctly.

    However, the more patterns that are trained, the more

    likely it is to be trapped in a local minimum. Moreover,

    similar patterns of faults or perturbations that occur after

    a fault are likely to produce a false alarm.4) The neural-network-based system can detect trained

    faults promptly during the transient period. It is faster

    than the method trained using steady-state data. It detects

    untrained fault as the normal operating condition, as

    desired. When a set point of a controller is changed dur-

    ing normal operation of the plant, the detection system

    diagnoses that the plant is normal unless the response of

    the plant induced by the change in set point is similar to

    that of the trained faults. Because an operator should be

    aware of either a set point change or the occurrence of a

    measurable disturbance, in these situations the detection

    system should be turned off temporarily until the process

    returns to a normal operating state.5) The secondary neural network can generalize from faulty

    data with noise if the amplitude of the noise is within

    certain bounds. Also it can generalize from faulty data

    with different severity, unless the severity is much

    smaller than the window height.

    6) A conventional model-based approach, the EKF, is cho-

    sen for this study. Its performance strictly depends on the

    accuracy of the model and its applicability is restricted

    by observability of the realization. Compared with the

    EKF model-based approach, the neural-network-based

    approach is more robust with respect to noise. Generally,

    we do not have to change parameters of the neural-

    network-based method for different faults and different

    noise level.

    7) The secondary neural network must be designed using

    the given specifications of the plant. However, tuning of

    the parameters can be completed efficiently. This feature

    implies that a wide scope of industrial process applica-

    tions can be addressed using the approach developed inthis work.

    REFERENCES

    [1] D. M. Himmelblau, Fault Detection and Diagnosis in Chemical andPetrochemical Processes. New York: Elsevier, 1978.

    [2] P. M. Frank, Fault diagnosis in dynamic systems using analyticaland knowledge-based redundancyA survey and some new results,

    Automatica, vol. 26, no. 3, pp. 459474, 1990.[3] J. J. Gertler, Survey of model-based failure detection and isolation in

    complex plants, IEEE Contr. Syst. Mag., vol. 8, pp. 311, 1988.[4] K. A. Loparo, M. R. Buchner, and K. S. Vasudeva, Leak detection in

    an experimental heat exchanger process: A multiple model approach,IEEE Trans. Automat. Contr., vol. 36, 1991.

    [5] S. Park and D. M. Himmelblau, Fault detection and diagnosis via

    parameter estimation in lumped dynamic systems, Ind. Eng. Chem.Process Des. Dev., vol. 22, no. 3, pp. 482487, 1983.

    [6] R. Li and J. H. Olsen, Fault detection and diagnosis in a closed-loopnonlinear distillation process: Application of extended Kalman filters,

    Ind. Eng. Chem. Res., vol. 30, no. 5, pp. 898908, 1991.[7] S. K. Shum, J. F. Davis, W. F. Punch, and B. Chandrasekaran, An

    expert system approach to malfunction diagnosis in chemical plants,Comput. Chem. Eng., vol. 12, no. 1, pp. 2736, 1988.

    [8] J. C. Hopkins and D. M. Himmelblau, Artificial neural-network mod-els of knowledge representation in chemical engineering, ComputersChem. Eng., vol. 12, nos. 9/10, pp. 881890, 1988.

    [9] V. Venkatasubramanian and K. Chan, A neural-network methodologyfor process fault diagnosis, AIChE J., vol. 35, no. 12, pp. 19932001,1989.

    [10] K. Watanabe, I. Matsuura, M. Abe, and M. Kubota, Incipient faultdiagnosis of chemical processes via artificial neural networks, AIChE

    J., vol. 35, no. 11, pp. 18031812, 1989.[11] V. Venkatasubramanian, R. Vaidyanathan, and Y. Yamamoto, Process

    fault detection and diagnosis using neural networksI: Steady-stateprocesses, Computers Chem. Eng., vol. 14, no. 7, pp. 699712, 1990.

    [12] J. C. Hoskins, K. M. Kaliyur, and D. M. Himmelblau, Fault diagnosisin complex chemical plants using artificial neural networks, AIChE J.,vol. 37, no. 1, pp. 137141, 1991.

    [13] W. E. Dietz, E. L. Kiech, and M. Ali, Jet and rocket engine faultdiagnosis in real time, J. Neural-Network Computing, vol. 1, no. 5, pp.517, 1989.

    [14] R. Li, J. H. Olson, and D. L. Chester, Dynamic fault detection anddiagnosis using neural networks, in Proc. 5th IEEE Symp. Intell. Contr.,1990, pp. 11691174.

    [15] Y. Ohga and H. Seki, Abnormal event identification in nuclear powerplants using a neural network and knowledge processing, NuclearTechnol., vol. 101, pp. 159167, Feb. 1993.

    [16] W. L. Luyben, Process Modeling, Simulation, and Control for ChemicalEngineers. New York: McGraw-Hill, 1990.

  • 7/23/2019 A Neural-Network Approach to Fault Detection CSTR.pdf

    13/13

    MAKI AND LOPARO: FAULT DETECTION AND DIAGNOSIS IN INDUSTRIAL PROCESSES 541

    [17] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Learning internalrepresentations by error propagation, Parallel Distributed Processing:

    Explorations in the Microstructure of CognitionI: Foundations, D. E.Rummelhart and J. L. McClelland, Eds. Cambridge, MA: MIT Press,1986.

    [18] T. P. Vogel, J. K. Mangis, A. K. Rigler, W. T. Zink, and D. L. Alkon,Accelerating the convergence of the backpropagation method, Biol.Cybern., vol. 59, pp. 257263, 1988.

    [19] C. K. Cui and G. Chen, Kalman Filtering. New York: Springer-Verlag.[20] A. H. Jazwinski, Stochastic Processes and Filtering Theory. New

    York: Academic, 1970.

    Yunosuke Maki was born in Toyohashi, Japan, in1958. He received the B.S. degree in mathematicsand instrumentation from the University of Tokyoin 1981 and the M.S. degree in Systems and ControlEngineering from Case Western Reserve University,Cleveland, OH, in 1994.

    Since 1981, he has been working for theKawasaki Steel Corporation in Japan. His researchinterests include the industrial applications ofsystems and control theory, especially to the steelmaking process.

    Mr. Maki is currently a member of the Iron and Steel Institute in Japan.

    Kenneth A. Loparo (S75M77SM89) receivedthe Ph.D. degree in systems and control engineeringfrom Case Western Reserve University, Cleveland,OH, in 1977.

    He was an Assistant Professor in the MechanicalEngineering Department at Cleveland State Univer-sity, OH, from 1977 to 1979, where he receivedthe Distinguished Faculty Award for contributionsto teaching and research. From 1979 to the presenttime, he has been on the faculty of The Case School

    of Engineering, Case Western Reserve Universitywhere he is currently Associate Dean of The Case School of Engineeringand Professor of Systems and Control Engineering. He is also Professor ofMechanical and Aerospace Engineering and Professor of Mathematics. Heserved as Chair of the Department of Systems Engineering from 1990 to1994 and as Associate Director of the Center for Automation and IntelligentSystems Research from 1985 to 1989. His research interests include stabilityand control of nonlinear and stochastic systems with applications to large-scaleelectric power systems; nonlinear filtering with applications to monitoring,fault detection, diagnosis and reconfigurable control; information theoryaspects of stochastic and quantized systems with applications to adaptive anddual control; and the design of digital control systems.

    At Case Western Reserve University he has received numerous awardsincluding the Sigma Xi Research Award for contributions to stochastic control,the John S. Diekoff Award for Distinguished Graduate Teaching, the Tau BetaPi Outstanding Engineering and Science Professor Award, the UndergraduateTeaching Excellence Award, and the Carl F. Wittke Award for Distinguished

    Undergraduate Teaching.