FBOLES—outline of a frequency-based on-line expert system approach for fault diagnoses in nuclear power plants

  • Published on

  • View

  • Download

Embed Size (px)


  • Reliability Engineering and System Safety 40 (1993) 165-172

    FBOLES---outline of a frequency-based on- line expert system approach for fault

    diagnoses in nuclear power plants

    Qin Zhang, Xuegao An, Jin Gu, Dazhi Xue & Shuren Xi Institute of Nuclear Energy Technology, Tsinghua University, Beijing 100084, People's Republic of China

    (Received 11 November 1991; accepted 7 September 1992)

    The basic concept of a novel frequency-based on-line expert system (FBOLES) approach for on-line fault diagnoses in nuclear power plants has been presented previously. This paper reports further progress, and includes how to express the uncertainties of expert opinion in knowledge-bases and propagate then in inferences, and how to deal with imperfect knowledge- bases. It also describes the main elements of FBOLES and outlines how the system works under complex conditions including the dynamic behavior of signals during accidents, multiple failures, and on-line changes of plant operation mode. Its ability to diagnose root causes of accidents has been tested using the main steam and feedwater condensate systems of a full-scale, three-loop, 950 MW nuclear power plant simulated by the Tsinghua simulator. So far the results seem good.


    The basic concept of a novel frequency-based on-line expert system (FBOLES) approach for on-line fault diagnoses in process plants has been developed and presented in our previous papers. 1'2 This paper reports further progress, which includes (1) how to express the uncertainties of expert opinion in knowledge-bases and propagate them in inferences; (2) how to deal with imperfect knowledge-bases; (3) how to identify multiple failures consisting of initiating and non-initiating events; and (4) how to deal with the dynamic behavior of signals during accidents. The advantages of the FBOLES approach are as follows:

    (1) uncertainties, including spurious sensor signals and uncertain expert opinion in knowledge- bases, are included;

    (2) imperfect knowledge-bases are taken into consideration;

    (3) rapid diagnosis of the root causes of abnor- malities is possible, so that operators may control accidents in time;

    (4) it is possible to add new information at any time;

    Reliability Engineering and System Safety 0951-8320/93/$06.00 t~ 1993 Elsevier Science Publishers Ltd, England.

    (5) it is easy to take into consideration any on-line changes of the plant operation modes when: (i) the system is set up for different but normal plant operation modes; (ii) it is necessary to update the knowledge-bases on-line following abnormal plant operation mode changes that may occur during accidents; and

    (6) the knowledge-bases are easily constructed and modified.

    Using the full-scale nuclear power plant simulator (PWR, 950MW and three loops) located on the campus of Tsinghua University, Beijing, we have initially tested this approach using its main steam and feedwater condensate systems (i.e. the secondary loop). The test results seem good.

    In Section 2 we will describe briefly our methodology with emphasis on recent progress. Section 3 discusses conflict signals and imperfect knowledge-bases. Section 4 outlines the main elements of FBOLES and describes how it works in complex cases. Section 5 reports some test results.



    Suppose E = ( ' ]n= 1 elk is a set of evidence collected on-line and H= {hi, hE,..., hm} is the exhaustive

  • 166 Qin Zhang et al.

    and exclusive plant states or hypotheses, where ke{ . . . . -3 , -2 , -1 ,0 , 1, 2,3 . . . . } indicates the state of the process parameter monitored by sensor i or the state of the component i. It is noted that for a given set of evidence E, most members of H become impossible. Only a small part of H is consistent with E. Denote the consistent part of H as H*. Thus E=H*.

    Besides the evidence E, one also knows the time interval in which an abnormality occurs. Let (t, t + dt) denote the occurrence of an abnormality between t and t + dt. Since dt is small, we get

    Fr{hi It} Pr{h~ ]En( t , t+d/ )} =

    Z Fr{hj I t}

    when hieH* (1)

    otherwise Pr{hi I E A (t, t + dt)} = 0, where Fr{-} is a frequency function.

    This is a frequency-based approach. It reveals that not considering (t,t+dt), which leads to the unavailability-based approach, may be inappropriate.

    H* is associated with E and must be found on-line after E is received. To find H*, we have

    elk = SFik + RiS ik "4- Ri lKik = Sik "1- RiXik (2)

    Sik = SFik + RilKik (3)


    H*= E= (~ e,k= (~ ( Si~, + R,X~) i=1 i=1


    = ~ N Sik 0 RjXjk (4) ot=l i~l~ j~J~

    where '+' is the exclusive OR operator; lo~+Jo~ = {1, . . . , n}; R~ denotes that sensor i is reliable; X/k denotes that what eik indicates is true; SF/k denotes that sensor i is malfunctioning (when k 4= 0) or failed to function (when k = 0); IKi~, indicates the imperfect- ness of the mini knowledge-base that is intended but actually failed to explain X~k due to its imperfectness; S;k indicates that signal i should be ignored because either the sensor is failed or the corresponding mini knowledge-base is imperfect so that the signal appears to conflict with others.

    It is noted that eqn (4) has the suspicion of the combination explosion of terms. However, in most cases, all signals and corresponding knowledge-bases are correct and only one term, i.e. O'~=~R/X/k, is needed. In rare cases, one of the signals is spurious or its corresponding mini knowledge-base is imperfect. In these cases, n additional terms are added. In very rare cases, more than one signal is spurious or has an imperfect mini knowledge-base. In these cases, much longer computation times will be needed because of the increase in the number of terms. However, in

    most cases, combination explosion will not appear. Furthermore, some of the terms on the right side of

    eqn (4) may be impossible (can never be true because they include exclusive events) or impractical (rarely occur because they include more than one independ- ent initiating event). Thus checks on the consistency are needed to remove impossible or impractical terms. On the other hand, some of the terms may include inclusive events. Absorption checking is thus needed to condense these terms.

    For the further development of H*, we treat each element in the modified eqn (4) as a top event of a mini fault or success tree that serves as a mini knowledge-base. These trees are constructed in advance and their minimal cut or path sets are found and stored in computer as the knowledge-bases. By using these mini knowledge-bases, eqn (4) can be further expanded in terms of minimal cut sets. These minimal cut sets are then the detailed H*. During all the expanding, the recognition that two or more independent initiating events rarely occur simul- taneously is employed effectively to reduce the amount of computation required (see Section 4.4).

    It is noted that, in practice, it is very difficult to ensure the completeness of the knowledge-bases. Fortunately, our approach has the ability to handle this by treating the imperfect knowledge-bases in the same way as it treats spurious sensor signals. Since the redundancy of the signals indicating faults, ignoring a few of the signals usually will not affect the final diagnostic results. Of course, the ignored signal will be flagged and a degrade factor (e.g. 0-01) is multiplied when calculating the associated probabil- ities. The ignored signals may be analyzed later to see the detailed reasons (sensor failures or imperfect knowledge-bases). Since the probabilities of knowledge-bases being imperfect are usually much larger than those of sensor failures, the degrade factors mainly reflect the degree of imperfectness of the knowledge-bases.

    In some cases, the imperfectness of two or more mini knowledge-bases under similar signals is due to a common cause. In these cases, only one degrade factor should be multiplied. For example, if the same steam flow rate signals from three identical lines are to be ignored because of imperfect knowledge-bases, only one degrade factor, instead of three, should be multiplied when calculating Pr{hi I t}.

    To reduce the imperfectness in the mini knowledge- bases, the minimal cut sets in all mini knowledge- bases should be analyzed one by one after they are generated from mini trees. When doing this, confidence factors should be added to reflect the experts' confidences on the effect of a cut set on the top event of the mini tree. For example, one expert may have 100% confidence that cut set A will cause Xik appear. He then puts a confidence factor 1 for this

  • FBOLES approach for fault diagnosis in power plants 167

    cut set. Compared with A, he may have only 50% confidence that cut set B will cause Xik to appear. Then he puts a confidence factor 0-5 for this cut set. When calculating Pr{hj It}, all the confidence factors from different mini knowledge-bases but associated with the same hj should be multiplied together. The calculated value may be very small. But since eqn (1) performs a relative calculation, it does not affect the final result.

    The detailed H* may also be obtained from the logic flowgraph method (LFM) 3 methodology by treating each term of eqn (4) as a subset of evidence. This may make it much easier to modify the knowledge-bases on-line, corresponding to the on-line changes in plant operation mode.


    3.1 Conflict signals

    What are conflict signals? Our understanding is as follows: assuming all knowledge-bases are perfect, if a number of these signals appear simultaneously (within the time interval of consideration), at least one must be spurious. In other words, there is no event that will cause these signals to appear simultaneously. When they do appear, they are called 'conflict signals' and it must be true that at least one of them is spurious and should be ignored.

    In our approach, when conflict signals exist, the diagnostic result must be a null set if we trust all of them. Thus when a null set is reached, there must be at least one spurious signal, given that our knowledge-bases are perfect. In this case, if we do not have other means to determine which signal is spurious, we have to ignore these signals one by one to see if any non-null set is reached. If not, there must be at least two spurious signals and we have to ignore two or more of them group by group until a non-null set is reached. When a signal is ignored and a non-null set is reached, this signal is likely to be spurious. If there is more than one candidate, we need to rank them according to their probabilities of associated hypotheses to see which is more likely. This result is reasonable because it just reflects the state of knowledge. To locate the spurious signal accurately, further information must be provided.

    In contrast to the conflict signals, it is possible that spurious signals do not lead to conflict. In these cases, a non-null set is reached but it is a spurious diagnosis. Unless new information is provided, there seems, as far as we know, no way to realize these spurious signals. Fortunately, these cases are very rare.

    3.2 Imperfect knowledge-bases

    There are three types of imperfectness in our mini knowledge-bases: (1) a true hypothesis is not included in the knowledge-base; (2) a false hypothesis is included in the knowledge-base; (3) the values of confidence factors, degrade factors or failure rates (probabilities) of basic events are inaccurate. Type 3 can only affect the rank of hypothesis and is therefore not serious. Type 2 may provide false hypotheses together with the true hypotheses. This will cause the diagnosis to be less accurate, but it will not shield the true hypotheses. So it is not serious either. Furthermore, if the false hypotheses are not included in all mini knowledge-bases that are currently employed in the diagnosis, the false hypotheses will not affect the final results.

    The first type of imperfectness will cause a null set in the diagnosis. Its effect is the same as that of the conflict signals. So it is treated as if the associated signal is spurious. In fact, when a null set is reached, we usually do not know whether it is caused by the conflict signals or the imperfect knowledge-bases. But this does not matter because the action taken to ignore this signal or to ignore this mini knowledge- base is the same. Which is the true cause can be determined later.

    It must be emphasized that some imperfect knowledge-bases are due to a common cause and should be treated as one, i.e. ignored simultaneously.


    The outline of FBOLES is illustrated in Fig. 1. In the following text, we will describe each functional element shown in this figure and the relationships between them.

    4.1 Signal Adaptor

    The Signal Adaptor continuously monitors the concerned signals both shown and not shown in the control room. Usually, there is a process monitoring computer connected to the reactor, which keeps all the important plant signals at the sequential time points. Thus, most on-line signals of interest can be obtained from this computer.

    4.2 Sign~ Checker

    The Signal Checker continuously checks the plant parameter and component state signals obtained from the Signal Adaptor so that any abnormal signals can be identified immediately.

    The normal parameter values with their normal random deviating ranges corresponding to the on-line

  • 168 Qin Zhang et al.

    1 I Decision



    Deep Plant Knowledge Base I

    1 1

    IsignaisShown-in SignalsNot Shown i ! Control Room j in Control Room ] 12[Z [S,gna, Adap,o; --

    I Checke . Signal

    I Signal / Analyzer I l l !__

    I n fe rence ~ Engine III


    Knowledge ase Modifie~

    Knowledge Base 1

    Deep Plant ],,__ Knowledge Base II

    Knowledge ] .~ . Base 2 j

    ~ Knowledge / Base 3

    Knowledge ], Base n

    Pig. 1. Outline of FBOLES.

    plant operation modes are established by operators through the Setup function. The necessary knowledge for this is provided in the Deep Plant Knowledge-Base I. What the Signal Checker does is to compare the signals received with their normal values. Once a signal is out its normal range, this signal is identified to be abnormal.

    It should be pointed out that because of the automatic control functions of the reactor, when an abnormality occurs, a plant parameter may first deviate from its normal value in one direction and then return to its normal value or even deviate in the opposite direction. In these cases, only the original signal deviation indicates the root causes and the later deviation or normalization does not. Also, the reactor trip, start-up of standby systems and controls taken by operators during accidents will greatly affect the signal values so that the root cause information is submerged. Thus the Signal Checker should only retain the maximum signal deviations in the first stage of abnormalities (before the reactor trip, start-up of

    standby systems or controls taken by...


View more >