9
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON SMART GRID 1 Reliability Evaluation of Phasor Measurement Unit Using Monte Carlo Dynamic Fault Tree Method Peng Zhang and Ka Wing Chan, Member, IEEE Abstract—Reliability evaluation of phasor measurement unit (PMU) is a primary key element in the reliability evaluation of wide-area monitoring system (WAMS). In this paper, a com- prehensive reliability evaluation method based on Monte Carlo dynamic fault tree (MCDFT) analysis is proposed to conduct the reliability evaluation on PMU. The reliability model of PMU is constructed using dynamic fault tree modeling and analyzed using Monte Carlo simulation to evaluate the reliability indices of PMU. The validity and advantages of the proposed MCDFT reliability evaluation of PMU were veried with simulation and comparison studies. Importance analysis showed that basic components in the GPS receiver and CPU hardware modules have high impacts on the reliability of PMU. Sensitivity and redundancy design analysis are then applied to conclude that the redundancy design of GPS receiver and CPU hardware would be the best measure for improving the reliability of PMU. Finally, a self-adaptive wide-area damping control scheme is taken as an example for the application of PMU reliability. Index Terms—Dynamic fault tree analysis, Monte Carlo simula- tion, phasor measurement unit (PMU), reliability evaluation, wide area monitoring system (WAMS). I. INTRODUCTION A S MODERN electric power systems have become more and more complex, there is an urgent need for a real-time wide area monitoring system (WAMS) [1]. The concerns on the reliability of WAMS have been one of the main factors con- tributed to the slow pace of synchro-phasor adoption for real- time applications [2]. It is now urgent and necessary to compre- hensively and quantitatively evaluate the reliability of WAMS so as to ensure its availability and reliability meet the require- ments for real-time analysis and control [3]–[5]. As the phasor measurement unit (PMU) is the core component of WAMS [6], the evaluation of its reliability is a primary key in the relia- bility evaluation of WAMS. For instance, availability assess- ments were conducted in [7] for assessing the proposed appli- cations of WAMS in power system monitoring and control but with typical availability of PMU assumed only. Availability and other reliability indices of PMU can be de- rived either from the statistical data collected in operation or Manuscript received July 21, 2011; revised November 15, 2011; accepted December 11, 2011. This work was supported by the Hong Kong Polytechnic University under Project G-YG33. The work of P. Zhang was supported by his research studentship awarded by the Hong Kong Polytechnic University. Paper no. TSG-00259-2011. The authors are with the Department of Electrical Engineering, Hong Kong Polytechnic University, Hong Kong SAR, China (e-mail: 07901413r@polyu. edu.hk). Color versions of one or more of the gures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identier 10.1109/TSG.2011.2180937 from a reliability model. As PMU is still an emerging device, available statistical data for reliability evaluation is little and a better alternative would be to rst construct a reliability model of PMU and then to evaluate the reliability indices of this model using a comprehensive reliability method. Reliability model of PMU has been constructed in [8], [9] using the traditional Markov method. In Markov method, the basic components of PMU are rst represented as equivalent two-state models and then Markov state transition diagram is used to depict the logic relationship between PMU and its basic components. However, logic relationship between PMU and its basic components would be too complex to be depicted accu- rately in the Markov state transition diagram. Furthermore, in order to simplify Markov state transition diagram, only a single fault pattern was considered while multiple fault patterns were ignored. Without an accurate reliability model of PMU, the ac- curacy of the reliability indices would be compromised. In order to overcome these problems in the Markov method, dynamic fault tree (DFT) modeling [10]–[12], which can accu- rately depict the complex dynamic logic relationship between PMU and its basic components, is proposed here to construct an accurate reliability model of PMU in which multiple fault patterns are considered. This DFT reliability model can then be analyzed with various methods such as network reduction method [13], minimal cut sets [14], and Monte Carlo simulation approach [15]–[17]. Compared with other two methods, Monte Carlo simulation approach can evaluate the reliability indices of PMU more easily and accurately. In this paper, a comprehensive Monte Carlo dynamic fault tree (MCDFT) reliability evaluation method is presented for quantitative evaluation of PMU reliability. The main advantage of the proposed approach is that an accurate reliability model of PMU can be constructed to allow multiple fault patterns in PMU to be considered in the reliability evaluation. Furthermore, more reliability indices of PMU such as importance indices, which cannot be easily obtained otherwise, can be deduced with the use of an MCDFT reliability evaluation method. Through the importance, sensitivity, and redundancy design analysis of basic components in PMU, the most critical components with high impact on the reliability of PMU can be identied and the most effective reliability improvement measures could be deduced. The rest of this paper is organized as follows. The structure of PMU is detailed in Section II. Reliability model of PMU is constructed based on DFT modeling method in Section III. In Section IV, Monte Carlo simulation approach is proposed to an- alyze the DFT reliability model of PMU. In Section V, a nu- merical study is conducted to illustrate the proposed MCDFT reliability evaluation method. Section VI presents an example 1949-3053/$31.00 © 2012 IEEE

Reliability Evalution Using Monte Carlo Tree Method

Embed Size (px)

Citation preview

Page 1: Reliability Evalution Using Monte Carlo Tree Method

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

IEEE TRANSACTIONS ON SMART GRID 1

Reliability Evaluation of Phasor Measurement UnitUsing Monte Carlo Dynamic Fault Tree Method

Peng Zhang and Ka Wing Chan, Member, IEEE

Abstract—Reliability evaluation of phasor measurement unit(PMU) is a primary key element in the reliability evaluation ofwide-area monitoring system (WAMS). In this paper, a com-prehensive reliability evaluation method based on Monte Carlodynamic fault tree (MCDFT) analysis is proposed to conduct thereliability evaluation on PMU. The reliability model of PMU isconstructed using dynamic fault tree modeling and analyzed usingMonte Carlo simulation to evaluate the reliability indices of PMU.The validity and advantages of the proposed MCDFT reliabilityevaluation of PMU were verified with simulation and comparisonstudies. Importance analysis showed that basic components inthe GPS receiver and CPU hardware modules have high impactson the reliability of PMU. Sensitivity and redundancy designanalysis are then applied to conclude that the redundancy designof GPS receiver and CPU hardware would be the best measurefor improving the reliability of PMU. Finally, a self-adaptivewide-area damping control scheme is taken as an example for theapplication of PMU reliability.

Index Terms—Dynamic fault tree analysis, Monte Carlo simula-tion, phasor measurement unit (PMU), reliability evaluation, widearea monitoring system (WAMS).

I. INTRODUCTION

A S MODERN electric power systems have become moreand more complex, there is an urgent need for a real-time

wide area monitoring system (WAMS) [1]. The concerns on thereliability of WAMS have been one of the main factors con-tributed to the slow pace of synchro-phasor adoption for real-time applications [2]. It is now urgent and necessary to compre-hensively and quantitatively evaluate the reliability of WAMSso as to ensure its availability and reliability meet the require-ments for real-time analysis and control [3]–[5]. As the phasormeasurement unit (PMU) is the core component of WAMS [6],the evaluation of its reliability is a primary key in the relia-bility evaluation of WAMS. For instance, availability assess-ments were conducted in [7] for assessing the proposed appli-cations of WAMS in power system monitoring and control butwith typical availability of PMU assumed only.Availability and other reliability indices of PMU can be de-

rived either from the statistical data collected in operation or

Manuscript received July 21, 2011; revised November 15, 2011; acceptedDecember 11, 2011. This work was supported by the Hong Kong PolytechnicUniversity under Project G-YG33. The work of P. Zhang was supported by hisresearch studentship awarded by the Hong Kong Polytechnic University. Paperno. TSG-00259-2011.The authors are with the Department of Electrical Engineering, Hong Kong

Polytechnic University, Hong Kong SAR, China (e-mail: [email protected]).Color versions of one or more of the figures in this paper are available online

at http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/TSG.2011.2180937

from a reliability model. As PMU is still an emerging device,available statistical data for reliability evaluation is little and abetter alternative would be to first construct a reliability modelof PMU and then to evaluate the reliability indices of this modelusing a comprehensive reliability method.Reliability model of PMU has been constructed in [8], [9]

using the traditional Markov method. In Markov method, thebasic components of PMU are first represented as equivalenttwo-state models and then Markov state transition diagram isused to depict the logic relationship between PMU and its basiccomponents. However, logic relationship between PMU and itsbasic components would be too complex to be depicted accu-rately in the Markov state transition diagram. Furthermore, inorder to simplify Markov state transition diagram, only a singlefault pattern was considered while multiple fault patterns wereignored. Without an accurate reliability model of PMU, the ac-curacy of the reliability indices would be compromised.In order to overcome these problems in the Markov method,

dynamic fault tree (DFT) modeling [10]–[12], which can accu-rately depict the complex dynamic logic relationship betweenPMU and its basic components, is proposed here to constructan accurate reliability model of PMU in which multiple faultpatterns are considered. This DFT reliability model can thenbe analyzed with various methods such as network reductionmethod [13], minimal cut sets [14], and Monte Carlo simulationapproach [15]–[17]. Compared with other two methods, MonteCarlo simulation approach can evaluate the reliability indices ofPMU more easily and accurately.In this paper, a comprehensive Monte Carlo dynamic fault

tree (MCDFT) reliability evaluation method is presented forquantitative evaluation of PMU reliability. The main advantageof the proposed approach is that an accurate reliability model ofPMU can be constructed to allowmultiple fault patterns in PMUto be considered in the reliability evaluation. Furthermore, morereliability indices of PMU such as importance indices, whichcannot be easily obtained otherwise, can be deduced with theuse of an MCDFT reliability evaluation method. Through theimportance, sensitivity, and redundancy design analysis of basiccomponents in PMU, the most critical components with highimpact on the reliability of PMU can be identified and the mosteffective reliability improvement measures could be deduced.The rest of this paper is organized as follows. The structure

of PMU is detailed in Section II. Reliability model of PMU isconstructed based on DFT modeling method in Section III. InSection IV, Monte Carlo simulation approach is proposed to an-alyze the DFT reliability model of PMU. In Section V, a nu-merical study is conducted to illustrate the proposed MCDFTreliability evaluation method. Section VI presents an example

1949-3053/$31.00 © 2012 IEEE

Page 2: Reliability Evalution Using Monte Carlo Tree Method

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

2 IEEE TRANSACTIONS ON SMART GRID

Fig. 1. Functional structure of PMU [8].

for the application of PMU reliability in power system stabilityanalysis and control.

II. STRUCTURE OF PHASOR MEASUREMENT UNIT

Fig. 1 shows the general structure of a PMU with seven func-tion modules including various basic components [8]. Throughthe PT/CT module (M1), large analog three-phase voltage andcurrent signals are transformed into small analog voltage andcurrent signals which will then be passed to the anti-aliasingfiltering module (M2) to eliminate any high frequency noise.The A/D converting module (M3), whose synchronous sam-pling pulse (SSP) is supplied by the GPS module (M4), is usedto convert the scaled and filtered analog inputs to digital signals.In the CPUmodule (M5), the phasors of phase voltages and cur-rents are computed from the digitized signals and stamped withcoordinated universal time (UTC) supplied by the GPS module(M4), and then sent to phasor data concentrator (PDC) and con-trol center by the communicationmodule (M6) through commu-nication network. The power of PMU is supplied by the powersource module (M7).Though PMUs designed and manufactured by different man-

ufacturers may have different structures [18], their reliabilitycan also be evaluated by applying the proposed reliability evalu-ation method, which is not limited to the general PMU structurein Fig. 1.

III. RELIABILITY MODELING OF PMU

In this paper, DFT modeling method is adopted to constructan accurate reliability model of PMU. In this model, the com-plex dynamic logic relationship between PMU and its basiccomponents is accurately depicted through dynamic logic gatesincluding sequence enforcing gate (SEQ), priority-AND gate(PAND), functional dependency gate (FDEP), cold spare gate(CSP), warm spare gate (WSP), and hot spare gate (HSP).PMU is a complex device which can be divided into seven

function modules based on their functionalfeatures. Each module, which could contain a small or largenumber of basic components, is first constructed as a sub-DFTreliability model, and then the complete DFT reliability modelof PMU is constructed by combining these sub-DFT reliabilitymodels. This approach can significantly decrease the PMUmodeling complexity and computational effort required by thefollowing Monte Carlo simulation.

Fig. 2. Sub-DFT reliability model of data collection module.

A. Sub-DFT Reliability Model of Data Collection Module

Data collection module (M0) is used to transform largevoltage and current signals to small signals, filter noise andrandom disturbances, and convert the analog input signalsto digital ones. It consists of three function modules: PT/CTmodule (M1), anti-aliasing filtering module (M2) and A/Dconverting module (M3) as shown in Fig. 1.In each of the three function modules, there are two par-

allel circuit boards (one is active and the other operates as coldstandby). The sub-DFT reliability model of this data collectionmodule is shown as Fig. 2 in which , , , ,

, , , , , , and de-note the failure rate and repair rate of circuit boards in moduleM1, M2 and M3.In the sub-DFT reliability model of data collection module,

dynamic logic gate CSP is used to depict the spare mutual rela-tionship between two circuit boards, which are standby to eachother, in module M1, M2, and M3.

B. Sub-DFT Reliability Model of GPS Module

The construction of the sub-DFT reliability model of GPSmodule is the key in the construction of reliability model ofPMU. Thus, it is necessary to properly analyze the operationmechanism of GPS module shown in Fig. 3. In GPS module,a crystal oscillator is used to supply the sampling clock pulsesfor the A/D converting module and track the pulses per second(PPS) supplied by the GPS receiver to correct the error betweenPPS and crystal oscillator frequency. GPS may lose its signaldue to many factors such as the poor placement of the GPS an-tenna, obstruction around GPS antenna, interference, or jammedsignal at the receiver from various sources, and so on [19].Whenthe GPS signal is lost, the crystal oscillator will take over theGPS receiver to supply the synchronous time signal (STS), andthe GPS module would enter backup clock operation mode.Under backup clock operation mode, the time accumulation

error in STS cannot be eliminated but would be less than[20]. According to IEEE Std. C37.118–2005, the total vectorerror (TVE) needs to be less than 1% which corresponds to amaximum time error of for a 60 Hz system. After theloss of GPS signal, GPS will enter backup clock operation mode

Page 3: Reliability Evalution Using Monte Carlo Tree Method

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

ZHANG AND CHAN: RELIABILITY EVALUATION OF PHASOR MEASUREMENT UNIT 3

Fig. 3. Operation mechanism of GPS module [8].

Fig. 4. Sub-DFT reliability model of GPS module.

and synchronous phasor from PMU can maintain its synchro-nization accuracy for up to 26 h depending on the other sourcesof error contributed to the TVE before the recovery of GPSsignal.Fig. 4 shows the sub-DFT reliability model of GPS module in

which , and denote the failure rate of crystaloscillator, GPS receiver, and backup clock operation mode, re-spectively, and denotes the repair rate of crystaloscillator and GPS receiver, respectively, and denotes theprobability of unsuccessful operation of function switcher.In sub-DFT reliability model of GPS module, two dynamic

logic gates FDEP are used to depict the functional dependencerelationship between crystal oscillator failure and synchroniza-tion failure and between function switcher failure and backupclock mode failure, and a PAND gate is used to depict the eventssequence relationship between GPS receiver failure and GPS re-placement failure.

C. Sub-DFT Reliability Model of CPU Module

CPU module consists of hardware and software. Hardware isthe carrier of software to accomplish the designed PMU func-tions such as phasor calculation, frequency estimation, and so

Fig. 5. Sub-DFT reliability model of CPU module.

on. The reliability evaluation of software is totally differentfrom that of hardware. Firstly, the failure of hardware is associ-ated with the physical failure, but the failure of software mainlyassociates with the failure of its function because of the codeerror. Secondly, due to aging, reliability of hardware will grad-ually decrease while the reliability of software will increase dueto the unceasing optimization of its code.Among all the reliability models of software, logarithmic ex-

ponential model is commonly used to analyze the reliability ofsoftware [21]. The failure rate of software is given by

(1)

where is the initial failure rate, is the failure decay param-eter, and is the number of failures found.Fig. 5 shows the sub-DFT reliability model of CPUmodule in

which , , and denote the failure rate and repairrate of software and hardware in CPU module.In sub-DFT reliability model of CPUmodule, basic logic gate

OR is used to depict the relationship between software and hard-ware in CPU module.

D. Sub-DFT Reliability Model of Communication Module andPower Source Module

In PMU, there are two parallel communication ports whichare spare mutually. There is also a standby power source moduleto improve the reliability of power supply.The sub-DFT reliability models of communication module

and power source module are described in Fig. 6, in which, , , , , , and de-

note the failure rate and repair rate of network ports in commu-nication module and two alternate power source modules.Like the sub-DFT reliability models of data collection

module, dynamic logic gate CSP is used to depict the alternaterelationship between two communication ports in communica-tion module and between two power source modules.

E. Complete DFT Reliability Model of PMU

It can be seen from Fig. 1 that the failure of any functionmodule will result in PMU failure. From a reliability point ofview, all the modules are called in series. Therefore, basic logicgate OR is used to combine all the sub-DFT reliability modelsassociated with the seven function modules of PMU to obtainthe complete DFT reliability model of PMU as shown Fig. 7.

Page 4: Reliability Evalution Using Monte Carlo Tree Method

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

4 IEEE TRANSACTIONS ON SMART GRID

Fig. 6. Sub-DFT reliability models of communication module and powersource module.

Once the complete DFT reliability model of PMU is con-structed, Monte Carlo simulation would be used to determinethe reliability indices of PMU accurately.

IV. MONTE CARLO ANALYSIS OF RELIABILITY MODEL OFPMU

Monte Carlo simulation approach is one of the most effectivemethods for DFT analysis. Here it is adopted for evaluating thereliability indices of PMU using the DFT reliability model out-lined in Section III. It simulates the actual process and randombehavior of PMU. The following are the procedures of the pro-posed MCDFT reliability analysis for PMU.Step 1: According to the probability density functions (PDF)

for time to failure and repair of all basic componentsin PMU, direct sampling method is used to obtainthe failure time and repair time of the basic compo-nents in PMU as a series of real experiments.

Step 2: Based on the logical relationship between the basiccomponents, state-time diagrams of logic gates areused to depict the duration of operation and failurestates of the top event. Through the state-time di-agrams of various logic gates in the proposed relia-bility model, the failure time and repair time of PMUcan be calculated.

Step 3: Calculate the convergence factor to determinewhether the Monte Carlo simulation result is closeenough to the real value or not. If the convergencefactor satisfies the predefined threshold, step 4 willproceed; if not, the simulation result will be storedand the simulation will restart from step 1.

Step 4: Calculate the reliability indices of PMU.

A. Convergence Assessment

Monte Carlo simulation approach is a computer simulationmethod based on probability theory. In the solution process ofthe reliability indices of PMU, the convergence of the MonteCarlo simulation results can be used to assess the precision ofsimulation results. Here, standard error is used to evaluate theconvergence. The convergence factor is defined as

(2)

where

(3)

Fig. 7. Complete DFT reliability model of PMU.

(4)

(5)

is the failure count of PMU in simulation, is prespecifiedfraction, is the failure time of PMU in one simulation,is the variance of failure time of PMU, and is the standarderror of failure time of PMU.

B. Reliability Indices of PMU

Reliability indices of PMU are those parameters which canbe used to indicate the reliability degree of PMU and analyzethe best reliability improvement measures of PMU. In PMU re-liability evaluation, the reliability indices mainly include meantime between failures (MTBF), mean time to repair (MTTR),availability, unavailability, and component importance indicesincluding basic cell importance and basic cell module impor-tance.MTBF is the mean working time between two failures for

PMU. MTTR is the mean time for the repair of faulted PMU.MTBF and MTTR can be expressed by the method of MonteCarlo simulation approach, respectively:

(6)

where and are the failure time and repair time of PMU,respectively, in the Monte Carlo simulation.According to the MTBF and MTTR of PMU, the availability

and unavailability of PMU can be described as

(7)

Component importance indices are used to arrange the com-ponents in order of increasing or decreasing importance. Amongall the existing component importance indices, the failure crit-icality importance index [22], [23] is best suited for the MonteCarlo simulation approach adopted in this paper as no extraMonte Carlo simulation cycle would be needed. In failure criti-cality importance indices, the basic cell importance ofcomponent reflects the probability of the failure of PMU

Page 5: Reliability Evalution Using Monte Carlo Tree Method

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

ZHANG AND CHAN: RELIABILITY EVALUATION OF PHASOR MEASUREMENT UNIT 5

Fig. 8. Flow chart of reliability evaluation of PMU.

arising from the failure of component can be definedas

(8)

where is failure count of PMU caused by the failure ofbasic component , and is the failure count of component .The basic cell importance of component mainly depends onthe logic location of component in the DFT based reliabilitymodel of PMU.Another key index in failure criticality importance indices is

the basic cell module importance , which reflects theproportion of the failure count of PMU arising from the failureof component in all failure counts of PMU. can beexpressed as

(9)

where is the failure count of PMU. Components with highbasic cell module importance would be the components whichhave high impact on the reliability of PMU. The basic cellmodule importance of component mainly depends on thefailure rate of component .

C. Reliability Evaluation Flow Chart of PMU

Flow chart of reliability evaluation of PMU using the pro-posed MCDFT reliability evaluation method is shown in Fig. 8where indicates the count of down state (failure) of PMUin simulation, indicates the preset convergence criterion ofMonte Carlo simulation, indicates the array of failure timeof PMU, indicates the array of repair time of PMU. Someequations are included to describe the failure and repair timeof PMU and its function modules in Monte Carlo simulation.For example, “ ” means thatthe failure time and repair time of communication module (M6)are evaluated through the Monte Carlo simulation of CSP gate

Page 6: Reliability Evalution Using Monte Carlo Tree Method

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

6 IEEE TRANSACTIONS ON SMART GRID

TABLE IRELIABILITY PARAMETERS OF BASIC COMPONENTS IN PMU [24], [25]

which is used to construct the sub-DFT reliability model ofcommunication module, “ ”means that failure and repair time of GPS module (M4) are de-termined by those of crystal oscillator.In the start of simulation, parameters , , and are

firstly initialized. Secondly,Monte Carlo samples the failure andrepair time of each basic component in PMU. Thirdly, failureand repair time of the seven function modules in PMU are eval-uated. Fourthly, one set of failure and repair time of PMU areevaluated and stored in array and , then PMU failure countwill be incremented and convergence analysis will be conductedusing array and to determine whether reliability indicesof PMU have reached pre-set accuracy or not. If Monte Carlosimulation result has reached preset accuracy, reliability indicesof PMU can be calculated; otherwise, another round of MonteCarlo simulation will be conducted until the convergence crite-rion is satisfied.

V. NUMERICAL STUDY

Table I lists the reliability parameters of the basic componentsin PMU adopted in this paper. They are either collected or de-rived from [24], [25].

A. Convergence Assessment

Fig. 9 plots the convergence factor against Monte Carlosimulation count. It is clear that the convergence factorremains almost unchanged when the simulation count is over200 000. The corresponding convergence factor 0.25% wastherefore selected as the termination condition for Monte Carlosimulations for ensuring the convergence and accuracy of thereliability indices in evaluating the reliability of PMU in thefollowing studies.

B. Reliability Indices of PMU

Table II lists the reliability indices evaluated with the pro-posedMCDFT reliability evaluation method. It can be seen that:1) Availability of PMU reaches 0.9982, which means PMUis unavailable about 15.8 h/yr.

Fig. 9. Convergence factor against Monte Carlo simulation count.

TABLE IIRELIABILITY INDICES OF PMU AND SEVEN FUNCTION MODULES IN PMU

TABLE IIICOMPARISON OF RELIABILITY INDICES OF PMU FROM DIFFERENT METHODS

2) Among the seven functionmodules of PMU, availabilityof GPS module (M4) and CPU module (M5) have thehighest impact on the availability of PMU, and avail-ability of other modules is extremely high due to theirredundancy design.

Similar PMUs supplied by different manufacturers do havedifferent reliability. The differences are mainly caused by thedifferent reliability parameters of their basic components, es-pecially the basic components in GPS module (M4) and CPUmodule (M5). Here, two cases with different sets of reliabilityparameters of basic components in PMU are used to show howdifferent the reliability indices could be in different makes ofPMUs. In Case 1, reliability parameters of basic componentsin PMU are same as listed in Table I. In Case 2, failure rate ofcomponents in GPS module (M4) and CPU module (M5) is fivetimes larger while failure rate of components in other modulesremains unchanged. The validity and advantage of proposedMCDFT method are further checked in Table III with a compar-ison of the reliability indices of PMU for the above two casesacquired with Markov [8] and the proposed MCDFT method.In both Case 1 and 2, there are differences between reliability

indices of PMU acquired by the two methods. Those differencesare mainly due to the fact that more accurate reliability model of

Page 7: Reliability Evalution Using Monte Carlo Tree Method

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

ZHANG AND CHAN: RELIABILITY EVALUATION OF PHASOR MEASUREMENT UNIT 7

TABLE IVIMPORTANCE ANALYSIS ON FUNCTION MODULES IN PMU

TABLE VIMPORTANCE ANALYSIS ON BASIC COMPONENTS IN GPS AND CPU MODULES

PMU is constructed and multiple fault patterns rather than onlysingle fault pattern were considered in the proposed MCDFTmethod.In Case 1, because of the low failure rate of PMU compo-

nents, there are just only a few multiple fault patterns in all thefailure patterns of PMU. The difference between availability ofPMU acquired by the two methods is small, ,i.e., about 0.8 h/yr for the unavailable time. This concludes thatthe results obtained with the proposed MCDFT reliability eval-uation method is in-line with the benchmarkingMarkov methodfor PMU with low failure rate.In Case 2, there are a significant number of multiple fault

patterns in all the failure patterns of PMU because of the highfailure rate of PMU components. The difference between avail-ability of PMU acquired by the twomethods is expectedly large,

, whichmeans the difference between the un-available time is up to 12.3 h/yr. This indicates that the proposedMCDFT method is superior as it could depict the complex logicrelationship between PMU components in more detail and dealwith multiple fault patterns.

C. Importance Analysis

As a complex system, PMU consists of seven function mod-ules which in turn composes of numerous basic componentswhose reliability degree have different impact on the reliabilityof PMU. In order to find out the most critical basic componentswhich would have high impact on the reliability of PMU andits improvement, importance analysis of basic components inPMU should be conducted.Importance analysis is first conducted on the seven function

modules, and then importance analysis is conducted on the basiccomponents in the function modules with high impact on thereliability of PMU. Table IV and Table V summarize the results.

It can be seen from Table IV that the basic cell importance ofall function modules is 1 because of the series relationship be-tween PMU and its seven function modules, and GPS (M4) andCPU modules (M5), which possess high basic cell module im-portance, have high impact on the reliability of PMU. The sameconclusion has also been reached from the analysis of Table II.GPS and CPUmodules are the two function modules which willbe analyzed further on their basic components.Table V summaries the importance analysis conducted on the

basic components in the GPS and CPU modules. When GPSreceiver fails, GPS module will enter backup clock operationmode and PMU will not fail immediately, so the basic cell im-portance of GPS receiver is just 42.2057%which is smaller thanthe basic cell importance of the other basic components in GPSand CPUmodules. Due to the high failure rate, GPS receiver hasthe highest basic cell module importance, which indicates thatGPS receiver has the highest impact on the reliability of PMU.Once the basic components which have high impact on PMU

reliability have been identified, two measures could be appliedto improve the reliability of PMU: 1) Increase the reliability ofbasic components with high impact on the reliability of PMU.This measure can be achieved through either reducing thefailure rate or raising the repair rate of the basic components.2) Adopt redundancy design of the basic components withhigh impact on the reliability of PMU. Though both measurescan improve the reliability of PMU, their effectiveness andeconomy are different.

D. Sensitivity Analysis

The effectiveness and economy of the first PMU reliabilityimprovement measure mentioned in previous section can beassessed with the help of sensitivity analysis. The changes ofavailability of PMU with the change of failure rate or repair rateof the basic components in GPS and CPU modules are plottedin Fig. 10.It can be seen that the availability of PMU will increase lin-

early with the decrease of the failure rate of basic componentsin GPS and CPU modules. However, the availability of PMUwill not reach 1 even if the failure rate decreases to zero whichis technically infeasible. Also, the availability of PMU will besaturated when the repair rate is higher than a certain value.

E. Redundancy Design Analysis

Cold standby components were added to the GPS and CPUmodules to test the effectiveness of the second PMU reliabilityimprovement measure. The change of reliability of PMU withthe redundancy design of basic components in GPS and CPUmodules is shown in Table VI. Availability of PMU doesimprove significantly with redundancy added. The biggestimprovement would be the redundancy of GPS receiver whichhas the highest basic cell module importance.Compared with the improvement shown in the sensitivity

analysis done in previous section for the first PMU reliabilityimprovement measure, the secondmeasure with redundancy de-sign on GPS receiver and CPU hardware is more practical andcost effective with larger PMU reliability improvement.

Page 8: Reliability Evalution Using Monte Carlo Tree Method

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

8 IEEE TRANSACTIONS ON SMART GRID

Fig. 10. Sensitivity analysis of the basic components in GPS and CPU mod-ules.

TABLE VICHANGE OF RELIABILITY INDICES OF PMU WITH REDUNDANCY DESIGN OF

BASIC COMPONENTS IN GPS AND CPU MODULE

VI. APPLICATION OF PMU RELIABILITY IN POWER SYSTEMSTABILITY ANALYSIS AND CONTROL

Reliability of PMU must be considered in any WAMS basedstability analysis and control schemes whose reliability needs tobe guaranteed prior to live operation. Here, a self-adaptive wide-area damping control (SAWADC) scheme proposed in [26] istaken as an example to demonstrate how the availability of PMUis utilized.

Fig. 11. Hierarchy of self-adaptive wide-area damping control scheme.

Fig. 11 shows the hierarchy of SAWADC scheme applied onthe New England 10-generator test system. Based on the geo-graphical distribution, generators are grouped into three clustersand three phasor data concentrators (PDC) are required to col-lect all the PMUs data. Ten generator rotor speeds are measuredby ten PMUs first, and then via the corresponding communi-cation link, these speeds are transmitted to the PDCs and thecontrol center where SAWADC scheme is operated.In order to guarantee the availability of this SAWADC

scheme, these ten PMUs, thirteen communication links fromPMUs to PDC and from PDC to control center, three PDCs,and control center should be available. From the perspective ofreliability, these devices are in series and the availability of thisdamping control schemes can be calculated by the product oftheir availability [7]. The availability of this SAWADC schemein IEEE 10-generator test system would be

(10)

where and are equal to 1, ten PMUs andthirteen communication links are uniform with[9]. With the PMU availability, , obtained inCase 1 shown in Table III, the availability of this SAWADCscheme, , is 0.9695.

VII. CONCLUSION

PMU is the core of WAMS which is one of the key systemcomponents in smart grid. Due to the large number of PMUsin modern WAMS, reliability of PMU plays an important rolein the reliability of WAMS. In this paper, a comprehensiveMCDFT reliability evaluation method is proposed to evaluatethe reliability of PMU. DFT modeling method is used toconstruct an accurate reliability model of PMU, in which thecomplex dynamic relationship between the basic componentsand PMU can be accurately depicted with multiple fault pat-terns considered. Monte Carlo simulation approach is used toanalyze this DFT reliability model to evaluate various relia-bility indices including important indices of basic componentswhich are very useful to the reliability improvement of PMUand cannot be easily evaluated by other reliability evaluationmethods. In numerical study, comparison of reliability indicesof PMU in two cases acquired by two reliability evaluation

Page 9: Reliability Evalution Using Monte Carlo Tree Method

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

ZHANG AND CHAN: RELIABILITY EVALUATION OF PHASOR MEASUREMENT UNIT 9

methods shows that the proposed MCDFT reliability evaluationmethod can produce more accurate reliability indices of PMUdue to the more accurate reliability model, especially when thefailure rates of PMU components are very high. Importanceindices of components in PMU are analyzed to show thatGPS receiver and CPU hardware have high impact on thereliability of PMU. Sensitivity and redundancy design analysisare conducted and found that the redundancy design of GPSreceiver and CPU hardware would be the better measure forimproving the reliability of PMU. A simple example based on aself-adaptive wide-area damping control scheme has also beengiven to show the application of the proposed PMU reliabilityevaluation method.

REFERENCES

[1] J. Bertsch et al., “Wide-area protection and power system utilization,”Proc. IEEE, vol. 93, no. 5, pp. 997–1003, May 2005.

[2] WECC Synchro-Phasor Project Whitepaper, Version 3.0 2009.[3] Y. Wang, W. Y. Li, and J. P. Lu, “Reliability analysis of wide-area

measurement system,” IEEE Trans. Power Del., vol. 25, no. 3, pp.1483–1491, Jul. 2010.

[4] Z. H. Dai, Z. P. Wang, and Y. J. Jiao, “Reliability evaluation of thecommunication network in wide-area protection,” IEEE Trans. PowerDel., unpublished.

[5] S. R. Samantaray, I. Kamwa, and G. Joos, “Ensemble decision treesfor phasor measurement unit-based wide-area security assessment inthe operations time frame,” IET Gener. Transm. Distrib., vol. 4, no.12, pp. 1334–1348, Dec. 2010.

[6] A. G. Phadke and R. M. De Moraes, “The wide-world of wide-areameasurement,” IEEE Power Energy Mag., vol. 6, no. 5, pp. 52–65,Aug. 2008.

[7] M. Zima et al., “Design aspects for wide-area monitoring and controlsystems,” Proc. IEEE, vol. 93, no. 5, pp. 980–996, May 2005.

[8] Y. Wang, W. Y. Li, and J. P. Lu, “Reliability analysis of phasor mea-surement unit using hierarchical Markov modeling,” Electr. PowerCompon. Syst., vol. 37, pp. 517–532, 2009.

[9] F. Aminifar et al., “Reliability modeling of PMUs using fuzzy sets,”IEEE Trans. Power Del., vol. 25, no. 4, pp. 2384–2391, Oct. 2010.

[10] G. Merle et al., “Probabilistic algebraic analysis of fault trees with pri-ority dynamic gates and repeated events,” IEEE Trans. Rel., vol. 59,no. 1, pp. 250–261, Mar. 2010.

[11] Y. Ren and J. B. Dugan, “Design of reliable systems using static &dynamic fault trees,” IEEE Trans. Rel., vol. 47, no. 3, pp. 234–244,Sep. 1998.

[12] J. Yuan et al., “Research on reliability modeling of complex systembased on dynamic fault tree,” in Proc. Technol. Innov. Conf., Oct. 2009,pp. 1–5.

[13] J. A. Buzacott, “Network approaches to finding the reliability of re-pairable systems,” IEEE Trans. Rel., vol. 19, no. 4, pp. 140–146, Aug.2009.

[14] R. N. Allan, R. Billinton, and M. F. De Oliveira, “An efficient algo-rithm for deducing the minimal cuts and reliability indices of a generalnetwork configuration,” IEEE Trans. Rel., vol. 25, no. 4, pp. 226–233,Aug. 2009.

[15] L. Goel, P. A. Viswanath, and P.Wang, “Monte Carlo simulation basedreliability evaluation in a multi-bilateral contracts market,” IEE Proc.Gener., Transm., Distrib., vol. 151, no. 6, pp. 728–734, Dec. 2004.

[16] D. Lieber, A. Nemirovskii, and R. Y. Rubinstein, “A fast Monte Carlomethod for evaluating reliability indexes,” IEEE Trans. Rel., vol. 48,no. 3, pp. 256–261, Aug. 2002.

[17] R. Billinton and L. Gan, “Monte Carlo simulation model for multiareageneration system reliability studies,” IEE Proc. Gener., Transm., Dis-trib., vol. 140, no. 6, pp. 532–538, Aug. 2002.

[18] A. G. Phadke and J. S. Thorp, Synchronized Phasor Measurements andTheir Applications. New York: Springer, 2008.

[19] D. Itagaki, K. Ohashi, I. Shuto, and H. Ito, “Field experience andassessment of GPS signal receiving and distribution system for syn-chronizing power system protection, control and monitoring,” in Proc.IEEE Power India Conf., Jun. 2006.

[20] S. Zhong, J. W. Fu, and X. R. Wang, “Development of high qualitybackup clock for synchronized phasor measurement unit,” Autom.Electr. Power Syst., vol. 30, no. 1, pp. 68–72, Jan. 2006.

[21] J. D. Musa, A. Iannino, and K. Okumoto, Software Reliability: Mea-surement, Prediction, Application. New York: McGraw-Hill, 1987.

[22] W. Wang, J. Loman, and P. Vassiliou, “Reliability importance of com-ponents in a complex system,” in Proc. Annu. Symp. Rel. Maintain-ability, Aug. 2004, pp. 6–11.

[23] P. Hilber and L. Bertling, “A method for extracting reliability impor-tance indices from reliability simulations of electrical networks,” inProc. 15th Power Syst. Comput. Conf. (PSCC), Liege, Belgium, Aug.2005.

[24] M. S. Ding, G. Wang, and X. H. Li, “Reliability analysis of digitalrelay,” in Proc. 8th IEE Int. Conf. Developments Power Syst. Protec-tion, Apr. 2004, vol. 1, pp. 268–271.

[25] A. Antonopoulos, J. J. O’Reilly, and P. Lane, “A framework for theavailability assessment of SDH transport networks,” in Proc. 2nd IEEESymp. Comput. Commun., Jul. 1997, pp. 666–670.

[26] P. Zhang et al., “Self-adaptive wide-area damping control based on SSIand WAMS,” in Proc. Int. Conf. Electr. Eng., Jul. 2010, pp. 11–14.

Peng Zhang received the B.Eng. and M.Eng.degrees in electrical engineering from ShandongUniversity, Jinan, China, in 2004 and 2007, respec-tively. He is currently working toward the Ph.D.degree at the Department of Electrical Engineeringin The Hong Kong Polytechnic University.His major research interests include wide-area

monitoring system and its application in powersystem stability analysis and control.

Ka Wing Chan (M’98) received the B.Sc. (Hons)and Ph.D. degrees in electronic and electrical engi-neering from the University of Bath, Bath, U.K., in1988 and 1992, respectively.He is currently an Assistant Professor in the

Department of Electrical Engineering of the HongKong Polytechnic University. His general researchinterests include power system stability, analysis,control, security and optimization, real-time sim-ulation of power system transients, distributedand parallel processing, and artificial intelligence

techniques.