13
This document is downloaded from DR‑NTU (https://dr.ntu.edu.sg) Nanyang Technological University, Singapore. Imperfect predictive maintenance model for multi‑state systems with multiple failure modes and element failure dependency Tan, Cher Ming; Raghavan, Nagarajan 2010 Tan, C. M., & Raghavan, N. (2010). Imperfect predictive maintenance model for multi‑state systems with multiple failure modes and element failure dependency. Prognostics and System Health Management Conference (pp. 1‑12) Macau. https://hdl.handle.net/10356/93527 https://doi.org/10.1109/PHM.2010.5414594 © 2010 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder. http://www.ieee.org/portal/site This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder. Downloaded on 31 May 2021 21:09:31 SGT

Imperfect predictive maintenance model for multi‑state systems with multiple failure ... · 2020. 3. 7. · reliability dependent elements and multiple failure modes. The system

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

  • This document is downloaded from DR‑NTU (https://dr.ntu.edu.sg)Nanyang Technological University, Singapore.

    Imperfect predictive maintenance model formulti‑state systems with multiple failure modesand element failure dependency

    Tan, Cher Ming; Raghavan, Nagarajan

    2010

    Tan, C. M., & Raghavan, N. (2010). Imperfect predictive maintenance model for multi‑statesystems with multiple failure modes and element failure dependency. Prognostics andSystem Health Management Conference (pp. 1‑12) Macau.

    https://hdl.handle.net/10356/93527

    https://doi.org/10.1109/PHM.2010.5414594

    © 2010 IEEE. Personal use of this material is permitted. However, permission toreprint/republish this material for advertising or promotional purposes or for creating newcollective works for resale or redistribution to servers or lists, or to reuse any copyrightedcomponent of this work in other works must be obtained from the IEEE. This material ispresented to ensure timely dissemination of scholarly and technical work. Copyright andall rights therein are retained by authors or by other copyright holders. All persons copyingthis information are expected to adhere to the terms and constraints invoked by eachauthor's copyright. In most cases, these works may not be reposted without the explicitpermission of the copyright holder. http://www.ieee.org/portal/site This material ispresented to ensure timely dissemination of scholarly and technical work. Copyright andall rights therein are retained by authors or by other copyright holders. All persons copyingthis information are expected to adhere to the terms and constraints invoked by eachauthor's copyright. In most cases, these works may not be reposted without the explicitpermission of the copyright holder.

    Downloaded on 31 May 2021 21:09:31 SGT

  • Imperfect Predictive Maintenance Model for Multi-State Systems withMultiple Failure Modes and Element Failure Dependency

    Cher Ming Tan I,2, *and Nagarajan Raghavarr''Division of Circuits & Systems, School ofElectrical & Electronic Engineering, Nanyang Technological University.

    2Singapore Institute of Manufacturing Technology, A*STAR, Singapore - 638075.3Division of Microelectronics, School ofElectrical & Electronic Engineering, Nanyang Technological University.

    I,3BlockS2, Nanyang Avenue, Singapore - 639798.*Email : [email protected], Ph : (+65) 6790 4567, Fax: (+65) 6792 0415

    Abstract - The objective of this study is to develop a practicalstatistical model for imperfect predictive maintenancebased scheduling of multi-state systems (MSS) withreliability dependent elements and multiple failure modes.The system is modeled using a Markov state diagram andreliability analysis is performed using the UniversalGenerating Function (UGF) technique. The model issimulated for a case study of a power generation -transmission system. The various factors influencing thepredictive maintenance (PdM) policy such as maintenancequality and user threshold demand are examined and theimpact of the variation of these factors on systemperformance is quantitatively studied. The model is foundto be useful in determining downtime schedules andestimating times to replacement of an MSS under the PdMpolicy. The maintenance schedules are devised based on a"system-perspective" where failure times are estimated byanalyzing the overall performance distribution of thesystem. Simulation results of the model reveal that a slightimprovement in the "maintenance quality" can postponethe system replacement time by manifold. The consistencyin the quality of maintenance work with minimal varianceis also identified as a very important factor that enhancesthe system's future operational and downtime eventpredictability. Moreover, the studies reveal that in order toreduce the frequency of maintenance actions, it isnecessary to lower the minimum user expectations from thesystem, ensuring at the same time that the system stillperforms its intended function effectively. The modelproposed can be utilized to implement a PdM program inthe industry with a few modifications to suit the individualindustry's needs.

    I. INTRODUCTION

    Maintenance has evolved from the age-old ad hoc corrective(or reactive) maintenance [1] (CM) to preventive maintenance(PM) [2] and then to the presently popular predictivemaintenance (PdM) [3, 4]. However, it is well recognized thatboth the CM and PM are ineffective. In the case of CM, the"completely failed" system is highly degraded, makingmaintenance very difficult, time-consuming and expensive.Also, CM is associated with large and unpredictabledowntimes resulting in low mean availability, increased delays,larger inventory storage requirements and increased forgoneproduction losses. As for PM, the fixed downtime intervalsimply more-than-necessary repair frequency during the initialperiods of the system operation that could increase theprobability of maintenance-induced failures. On the otherhand, as the system ages and enters into its wear-out period,PM results in less-than-necessary repair frequency, therebyincreasing the probability of unanticipated catastrophic failuresand making PM similar to CM.

    In PdM, which is also referred to as condition-based PM [5],the maintenance schedule and frequency match the age orhealth of the system at all times, making the schedule nearlyoptimum, prolonging the time to replacement (TTR) as aconsequence. The expected times to future failure of a systemare estimated during each operational period based on thevariation pattern of its physical properties (conditionmonitoring) that are indicative of its state of degradation usingimplanted sensors, and the downtime schedule for eachoperation cycle is determined based on the estimated futurefailure times. Past research studies show that the averagesystem reliability (and yield), availability and mean systemperformance are the highest for PdM and the incurredmaintenance operation costs are the lowest [6]. The spare partrequirements and delay times are also reduced due to reliablepredictions of future downtime events.

    However, there are currently two main obstacles to thepractical implementation of the PdM policy. Firstly, there is nosimple concrete statistical model that PdM can be based upon.The past models developed are theoretical in their approachwith idealistic assumptions and fitting parameters, renderingthem unfit for practical real-world implementation. Forexample, in [7], it was proposed that the system being repairedcould be restored to either the "as-good-as-new" condition orthe "as-bad-as-old" condition with complementaryprobabilities, failing to account for the possibility that thesystem's restoration could be somewhere in between these twopossible extreme cases. Although the virtual age modelproposed by Kijima et. al. [8] to account for the imperfectrestoration helped overcome the above-mentioned problem, thedetermination of the effective age parameter 'a' in theproposed model is not given, making its implementationvague.

    Secondly, the implementation of PdM requires advancedmonitoring technologies, real-time data acquisition systemswith sophisticated data storage and speed requirements andsignal processing and filtering techniques [9], making theimplementation of PdM complex and expensive. However,with the advances in sensor technologies today, this difficultyis gradually overcome [10].

    In this work, we will focus on the first obstacle which is todevelop a comprehensive and practical statistical model forPdM. Imperfect maintenance will be considered in this workfor practical applications. This imperfect maintenance is a termfrequently used to refer to maintenance activities in which thefuture reliability and degradation trend of the system dependson the skill and quality of the current and previous repairworks performed. In other words, imperfect maintenanceaccounts for the impact of maintenance quality on the futurereliability of repairable systems. In reality, maintenance

    978-1-4244-4758-9/10/$26.00 © 2010 IEEE MU3006 2010 Prognostics& SystemHealthManagement Conference(PHM2010 Macau)

    Authorized licensed use limited to: Nanyang Technological University. Downloaded on March 02,2010 at 03:30:58 EST from IEEE Xplore. Restrictions apply.

  • strategies for most systems can be categorized under imperfectmaintenance because repair / replacement is typicallyperformed only for some of the components in the system;while there are other components which are degraded but notto the extent of needing a repair / replacement. Therefore, froma "system perspective", repair does not rejuvenate a system toits original zero degradation state, calling for the need to useimperfect maintenance models.

    The structure of this paper is as follows. Section II gives abrief review on the various existing models for imperfectmaintenance. Section III introduces the methodology for themulti-state system PdM modeling and the description of thesystem case study. Section IV describes the various resultsfrom the model simulation and fmally, a short summary of thework done and results achieved is presented in Section V.

    II. IMPERFECT MAINTENANCE

    Various models have been proposed for imperfectmaintenance in the past from different perspectives asreviewed in complete detail in [11]. Basically, there are fourclasses ofmodels developed so far.

    The first class of models was based on a probabilisticapproach [12] - [14] where it was assumed that the systemundergoes "perfect renewal" to "as-good-as-new" conditionwith a constant probability of p and "minimal repair" to "as-bad-as-old" condition with a probability of (J-p). Furtherenhancement to this probabilistic approach was to consider theprobabilities as time-varying functions, p(t) and [J-p(t)}, toaccount for the change in these values with the aging system'sdegradation [15, 16]. Makis and Jardine [17] further accountfor the probability that the repair is unsuccessful and causes acatastrophic complete system failure and the p(t) function wasmodified to p(n ,t) to describe the probability accounting forthe number of previous failures, n, undergone by the systemprior to the current one.

    The second class of models was based on the improvementfactor method where the system was analyzed by looking atthe failure rate. Certain models were proposed to reflect thereduction in failure rate after repair [18, 19]. The degree ofimprovement in the failure rate was called improvement factorand "failure rate" was used as the threshold reliability index.

    The third class of models was based on the age of thesystem. The most popular model in this class is known as thevirtual age model proposed by Kijima et al, [8, 20]. The virtualage of the system after the nth repair (Vn) is expressed as:V

    n=V

    n-

    l+a- X; where Xn is nth failure time, Vn-J is the virtual

    age after (n-I)" repair and "a" is the virtual age parameter (0 :Sa s 1). However, the method of estimating the parameter "a" isnot mentioned in the literatures. Another age-based modelcalled the proportional age setback model was proposed in[21] which is very similar to the virtual age model, except thatthe effects of equipment working conditions and surveillanceeffectiveness on imperfect maintenance andcorresponding age reduction are accounted for in addition tothe maintenance work quality.

    The fourth class of models was based on the systemdegradation where the system is considered to suffer randomshocks at variable intervals of time causing it to undergoprogressive increments of damage [22, 23]. When a threshold

    cumulative damage level is reached, the system is interpretedto have failed. The effect of the imperfect maintenance actionsis described by the degree of reduction of the cumulativesystem damage after repair as compared to that before repair.Wang et al. [24] - [27] treated imperfect maintenance bymodeling the decrease in system lifetime with the increase inthe number of repairs. They also modeled the time betweenmaintenance actions using a quasi-renewal process [26].

    There are other models to account for imperfectmaintenance, and readers may refer to references [14] and [28]for detailed information. However, all the above-mentionedmodels are focused on a binary system where the systemoperates in only two discrete performance states, viz."functional" or "non-functional". Also, most of the modelsassume a "single unit" system with negligible maintenanceduration and the impact of maintenance quality on the systemreliability is not included. For practical applications, multi-state systems should be considered [29, 30], and the study ofthe impact of maintenance quality is especially useful as itdirectly impacts the time to replacement of the system. It iswith the motivation to enable PdM model to be applicable in apractical environment that this work is produced.

    In this work, we analyze the system from a multi-stateperspective for a generic complex n-component system withany system structure (series, parallel, k-out-of-n structuresetc... ). Imperfect PdM is modeled by considering the effect ofthe mean and variance in the quality of maintenance workseparately. The impact of the skill of maintenance work andthe spare part product quality on the future reliability of asystem is modeled by a parameter called the RestorationFactor (RF) which describes the percentage recovery in thesystem performance for the new operation cycle, aftermaintenance, relative to the previous operation cycle of thesystem. Being a quality index, RF is assumed to have a normaldistribution with its mean (PRF) and standard deviation (aRF)giving a clear indication of the skill and consistency ofmaintenance work performed respectively. The modeldeveloped here is widely applicable to most industrial systemsand its application to a dependent multi-state system (MSS)with multiple failure modes is illustrated in this work.

    Although an imperfect maintenance policy for multi-statesystems (MSS) has been proposed earlier in [31] based on theproportional age setback model [21], the maintenance durationis assumed to be negligible, and the user threshold demand isassumed to be constant. Furthermore, no method for thedetermination of the time to replacement (TTR) of the systemis discussed. For practical applications, the model proposed inthis work considers the fmite maintenance duration and itsvariability, and the system replacement time is determinedusing a simple approach.

    The novelty of the work lies in characterizing the systemperformance variation in dependent and multiple failure modeMSS for different maintenance work quality standardsrepresented by the restoration factor (RF) distribution anddifferent user threshold demands (W). In other words, thesystem's performance capability is being examined from theuser's perspective. The system is modeled using a MarkovState Diagram in this work, which is found to be a good choicefor modeling complex systems [32]. Readers may refer to [33]for the earlier study of PdM applied to the simplest case of an

    978-1-4244-4758-9/10/$26.00 e 2010 IEEE MU3006 2010 Prognostics& SystemHealthManagement Conference(PHM2010 Macau)

    Authorized licensed use limited to: Nanyang Technological University. Downloaded on March 02,2010 at 03:30:58 EST from IEEE Xplore. Restrictions apply.

  • Fig. 2. Markov State Diagramfor the 3-elementpower generatorsystem.

    For example, if n observations on n different identicalsystems are made from their historical maintenance records

    .........•

    ELEMENT 3

    )..13(3)

    r······· 'ELEMENT 2

    .......•

    ;......•

    The system in this study is modeled such that Element 2 isdependent on Element 1 and Element 3 is influenced by twofailure modes (FM). The symbol p in Table I denotes theconditional probability of occurrence of FM B given that FMA has occurred. In other words, FM A is assumed to be thefailure mode which will defmitely occur while FM B istriggered depending on FM A.

    Each performance rate in Table I corresponds to a discreteamount of power processing capacity that every element of thegenerator system can process. For example, Element 3 has 3discrete states of performance. It could be processing power atits maximum capacity (performance) of g31 = 10 MW orintermediate capacity of g32 = 5 MW or at g33 = 0 MWimplying complete non-functional failure. Elements 1, 2 and 3each have 3 discrete states of performance, thus making up atotal of 3 x 3 x 3 = 27 discrete system performance rates.

    Under some reasonably general conditions, the failures of acomplex system can be shown to follow the exponentialdistribution even though the individual components in thesystem may follow other failure distributions [32]. Thus, if thepower generator system in Fig 2 is assumed to be complex, itwill follow the exponential failure distribution pattern, therebyjustifying the use of a Markov State Diagram, in which thedegradation rates are all time-independent constants.

    During the operation of the system, the elements transit fromone state of performance to another in a period of time. Usingthe concept of hazard rate, the inter-state transitions can bedescribed using the degradation rate, which is expressed as thenumber of such state transitions per year (unit time). Thehypothetical values assumed for the degradation rates of eachelement are shown in Table I. The values of these parametersin Table I for various state transitions may be extracted frompast maintenance data records and condition monitoring dataof previously operated similar systems using the standardMaximum Likelihood Estimate (MLE) procedure for theexponential distribution [34].

    ELEMENT 1

    III. METHODOLOGY & SYSTEM CASE STUDY

    r---------------------------------------------,, ,,: ~~~!....... :. . . ,, . . ,: ~. ~ (3)~:, . . ,, : ':........... ~--+J. ..... ~ · ,: : : :: : ~(2): .:, : :'. . .: ....•,, ,~---------------------------------------------~

    independent MSS with no element failure dependency andsingle failure mode.

    In this work, the power generator system is analyzed usingthe Markov process [29, 30]. Each element has its own markovstate diagram with different states of performance andcorresponding degradation rates (conventionally known as"failure rate") as shown in Fig 2. The parameters gij representthe t" discrete degraded state of performance of the lh elementin the system. The symbol liJ(m) is the degradation rate ofelement m where its performance degrades from the lh state tothe jth state. The operational lifespan of the system may beclassified into different operation cycles. The J(h operationcycle is defined as the operating time interval between the(k_J)th and J(h maintenance actions. Referring to the MarkovState diagram shown in Fig 2, the numerical values for thevarious states of performance of each element and thedegradation rate (l) are given in Table I.

    Fig. 1. Power generatorsystemtopologyexaminedin this case study.

    The system examined in this work is a power generatorsystem which is a flow transmission MSS. The topology of thesystem consists of 3 generator elements as shown in Fig 1.Elements 1 and 2 are in parallel with each other and they arecollectively in series with element 3. The performance(degradation) index of the system is the power processingcapacity of each generator element expressed in the unit ofmegawatts (MW). In this study, element 2 is considered to bedependent on element 1 and element 3 degrades under theinfluence of two failure modes which may be dependent oneach other.

    There are two types of multi-state systems (MSS). One is thejlow transmission system [29] in which the performance(degradation) of the system is characterized and measured interms of its productivity or capacity. Typical examples are (a)hydraulic systems where performance is measured in terms ofvolume and massjlow rates (tons/min), (b) power systems withits power generating capacity and (c) continuous productionsystems with its rate ofproduction.

    The other type is the task processing system [30] in whichperformance is described in terms of processing speed orresponse time. Typical examples include server, control andother software systems where the performance index of interestis the speed of processing data and instructions, expressed inmega bits per second (mbps). The analysis of these two typesof MSS mentioned above are different owing to the differentproperties being examined in them and the different nature oftheir basic functions.

    978-1-4244-4758-9110/$26.00 e 2010 IEEE MU3006 2010 Prognostics& SystemHealth ManagementConference(PHM2010 Macau)

    Authorized licensed use limited to: Nanyang Technological University. Downloaded on March 02,2010 at 03:30:58 EST from IEEE Xplore. Restrictions apply.

  • and the duration (ti) for transition between consecutive statesof performance gi,j and gi,j+1 is measured, then the degradationrate ~, j+1 could be estimated using the exponential MLE as in(1) where t, is the time to degradation (TTD) of the lhobservation from statej to statej+1.

    As mentioned in Table I, the values of Al,2(3) and Al,3(3) areassumed to depend on the failure modes A and B and theconditional probability, p. If FM A causes FM B, thenelement 3 is under the influence of two failure modes and it ismost likely to undergo a catastrophic breakdown from state g31to state g33. The probability for such a catastrophic failure isthe conditional probability (P) of FM B occurring given thatFM A is present. In contrast, when FM B is not triggered byFM A with a corresponding probability of (1 - p), thenelement 3 degrades gradually from state g31 to g32 andeventually to g33, since it is under the influence of only onefailure mode. Based on the above proposition, the expressionsfor Al,}3) and Al,3(3) in terms of AA' AB and p may be expressedas in (2) and (3). The terms AA and AB refer to the degradationrates of element 3 with respect to failure modes A and Brespectively.

    Aj,j+l =

    AI,/ 3) = (1- p). AA

    AI ,3 (3) = P'(AA +AB )

    (1)

    (2)

    (3)

    change according to the particular system. Therefore, equations(2) and (3) are not standard expressions. Rather, they aremodels used to describe the nature of failure of the generatorsystem.

    There are two important factors which affect the predictivemaintenance (PdM) policy. They are the restoration factor (RF)and the user threshold demand (W). Since the impact of thesetwo factors on PdM will be investigated in this work, let usnow discuss these two factors in detail.

    A. PdM Model Parameters

    1) Restoration Factor (RF)

    To study the impact of the quality of a maintenance work onsystem performance quantitatively under the PdM policy, anew term called the Restoration Factor (RF) is introduced. Itrepresents the percentage recovery of the system's meanperformance in the Ith operation cycle (after the (k-1)thmaintenance action) relative to its mean performance duringthe previous (k-L)" operation cycle. The Ith maintenanceaction (cycle) refers to the downtime duration between thesuccessive Ith and (k+l)th operation cycles. The better themaintaining quality is, the higher the RF.

    Based on the defmition ofRF, the system mean performanceduring the Ith operation cycle, denoted by Gk(t) , may beexpressed in terms of the corresponding mean performance inthe (k-I )" operation cycle, Gk-1(t) using the restoration factor ofthe preceding (k_1)th maintenance cycle, RF[k-1] as follows:

    (4)

    TABLEINUMERICAL VALUES OF PERFORMANCE STATES AND TRANSITION

    DEGRADATION RATES IN THE MARKOV MODEL

    It is to be noted that the model above is based on theobserved system dynamics of the generator being examined.Different systems have different natures and modes of failureand in each case, the model in equations (2) and (3) will

    Element(#)

    2

    3

    PerformanceState (MW)

    gll = 6.0gI2 = 3.5g13 = 0.0

    g2I = 4.0g22 = 2.5g23 = 0.0

    g3I = 10.0g32 = 5.0g33 = 0.0

    DegradationRates (yr")

    Al 2(1) = 0.020A2:3(1) = 0.030

    Al 2(2) = 0.035~:3(2) = 0.045

    AA = 0.050AB = 0.080A2,3(3) = 0.040

    Remarks

    Element 1 has 3distinct states ofperformance. It hasonly one failure modeand it is independentof other elements.

    Element 2performancedistribution dependenton Element 1performance state.

    Element 3 has twofailure modes {A, B}where failure mode Aalways exist and theconditional probabilityof occurrence of FM Bgiven that FM A existsis p. The values ofA {3) and A (3) depend1,2 1,3on AA' AB andp.

    An RF value of 100% represents the system beingmaintained to an as-good-as-new (renewal process) condition.However, this is practically unachievable unless the system isreplaced (expensive) instead of being maintained (repaired)upon failure.

    The RF value is not a constant throughout the system lifecycle. It is a random variable that can vary during everymaintenance action because it is influenced by various factorssuch as the concentration and attentiveness of the maintenancepersonnel (state of mind), the ability to accurately locate thepoint of defect due to the complexity of the failure, availabilityof appropriate spare parts and maintenance tools etc. Also, thesame system might be maintained by different personnelduring different downtimes having different capabilities inperforming the same maintenance work. Moreover, at thesystem-level, not all the components undergo a repair /replacement during the maintenance task. This variability inthe quality of maintenance work and confmement of repair tocertain critical components of the system renders RF to beconsidered as a random variable.

    The RF parameter is modeled as a random variablefollowing a Normal Distribution as given by (5) in this work.As RF is always positive, it should be modeled by a statisticaldistribution with a positive valued random variable. However,the ensemble of the many RF values over a period of operationtime and a large number of equipment maintenance actionsjustify its representation by a normal distribution with largemean value and moderately low standard deviation, such thatthe probability of having negative value is negligible, as thetail of the normal p.d.f narrows down around RF = 0, making

    978-1-4244-4758-9110/$26.00 e 2010 IEEE MU3006 2010 Prognostics& SystemHealthManagement Conference(PHM2010 Macau)

    Authorized licensed use limited to: Nanyang Technological University. Downloaded on March 02,2010 at 03:30:58 EST from IEEE Xplore. Restrictions apply.

  • the probability, Pr(RF < 0) very small (rare event). In (5), PRFis the mean value of the RF distribution and aRF is thecorresponding standard deviation.

    As most industries are adopting the conventional CM policy,past maintenance history of downtime events for previouslyfailed systems could be used to estimate RF values and thenuse them to predict the parameters of the RF distribution. TheRF estimate for a particular Jth maintenance action, denoted byRFk, can be calculated using (6) where Tk+1 and T, representthe durations of the (k+l)th and Jth operation cyclesrespectively. Quantities ti;1 and tk refer to the absolute timeinstance of the (k+1)th and Jth failures respectively, relative tothe initial time of system operation (t =0); ik and ik-l are thedowntime durations for the Jth and (k-I)" repair actionsrespectively. Similar values of RF can be estimated for allpossible k values and for many such identical systems. Theobtained RF data set could then be used to compute PRF and aRFparameters for the RF distribution using the standard statisticalmean and variance expressions. Based on the estimated valuesofPRF and aRF from past maintenance records, the performancetrends for future operation cycles of new similar systems canbe modeled based on Eq. (4) where RF[k-l] is a randomnumber representing the expected maintenance quality for the(k-I)" maintenance action, which is generated from the normaldistribution with parameters PRF and aRF using the randomnumber generator function.

    (7)

    2) User Threshold Demand (W)

    In this study, the system's performance is analyzed from theuser's perspective. The user of the system sets a minimumexpectation from the system, called the user threshold demand,represented as W. During each operation cycle, as the system'smean performance G(t) drops below the user set demand of W,the system is "interpreted" to have ''failed'' from the user'sperspective even though the system might not have physicallyfailed in reality, i.e, the user's "dissatisfaction" is considered as"system failure" in this case. The higher the user thresholddemand (W), the sooner will be the time to failure (TTF) andreplacement (TTR) of the system as seen in Fig 3. The time tofailure for a system during the r operation cycle is estimatedby solving (7).

    B. Markov Chain Diagram

    Having described the parameters RF and W, we nowdevelop and analyze the Markov model for the system. Fromthe Markov State diagram for each element shown in Fig 2, theset of simultaneous differential equations describing the stateprobability expressions may be extracted. They are listed inequations (8) - (16) below. Every state is described by itscorresponding differential equation. In Fig 2, since eachelement has 3 states, there are a total of 3 + 3 + 3 = 9dependent differential equations to be solved simultaneously,with the following initial conditions: Pl1(O) = P2I(O) = P3I(O) =1, P12(O) = P13(O) = P22(O) = P23(O) = P32(O) = P33(O) = O. Theseinitial conditions correspond to the probability of the system'selements in their most efficient states of performance {glh g2hg31} being equal 100% at time t = O.

    (5)

    (6)

    This approach of computing the RF distribution parametersfrom past CM maintenance records is based on the assumptionthat the quality ofmaintenance in both CM and PdM policies issimilar because the nature of the maintenance work isessentially the same. Note that only hands-on field repair onthe system is termed as "maintenance" in this work. Minorcontrol signal adjustments to the automated system'sparameters is not considered as a "maintenance" activity as itdoes not involve or require any hands-on maintenancepersonnel skills.

    Time to Ftlil.re (lTF) - User De mtliN! d

    ............................

    . .···········r···············r···············

    t

    Fig. 3. Schematicillustrationof the relationship betweentime to failure (TTF)and user demand(W).

    PH"(t) =-~ 2(I) • PH(t) (8)

    "( ) ~ (I) A (I)P12 1 = + ,2 • PH (1)- 2,3 • P12 (1) (9)

    P13"(t) = +..1,2,/1) • Pl2(t) (10)

    P2:(t) =-~,2 (2) • P 21(t) (11)

    "( ) ~ (2) A (2)P22 1 = + ,2 • P21 (1)- 2,3 • P22 (1) (12)

    P2:(t) =+..1,2,/2) · P22(t) (13)

    P3:(t) =-(~,2 (3) + ~,3 (3) ). P31

    (t) (14)

    •( ) A (3) A (3)P32 1 = + 1,2 • P31(1)- 2,3 • P32(1) (15)

    "( ) A (3) ~ (3)P33 1 = + 2,3 • P32(1) + ,3 • P31(1) (16)

    The terms Pi,j(t) in the above equations denote the probabilitythat element i is in statej at any arbitrary time t ~ o.

    C. Universal Generating Function (UGF)

    The UGF methodology [30, 35] is an essential tool to obtainthe performance distribution of the overall system from theperformance distribution of the individual elements of thesystem. A performance distribution is a probability distributiontable listing the various states of performance of theelement/system and their corresponding time-varying state

    978-1-4244-4758-9/10/$26.00 e 2010 IEEE MU3006 2010 Prognostics& SystemHealthManagement Conference(PHM2010 Macau)

    Authorized licensed use limited to: Nanyang Technological University. Downloaded on March 02,2010 at 03:30:58 EST from IEEE Xplore. Restrictions apply.

  • probability expressions. The element performance distributionsfor all the 3 elements of the generator system are described inTable II based on the Markov analysis results in the previoussection.

    The system performance distribution needs to be obtainedfrom the individual element performance distributions in orderto characterize the system's reliability and performancevariation denoted by R(t) and G(t) respectively, which is ourfmal objective. This is made possible using the UniversalGenerating Function (UGF) represented as U(z), which is a z-transform based approach first proposed by Ushakov (1987)[36]. The UGF is an efficient tool for complex multi-statesystems (MSS) reliability assessment as it greatly reduces theproblem complexity and computational intensity bymodularizing a system into its components and analyzing eachcomponent of the system individually, thereby enabling acomplex problem to be broken into sub-problems each ofwhich can be solved separately, with ease.

    For the generator system in this work, the UGF methodreduces the total number of differential equations to only 3 + 3+ 3 = 9 in (8) - (16) as compared to using a single "overall-system" markov analysis which would have required amaximum of 3 x 3 x 3 = 27 differential equations,corresponding to the 27 discrete and distinct system states.

    As mentioned earlier, element 2 of the generator system isdependent on element 1. This dependency can be modeled insuch a way that the performance distribution of element 2depends on the performance state ofelement 1.

    TABLEIIPERFORMANCE DISTRIBUTION OF THE THREE ELEMENTS OF THE

    GENERATOR SYSTEM

    Equations (17) - (19) describe the conditional probabilitytheory used to determine the resultant state probabilityexpressions P21(t), P22(t) and P23(t) for dependent element 2.

    Using the UGF approach, the z-polynomial u-functions, u(z)for individual elements 1, 2 and 3 of the system may now beexpressed as follows,

    u l (z) = PH (t)zgll + P12 (t) zgI2

    + P13 (t) zg

    13 (20)

    u2(z) = P21 (t) zg21

    + P22 (t) zg22

    + P23 (t) zg23

    (21)

    u3(z) = P31 (t) zg31

    + P32 (t) zg32

    + P33 (t) zg33 (22)

    To formulate the system's overall performance distributionin terms of the individual element performances, a systemstructure function [30], qJ, is constructed. This qJ functiondepends on the system topology (series - parallel architecture)and also the type ofMSS being analyzed (flow transmission ortask processing). The system topology of the generator systemin Fig 2 consists of elements 1 and 2 in parallel to each otherand the parallel combination in tum in series with element 3.

    The net useful power output of the system (performance),Gs, will be the minimum of the total amount of powerprocessed by the parallel combination of elements {I ,2} givenby (G1 + G2) and the serially connected element 3 having apower processing capacity represented by the random variable,G3• Based on this configuration, the system structure function,qJ, for the flow transmission power generator system in Fig 2 isgiven by (23), where Gs denotes the overall systemperformance (output power) random variable.

    Representing the discrete random variable for performanceof elements 1 and 2 as G1 and G2, if we assume the followingdependency condition given in Table III, we have thefollowing equations:

    TABLEIIICONDITIONAL PERFORMANCE DISTRIBUTION FOR ELEMENT 2 OF THE

    GENERATOR SYSTEMc=3 b=3 a=3

    =LLLPla(t)P2b(t)P3c(t)*z

  • same system performance rate values, in which case, theassociated state probabilities are summed up.

    Representing {gsJ, gs2" ..., gs27} and (PsIt), Ps2(t), ..., Ps27(t)} asthe set of system performance states and their probabilities, asimplified form of (24) is:

    The performance variation of the system for a general k"operation cycle, Gk(t), may now be described in terms of GI(t)by (28) based on the earlier expressions in (4) and (27). Theestimated time to failure (TTF) for every operation cycle, k, isfound by solving (7) numerically, where Gk(t) is described by(28).

    (25)

    E. Modeling ofMaintenance Cycle

    The duration for different maintenance actions (downtimeduration) in any maintenance policy is always a variable due tomany factors. For example, the root cause of each failure couldbe different; the degree of the damages caused by the failurescan be different on different occasions too. Some of the failuresites might be externally accessible and maintenance could beperformed without dismantling the system, thus requiring lessrepair time; whereas some others could be situated deep insidethe system that requires the system to be opened out for failureanalysis and restoration work which could end up to be verytime-consuming. As a result of all these variations, themaintenance duration needs to be modeled by a randomvariable with a stochastic distribution. The Weibull andGamma distributions are commonly used for downtime orrepair distributions [37]. Here, we use the Weibull distributionfor downtime event modeling. The shape factor ~ is assumed tobe 1 to reflect the age-independent randomness in themaintenance durations. The value for the scale factor, 11 is setto 0.02 years according to maintenance duration records forpower generators, as revealed in [38].

    As the system continues to age, the degree of failure andextent of damage of the to-be-maintained system becomesmore pronounced and severe even under the PdM policy, dueto the effect of irreparable wear-and-tear effects. Thus, thelater stages of system failures are more difficult and time-consuming to maintain and restore as compared to the initialfailures. Therefore, it would be appropriate to model the scalefactor, 1], of the downtime distribution as an arbitraryincreasing function of the operation cycle, k, to represent theincreased downtime periods during subsequent repair actions,as the system's rate of degradation increases with time forimperfect maintenance.

    With the mathematical model developed, we can nowsimulate the model using Matlab and study the effect of thevarious factors described, on the system reliability andperformance characteristics for the MSS PdM policy. Due tothe unavailability of real industrial data, the numerical valuesof the Markov degradation rates in this power system casestudy are hypothetically assumed for the sake of illustration;hence, the magnitudes of the computation results may notreflect the reality. In other words, the results described in thefollowing sections serve only to show the practical usefulnessand applicability of the model.

    IV. RESULTS AND DISCUSSION

    A. Impact ofThreshold Demand (W)

    Figures 4(a) - 4(c) show the system mean performancecurves for three threshold demand (W) values of9 MW, 8 MWand 7 MW respectively at a given RF distribution withparameters PRF = 95% and (jRF = 0%. One can see from these

    27

    Us(z) = LPSi(t).zgsii=l

    The relationship between the state probabilities for eachelement is: PII(t) + PI2(t) +PI3(t) = 1; P2I(t) + P22(t) + P23(t) =1; P3I(t) +P32(t) +P33(t) = 1 tj t.

    27

    For the system, L Psi (t) = 1 tj t ·i=l

    From Table IV, the power capacity of gl = 10 MWcorresponds to the maximum performance rate of the ''fullyfunctional" system. On the other hand, the power capacity ofg27 = 0 MW represents the total failure event where the powersystem is "completely non-functional" i.e, not able to processany power at all. The power processing capacities of 8.5 MW,7.5 MW, 6 MW etc ... correspond to the intermediate degradedstates where the system is only ''partially functional andefficient" in its performance.

    TABLEIVSYSTEM PERFORMANCE DISTRIBUTION TABLE OBTAINED USING THE

    UGF APPROACH

    Systemperformance state 10 8.5 7.5 0

    - Gs/MW

    State probability -Psl(t) Ps2(t) Ps3(t) Ps27(t)Ps(t)

    D. System Reliability & Performance

    From the system performance distribution, the Reliability(Survival) Function of the system, RI(t) for the 1st operationcycle, can be defmed as the probability that the system'sperformance (Gs) is above the minimum user-set thresholddemand value, W. This is consistent with our earlier defmitionoifailure in Section IIIA.2 where the system is considered tohave failed from the user's perspective once its meanperformance, G(t), drops below the threshold user demand(W). Therefore, RI(t) is expressed as follows:

    RI(t) = no, ~ W) = {fpAt) Is, ~ W} (26)1=1

    where W is the minimum threshold demand settingrepresenting the minimum user expectation from the system.

    The mean performance of the system for the 1st operationcycle, GI(t), can be modeled from the system performancedistribution. Since Table IV is a probability distributionfunction (p.d.t) of a discrete statistical random variable of thesystem performance, Gs, the mean or expectation of Gs,denoted by E(Gs) , can therefore be expressed as follows:

    (27)

    In (27), GI(t) is the system mean performance for the l"operation cycle; the summation term is the usual statisticaldefmition for "expectation ofa random variable".

    k-l

    Gk(t) =G1(t)· TIRF[r]r=1

    (28)

    978-1-4244-4758-9/10/$26.00 e 2010 IEEE MU3006 2010 Prognostics & System Health ManagementConference(PHM2010Macau)

    Authorized licensed use limited to: Nanyang Technological University. Downloaded on March 02,2010 at 03:30:58 EST from IEEE Xplore. Restrictions apply.

  • w= 7M~ uRF =0%, P = 50%

    _.- IJ.RF =900k••••• IJ.RF=95°k--IJ.

    RF=97.5%

    10

    9.5£e"

    I

    G) 9uCft5E(; 8.5'f:G)

    Q.

    C 8ft5G)

    ::E

    ~ 7.5(I)

    >.en7

    6.50 5

    TABLE VITIME TO REPLACEMENT (TTR) VERSUS THRESHOLD DEMAND (W) TRENDS

    Table V shows the computed TTR for different thresholddemand values when the RF distribution is fixed at JlRF = 95%and (jRF = 0%. From Table V, it can be seen that if thethreshold demand W is increased from 7 MW to 8 MW, TTRdrops by approximately 57.6%. Similarly, further increase inthe threshold demand from 8 MW to 9 MW again causes thereplacement time to drop further by around 73.1%. Therefore,it is important for the user not to choose a very high W valueclose to the maximum performance capability of the system(10 MW in this case study). Instead, a moderate thresholddemand under which the system can still function effectivelyshould be chosen.

    B. Impact ofMean Restoration Factor (J.lRF)

    Fig 6 shows the typical variation of system meanperformance curves respectively for various f.lRF values of90%,95% and 97.5% respectively keeping the parameters (jRF = 0%,p = 50% and W = 7 MW fixed. The figure shows that thehigher the f.lRF, the higher the average performance at any pointin time, as expected.

    Large values of f.lRF coupled with low (jRF indicate very highquality ofmaintenance work and imply large restorations in thesystem performance during every maintenance. As a result, thesystem's initial performance during the start of every operationcycle is relatively high and it takes longer time for the system'sperformance to degrade to below the minimum thresholddemand (W) in that operation cycle. This implies extendedtimes to failure (TTF) and hence prolonged time toreplacement (TTR).

    Fig. 6. System mean performance variation for various mean restorationfactors of DRF = 90%, 95% and 97.5%, given W = 7MW, DRF = 0% and p =50%.

    8

    2

    18

    1.8

    16

    7

    1.6

    14

    6

    1.4

    5

    1.2

    4

    0.8

    3

    Index of operation/maintenance cycle = k

    0.6

    11m: =95%, D"RF =0%, P =500/0

    2

    4

    0.4

    2

    0.2

    10 .-.=-_r----r---_r---_r---_r---_r---_r---_r---_r--------.

    9.5

    I

    ~ 0cE10~ 9G)

    Q.

    Cft5G)

    ::E 0~ 10~-~-....,------r------r--_r__-_r__-_,_-___r"------....,~ 9en

    8

    7L--_-'--_---L-_--..L-_---'-__.&..--_-I..-_--'--_---L-_-----I

    o

    ~ 9

    10__----r---r---_,_--r-------r--_r----~-----,

    9

    9.5

    7

    IIUC..I§ 8.5

    ! B W_+_ ~---II 0 at- Maintenance

    per Ion Cycle # 1~ 75 Cyde # 1

    IllUSTRATING THE DETERMINATION OF TIME TO REPLACEMENT

    Sy!ll:em Opendion Time well'-

    6 8 10 12System Operation Time (years)

    Fig. 4. System mean performance variation for (a) W = 9MW, (b) W = 8 MWand (c) W = 7 MW.

    figures that the higher the threshold demand (W), the soonerthe time to failure (TTF) and the higher the mean frequency ofmaintenance actions to be performed. This is because setting ahigher threshold demand (W) implies that the system's meanperformance would degrade below the threshold level in ashorter span of time as illustrated earlier in Fig 3.

    Time to Replacement, represented as TTR, is defmed as theinstant when the degrading system's mean performance cannever be restored to above the minimum required threshold(W) anymore in spite of any further maintenance work. In suchan event, further repair work is not beneficial because theuser's minimum expectations can no longer be satisfied, andreplacement of the system is therefore the only alternativeoption. Fig 5 clearly illustrates the replacement criteria.

    Fig. 5. Determination of time to replacement (TIR) from system meanperformance curve. Symbol 'k' represents the j(h operation cycle.

    TABLE VTIME TO REPLACEMENT (TTR) VERSUS THRESHOLD DEMAND (W) TRENDS

    Threshold Demand W (MW) Time to Replacement (TTR) / yrs

    J.1-RF = 95%; (JRF = 0%; p = 50%.

    9.0 1.801

    Mean RF (flRF) Time to Replacement (TTR) / yr

    W = 7 MW; (JRF = 0%; p = 50%.

    97.5% 29.99

    95.0% 15.81

    92.5% 11.19

    90.0% 8.856

    85.0% 6.508

    8.0 6.704

    7.0 15.81Table VI shows that the TTR of the system increases largely

    by 36.1% as the f.lRF is increased from 85% to 90%. Further

    978-1-4244-4758-9/10/$26.00 e 2010 IEEE MU3006 2010 Pro gnostics & System Health Management Conference(PHM2010 Macau)

    Authorized licensed use limited to: Nanyang Technological University. Downloaded on March 02,2010 at 03:30:58 EST from IEEE Xplore. Restrictions apply.

  • Simulation A (SIMA) 0.984 0.879 0.998 0.964 0.782

    Simulation B (SIM B) 0.807 0.842 1.004

    increase in f.JRF from 90% to 95% prolongs the TTR valuefurther by as large as 78.5%. It is therefore necessary for anindustry to strive to improve its f.JRF to as much as possible. Thecost incurred in improving the f.JRF must be justified incomparison to the cost savings in maintenance and prolongedTTR achieved as a result of the improvement.

    TABLEVIIRESTORATION FACTOR FOR DIFFERENT MAINTENANCE ACTIONS OBTAINED

    USING RANDOM NUMBER GENERATION FOR TWO DIFFERENT SIMULATIONS, SIMA AND SIMB WHERE MEAN RESTORATION FACTOR = 85%.

    Maintenance # 2 3 4 5

    SIM B as shown in Table VII. Therefore, it is crucial tomaintain consistency in maintenance quality for a longer lifecycle of the equipment. Note that maintenance quality includesthe technical skill and competency of the maintenancepersonnel as well as the quality of spare parts used duringrepair.

    It is therefore clearly evident that keeping the maintenanceactions as consistent as possible is of paramount importance inorder to enhance the predictability of future downtimeschedules, facilitate efficient pre-planning of inventory stocks,reduce delay times and downtime durations, thereby increasingthe mean availability and production output of the system.

    D. Effect ofMultiple Failure Modes

    1614

    REPLACE

    P =00/0ITR =15.81 years

    4 6 8 10 12System Operation Time (years)

    J.1FF =950/0 GFF =00/0THRESHOLD DEMAND =W =8.0 MW

    2o

    7.8

    9.8 ,9.6 ~

    a: 9.4 \§ 9.2 \~ 9 ~.e 88 \ II... 1 I , n~ , I , I'e 8.6 t I 'I',J II 'It,

    Ill:: 8.4 , I II' ItE 'I 'I 'n! 8.: __l_lt\-__ _ _

    I~ REPLACEII P =1000/0\ TTR = 4.226 years

    Fig. 8. Impact of the presence of multiple failure modes on the systemperformancetrends.

    E. Determination ofMaintenance Schedule

    Based on the simulations of the proposed model, the timesto failure (TTF) data can be obtained from the numericalsolution of (7) and an estimate of the maintenance (downtime)

    When p = 0%, the element is under the influence of only FMA and therefore, it has a lower degradation rate as it graduallydegrades with a stepwise state transition from g31 ~ g32 ~ g33.Therefore, when p = 0%, the system has a prolonged lifespanand the maintenance intervals are more spread out. In contrast,when p = 100%, the FM A is certain to cause FM Bandtherefore the element which is now under the influence of boththe failure modes has a very high effective degradation rateand hence, it is most likely to undergo a catastrophicbreakdown directly from the best state of g31 to the worst stateof g33. As a result, the lifespan of the system is very short andthe maintenance frequency is very high. From Fig 8, while theTTR value for p = 100% is only 4.226 years, the correspondingvalue for p = 0% is 15.81 years which is about four timeslarger. This shows the significant effect of the presence ofmultiple failure modes.

    10..-----------r--------r--------r----.---------.---------r---.----------r-----o

    In the beginning of Section III, we mentioned that element 3of the power generator system is under the influence of twofailure modes where FM B is dependent on FM A with aconditional probability,p. Fig 8 shows the system performancevariation trends for the extreme values of p = 0% andp = 100% keeping the other parameters f.JRF = 95%, l7RF = 0%and W= 8 MW fixed.

    25

    J.1FF=850/0 oFF=100/0 P =500/0THRESHOLD DEMAND = W = 6.0 MW

    10 15 20

    System Operation Time (t) I years

    5.5 L..--__------L_--'"---_------'-__-------'----------'---- ----'--------'o

    9

    9.5

    Fig. 7. Impact of inconsistency in maintenance work quality on variation insystemperformancetrends.

    Observations from Fig 7 and Table VII reveal that if theinitial maintenance goes bad by chance, then futuremaintenance actions will not be effective in restoring thesystem to a satisfactory level of performance regardless of howgood these future maintenance actions are. A bad repair workduring the initial stages of system operation causes irreparabledamage to its reliability and this damage cannot becompensated for by trying to improve the quality of futurerepair works on the system. The ultimate effect is a drasticreduction in the time to replacement (TTR). In Fig 7, the RFvalue for the first maintenance in the case of SIM A and SIM Bare 98.4% and 80.7% respectively. Due to the low initial RFfor the case of SIM B, the TTR for SIM B is as low as 11.85years in comparison to the longer (almost double) TTR valueof23.99 years for SIM A despite the subsequent higher RF for

    C. Impact ofVariation in Restoration Factor (l7RF)

    Figure 7 illustrates the large deviation between a possibleoptimistic-case (SIM A) and pessimistic-case (SIM B)scenario of a system's performance variation pattern when thel7RF is as high as 10%. The curves were generated by keepingall the parameters fixed at f.JRF= 85%, l7RF= 10%, P = 50% andW = 6 MW. The random number generator in Matlab providestwo entirely different set of values of RF[k] from the same RFdistribution during the two separate simulations. The variousRF values at each maintenance cycle for SIM A and SIM Bareshown in Table VII.

    10r-----------r--------r---------r-------r-------r--------r

    ':t:i'(!8 8.5IiE 8.g8! 7.5i:E 7EI>- 6.5UJ

    978-1-4244-4758-9/10/$26.00 © 2010 IEEE MU3006 2010 Prognostics& SystemHealth ManagementConference(PHM2010Macau)

    Authorized licensed use limited to: Nanyang Technological University. Downloaded on March 02,2010 at 03:30:58 EST from IEEE Xplore. Restrictions apply.

  • It may be noticed that in the case of a dependent system,since element 2 degrades along with element 1 because of thedependency, the effective degradation rate of the system ishigher and hence, it is to be replaced in a shorter period oftime. In contrast, when the elements are independent of each

    A company's long term fmancial position hinges largely onits ability to reduce plant operational and maintenance costs,which currently accounts for as much as around 15 - 70% ofits overall production expenses [1]. The new PdM policyproposed in this study can defmitely lower the maintenancecost, and hence increase the company's profitability.

    ACKNOWLEDGEMENTS

    A statistical model for the PdMpolicy ofMSS based systemshas been developed by combining the Universal GeneratingFunction (UGF) and Markov Chain analysis theories.Restoration factor (RF), which is indicative of maintenancework quality and threshold demand (W), which represents theminimum user expectations are identified as important PdMparameters, and their impacts on the system performance,downtime schedule and replacement time were quantitativelyexamined.

    other, degradation of element 1 does not affect theperformance of element 2 and as a result, the effectivedegradation rate of the system is lower resulting in prolongedoperation times and longer time to replacement. In Fig 9, theTTR value for dependent system was 6.732 years while for theindependent system, it is 3% larger at 6.942 years.

    V. CONCLUSION

    REFERENCES

    [1] Bevilacqua, M. and Braglia, M, "The analytic hierarchyprocess applied to maintenance strategy selection",Reliability Engineering and System Safety, Vol. 70, Issue1, pp. 71 - 83, (2000).

    [2] Zhao, Y.X., "On preventive maintenance policy of acritical reliability level for system subject to

    Using the stochastic model for the restoration factor (RF),system performance variation for various PRF, (jRF and Wvalues were simulated and presented graphically. The resultsclearly indicate the significant impact of PRF, (jRF and W onsystem reliability. A highly skilled maintenance crew (high)JRF) can help improve the system reliability and maintainabilityto a large extent, thus saving costs and reducing wear and tearof the system and in tum prolonging its useful lifespan.Consistent performance of maintenance (low (jRF) is also veryessential for more accurate predictability of future downtimeschedules and times to system replacement (TTR) which in turnassist the management to precisely pre-plan the productionactivities so as to meet the timely customer market demands.

    Throughout this study, the model developed and the resultsshown were all based on the case study of a simple 3-elementpower generator system MSS. However, it is important to takenote that the exact same procedure described in this workcould be applied to any n-element MSS of any type (flowtransmission or task processing) with any arbitrary topology.The only feature to take note of is the system structurefunction, Gs =qJ(GI'G2 , ••• ,Gn ) which will vary for differentsystems depending on its MSS classification and its topology[30].

    The authors would like to thank the Office of Research,Nanyang Technological University (NTU), Singapore forfunding this research work and providing the necessarylogistical support.

    DEPENDENT SYSTEM

    INDEPENDENT SYSTEM

    IJ.FF= 95% GFF = OOk P = SOO/o

    THRESHOLD DEMAND =W =8.0 MW

    COMPARING DEPENDENT vs INDEPENDENT SYSTEMS

    DOWNTIME(lIAAlNl'ENANCE) SCHEDULE(allunUsin YEARS,J

    (i) (ii) (iii) (iv)

    L:'bwntin:leW=8IvIW W=1IvIW W=1IvIW W=8IvIW

    # ~u=95%~u= 95% ~u=90% ~u= 95%

    C'u=O% C'u=O% C'u=O% C'u=O%p=50% p=50% p=50% p= 100%

    REPLACE 6.704- 15.81 8.856 4.226

    1 2.419 3.982 3.982 1.556

    2 4.384 1.385 6.111 2.162

    3 5.129 10.21 8.400 3.601

    4 6.498 12.48 REPLACE 4.0955 REPLACE 14.16 REPLACE6 15.21

    1 REPLACE

    9.6

    9.8

    e- 9.4(J

    I 9.28Ii 9E~ 8.8~e 8.6II~ 8.4

    i1 8.2(J)

    REPLACE ~7.8 ITR =6.732 years023 456 7 8

    System Operation Time (years)

    Fig. 9. Comparing the performance variation of a system with dependent andindependent elements.

    schedules can be constructed. Table VIII shows a typicalexample of a maintenance schedule derived from the model forthe following four different cases:

    TABLE VIIIDETERMINATION OF MAINTENANCE SCHEDULE FROM PDM MODEL

    SIMULATION FOR FOUR DIFFERENT CASES.

    (i) PRF = 95%, (jRF = 0%, W =8 MW, P = 50%.(ii) PRF = 95%, (jRF = 0%, W =7 MW, P = 50%.(iii) PRF = 90%, (jRF = 0%, W =7 MW, P = 50%.(iv)PRF=95%, (jRF=O%, W=8MW,p= 100%.

    F. Comparison ofDependent & Independent Systems

    In this study, Element 2 is modeled to be dependent onElement 1. In order to quantitatively examine the impact of thisdependency on the system lifespan, two simulations areperformed. One of them models the dependency of element 2according to Table III and equations (17) - (19), using theconditional probability theory. The other simulation assumeselement 2 to be independent of element 1. In both these cases,the values of the degradation rates, A1,2 (2) and A2,3 (2) are thesame and the values of the other PdM parameters PRF = 95%,(jRF = 0%, W = 8 MW and p = 50% are all kept fixed. Fig 9illustrates the performance degradation trends for these twodifferent cases of dependent and independent systems.

    978-1-4244-4758-9/10/$26.00 e 2010 IEEE MU3006 2010 Pro gnostics & System Health Management Conference(PHM2010 Macau)

    Authorized licensed use limited to: Nanyang Technological University. Downloaded on March 02,2010 at 03:30:58 EST from IEEE Xplore. Restrictions apply.

  • degradation", Reliability Engineering and System Safety,Vol. 79, Issue 3, pp. 301-308, (2003).

    [3] Moya, C.C., "The control of the setting up of a predictivemaintenance program using a system of indicators",International Journal ofManagement Science, pp. 57-75,(2004).

    [4] L.Dieulle, C.Berenguer, A.Grall and M.Roussignol,"Continuous Time Predictive Maintenance Schedulingfor a Deteriorating System", Annual Reliability andMaintainability Symposium, pp.150-155, (2001).

    [5] L.Jaw, "Putting CBM and EHM in perspective",Scientific Monitoring Inc., Maintenance Technology.

    [6] Grall, A., Berenguer, C. and Dieulle, L., "A condition-based maintenance policy for stochastically deterioratingsystems", Reliability Engineering and System Safety,Vol. 76, Issue 2, pp. 167-180, (2002).

    [7] Nakagawa, T., "Optimum policies when preventivemaintenance is imperfect", IEEE Transactions onReliability, Vol. 28, No.4, pp. 331-332, (1979).

    [8] Kijima, M., "Some results for repairable systems withgeneral repair", Journal ofApplied Probability, Vol. 26,pp.89-102, (1989).

    [9] Wendai Wang and Daescu, D.D. "ReliabilityQuantification of Induction Motors - AcceleratedDegradation Testing Approach", Annual Reliability andMaintainability Symposium, pp. 325-331, (2002).

    [10] Kane, M.M., "Intelligent motors moving to the forefrontof predictive maintenance", 47th Annual Petroleum andChemical Industry Conference, pp. 217-223, (2000).

    [11] Hoang Pham and Hongzhou Wang, "Imperfectmaintenance", European Journal of OperationalResearch, Vol. 94, Issue 3, pp. 425-438, (1996).

    [12] Nakagawa, T., "Optimum policies when preventivemaintenance is imperfect", IEEE Transactions onReliability, Vol. 28, No.4, pp. 331-332, (1979).

    [13] Nakagawa, T., "Imperfect preventive maintenance",IEEE Transactions on Reliability, Vol. 28, No.5, pp.402., (1979).

    [14] Brown, M. and Poschan, F., "Imperfect repair", JournalofApplied Probability, Vol. 20, pp. 851-859, (1983).

    [15] Block, H.W., Borges, W.S. and Savits, T.H., "Agedependent minimal repair", Journal of AppliedProbability, Vol. 22, pp. 370-385, (1985).

    [16] Block, H.W., Borges, W.S. and Savits, T.H., "A generalage replacement model with minimal repair", NavalResearch Logistics, Vol. 35/5, pp. 365-372, (1988).

    [17] Makis, V. and Jardine, A.K.S, "Optimal replacementpolicy for a general model with imperfect repair",Journal of the Operational Research Society, Vol. 43,No.2, pp. 111-120, (1992).

    [18] Malik, M.A.K., "Reliable preventive maintenancepolicy", AIlE Transactions, Vol. 11/3, pp. 221-228,(1979).

    [19] Chan, J.K. and Shaw, L., "Modeling repairable systemswith failure rates that depend on age and maintenance",IEEE Transactions on Reliability, Vol. 42, pp. 566-570,(1993).

    [20] Kijima, M., Morimura, H. and Suzuki, Y., "Periodicalreplacement problem without assuming minimal repair",European Journal of Operational Research, Vol. 37/2,pp. 194-203, (1988).

    [21] Martorell, S., Sanchez. A. and Serdarell, V., "Age-dependent reliability model considering effects ofmaintenance and working conditions.", ReliabilityEngineering and System Safety, Vol. 64, Issue 1, pp. 19-31, (1999).

    [22] Kijima, M. and Nakagawa, T., "Accumulative damageshock model with imperfect preventive maintenance",Naval Research Logistics, Vol. 38, pp. 145-156, (1991).

    [23] Kijima, M. and Nakagawa, T., "Replacement policies ofa shock model with imperfect preventive maintenance",European Journal of Operational Research, Vol. 57,pp. 100-110, (1992).

    [24] Wang, H. and Pham, H., "Optimal maintenance policiesfor several imperfect maintenance models", InternationalJournal ofSystems Science, 1996.

    [25] Wang, H. and Pham, H., "Optimal age-dependentpreventive maintenance policies with imperfectmaintenance", International Journal of Reliability,Quality and Safety Engineering, 1996.

    [26] Wang, H. and Pham, H., "A quasi renewal process and itsapplication in the imperfect maintenance", InternationalJournal ofSystems Science, 1996.

    [27] Wang, H. and Pham, H., "Availability and optimalmaintenance of series system subject to imperfect repair",IE Working Paper, 1996, pp. 96-101, Rutgers University.

    [28] Wang, H., "A survey of maintenance policies ofdeteriorating systems", European Journal ofOperationalResearch, Vol. 139, Issue 3, pp. 469-489, (2002).

    [29] Levitin, G. and Lisnianski, A., "Multi-State SystemReliability: Assessment, Optimization and Applications":World Scientific Publishing Co. Pte Ltd, 2003,Chapters - 1, 4.

    [30] Levitin, G., "The Universal Generating Function forReliability Assessment and Optimization": SpringerSeries Publishing Co. Pte Ltd, 2004.

    [31] Levitin, G. and Lisnianski, A., "Optimization ofimperfect preventive maintenance for multi-statesystems", Reliability Engineering & System Safety, Vol.67, Issue 2, pp. 193-203, (2000).

    [32] Drenick, R.F., "The Failure Law of ComplexEquipment", J. Soc. Indust. Appl. Math, Vol. 8, No.4,pp. 680-690, (1960).

    [33] Raghavan, N. and Tan, C.M., "A framework to practicalpredictive maintenance modeling for multi-statesystems", Reliability Engineering & System Safety, Vol.93, pp.1138-1150, (2008).

    [34] Ebeling, C., "An Introduction to Reliability andMaintainability Engineering", Me Graw HillPublications, International Edition, 1997.

    [35] Levitin, G., Lisnianski, A., Beh-Haim, H. and Elmakis,D., "Redundancy optimization for series-parallel multi-state systems", IEEE Transactions on Reliability, Vol.47, No.2, pp. 165-172, (1998).

    978-1-4244-4758-9110/$26.00 © 2010 IEEE MU3006 2010 Prognostics& SystemHealthManagement Conference(PHM2010 Macau)

    Authorized licensed use limited to: Nanyang Technological University. Downloaded on March 02,2010 at 03:30:58 EST from IEEE Xplore. Restrictions apply.

  • [36] Ushakov, I., "Universal Generating Function", SovietJournal of Computing System Science, Vol. 24(5), pp.118-129, (1986).

    [37] Source: www.weibull.com : "Introduction to RepairableSystems - Downtime Distributions".

    [38] Endrenyi, J. et al, "The Present Status of MaintenanceStrategies and the impact of Maintenance on Reliability",IEEE Transactions on Power Systems, Vol. 16, No.4,pp. 638-646, (2001).

    BIOGRAPHY

    CHER MING TAN was born inSingapore in 1959. He received the B.Eng.degree (Hons.) in electrical engineeringfrom the National University of Singaporein 1984, and the M.A.Sc. and Ph.D.degrees in electrical engineering from theUniversity of Toronto, Toronto, ON,Canada, in 1988 and 1992, respectively.He joined Nanyang Technological

    University (NTU) as an academic staff in 1997, and he is now anAssociate Professor in the Division of Circuits & Systems at theSchool of Electrical and Electronic Engineering (EEE), NanyangTechnological University (NTU), Singapore. His current researchareas are reliability data analysis, electromigration reliabilityphysics and test methodology, physics of failure in novel lightingdevices and quality engineering such as QFD. He also works onsilicon-on-insulator structure fabrication technology and powersemiconductor device physics.

    Dr. Tan was the Chair of the IEEE Singapore Section in 2006. Heis also the course-coordinator of the Certified Reliability Engineerprogram in Singapore Quality Institute, and Committee member ofthe Strategy and Planning Committee of the Singapore QualityInstitute. He was elected to be an IEEE Distinguished Lecturer ofthe Electron Devices Society (EDS) on Reliability in 2007. He isalso the Faculty Associate of Institute of Microelectronics (IME)and Senior Scientist of Singapore Institute of ManufacturingTechnology (SIMTech). He was also elected to the ResearchBoard of Advisors of the American Biographical Institute andnominated to be the International Educator ofthe Year 2003 by theInternational Biographical Center, Cambridge, U.K. He is nowappointed as a Fellow ofthe Singapore Quality Institute (SQI).

    He is currently listed in Who's Who in Science and Engineering aswell as Who's Who in the World due to his achievements inscience and engineering.

    NAGARAJAN RAGHAVAN was bornin Bangalore, India in 1985. He receivedhis B.Eng (Hons.) from the School ofElectrical and Electronic Engineering(EEE), Nanyang Technological University(NTU), Singapore. He then completed aDual Master degree (M.Sc, M.Eng)majoring in Materials Science at theNational University of Singapore (NUS)

    and Massachusetts Institute of Technology (MIT), Boston underthe Singapore-MIT Alliance (SMA) program in the field ofAdvanced Materials for Micro & Nano Systems (AMM&NS).

    He was the recipient of the prestigious Nanyang Scholarship,NTU President Research Scholar and Singapore-MIT Alliance(SMA) Graduate Fellowship awards. He was one of the fiverecipients to be bestowed with the IEEE Reliability SocietyGraduate Scholarship award in 2008 for his researchaccomplishments in reliability and its application to the field ofmicro and nano-electronics. He is currently pursuing his PhD at theDivision of Microelectronics, EEE, NTU focusing on reliabilitymodeling and characterization of high-x dielectric materials innanodevices. His research interests include electromigration, high-K and low-x dielectric breakdown, wafer bonding, reliabilitystatistics and maintenance modeling.

    To date, he has authored / co-authored more than 18 internationalpeer-reviewed publications and serves on the review committee formany journals. He is currently a Student Member of IEEE and theMaterials Research Society (MRS), in the US.

    978-1-4244-4758-9/10/$26.00 e 2010 IEEE MU3006 2010 Prognostics& SystemHealthManagement Conference(PHM2010 Macau)

    Authorized licensed use limited to: Nanyang Technological University. Downloaded on March 02,2010 at 03:30:58 EST from IEEE Xplore. Restrictions apply.