27
January, 2001 12.1-1 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis 12.1 Failure Modes and Effects Analysis 12.1.1 Introduction What is product quality in the eyes of the customer? Ullman (1992) reports on a survey (Time, November 13, 1989) which asked customers, “What determines quality?” We adapted the results of this survey in Table 12-1. Table 12-1: Results of a Customer Survey on Product Quality Essential Not essential Not sure Works as it should 98 1 1 Lasts a long time 95 3 2 Is easy to maintain 93 6 1 Looks attractive 58 39 3 Incorporates the latest technology 57 39 4 Has many features 48 47 5 He stresses the point that “quality is a composite of factors that are the responsibility of the design engineer.” This table emphasizes the key factors of quality: providing basic functionality, reliability, and ease of service. The following quotation characterizes the loss of goodwill associated with not meeting the customer’s expectations: An unhappy customer is a terrorist. Satisfied customers tell eight to ten others. Dissatisfied customers tell 16 to 20 others. Twenty-five percent of the dissatisfied customers tell as many as 40 other people. – Mark Foucher (Cadillac consumer relations) Rising demand for quality during customer ownership has provided the need to address reliability and serviceability during design. Probabilistic methods for reliability assessment have been a mainstay of engineering systems development for many years (Levinson, 1964). Reliability methods focus on predicting availability and maintainability of complex systems based on mean time between failure (MTBF) and mean time to repair (MTTR) estimations for functional components or subsystems. Failure rate and distribution data are obtained through component life testing or generic failure rate distribution tables; information generally not available until the detailed design stage when all the components and elements have been specified. While traditional reliability methods are useful, product development teams need tools to help address quality and reliability from the early stages of design. 12.1.2 What is Failure Modes and Effects Analysis Failure Modes and Effects Analysis (FMEA) is a method for addressing reliability during the early stages of design. FMEA helps improve design decisions and product quality during operation. In this section, we demonstrate a new way of organizing FMEA based on both the products function-structure relationship and the Voice of the Customer (VOC). FMEA is a product development (or process analysis) tool used to anticipate modes of failure and mitigate potential risk. FMEA is a risk used to eliminate, control, and reduce risk using the steps outlined in Table 12-2.

Hair Dryer

Embed Size (px)

DESCRIPTION

A useful document for Design FMEA

Citation preview

Page 1: Hair Dryer

January, 2001 12.1-1 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

12.1 Failure Modes and Effects Analysis

12.1.1 Introduction

What is product quality in the eyes of the customer? Ullman (1992) reports on a survey (Time, November 13, 1989) which asked customers, “What determines quality?” We adapted the results of this survey in Table 12-1.

Table 12-1: Results of a Customer Survey on Product Quality

Essential Not essential Not sure

Works as it should 98 1 1 Lasts a long time 95 3 2 Is easy to maintain 93 6 1 Looks attractive 58 39 3 Incorporates the latest technology 57 39 4 Has many features 48 47 5

He stresses the point that “quality is a composite of factors that are the responsibility of the design engineer.” This table emphasizes the key factors of quality: providing basic functionality, reliability, and ease of service. The following quotation characterizes the loss of goodwill associated with not meeting the customer’s expectations:

An unhappy customer is a terrorist. Satisfied customers tell eight to ten others. Dissatisfied customers tell 16 to 20 others. Twenty-five percent of the dissatisfied customers tell as many as 40 other people.

– Mark Foucher (Cadillac consumer relations)

Rising demand for quality during customer ownership has provided the need to address reliability and serviceability during design. Probabilistic methods for reliability assessment have been a mainstay of engineering systems development for many years (Levinson, 1964). Reliability methods focus on predicting availability and maintainability of complex systems based on mean time between failure (MTBF) and mean time to repair (MTTR) estimations for functional components or subsystems. Failure rate and distribution data are obtained through component life testing or generic failure rate distribution tables; information generally not available until the detailed design stage when all the components and elements have been specified. While traditional reliability methods are useful, product development teams need tools to help address quality and reliability from the early stages of design.

12.1.2 What is Failure Modes and Effects Analysis

Failure Modes and Effects Analysis (FMEA) is a method for addressing reliability during the early stages of design. FMEA helps improve design decisions and product quality during operation. In this section, we demonstrate a new way of organizing FMEA based on both the products function-structure relationship and the Voice of the Customer (VOC). FMEA is a product development (or process analysis) tool used to anticipate modes of failure and mitigate potential risk. FMEA is a risk used to eliminate, control, and reduce risk using the steps outlined in Table 12-2.

Page 2: Hair Dryer

January, 2001 12.1-2 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

Table 12-2 Three phases of FMEA

Task Method

Identify what can go wrong Determine failure dependencies: ( Causes → Failure Modes → Effects )

Evaluate how likely is it to occur and what are the consequences

Assess Risk Priority Numbers (RPN): (failure occurrence × effects severity × detection difficulty = RPN)

Decide what can be done to reduce the risk

Optimize design improvements, trade-offs, test plans, manufacturing changes, etc.

The FMEA team prioritizes expected failure modes together with upstream causes and downstream effects, producing a comprehensive ranked list of failure scenarios (causes, failure modes, and effects) in spreadsheet form. All problems are not equal, so companies should give the most attention to the potentially severe and frequent failures. The document also includes plans for reducing risk through design or process improvement. FMEA is basically a tool to document prioritized problems and solutions. Many procedures partition the FMEA into several categories: System, Design, and Process.

• System FMEA: an effective means of analyzing a design in terms of overall performance specifications, functions, interfaces, inputs, and outputs.

• Design FMEA identifies flaws in the specification of the design.

• Process FMEA specifies how manufacturing variation and errors might affect manufacturing efficiency and product functionality.

Often, the System and Design FMEA are combined into a comprehensive analysis covering product functionality.

12.1.3 Why perform an FMEA?

The thought process behind FMEA is implicit in any design effort: teams and individuals continuously ask “what could go wrong?” and “how could problems be prevented?” FMEA is a way of documenting this thought process and making risks explicit. Some of the reasons for performing FMEA are:

• Improve Customer Satisfaction: help identify Customer Requirements (& how these might not be achieved), as well as improving reliability and safety in general.

• Provide Thoroughness: increase the likelihood that all potential failures and effects will be considered.

• Reduce Development Effort: reduce product development time and cost.

• Focus Test Planning: focuse the development of test plans.

• Provide Documentation: make technical risks explicit to team members, suppliers, management.

The method is used to focus limited resources on critical design trade-offs and decisions leading to improved reliability, quality, and safety.

Page 3: Hair Dryer

January, 2001 12.1-3 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

12.1.4 FMEA Background and Documents

FMEA Emerged in the 1960‘s as a formal methodology in the aerospace and defense industries. In 1974, the Navy developed MIL-STD-1629 outlining a procedure for FMEA. In the late 1970’s, automotive applications driven by liability costs, began to incorporate FMEA into the management of their processes. Since then it has been adopted and standardized by many countries and industries (Table 12-3).

Table 12-3 A partial list of important FMEA publications

year FMEA Document 1964 “Failure Effect Analysis” Transactions of the NY Academy of Sciences (J. S. Couthino) 1974 US Department of Defense (DoD) Mil-Std-1629 (ships) “Procedures for Performing a Failure Mode, Effects, and Criticality Analysis” 1980 US DoD Mil-Std-1629A 1984 US DoD Mil-Std-1629A/Notice 2 1985 International Electrochemical Commission (IEC) 812 “Analysis techniques for system reliability − procedure for failure mode and effects

analysis” 1988 Ford published “Potential Failure Modes and Effects Analysis in Design and for Manufacturing and Assembly Processes Instruction Manual” 1994 SAE J1739 Surface Vehicle Recommended Practice 1995

“FMEA: from theory to execution” (D.H. Stamatis): “FMEA: predicting and preventing problems before they occur” (P. Palady)

1996 Verbrand der Automobil industrie, Germany VDA Heft 4 Teil 2: “The Basics of FMEA” (R.E. McDermott et. al.)

1998 Proposal for a new FMECA standard for SAE (J. Bowles)

The FMEA spreadsheet (Figure 12-1) has been adopted as the industrial standard for performing FMEA in many industries in the United States, Europe and Japan.

Function or Requirement

Potential Failure Modes

Potential Causes of Failure

Occ

urre

nce

Local EffectsEnd Effects on Product, User, Other Systems S

ever

ity Detection Method/ Current Controls

Det

ectio

n RPN

Actions Recommended to Reduce RPN

Responsibility and Target Completion

Date

Figure 12-1 FMEA Spreadsheet Column format similar to many FMEA standards

12.1.5 Some Criticisms of FMEA

FMEA is an established and prevalent design methodology, and yet it has also drawn some criticisms. Some documented criticisms of FMEA include:

• Performed too late, FMEA does not affect key product/process decisions (McKinney, 1991)

• FMEA does not capture key failures (Bednarz & Marriott, 1988)

• The analysis is not repeatable (Bell et al., 1992)

• FMEA is often an afterthought “checklist” exercise (Kara-Zaitri et al., 1991)

• Process is tedious (Ormsby et al., 1992)

• The Risk Priority Number gives a distorted measure of risk (Gilchrist, 1993, Harpster 1999)

Applying FMEA too late

When FMEA is used too late, there is little or no opportunity to impact important design decisions. After detailed design is completed, or after releasing the design to manufacturing,

Page 4: Hair Dryer

January, 2001 12.1-4 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

development teams no longer have the flexibility to improve product quality through design changes. The leverage point for FMEA is during the early stages of design when engineers can make the key decisions which impact life-cycle cost and reliability (Figure 12-2).

100

Planning Design Development Tooling Production

funds committed

funds expended

% o

f tot

al

Figure 12-2 Life-cycle costs for products are determined early in the development process

Life-cycle costs, and reliability, are locked in at the planning and early design stages. Often, engineers complete the FMEA after the design is complete in order to satisfy management or customer requests. Companies should view FMEA as a design tool rather than a “check-list” and a hindrance. In figure 12-3 we illustrate the early and continuous application of FMEA to what often happens: performing the FMEA late or not at all.

System, Design, Process, Service...

functionaldesign

detaileddesign

layoutdesign manufacturing

Ideal ActualDesign Process

Figure 12-3 Comparison of ideal and actual deployment of FMEA.

It is difficult for engineers to undertake a broad, system level FMEA before the detailed design stage, since they know little about the specific components. However, the insights from doing the analysis very early can have a significant impact on reliability and quality. Do the best you can with the information available. One can begin an FMEA as soon as the QFD requirements are identified. In addition, the early system level FMEA lays vital groundwork for later, more detailed FMEAs.

Not identifying key failures using FMEA A crucial step in any risk management tool is anticipating what might go wrong. It is difficult to assess the probability and consequences for a failure mode that the development team has not identified – much less provide design solutions. Formulating an extensive list of failure modes is a crucial step for predicting the overall reliability of a system. Of course, anticipating every mode of failure is not possible, but development teams should attempt to uncover as many potential failures at the earliest opportunity. The procedures for FMEA are only loosely defined in the literature, resulting in a subjective analysis dependent on the experience of the FMEA team. In fact, FMEAs completed by different engineers can give entirely different results.

Page 5: Hair Dryer

January, 2001 12.1-5 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

Problems with timing and execution lead to one of the biggest problems with FMEA: not identifying critical failure modes that occur during use. Some failures are difficult to anticipate during the pre-manufacturing stages, including:

• interfaces with other systems

• assembly and service errors (including but not limited to manufacturing variation)

• unexpected operating conditions

• unanticipated customer use

In the next few sections, we attempt to systematize the procedure and relate failure modes directly to the customer requirements and product functions.

12.1.6 “Advanced” FMEA for ME317

In an attempt to address some of the shortcomings of traditional FMEA, we define a straightforward and effective way to overcome some problems. By examining product function-structure relationships and taking customer requirements into account, we can identify a more comprehensive set of failure modes early-on. The next section gives an abbreviated explanation of the Advanced FMEA approach. The general procedure for Advanced FMEA for ME317 is outlined below. Step 1: Define the Voice of the Customer and Function-Structure relationship

function structure

H

K

JVOC'sE)F)G)H)

L d

c

a

b

. Step 2: Insert sub-function failures: Identify causes with respect to Structure - Identify effects with respect to functions and VOC’s

function structureVOCs

E)F)G)H)

H

K

J

L d

c

a

b

Causes A) B) C)

.not(k)

Step 3: Negate VOC’s and identify causes not found in function-structure analysis

Causes Q'') R'') S'')

Causes Q) R) S)

not(VOCs)E)F)G)H)

Causes Q') R') S')

.

Page 6: Hair Dryer

January, 2001 12.1-6 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

Step 4: Compile failure scenarios and rank

A

B

C

Q'' Q'

R'' R' R

S'' S' S

causes effects effects endeffects

riskpriority

high

high

med

low

med.

low

not(b)

not(b)

not(b)

not(k)

not(k)

not(k)

not(F)

not(F)

not(F)

not(G)

not(G)

not(G)

Q'

. Step 5: Prioritize failures by “risk” and sort by causes

1 2 3 4 5 6 7 8 90

10

20

30

40

50

60

70

80

90

100

. Step 6: Make design decisions that will reduce the high risk failures

Understanding Critical Functions

Failure modes are both outright functional failures of the product and also “customer dissatisfaction modes.” While it is impossible to capture every failure mode, by analyzing the product functions employed to provide performance and customer satisfaction, we can systematically identify a fundamental set of failure modes. A function-based approach to FMEA encompasses not only component failures, but also failure modes related to interfaces with other systems, the operating environment, and interaction with the user. Functional failures include both outright failures (where the product or an aspect of the product ceases to function) and those failures that diminish performance.

To develop an effective FMEA we should describe product functions and indicate their relationship to components and physical elements. One such technique for analyzing product functions is the “Function-Structure” map shown below for a hair dryer (Figure 12-4). The structure map can include external elements that contribute to functions, for example: power sources, user interaction, or other energy, material, or flow sources or destinations. The function-based FMEA has some distinct advantages

• Its scope is greater than the components alone, it focuses on what the product should do

• It can be started early before all the components and elements are specified

• It is an organized manner of analyzing a product or system

Page 7: Hair Dryer

January, 2001 12.1-7 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

Dry Hair

convert rotation

convert elec.

provide handle

screen

temperature switch

front case

front grid

heat shield

switch actuator

heating element

thermocouple

springs

fan blade

fan housing

motor

power cord

switch

ground wire

rear housing

heating assy

rear housing

front housing

heat air

provide user interface

provide airflow

fan assy

Hair Dryer

Function Structure

convey flow

supply air

convert elec. to rotation

to flow

support flow generation

supply electicity

to heat

control temperature

transfer heat to air

provide controls

protect user

control flow

provide electricity

ambient air

power source Figure 12-4 Function-Structure Map for a Hair Dryer

Incorporating the Voice of the Customer into FMEA

Performing the QFD ahead of FMEA is useful, since QFD helps engineers understand how the Voice of the Customer manifests itself in the design. We contend that any deviation from what the customer expects is a product failure. By negating the VOC’s, you can identify the true customer dissatisfaction modes or “negative VOC’s.” The Voice of the Customer captures performance attributes as well as “soft” functions. Soft functions refer to requirements that may not be critical to the product’s operation but affect customer perception, for example: noise level, appearance, ease of use, color, etc. In order to identify a more comprehensive set of failure modes, we need to examine the customer requirements identified through methods such as Quality Function Deployment (QFD). The Customer Requirements from a hair dryer QFD are given in Table 12-4.

Page 8: Hair Dryer

January, 2001 12.1-8 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

Table 12-4 QFD Customer Requirement for a Hair Dryer

Customer Requirements Weight

Dries Quickly 9 Operates Safely 3 Comfortable to Hold 9 Quiet 3 Reliable 3 Attractive 3 Cheap to operate 3 Portable 1 Operates Easily 1

For example, the customer attribute “Quiet” may not appear in functional analysis, since it is not strictly a function. Using the “negative VOC” (eg. not(Quiet) ), we can organize potential failure modes around the customer requirements.

Combining Function-based and Negative-VOC-based Analysis

Functional analysis is a systematic way of finding failures in the basic function of the product (implicit VOC’s = expected basic functionality) as well as performance related failures. Function-based FMEA will identify many failures which result in violation of customer requirements. To complement this analysis, a Negative-VOC-based FMEA is a top down method which can capture additional performance related failures as well as “soft” functions (see Figure 12-5).

PartsCharacteristics

EngineeringCharacteristics

Key ProcessOperations

ProductionRequirements

Houseof Quality

PartsDevelopment

ProcessPlanning

ProductionPlanning

I II III IV

Cus

tom

erAt

tribu

tes

Engi

neer

ing

Cha

ract

eris

tics

Parts

Cha

ract

eris

tics

Key

Proc

ess

Ope

ratio

ns

Quality FunctionDeployment

implicit VOC’s“soft” functions

performance relatedfunctions & metrics

Function-StructureMap

functions structure

sense ice level

close switchverify cube need

verify bucket full

harvest cubes

activate harvest

fill mold with water

freeze water

create nominalquality

create nominalgeometry

loosen ice in mold

remove ice frommold

sense ice level

open switch

de-activate harvest

ice maker

feeler arm

ice creation system

feeler arm switch linkage

ice cube levelsensor

ice harvestingsystem

feeler arm switch

mold

freezer system

mold heating system

ice harvesting system

water deliverysystem

.

create cubes

deposit cubes in bucket

structurefunctions

VOC functions

Figure 12-5 Relationship of VOC and functions contributing to customer satisfaction

12.1.7 Performing the FMEA Worksheet

The FMEA worksheet consists of a table listing potential failure modes causes, and effects organized around functions or the Voice of the Customer (VOC) (see Table 12-5). The following section describes the elements of the spreadsheet used for identifying failure scenarios. The elements for rating and prioritizing risk – Occurrence, Severity, Detection, and the Risk Priority Number – will be discussed in the next section.

Page 9: Hair Dryer

January, 2001 12.1-9 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

Table 12-5 ME317 FMEA template

Function or Requirement

Potential Failure Modes

Potential Causes of Failure

Occ

urre

nce

Local EffectsEnd Effects on Product, User, Other Systems S

ever

ity Detection Method/ Current Controls

Det

ectio

n RPN

Actions Recommended to Reduce RPN

Responsibility and Target Completion

Date

Function or Requirement is the specific behavior (e.g., heat air) or attribute (e.g., quiet) intended by the designers. Use the functional analysis to list major sub-functions or use customer requirements (VOCs) in this column. Discrepancies from the intended function or requirement are the result of a failure in the system. We recommend organizing the FMEA first by listing functions. After functional failures have been identified, organize the FMEA around “negative” customer requirements.

Potential Failure Mode describes the departure from the intended function or requirement. For function-based FMEA, interpret failure modes as a sub-function occurring improperly or not at all. For example, the failure mode not(provide airflow) could be an improper function (low airflow) or a function not occurring (no airflow). For a VOC-based FMEA, consider failure modes as a negation of the customer requirement, for example not(easy to use).

Potential Causes are the reasons why the desired requirement fails. Consider the needed conditions for each customer requirement. Also, ask yourself “what has to happen for the function to occur properly?” Then, list the possible causes of failure due to components, usage conditions (especially regarding human interaction), operating environment, and interfaces with other systems. Many causes may require detailed analysis to determine the root cause of the failure mode. You may need to identify root causes by using a more specialized FMEA, or by other techniques such as “Five Why’s” (ask why five times until a root cause is determined) or the Ishikawa Fishbone Root Cause Analysis (organize root causes around man, machine, material, method, measurement, environment).

Local Effects are the immediate effects of the failure mode within the product or system being analyzed. Local effects impact other sub-functions of the device but may not be immediately recognizable to the user. For example, a local effect for the function “heat air” might be “reduced current to the heater coil.” For small systems or for high level, simple functional models, the local effects column may be equivalent to the end effects.

End Effects on Product, User, Other Systems are the noticeable effects on performance, safety, and perceived quality. As a designer, this is the effect you are trying to prevent.

Method of Detection refers to the means to prevent the Local Effects. Methods of Detection can be inspections, test, sampling, etc. The detection method can be in the design stages through testing or design reviews, or by performing manufacturing inspections. Sometimes, there are no detection methods available. If no methods are possible, the detection method is “none.”

12.1.8 Identifying failures

Function-based approach: Figure 12-6 shows an example of a cause-and-effect chain of events derived from the function-structure map. There are many such “failure scenarios” or “event chains” possible in an FMEA and you should list each on a separate row with an associated risk ranking. It should be noted that element (structure) failures can have many causes, and each should be listed on a separate line on the FMEA. For example, the motor could fail outright due

Page 10: Hair Dryer

January, 2001 12.1-10 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

to a short circuit, or it could have a bearing failure which would degrade performance – clearly two distinct failure scenarios that should be separate line-items for the FMEA.

not(provideairflow) not(convert

electricityto rotation)

motorfailure

not(dry hair)

main function

end effect

majorsub-function

local effect

sub-function

failure modepotentialcauses

fanobstructed

mappedelements

not(convertrotation to flow)

Figure 12-6 Two “Failure scenarios,” their functional levels and relationship to FMEA. Note, each

scenario should be it’s own line-item on the FMEA

Negative-VOC based approach: by negating the VOC’s we can capture failures possibly missed by the function-based analysis. For example, not(comfortable to hold) could be caused by a vibrating hair dryer, caused by vibration of the motor, caused by eccentric motor rotation, caused by rotor/stator misalignment … which could, in turn, have more causes. These failures should be made explicit and ranked using the risk elements, which we will discuss in the next section.

12.1.8 Ranking Risks

The Risk Priority Number (RPN) is has evolved with the FMEA spreadsheet as an index used to prioritize failures in most FMEA standards. The RPN consists of three elements:

• Occurrence (O) – how likely is the cause to occur and result in the failure mode?

• Severity (S) – how serious are the end effects?

• Detection (D) – how likely is the failure to be detected before it reaches the customer? These indices are usually rated on a 1-10 scale and are described in more detail below. The risk priority number is the product of these three items

RPN = (Occurrence) × (Severity) × (Detection) Probability of Occurrence (O) is defined as how frequently the specific failure cause is projected to occur and result in the “failure mode”. The literature prescribes that Occurrence is assigned to the cause (and failure mode) and has nothing to do with the probability of the end effects. We recommend associating the Occurrence rating with the entire failure scenario, since some causes could have many different effects. Occurrence should refer to the probability of cause → failure mode → effect/event. In mathematical terms:

Probability of failure = (Probability of cause) × (Probability of failure given the cause) Occurrence scores are generated on the basis of an industry- or company-specific mapping from probabilities to a 1-10 scale (see Table 12-6).

Page 11: Hair Dryer

January, 2001 12.1-11 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

Table 12-6 Example of Occurrence criteria [AIAG, 1995]

Probability of Failure Possible Failure Rates Occurrence Ranking

Very High: Failure is almost inevitable ≥ 1 in 2 10 1 in 3 9 High: Repeated failures 1 in 8 8 1 in 20 7 Moderate: Occasional failures 1 in 80 6 1 in 400 5 1 in 2,000 4 Low: Relatively few failures 1 in 15,000 3 1 in 150,000 2 Remote: Failure is unlikely 1 in 1,500,000 1

Severity of effect (S) is typically defined as an assessment of the seriousness of the potential “end effects,” and assessed independent of the causes. We recommend assessing Severity to the entire failure scenario (causes, failure modes effects). Severity is estimated on a 1 to 10 scale (see Table 12-7).

Table 12-7 Severity of effect scores [AIAG, 1995]

Effect Severity of Effect Severity Ranking

Hazardous without warning when a failure mode affects safe device operation without warning 10 Hazardous with warning when a failure mode affects safe device operation with warning 9 very high device inoperable: loss of primary function 8 High device operable: at a highly reduced level of performance 7 Moderate device operable: at a reduced level of performance 6 Low device operable: at a slightly reduced level of performance 5 very low device operable: defect noticed by most customers 4 Minor device operable: defect noticed by average customers 3 very minor device operable: defect noticed by discriminating customers 2 None almost no effect 1

Detection (D), sometimes called Detectability, has no standard definition. There is some confusion surrounding this index, since different definitions exist for this term. If the team does not have a good understanding of this index, we recommend using a value of “1” for all fields and the team can fill it in later if time permits. Some criticisms of Detection are discussed later in the appendix section. The two most common interpretations are as follows:

Definition 1 The assessment of the ability of the “design controls” to identify a potential cause or design weakness before the component, subsystem or system is released for production. Detection scores are generated on the basis of likelihood of detection by the relevant company design review and testing procedures program (see Table 12-8).

Page 12: Hair Dryer

January, 2001 12.1-12 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

Table 12-8 Detection criteria [Palady, 1995]

Likelihood of Detection Detection Ranking

No system implemented, no awareness of quality, definition of product quality inconsistent – based on individual discretion 10 Completely reactive to problems identified during manufacturing, no formal programs, some awareness of product quality 9 Completely dependent on operator self-inspection with periodic quality control inspection, no implementation of quality methods, no formal procedures

8

Partial implementation of quality methods, sampling inspection plans and random audits, 100 percent inspection 7 Initial stages of the quality program implemented, little automated process controls, partial alignment 6 Quality programs developed but not fully implemented, no formal design review, some automated process controls, depended on adherence to standard operating procedures

5

Quality programs in place, awareness of reliability, no formal program, design reviews partially implemented, a mix of automated and human intervention process controls

4

Emphasis placed on design reviews, full quality program developed and implemented, currently training and partial implementation of reliability programs, automated process for the majority of operations

3

Effective design reviews, implementation of quality and reliability programs, highly automated process controls 2 Very effective design review system, mature quality control and reliability programs, state-of-the-art process controls 1

Definition 2 What is the chance of the customer catching the problem before the problem results in catastrophic failure (Table 12-9). Table 12-9 Detection criteria [Palady, 1995]

Likelihood of Detection Detection Ranking

Almost impossible to detect 10 Remote detection 9 Very slight detection 8 Slight detection 7 Low detection 6 Medium detection 5 Moderate chance of detection 4 High probability of detection 3 Very high probability of detection 2 Almost certain to detect 1

Risk Priority Number

Larger RPN’s indicate the need for corrective action or failure resolution. Give special attention to the effect when the severity rating is high regardless of the RPN. You should construct a Pareto chart of RPNs vs. causes or failure modes to clearly summarize the FMEA. Note that each discrete failure scenario (i.e. mode, cause, and effect) should have its own associated Occurrence, Severity and Detection values, and therefore a distinct RPN number. Again, we define the Risk Priority Number (RPN) as follows:

RPN = (Occurrence) × (Severity) × (Detection) Actions Recommended to Reduce RPN is a list of corrective actions and failure resolutions. Recommendations could include, in the order of priority:

Page 13: Hair Dryer

January, 2001 12.1-13 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

1) Design solutions to eliminate the failure mode or reduce likelihood, including: functional redundancies and error proofing the assembly, installation and usage.

2) Actions to reduce the severity of the failure mode in terms of its impact on the user, performance, and other systems

3) Developing means of detecting causes of failure modes during manufacturing including: inspection, testing, and error proofing.

4) Tests to provide more information data to assess Likelihood and Severity 5) Providing diagnostics to easily identify the failure mode or cause during

manufacturing or operation. 6) Establish periodic maintenance or check-ups to enhance availability and safety.

Pareto of RPN vs. Causes

0

100

200

300

400

500

A E D C B E E C E D E A B C C E A C C C B E C C D

Causes

RPN

0

20

40

60

80

100

Cum%

Figure 12-7 A Pareto Chart of RPN vs. Causes with many repeated causes

The output of the FMEA is a list of failure modes and actions recommended to mitigate these risks. Companies should use the FMEA to prioritize the activities listed above. Frequently, FMEA teams raise the question: what should I do about failure modes that appear frequently in the FMEA but might not have individually high RPN numbers? Since failure modes often have multiple causes, the same failure mode might show up repeatedly (see Figure 12-7). There are two recommendations to deal with this situation: 1) List the cumulative RPN (sum of individual failure mode RPNs) associated with each failure mode, or 2) list the cumulative RPN associated with each cause. This will also serve the purpose of compacting the FMEA to its failure modes or causes. Listing a Pareto chart of the RPNs can quickly highlight which failure modes and/or causes to first address. Figure 12-8 shows that cause “E” has the highest associated cumulative risk, which was not apparent from the standard Pareto chart.

Page 14: Hair Dryer

January, 2001 12.1-14 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

C umulative R PN vs. C auses

0

500

1000

1500

2000

2500

3000

E A B C D

Ca use s

Cum

. RPN

Figure 12-8 Cumulative RPN summed for each cause

Example FMEA Worksheet for Hair Dryer

The following page shows a detailed example of an FMEA performed on the basis of critical functions. Failure modes are the result of sub-functions not occurring, and are mapped to components, usage, and environment to determine the root cause (Table 12-10).

Page 15: Hair Dryer

January, 2001 12.1-15 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

Table 12-10 Function based FMEA Worksheet for a Hair Dryer Failure Modes & Sysytem Name: Blow Dryer FMEA Number : KosDri-217AEffects Analysis Major Function: Dry Hair: Provide Flow, Heat Air, Provide User Interface Page : 1 of 2

Prepared By: M.A. Nuhrd Date 12/11/97

Function or Requirement Potential Failure Modes Potential Causes of

Failure

Occ

urre

nce

Local EffectsEnd Effects on Product, User, Other Systems Se

verit

y Detection Method/ Current Controls D

etec

tion R

PN

Actions Recommended to

Reduce RPN

Responsibility and Target

Completion Date

PROVIDE AIR FLOW

supply electricity to fan no electricity to fan motor broken fan switch 2 no air flow hair not dried 8 6 96

no electricity to fan motor loose switch connection 2 no air flow hair not dried 8 5 80

no electricity to fan motor short in power cord 2 no air flow hair not dried 8 3 48

no electricity to fan motor short in power cord 2 potential user injury 10 3 60

no electricity to fan motor no source power 3 no air flow hair not dried 8 1 24

low current to fan motor low source power 1 reduced air flow inefficient drying 6 8 48convert electric power to rotation no rotation motor failure 2 no air flow hair not dried 8 5 80

no rotation obstruction impeding fan 3 no air flow hair not dried 8 1 24

no rotation obstruction impeding fan 3 motor overheat melt casing 10 1 30

low rotation rotor/stator misalignment 1 reduced air flow hair not dried 8 4 32

low rotation rotor/stator misalignment 1 noise generation 5 1 5

low rotation hair/foreign matter increasing friction 4 reduced air flow inefficient drying 6 1 24

convert rotation to flow no fan rotation loose or worn fan connection to rotor 1 no air flow hair not dried 8 2 16

convey flow low incoming flow blocked inlet 2 reduced air flow inefficient drying 6 1 12

low incoming flow blocked inlet 2 reduced air flow high temperatures generated 10 1 20

low outgoing flow blocked outlet 1 reduced outflow inefficient drying 6 1 6

low outgoing flow blocked outlet 1 reduced outflow high temperatures generated 10 1 10

no flow control broken fan switch 1 no flow hair not dried 8 5 40HEAT AIRsupply electricity to heater no electricity to heater broken temperature

switch 1 no heat generated hair not dried 8 6 48

no electricity to heater short in power cord 2 no heat generated hair not dried 8 3 48no electricity to heater no source power 1 no heat generated hair not dried 8 1 8

low current to heater low source power 1 insufficient heat generated inefficient drying 6 3 18

convert electric power to heat no heat generation short in heater element 2 no heat generated hair not dried 8 2 32

no heat generation short in heater element 2 user injury 10 2 40

control temperature temperature too high thermocouple failure 1 excessive heat generated discomfort to user 9 5 45

temperature too high thermocouple failure 1 excessive heat generated burns user 10 5 50

temperature too high temperature switch stuck closed 1 excessive heat

generated discomfort to user 9 4 36

temperature too high temperature switch stuck closed 1 excessive heat

generated burns user 10 4 40

temperature too low thermocouple failure 1 insufficient heat generated inefficient drying 6 5 30

temperature too low temperature switch stuck open 1 no heat generated inefficient drying 6 4 24

temperature too low incoming air too cold 2 insufficient heat generated inefficient drying 6 1 12

PROVIDE USER INTERFACE

provide controls no user control broken actuator switch 1 no heat generated and no air flow hair not dried 8 3 24

no user control actuator switch stuck or jammed 2 no heat generated

and no air flow hair not dried 8 2 32

protect user exposure to fan blade damaged rear screem 1 allows foreign objects access to blade injury to fingers 10 1 10

exposure to heating element broken front grid 1 exposed heating burns user 10 1 10

Page 16: Hair Dryer

January, 2001 12.1-16 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

In the next table we show an example of a FMEA based on the “negative VOC.” Table 12-11 VOC based FMEA Worksheet for a Hair Dryer

Failure Modes & Sysytem Name: Blow Dryer FMEA Number : KosDri-217AEffects Analysis Major Function: Quiet, Attractive, Comfortable Page : 2 of 2

Prepared By: M.A. Nuhrd Date 12/11/97

Function or Requirement

Potential Failure Modes

Potential Causes of Failure

Occ

urre

nce

Local EffectsEnd Effects on Product, User, Other Systems Se

verit

y Detection Method/ Current Controls D

etec

tion R

PN

Actions Recommended to

Reduce RPN

Responsibility and Target Completion

Date

Quiet not(Quiet) motor loose 2 vibration of motor excessive noise 3 2 12

rotor/stator misalignment 2 eccentric rotation excessive noise 3 1 6

loose casing 3 reduced noise insulation excessive noise 3 1 9

Attractive not(attractive) excessive UV exposure 4 faded color degraded appearance 2 4 32

loose casing 3 gap between housing

degraded appearandce 2 1 6

Comfortable to hold not(comfortable) motor loose 2 vibration of motor vibrates hand 4 2 16

rotor/stator misalignment 2 eccentric rotation vibration on hand 4 1 8

loose casing 3 gap in handle uncomfortable to hold 5 1 15

12.1.9 The Important Role of FMEA in “Service as a Business” There is a growing interest in industry for life-cycle design methods that support “service as a business,” i.e., selling product functions rather than hardware through lease arrangements or long-term service contracts. The keys to supporting “service as a business” are various maintenance technologies. Industries such as automotive, copiers and other office equipment, and power generation have historically invested heavily in such technologies. More recently, industries such as aircraft engines, medical equipment, and electronics manufacturing equipment have been following suit. There are several types of technologies that are supporting this recent trend towards “service as a business.” • Monitoring technology: Today, various forms of micro-sensors are now available for

embedding into systems very inexpensively. The accuracy and reliability of such sensors are improving very rapidly allowing applications into not only high-end systems but also household appliances. In diagnostic functions, artificial intelligence and expert systems that were popular in the 80’s have evolved into more specialized inference models that are suited for system diagnosis. Such models include systems based on bayesian nets and belief nets. A recent focus of development is prediction of future failures and maintenance needs as well as diagnostics after the failure occurrence. Some copiers even have self-recovery capability using micro actuators.

• Information management technology: the area attracting most attention is remote

monitoring and control using the internet. Assigning IP addresses are becoming a common practice not only for high-end office equipment and power generating control systems, but also medical equipment, aircraft engines, and even household appliances. The monitoring system constantly sends the pertinent information via the internet to high-speed computers that perform sophisticated diagnosis. The diagnostic system, in turn, provides maintenance instructions via the internet to on-board self-repair functions or service technicians and maintains the equipment before performance degradation. Past systems required a dedicated

Page 17: Hair Dryer

January, 2001 12.1-17 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

phone or cable connection to accomplish this type of on-line maintenance. The rapid advances in internet allow on-line service to far wider applications at low cost.

• Design technology for life-cycle serviceability: The key to utilizing the emerging

monitoring and information management technology is concurrent life-cycle engineering. Otherwise known as “design for serviceability,” this practice seeks to incorporate in upstream design not only the hardware design but development of sensors, diagnostic software, service logistics and processes, and overall serviceability. Stanford’s Manufacturing Modeling Laboratory (MML) has long focused on methods to support this goal: Advanced Failure Modes and Effects Analysis (AFMEA), and Design for Life-cycle Modularity. The goal is to address life-cycle management up front. Once the hardware design is complete, the opportunity to integrate the maintenance technology becomes limited.

While these technologies are enabling elements, the authors believe the main driver for concurrent and life-cycle design is the business trend that encourages service as a business. Jack Welch, CEO of GE, has declared that GE is a service enterprise, not a manufacturing company. He positions “service as a business” as one of GE’s three initiatives along with globalization and use of the internet (e-commerce, e-engineering, etc.) Service operations are showing significant growth in most of GE’s businesses including Aircraft Engines which now earns more than 50% of its revenue through engine services. Medical equipment, power systems, and locomotives are all showing the same trend. Welch claims GE should not be selling hardware, but provide the customers with quality functionality (such as thrust or Mega-Watts) with high availability. Thus, their business includes long term service contracts or full scope leases, making life-cycle service technology more and more important to GE. Other industries such as electronics manufacturing equipment share the same challenge as an outage of one machine could stop the entire production facility worth billions of dollars. Companies such as Applied Materials are seeking customer values through high availability as well as state-of-the-art process functions. In the past, people saw maintenance and service as a negative proposition to the manufacturers, i.e., things they did not want to do but had to after sales of the product. Recent trends in the US position the service business as a positive value proposition that differentiates a company’s product from its competitors. Furthermore, companies are turning service operations into an important revenue source. We believe this business model is making the US the leader in life-cycle design for serviceability that supports the environment indirectly through extended product life and material usage. The next and the last section of this reader will focus on Stanford’s development of Scenario-based FMEA, a key method to support the identification of service modes and current design of the product, its monitoring systems, and service processes.

Page 18: Hair Dryer

January, 2001 12.1-18 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

12.1.10 New Perspectives in the Risk Priority Number FMEA’s have been around for about 3 decades and the Risk Priority Number (RPN) has evolved with the FMEA spreadsheet. Today, the RPN is still a popular measure of risk in the FMEA but not everybody thinks it’s a good rating. Some criticisms of the RPN include: • RPN is not a consistent measure of risk (Harpster, 1998: Gilchrist, 1993) • “Difficulty of detection” has many definitions and is not well understood (Palady, 1993) • It takes a lot of time to rate the components of the RPN – (O, S, D) • The values of Occurrence, Severity and Detection are not consistent: their meaning depends

on the FMEA reference manual used In this section we want to introduce some alternatives to the RPN, specifically

• A simplified version (high, medium, low for Severity and Occurrence only) • A more accurate measure: Expected Cost (probability x cost)

Simplified Ranking of Risk One of the criticisms of FMEA is that they are tedious. A major contributor to the painstaking FMEA process is the Risk Priority Number and the rating of the associated indices (severity, occurrence, and detection). We recommend using a simplified RPN when

• time is tight, or • it is very early in the design • almost no information is available about failure rates & consequences

The simplified rating is based on the following

• Rate probability of Occurrence High, Medium, or Low (9, 3, or 1) • Rate Severity of failure High, Medium, or Low (9, 3, or 1) • Multiply the two together and Pareto or plot Occurrence vs. Severity (Figure 12-9)

Occ

urre

nce

SeverityHML

H

M

L

medium highesthigh

as timepermits highmedium

mediumas timepermits

Figure 12-9 Occurrence vs. Severity

Page 19: Hair Dryer

January, 2001 12.1-19 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

Expected Cost: a More Accurate Measure of Risk Problems with the RPN as a Measure The RPN is the product of 3 ordinal scales. Ordinal scales are used for rank ordering, but the magnitude of the ordinal measure is not meaningful. Grades of beef, for example, are ordinal scales. An Occurrence rating of an 8 is greater than a 4, but and 8 is not twice as likely as a 4. Severity and Detection are also ordinal scales: a severity of 10 is not twice as bad as a 5. We could just as easily assign letters to the ten Severity/Occurrence/Detection categories - using A, B, C, D,… J instead of 1-10.

Occurrence, Severity, and Detection are ordinal scales: failures can be rank-ordered by any one of these criteria. The combination of these 3 elements through multiplication (or any other mathematical function) is not an “allowable transformation.” The magnitudes of the 3 RPN elements are not meaningful, yet we use the magnitude of the RPN to prioritize risk.

Discussion of “Occurrence” Occurrence ratings are based on probabilities, however, the exact mapping depends on which FMEA procedure you are using. Occurrence ratings are only consistent if the FMEA Occurrence guidelines are known. Below are some sample Occurrence-probability relationships (Figure 12-10).

0

1

2

3

4

5

6

7

8

9

10

1.E-11 1.E-09 1.E-07 1.E-05 1.E-03 1.E-01

Occurrence

Appr

oxim

ate P

roba

bility

AIAG, 1995

Palady, 1995

M cDermott et al., 1996

Humphries, 1994

Figure 12-10 Different references use varying Occurrence-probability relationships

Occurrence is usually assessed to the failure mode and cause, but not to the conditional probability that the failure mode + cause will result in the end effect. Discussion of “Detection” Detection has two common interpretations: 1) A measure of how likely a failure is to escape the design and/or manufacturing process and

reach the customer 2) A measure of how likely the customer is to “detect” a failure when it occurs

Page 20: Hair Dryer

January, 2001 12.1-20 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

The first definition of Detection is basically trying to measure the organizational efficiency of a design review process and/or quality control capability of a manufacturing line. This is extremely difficult to quantify, and it takes a lot of time to do so. Teams spend a lot of time trying to argue the definition of Detection, and even more time trying to come up with a 1-10 number. The second definition relates to how diagnosable the problem is. Consider a car that has run out of oil (the failure has occurred) and the oil warning light goes on. According to the second definition of Detection, if we think the driver will notice the warning light and get the car serviced, we give this a low Detection rating (scenario 1). If we think the warning light will go unnoticed, then we give the failure a high Detection rating (scenario 2). However, these are two different scenarios with different probabilities and consequences. They should be separate line items on the FMEA with individual Occurrence and Severity evaluations. Both definitions of Detection are somewhat confusing and difficult to evaluate. It is our opinion that the intent of the Detection rating is to account for the probability that a failure mode results in the “end effect” – in other words, given a failure mode (and cause), what is the conditional probability that the end effect will happen. Unfortunately, this intent has been somewhat obscured by the definitions given in FMEA manuals. We should note that some standards ignore Detection completely (SAE FMECA) and use a probability-based criticality number to rate probability of failures. Comparing the RPN to Expected Cost as a Measure of Risk The purpose of the FMEA is to prioritize failures in accordance with their risk. There are many definitions of risk, and we have listed a few below

• The possibility of incurring damage (Hauptmanns and Werner, 1991) • Exposure to chance of injury or loss (Granger and Henrion ) • Possibility of loss or injury (Webster)

Most definitions of risk contain elements of “chance” and of “consequences.” Probability and cost are frequently used to measure the elements of risk (Table 12-12).

Table 12-12 Using Probability and Cost to measure Risk

Question to be answered Accepted Measure Description

How likely is it to happen? p – probability the chance the scenario will occur

If it does happen, what are the consequences

c – cost the measure of the damage

The most common quantification of risk in many fields (e.g. Economics, Decision Theory, Probabilistic Risk Analysis, Insurance, etc.) is “expected cost”.

expected cost = probability x cost

As we discussed earlier, probability is related to Occurrence; likewise Severity could be related to cost (Figure 12-11).

Page 21: Hair Dryer

January, 2001 12.1-21 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

Cost

Severity Ranking

ApplianceCompany

AutomotiveCompany

AerospaceCompany

Figure 12-11 Hypothetical cost-Severity relationships

The cost doesn’t necessarily have to be in monetary units, other “hazard” measures could be used such as levels of toxicity, fatalities, etc. We can compare expected cost to RPN by using the Occurrence-probability relationship from the AIAG and a hypothetical cost-Severity relationship which is linear (Table 12-13).

Table 12-13 Occurrence-Probability and Severity-Cost Occurrence probability

1 6.66667E-072 6.66667E-063 6.66667E-054 0.00055 0.00256 0.01257 08 0.1259 0.333310 0.75

.05

Severity Cost ($)1 52 13 14 25 26 37 38 49 4

10 500

00050005000500050

For the time being, let’s ignore the Detection index (assume Detection = 1). We can calculate the Risk Priority number (O x S) and expected cost (p x c) for pairs of probability and costs.

Page 22: Hair Dryer

January, 2001 12.1-22 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

0

10

20

30

40

50

60

70

80

90

100

0.00001 0.001 0.1 10 1000

Expected Cost

Figure 12-12 RPN vs. expected cost for 100 pairs of probability and cost

Although the RPN and Expected cost values have a positive correlation the mapping or RPN to expected cost is not 1-to-1. As shown in Figure 12-12 the points cover an area and not a monotonically increasing line. The results reveal some interesting facts: 1) Failures with the same expected cost could have very different RPN values. For this set of data, an expected cost of $6.25 could have a range of RPN from 8 to 6. Two entirely different priorities would be given to failures with the same expected cost (Table 12-14).

Table 12-14 Probability Cost Expected

cost Occurrence

Rank, L Severity Rank, S

RPN * (L x S x D)

*D = 1 .125 $50 $ 6.25 8 1 8

.0125 $500 $ 6.25 6 10 60 2) Failures with the same RPN could have very different expected costs. For this set of data, an RPN of 10 could have expected costs ranging from $0.00033 to $37.50. Two failures with vastly different expected costs could have identical RPN’s (Table 12-15).

Table 12-15 Probability Cost Expected

cost Occurrence

Rank, L Severity Rank, S

RPN * (L x S x D)

*D = 1 0.75 $ 50 $ 37.50 10 1 10

6.66x10-7 $500 $ 0.00033 1 10 10

3) A failure mode can have a higher RPN and lower expected cost than another For example:

Table 12-16 Probability Cost Expected

cost Occurrence

Rank, O Severity Rank, S

RPN * (L x S x D)

*D = 1 0.75 $50 $ 37.50 10 1 10

6.66x10-5 $500 $ 0.033 3 10 30

Page 23: Hair Dryer

January, 2001 12.1-23 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

We see in Table 12-16 that while a failure scenario may be given higher priority using the RPN (say, 30 compared to 10), its expected cost could be orders of magnitude smaller ($0.033 compared to $37.5 in this example.) This could lead to “myopic” decisions about which failures to prioritize. For example, by randomly choosing pairs of probabilities and costs we can see that the RPN does not necessarily increase with expected cost (Figure 12-13).

0.0001

0.001

0.01

0.1

1

10

100

1000

10000

100000

1000000

10000000

100000000

potential failures

0

10

20

30

40

50

60

70

80

90

100

Expected CostRPN

Figure 12-13 Failures ordered by Expected cost do not match the ordering by RPN

02468

1012141618

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65

failures sorted by descending Ex pected Cost

Log(

Expe

cted C

ost)

Figure 12-14a Random failures sorted by Expected Cost

024

68

1012

141618

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65

failures sorted by descending RPN

Log(

Expe

cted C

ost)

Figure 12-14b Random failures sorted by RPN

Page 24: Hair Dryer

January, 2001 12.1-24 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

Ordering failure “importance” using decreasing RPN could give an entirely different priority than using decreasing expected cost. While one failure mode may have a higher RPN than another, its expected cost could be much lower. As shown in Figure 12-14, the RPN gives lower priority to some failures which have a high expected cost. For a linear cost function, we have demonstrated that the Risk Priority Number is a distorted measure of risk with respect to expected cost. Including the Detection index only exaggerates the fundamental problems with the RPN. The next section, we explore using other combinations of Severity-cost and Occurrence-probability relationships. Other Cost and Probability Functions Prioritizing risks using the RPN will result in a different rank order of risks compared to expected cost. We have found the same “area” relationship (as opposed to a preferred 1-1 mapping) between RPN and Expected cost using a wide variety of Severity-cost, and Occurrence probability relationships. We feel that for almost any Severity-cost and Occurrence-probability relationship, the ordering of failures will not match. Why not use Expected Cost for FMEA? Understandably, engineers are not prone to use probabilities or cost values without having substantiating data. There is a stigmatism attached to using probability and cost estimates – the fear of being taken to task over specific probability and cost data. Apparently, however, there is minimal apprehension with using 1-10 point estimates of probability and severity (cost). Probability and cost are better measures than their RPN counterparts, and one could argue that guessing with a meaningful measure is favorable to guessing with a less meaningful measure. Both are estimates; the set containing probability and cost contains more information (see Table 12-17 below).

Table 12-17 Using Expected Cost

disadvantages of using expected cost advantages of using expected cost

failure costs are difficult to estimate probability and cost are better measures of chance and consequences – both are ratio scales and multiplication is legitimate

failure probabilities are difficult to estimate probability estimates are already used for estimating Occurrence

1-10 scales are familiar and “quick” sub-system FMEA using probability can be more easily integrated into system FMEA

probability and cost measures are consistent and global – they are better for communicating FMEA data

Expected cost ties FMEA to $$

Table 12-18 below shows the Kos-dri ranked using “expected cost.” Probabilities are used in the Occurrence columns, and costs are used for Severity. The ranking of risks is slightly different than using the traditional RPN. Expected cost offers some advantages.

Page 25: Hair Dryer

January, 2001 12.1-25 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

Table 12-18 Kos-dri example using Expect Cost Failure Modes & Sysytem Name: Blow Dryer FMEA Number : KosDri-217AEffects Analysis Major Function: Dry Hair: Provide Flow, Heat Air, Provide User Interface Page : 1 of 2

Prepared By: M.A. Nuhrd Date 12/11/1997

Function/ Requirement Potential Failure Modes Potential Causes of

Failure

Prob

abili

ty

of

Occ

urre

nce

Local EffectsEnd Effects on Product, User, Other Systems Se

verit

y (C

ost)

Detection Method/ Current Controls D

etec

tion R

PN

Actions Recommended to

Reduce RPN

Responsibility and Target

Completion Date

PROVIDE AIR FLOW

supply electricity to fan no electricity to fan motor broken fan switch 0.001 no air flow hair not dried 30 1 0.03

no electricity to fan motor loose switch connection 0.001 no air flow hair not dried 30 1 0.03

no electricity to fan motor short in power cord 0.001 no air flow hair not dried 30 1 0.03

no electricity to fan motor short in power cord 0.001 potential user injury 10000 1 10

no electricity to fan motor no source power 0.01 no air flow hair not dried 30 1 0.3

low current to fan motor low source power 0.0001 reduced air flow inefficient drying 3 1 0.0003convert electric power to rotation no rotation motor failure 0.001 no air flow hair not dried 30 1 0.03

no rotation obstruction impeding fan 0.01 no air flow hair not dried 30 1 0.3

no rotation obstruction impeding fan 0.01 motor overheat melt casing 10000 1 100

low rotation rotor/stator misalignment 0.0001 reduced air flow hair not dried 30 1 0.003

low rotation rotor/stator misalignment 0.0001 noise generation 5 1 0.0005

low rotation hair/foreign matter increasing friction 0.1 reduced air flow inefficient drying 5 1 0.5

convert rotation to flow no fan rotation loose or worn fan connection to rotor 0.0001 no air flow hair not dried 30 1 0.003

convey flow low incoming flow blocked inlet 0.001 reduced air flow inefficient drying 5 1 0.005

low incoming flow blocked inlet 0.001 reduced air flow high temperatures generated 10000 1 10

low outgoing flow blocked outlet 0.0001 reduced outflow inefficient drying 5 1 0.0005

low outgoing flow blocked outlet 0.0001 reduced outflow high temperatures generated 10000 1 1

no flow control broken fan switch 0.0001 no flow hair not dried 30 1 0.003HEAT AIR 1supply electricity to heater no electricity to heater broken temperature

switch 0.0001 no heat generated hair not dried 30 1 0.003

no electricity to heater short in power cord 0.001 no heat generated hair not dried 30 1 0.03no electricity to heater no source power 0.0001 no heat generated hair not dried 30 1 0.003

low current to heater low source power 0.0001 insufficient heat generated inefficient drying 5 1 0.0005

convert electric power to heat no heat generation short in heater element 0.001 no heat generated hair not dried 30 1 0.03

no heat generation short in heater element 0.001 user injury 10000 1 10

control temperature temperature too high thermocouple failure 0.0001 excessive heat generated discomfort to user 50 1 0.005

temperature too high thermocouple failure 0.0001 excessive heat generated burns user 10000 1 1

temperature too high temperature switch stuck closed 0.0001 excessive heat

generated discomfort to user 50 1 0.005

temperature too high temperature switch stuck closed 0.0001 excessive heat

generated burns user 10000 1 1

temperature too low thermocouple failure 0.0001 insufficient heat generated inefficient drying 5 1 0.0005

temperature too low temperature switch stuck open 0.0001 no heat generated inefficient drying 5 1 0.0005

temperature too low incoming air too cold 0.001 insufficient heat generated inefficient drying 5 1 0.005

PROVIDE USER INTERFACE

provide controls no user control broken actuator switch 0.0001 no heat generated and no air flow hair not dried 30 1 0.003

no user control actuator switch stuck or jammed 0.001 no heat generated

and no air flow hair not dried 30 1 0.03

protect user exposure to fan blade damaged rear screem 0.0001 allows foreign objects access to blade injury to fingers 10000 1 1

exposure to heating element broken front grid 0.0001 exposed heating burns user 10000 1 1

Including Uncertainty It is possible to use a framework for assessing risk using both probability and cost estimates and accounting for uncertainty. While individuals may be hesitant to use a single point-estimate for probability or cost, using a range of values with a “best estimate” is conceivable without having detailed information. Estimates for probability distributions and cost distributions can be combined mathematically to determine the expected-cost distribution or the “risk-profile”. See Table 12-19 below.

Page 26: Hair Dryer

January, 2001 12.1-26 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

Table 12-19 Using Uncertainty in analysis

Function/ Requirement Potential Failure Modes Potential Causes of

Failure

10%

-ile

prob

abili

ty

50%

-ile

prob

abili

ty

90%

-ile

prob

abili

ty

Local EffectsEnd Effects on Product, User, Other Systems

10%

-ile

Cos

t

50%

-ile

Cos

t

90%

-ile

Cos

t

Detection Method/ Current Controls D

etec

tion R

PN

weight 0.2 0.6 0.2 weight 0.2 0.6 0.2PROVIDE AIR FLOW

supply electricity to fan no electricity to fan motor broken fan switch 0.0005 0.001 0.005 no air flow hair not dried 15 30 50 1 0.053

no electricity to fan motor loose switch connection 0.0001 0.001 0.01 no air flow hair not dried 15 30 50 1 0.081

no electricity to fan motor short in power cord 0.00001 0.001 0.05 no air flow hair not dried 15 30 50 1 0.329

no electricity to fan motor short in power cord 0.0005 0.001 0.005 potential user injury 100 10000 20000 1 17.034

no electricity to fan motor no source power 0.008 0.01 0.03 no air flow hair not dried 15 30 50 1 0.422

low current to fan motor low source power 0.000001 0.0001 0.00003 reduced air flow inefficient drying 1 3 5 1 0.000convert electric power to rotation no rotation motor failure 0.0001 0.001 0.004 no air flow hair not dried 15 30 50 1 0.044

no rotation obstruction impeding fan 0.00001 0.01 0.1 no air flow hair not dried 15 30 50 1 0.806

no rotation obstruction impeding fan 0.005 0.01 0.02 motor overheat melt casing 30 10000 100000 1 286.066

low rotation rotor/stator misalignment 0.0003 0.0001 0.0004 reduced air flow hair not dried 15 30 50 1 0.006

low rotation rotor/stator misalignment 0.00007 0.0001 0.0006 noise generation 1 5 15 1 0.001

low rotation hair/foreign matter increasing friction 0.05 0.1 0.15 reduced air flow inefficient drying 1 5 15 1 0.620

References: Aerospace Recommended Practice ARP926A, 1979. Automotive Industry Action Group (AIAG) (1995). “Potential Failure Mode and Effects

Analysis (FMEA): Reference Manual”, 1995. Bednarz, S. and D. Marriott, 1988, “Efficient Analysis for FMEA,” Proceedings of the 1988

IEEE Annual Reliability and Maintainability Symposium, pp. 416-421. Bowles, J., 1998, “The New SAE FMECA Standard,” Proceedings of the 1998 IEEE Annual

Reliability and Maintainability Symposium, pp. 48-53. Couthino, J.S., “Failure-Effect Analysis,” Transactions of the New York Academy of Sciences,

1964, pp. 564-585 Ertas, A., and Jones, J., The Engineering Design Process. John Wiley & Sons, Inc., New York,

1993. Ford Motor Company “Potential Failure Modes and Effects Analysis in Design and for

Manufacturing and Assembly Processes Instruction Manual,” Sept. 1988 Gilchrist, W. (1993). “Modelling failure modes and effects analysis,” International Journal of

Quality and Reliability Management, 10(5), 16-23. Kara-Zaitri, C., Keller, A., Barody, I., and Fleming, P. (1991), "An Improved FMEA

Methodology," Proceedings of the 1991 IEEE Annual Reliability and Maintainability Symposium, pp. 248-252.

Kara-Zaitri, C., Keller, A., Barody, I., and P. Fleming (1991), "An Improved FMEA Methodology," In Proceedings of the 1991 IEEE Annual Reliability and Maintainability Symposium, IEEE Press, pp. 248-252.

Levinson, C. (1964), "System Reliability Analysis," Mechanical Design and Systems Handbook, (H. Rothbart, ed.) McGraw-Hill, Inc., New York.

Makino, A., P. Barkan, L. Reynolds and E. Pfaff (1989), “A Design for Serviceability Expert System,” Proc. of the ASME Winter Annual Meeting 1989, San Francisco, CA, pp. 213-218.

McKinney, B. (1991), "FMECA, The Right Way," In Proceedings of the 1991 IEEE Annual Reliability and Maintainability Symposium, IEEE Press, pp. 253-259.

Page 27: Hair Dryer

January, 2001 12.1-27 ME317 dfM: Product Definition S. Kmenta, and K. Ishii Failure Modes and Effects Analysis

Ormsby, A., J. Hunt, M. Lee (1991), Towards an Automated FMEA Assistant, Applications of Artificial Intelligence in Engineering VI, eds. Rzevski, G. and Adey, R., Computational Mechanics Publications, Southampton, Boston, pp. 739-752.

Palady, P. (1995) Failure Modes and Effects Analysis: Predicting & Preventing Problems Before They Occur, PT Publications, West Palm Beach, FL, 1995.

Society of Automotive Engineers (SAE) (1994), “Potential Failure Mode and Effects Analysis in Design (Design FMEA) and Potential Failure Mode and Effects Analysis in Manufacturing and Assembly Processes (Process FMEA) Reference Manual”, Surface Vehicle Recommended Practice J1739, July 1994.

Stamatis, D.H., Failure Mode and Effect Analysis: FMEA from Theory to Execution, ASQC Quality Press, Milwaukee, WI, 1995.

Ullman, D., The Mechanical Design Process. McGraw-Hill, Inc., New York, 1992. United States Department of Defense (USDoD) (1974/80/84), “Procedures for Performing A

Failure Mode, Effects and Criticality Analysis”, US MIL-STD-1629(ships), November 1974, US MIL-STD-1629A, November 1980, US MIL-STD-1629A/Notice 2, November 1984.

Voetter, M. and T. Mashhour (1996) “Verbesserung der FMEA durch eine Systematische Erfassung von Kausal-zusammenhaengen”, VDI-Z 138, Heft 4, 1996.