Advanced Statistics Manual PDF

Embed Size (px)

Citation preview

  • 8/17/2019 Advanced Statistics Manual PDF

    1/258

     ADVANCED

    STATISTICAL

    METHODS

    FOR ENGINEERS

  • 8/17/2019 Advanced Statistics Manual PDF

    2/258

     

  • 8/17/2019 Advanced Statistics Manual PDF

    3/258

    Chapter Zero

    Welcome to Advanced Statistical Methods forEngineers!

    2

    • Use name tents

    • Cell phones:

     – Turn off or use vibrate

     – Take phone calls outside

    • Keep side conversations to a minimum

    • Be prompt in returning from breaks

    • Don’t do other work during class

    • Let instructor know if you need to leave for more than 30 minutes

    • Listen with an open and active mind…

    • If you have a question at any time, ask!

     – Other Ground Rules wanted by students?…..

     – Class agree to these Ground Rules?

    Ground rules – please…

  • 8/17/2019 Advanced Statistics Manual PDF

    4/258

    3

     AgendaDay 1 Day 2 Day 3 Day 4

    8:00Ch 0: Welcome

    9:00 Ch 1: ANOVA andEquivalence Testing

    10:00

    11:00

    12:00

    1:00

    2:00

    3:00

    4:00

    5:00

    Ch 3: Distribution Analys is

    Ch 5: Regressionand GLM

    Breaks as Needed

    Lunch on your own

    Ch 7: Statistical

    Resources

    Online Evaluations

    Ch 4: Process

    Capability and

    Tolerance Intervals

    Ch 2: MeasurementSystems Analysis

    Ch 5: Regression

    and GLM

    continued

    End of Day Review

    Lunch on your own Lunch on your own Lunch on your own

    En d o f Day Rev iew En d o f Day Rev iew En d o f Day Rev iew

    Ch 6: Logistic

    Regression

    4

    Logistics

    • Starting Time: 8:00

    • Ending Time: Not later than 5:00

    • Lunch 12:00-1:00

    • Breaks every 90-120 minutes

    • Power Outlets

    • Rest Room Location

    • Food and drink locations (snacks, cafeteria, etc)

  • 8/17/2019 Advanced Statistics Manual PDF

    5/258

    5

    You Need ...

     – Laptop with MINITAB and a working wireless InternetConnection

     – Writing instruments

     – Access to data files

    6

    Icebreaker (5 Minutes)

    My favorite statistician, living or dead, is . . .

    My favorite statistics joke is …

    In my journey through the world of statistics…

    (Extra Credit)

    One thing that has worked well for me is …

    One thing that has been a challenge for me is …

  • 8/17/2019 Advanced Statistics Manual PDF

    6/258

    7

    Expectations

     – Tools, tools, tools…

    • Course may overlap with material from DRM or Lean Sigma

    • Tools may be familiar, but the intent is to present the tools with a focuson statistical thinking and decision-making.

    • Topics may be explored in greater mathematical depth than is offered inother curricula.

     – Benefits

    • A deep mathematical dive can actually help you better see the surface.

    • Awareness of mathematical assumptions is a critical first step forgrowing in your statistical knowledge, but advanced practitioners needto know:

     – Which assumptions are most critical?

     – When is it appropriate to break the rules?

     – What are the consequences of breaking the rules?

    • Statistical sophistication allows for flexibility and creativity in problemsolving.

    8

    Expectations –  Experience Chart

    • Mark an X in column that best describes yourexperience with each topic

     –  Your Expectations

    • Create a list at your table

    • Each table will report

    • Spokesperson: skip itemsalready mentioned

     –  Time: 10 Minutes

    Topic None A Little Comfortable Proficient I could teach it

    Equivalence Testing

    Tolerance Intervals

    ANOVA Signal 

    Interpretation

    Measurement Systems 

    Analysis

    Distribution Analysis

    Process Capability

    General 

    Linear 

    Models

  • 8/17/2019 Advanced Statistics Manual PDF

    7/258

    Your Feedback is Critical

    • September 17-20 represents the first wave of Advanced SME atMDT

    • Given that many of you already are leaders in the statistical orDRM worlds, your suggestions for course improvements areextremely important!

    • At the end of each day, we will engage in brief feedbacksession.

    • At the end of the week, there will be an online survey for you toformally evaluate the course.

    • If you wish to provide more detailed feedback, please send anemail to the instructor team: Leroy Mattson, Karen Hulting, JeremyStrief, Tom Keenan, Grant Short, Dayna Cruz

    | MDT Confidential9

    10

    What questions do you have?

  • 8/17/2019 Advanced Statistics Manual PDF

    8/258

     

  • 8/17/2019 Advanced Statistics Manual PDF

    9/258

    Chapter 1: ANOVA and Equivalence Testing

    Topics

    • Quality Trainer Review

    • ANOVA

     – Assumptions

     – Using Minitab Assistant vs Stat Menu

     – Calculation Deep Dive

     – Sample Size

     – ANOVA Signals

    • Equivalence Testing

    | MDT Confidential2

  • 8/17/2019 Advanced Statistics Manual PDF

    10/258

    | MDT Confidential3

    Quality Trainer Review

    Comparing Grouped Data:Variables Data Response

    | MDT Confidential4

  • 8/17/2019 Advanced Statistics Manual PDF

    11/258

     ANOVA: ASSUMPTIONS

    | MDT Confidential5

    One-way ANOVA:Testing for the significance of one factor 

    • The null hypothesis: – H0: μ1 = μ2 = … μk – Meaning that the population (response) means are equal at

    each of the k levels of this factor or the factor is NOT significant.

    • The alternative hypothesis: – H A: at least two population means are unequal

     – Meaning that the factor IS significant

    • Perform the One-way ANOVA and reject the null hypothesis ifthe p-value is < alpha – Usually alpha = 0.05 (or 0.10 or 0.01)

     – A way to remember: “If p is low – the null must go”.

    | MDT Confidential6

  • 8/17/2019 Advanced Statistics Manual PDF

    12/258

     ANOVA: General Process Steps

    • Select a model

    • Plan sample size using relevant data or guesses

    • (Optional) Simulate the data and try the analysis

    • Collect real data

    • Fit the model (perform ANOVA and get p value)

    • Examine the residuals

    • Transform the response or update the model, if

    necessary

    • State conclusion

    | MDT Confidential7

    Typical Assumptions for ANOVA Factors

    • Factors (or “Inputs”)

     – Each factor can be set to two or more distinct

    levels

     – Factor levels can be measured adequately

     – Factor levels are “fixed” rather than “random”

     – For multiple factors, all combinations of all levels

    are represented (levels are “completely crossed”)

    | MDT Confidential8

  • 8/17/2019 Advanced Statistics Manual PDF

    13/258

    Typical Assumptions for ANOVA Responses

    • Response data is “complete”, not censored

    • Some software requires “balanced” data – same

    sample size for each level of the input factor 

    • Assumptions on Residuals

     – Residual = Response – Fitted Value

     – Normally distributed

     – Equal variance (assumption relaxed in Minitab

     Assistant)

     – Independent (e.g. no time trend)

    | MDT Confidential9

     ANOVA CALCULATIONS DEEP DIVE:STAT MENU & MINITAB ASSISTANT

    | MDT Confidential10

  • 8/17/2019 Advanced Statistics Manual PDF

    14/258

     ANOVA Calculations

    • See www.khanacademy.org

     – ANOVA 1 – Calculating SST (7:39)

     – ANOVA 2 – Calculating SSW and SSB (13:20)

     – ANOVA 3 – Hypothesis Test and F Statistic (10:14)

    | MDT Confidential11

    Minitab Analysis of Kahn Dataset

    | MDT Confidential12

    Can arrange either Stacked or Unstacked

  • 8/17/2019 Advanced Statistics Manual PDF

    15/258

    Consider a PQ Dataset

    • Three runs of n=10 units produced and tensile

    tested

    • See Ch1DataFile.mtw

    • Columns TipTensile1, TipTensile2, TipTensile3

    | MDT Confidential13

    Minitab Options

    • Could use – Stat -> ANOVA

     – -> One way

     – -> One way (Unstacked)

     – -> General Linear Model

     – Stat -> Regression -> GeneralRegression

     – Minitab Assistant

    • Data arrangement

     – Stacked (one column for X, onecolumn for Y)

     – Unstacked (Y values in columns foreach X)

    | MDT Confidential14

  • 8/17/2019 Advanced Statistics Manual PDF

    16/258

     ANOVA using Minitab Statistics Menu

    | MDT Confidential15

    Stat Menu Outputs

    | MDT Confidential16

    S, R2 and adjusted R2 are measures

    of how well the model fits the data.

  • 8/17/2019 Advanced Statistics Manual PDF

    17/258

    Judging model fit

    • S is measured in the units of the response variable and represents thestandard distance data values fall from the fitted values

     – For a given study, the better the model predicts the response, the lower S is

    • R2 (R-Sq) describes the amount of variation in the observed responsevalues that is explained by the predictor(s)

     – R2 always increases with additional predictors.

     – R2 is most useful when comparing models of the same size

    • Adjusted R2 is a modified R2 that has been adjusted for the number ofterms in the model

     – R2 can be artificially high with unnecessary terms, while adjusted R2 mayget smaller when terms are added to the model

     – Use adjusted R2

    to compare models with different numbers of predictors

    | MDT Confidential17

    Comparisons Output

    | MDT Confidential18

  • 8/17/2019 Advanced Statistics Manual PDF

    18/258

     ANOVA – Examining Residuals

    1) Test for

    Normality

    Normal

    Probability Plot

    is a Straight line

    2) Test for Equal

    Variances

    Residual vs.

    Fitted Values is

    evenly distributed

    around the 0 line

    Using the Stacked arrangement, there wouldalso be a 4th Residual plot – Time Order.

    This is a Test for Independence – looking for a

    pattern over time.

    Residuals are strongly non-normal . . .

    Possible Causes:

    • Failure of Equal Variance

     Assumption

    • Outliers

    • Missing Important Factors

    in the Model

    • Data is from Non-Normal

    Population

    What to do?• Check for Outliers

    • Check if Equal Variance is satisfied

    • Perform Normality Test

    • If data is from Non-Normal Population consider using

    Non-Parametric Tests or Transform the Response

    variable

  • 8/17/2019 Advanced Statistics Manual PDF

    19/258

    If Residuals differ Group to Group

    Possible Causes:

    • Non-Constant Variance

    • Outliers

    • Missing Important Factors

    in the Model

    What to do?

    • Test for equal variance assumption using Stat >

     ANOVA > Test for Equal Variances

    • If test indicates unequal variances then considertransforming the response variable

    • Verify if the outlier is a data entry error 

    • Add the factor into the model

    If there is a time pattern in the data . . .

    What to do?

    • Prevent by Randomizing

    • A time effect may be present

    • Consider time series procedure

  • 8/17/2019 Advanced Statistics Manual PDF

    20/258

    Common Transformations

    Transformation Comments

     Appropriate for Poisson Distributed Data

    Log(y)If the Response is exponentially increasing

    then this transformation is appropriate

    1/y  Appropriate when responses are close to zero

    Called the Arcsine Square Root function.

     Appropriate when Response is a proportion

    between zero and one.

    y

    ysin 1

     Another useful tool is Box-Cox Transformation

    0 ),(log

    0 ,

     :ProcedureCox-Box

     

      

    whenY eY 

    whenY Y 

     Minitab

    Box-Cox Transformation in Minitab

    3210-1

    12

    10

    8

    6

    4

    2

    0

    Lambda

         S     t     D    e    v

    Lower CL Upper CL

    Limit

    Estimate 0.03

    Lower CL -0.30

    Upper CL 0.38

    Rounded Va lue 0.00

    (using 95.0% confidence)

    Lambda

    Box-Cox Plot of Data 1

    Minitab > Stat > Control Charts > Box-Cox Transformation

    Minitab Screenshots

  • 8/17/2019 Advanced Statistics Manual PDF

    21/258

     ANOVA using Minitab Assistant

    | MDT Confidential25

    http://www.minitab.com/support/documentation/Answers/Assistant%20White%20Papers/OneWayANOVA_MtbAsstMenuWhitePaper.pdf 

    Report Card

    | MDT Confidential26

  • 8/17/2019 Advanced Statistics Manual PDF

    22/258

    Diagnostic Report

    | MDT Confidential27

    Power Report

    | MDT Confidential28

  • 8/17/2019 Advanced Statistics Manual PDF

    23/258

    Summary Report

    | MDT Confidential29

     ANOVA - Exercise

    • Use Ch1DataFile.mtw

    • Test for differences between the group means

    using both Stat menu ANOVA and Minitab

     Assistant ANOVA . . . for these 3-lot PQ studies:

     – For TubeTensile1, TubeTensile2, TubeTensile3

     – For Diameter1, Diameter2, Diameter3

    • What are your conclusions?

    | MDT Confidential30

  • 8/17/2019 Advanced Statistics Manual PDF

    24/258

     ANOVA – Alternate Exercise

     Analyze this data two ways: 1) Assistant and 2) Stat>ANOVA

    Note: Stat>ANOVA assumes equal variances (and so may needtranformations), but Minitab Assistant ANOVA does no assume equalvariances.

     An article in the IEEE Transactions on Components, Hybrids, andManufacturing Technology (Vol. 15, No. 2, 1992, pp. 146-153)described an experiment in which the contact resistance of abrake-only relay was studied for three different materials (all weresilver-based alloys).

     Alloy-Contact Resistance.MPJ

    Test at a alpha = 0.01 level

    Does the type of alloy affect mean contact resistance?

     Applied Statistics and Probability for Engineers, 4th Edition, Douglas C. Montgomery and George C. Runger 

    Alloy-Contact Resistance.MPJ

    General Regression can be used for ANOVA

    Use for

    multiple

    regression –

    more than

    one X

    General regression can handle: 1) all continuous input(s), 2) all

    categorical input(s), 3) a mixture of continuous and categorical

    inputs, and 4) a non-normal response (it allows for the Box-Cox

    transformation of the response).

    The response must be continuous or considered as continuous.

  • 8/17/2019 Advanced Statistics Manual PDF

    25/258

    General Regression: Example of ANOVA

    Force in Grams

    Condition Stylet 1 Stylet 2 Stylet 3

    1 18.1 14.5 14.0

    2 20.0 16.1 16.3

    3 30.2 27.5 26.8

    4 42.5 39.4 38.727.70 24.38 23.95

    Note: A blocked One-way ANOVA is a two way ANOVA where one

    factor’s effect is to be “ blocked out “ . The randomization is donewithin each block.

    Background: The forces exerted by three different stylets in a lead is

    compared at 4 different positi on/advancement cond itions (blocks).

    The data is given below :

     x

    Perform an ANOVA analysis using Stats>Regression>General

    Regression and determine if:

    (1) there are significant differences between different stylets, and if 

    (2) the blocking factor employed was effect ive.

    Stylet.MTW

    stylet.MTW

    Condition is

    the Block

    Blocked One-way ANOVA

     x

  • 8/17/2019 Advanced Statistics Manual PDF

    26/258

    Blocked One-way ANOVA

    (1) Are there are significant differences between different stylets?

    (2) Is the blocking factor employed effect ive?

    SAMPLE SIZE FOR ANOVA

    | MDT Confidential36

  • 8/17/2019 Advanced Statistics Manual PDF

    27/258

    Planning Sample Size in ANOVA

    • Fill in the number of levels for the factor 

    • Always fill in Standard Deviation (use conservative estimate)

    • Then fill in two of the three long boxes

    • Can specify several values, separated by spaces

    Sample Size for One-Way ANOVA Example

  • 8/17/2019 Advanced Statistics Manual PDF

    28/258

    Sample Size for One-Way ANOVA

    RESPONDING TO ANOVA SIGNALS

    | MDT Confidential40

  • 8/17/2019 Advanced Statistics Manual PDF

    29/258

    Statistical vs. Practical Significance

    • Key idea in any hypothesis testing effort

     – If the test detects a difference (a “signal”), then what?

     – Don’t assume the signal is automatically bad news (if

    you’re hoping for consistency) or good news (if you’re

    hoping for a change)

    • For example, “ANOVA Failure” in PQ

     – Examine the size of the signal in the appropriate

    context . . . determine the “practical” significance of the

    difference

     – The appropriate response depends on an assessment

    of both statistical and practical significance

    | MDT Confidential41

     ANOVA Signal in PQ

    • There was a realization that a significant p-value

    in the comparison of lot means should not

    necessarily mean the PQ fails

    • Analysis sometimes included to assess the

    “power” of the ANOVA and the practical

    significance of the difference in the means.

    • Eventually, Corporate Policy on Manufacturing

    Process Validation added the “ANOVA FailureFlow Chart”

    | MDT Confidential42

  • 8/17/2019 Advanced Statistics Manual PDF

    30/258

    | MDT Confidential43

    2008 Version

    of Corporate

    Guideline for

    Manufacturing

    Process

    Validation

    | MDT Confidential44

    2012

    Version of

    CRDM

     ANOVA

    Signal Flow

    Chart

  • 8/17/2019 Advanced Statistics Manual PDF

    31/258

    Pros and Cons

    • Pro

     – Provides a consistent way to address the questionof practical significance

     – Relatively Simple

     – Effective – expect the approach to stand up toregulatory scrutiny

    • Con

     – Can be very prescriptive

     – Standards for Ppk are quite high: 95% confidence

    bound on Ppk > 1.33 – Disincentive for larger sample size

    | MDT Confidential45

    Current approaches

    • Corporate Guideline phased out

    • CV procedure still has essentially the same

     ANOVA Signal Flowchart

    • CRDM originally had a more prescriptive version

    • CRDM currently has a simplified version

    • Would also work to include a discussion of the

    sample size of the ANOVA and the practicalsignificance of the difference

    • Discussion – other businesses?

    | MDT Confidential46

  • 8/17/2019 Advanced Statistics Manual PDF

    32/258

    Example of ANOVA Signal Flow Chart

    • Recall the ANOVA exercise on Ch1DataFile.mtw

    for TubeTensile1, TubeTensile2, TubeTensile3

    | MDT Confidential47

     ANOVA Signal Flow Chart Ppk Analysis

    | MDT Confidential48

    First Stack the 3 lots using Data -> Stack -> Columns

    Then run

    Stat -> Quality Tools -> Capability Analysis -> Normal

     Add confidence interval for

    Ppk using Options button

  • 8/17/2019 Advanced Statistics Manual PDF

    33/258

    Next steps

    • Total sample size is 90, so use confidence bound

    • Lower 95% confidence bound on Ppk is 0.92

    • Must make 3 more runs

     – TubeTensile4, TubeTensile5, TubeTensile6

     – These must pass tolerance interval analysis (like

    the first three runs did)

     – All six runs pass tolerance interval analysis

    | MDT Confidential49

    Conclusion

    | MDT Confidential50

    Note: Ppk analysis of all six lots is not

    required. Included here FYI.

  • 8/17/2019 Advanced Statistics Manual PDF

    34/258

    Exercise: ANOVA Signal

    • Run ANOVA and assess practical significance for 

     – In Ch1DataFile.mtw, analyze

    • WireTensile1, WireTensile2, WireTensile3

    • Specification is 3 lb minimum

     – Use one of the ANOVA Signal Flowcharts

     – Then use another approach to determine the

    practical significance of the difference between the

    means

     – Conclusion?

    | MDT Confidential51

     ANOVA: Summary And Recap

    • Review Quality Trainer 

    • Calculations Deep Dive into ANOVA

    • Analytically, ANOVA is a special case of

    Regression

    • Sample Size

    • ANOVA Signal Flow chart – some Medtronic

    divisions use one to standardize response to ANOVA Signal in PQ

    | MDT Confidential52

  • 8/17/2019 Advanced Statistics Manual PDF

    35/258

    EQUIVALENCE TESTING

    | MDT Confidential53

    Statistical Logic for Equivalence

    • The basic statistical logic is designed to disproveequality.

     – Null hypothesis: Two population parameters areequal, e.g. μ1 = μ2.

     –  Alternative hypothesis: Two population parametersare not equal, e.g. μ1 ≠ μ2.

    • We need a different form of logic to affirmativelyprove equivalence.

     – Null hypothesis: Two population parameters differby Δ or more, e.g. |μ1 - μ2| ≥ Δ.

     – Alternative hypothesis: Two population parametersdiffer by less than  ∆, e.g. |μ1 - μ2| < Δ.

    | MDT Confidential54

  • 8/17/2019 Advanced Statistics Manual PDF

    36/258

    Equality vs. Equivalence

    Part of the confusion around the issue of

    equivalence is that the concepts of equality and

    equivalence may not be distinguished.

     – Equality: Two values/processes are

    mathematically identical.

     – Equivalence: The difference between two

    values/processes is sufficiently small that it can be

    deemed practically insignificant.

    | MDT Confidential55

     Approach 1: Confidence Intervals

    • The idea is to demonstrate that the confidence interval forthe difference of interest is fully contained within therange of practical significance [-Δ,Δ].

    | MDT Confidential56 Jones, BMJ 1996

  • 8/17/2019 Advanced Statistics Manual PDF

    37/258

     Approach 1: Confidence Intervals

    • Step 1: Define Practical Significance – Before collecting data, use scientific/engineering

    principles to decide what difference, Δ, is practicallynegligible.

    • Step 2: Estimate Sample Size for Experiment – Based on characterization data or other assumptions,

    estimate the sample size needed to produce aconfidence interval fully contained within [-Δ,Δ]. (Stat

  • 8/17/2019 Advanced Statistics Manual PDF

    38/258

    Example of Approach 1

    | MDT Confidential59

    Met hod

    Par amet er MeanDi st ri but i on NormalStandard devi ati on 3 ( esti mate)Conf i dence l evel 95%Conf i dence i nter val Two-s i ded

    Resul t s

    Margi n Sampl eof Err or Si ze

    2 12

    We need n=12 from

    BOTH processes.

    Example Output

    • Conclusions:

    • The processes are statistically different (p=0.003), whichis a statement about non-equality .

    • Despite being unequal, the processes are still equivalent.The 95% confidence interval for the difference in means is(0.671, 2.798), which is a strict subset of [-3, 3]

    | MDT Confidential60

     Two- sampl e T f or New vs Ol d

    N Mean St Dev SE Mean

    New 12 30. 927 0. 858 0. 25

    Ol d 12 29. 19 1. 52 0. 44

    Di f f erence = mu (New) - mu (Ol d)

    Est i mate f or dif f erence: 1.735

    95%CI f or di f f erence: ( 0. 671, 2.798)

     T- Test of di f f er ence = 0 ( vs not =) : T- Val ue = 3. 44 P- Val ue = 0. 003 DF = 17

  • 8/17/2019 Advanced Statistics Manual PDF

    39/258

     Approach 1: Summary

    • The confidence interval approach is the gold

    standard for clinical trials and other high scrutiny

    experiments requiring FDA approval.

    • It is mathematically equivalent to a p-value-driven

    approach called TOST (Two One-Sided T-tests).

    • The confidence interval approach is easier to

    understand than the original form of TOST.

    | MDT Confidential61

    Post-hoc Problems

    • Rigorous application of approach 1 requires that

    the Δ value be established before collecting data.

    • What should we do when data have already been

    collected without defining the difference of

    interest or planning sample size?

    | MDT Confidential62

  • 8/17/2019 Advanced Statistics Manual PDF

    40/258

     Approach 2: Retrospective Power Analysis

    • When data have already been collected withoutplanning for rigorous “equivalence testing”,equivalence may be assessed by displaying an entirepower curve.

    • Even if this approach does not set a-priori standardsfor equivalence, – it provides additional context for an insignif icant p-value

     – it can help engineering experts to make decisions

    • Subjective judgment will be required to determine ifthe experiment was suitably powered to demonstrateequivalence.

    • A power curve is a useful supplement to a traditionalanalysis, but it does not match the rigor in approach1.

    | MDT Confidential63

     Approach 2 Method

    • After collecting the means and standard deviation

    of the observed data, create a power curve

    through the Power and Sample Size platform in

    Minitab.

    • Display and interpret the Power Curve in your

    data analysis report.

    • You may honestly believe that your experiment

    was sufficiently powered (>80%) to detectmeaningful differences, but the post-hoc nature

    of the analysis makes your argument weaker.

    | MDT Confidential64

  • 8/17/2019 Advanced Statistics Manual PDF

    41/258

    Example

    | MDT Confidential65

    • Consider again our old and new processes which havedistributions of N(30,22) and N(31,12), respectively.

    • Suppose we forgot to take approach 1 and instead just collected5 data points from each process.

    • We found a statistical difference when we collected 12 datapoints, but the p-value goes above 0.05 when collecting only 5:

    Two-sample T for New_5 vs Old_5

     N Mean StDev SE Mean

     New_5 5 30.744 0.933 0.42

    Old_5 5 29.42 3.02 1.4

    Difference = mu (New_5) - mu (Old_5)

    Estimate for difference: 1.3295% CI for difference: (-2.61, 5.25)

    T-Test of difference = 0 (vs not =): T-Value = 0.93 P-Value = 0.403 DF = 4

    Power Curve Inputs

    • The observed sample size is n=5

    • Desired power levels are in the range of .8-.95

    • The pooled standard deviation is 2.24.

    | MDT Confidential66

  • 8/17/2019 Advanced Statistics Manual PDF

    42/258

    Power Curve Output

    • With 80% power, this experiment could havedetected a difference of about 4.5.

    • With 95% power, this experiment could havedetected a difference of about 6.

    • It is a subjective engineering judgment as to whethersuch values provide sufficient reassurance about theexperimental results.

    | MDT Confidential67

    Extensions and Challenges

    • Confidence intervals and power curves can be calculatedfor almost any type of statistical scenario:

     – Comparing 2 means

     – Comparing >2 means

     – Comparing standard deviations

     – Comparing reliability curves

    • However, the required sample size for provingequivalence of standard deviations is often much largerthan the sample size for means.

    • Equivalence for means can reasonably be quantified interms of arithmetic differences (e.g. |μ1 – μ2| < 5), butequivalence for standard deviations will be quantified interms of multiplicative differences (e.g. ½ < σ1/σ2 < 2).

    | MDT Confidential68

  • 8/17/2019 Advanced Statistics Manual PDF

    43/258

    Exercise – Lesion Depth

    • Consider the key requirement for a new ablation catheter:equivalent (or greater) maximum lesion depth, compared to thecurrent design, where the difference of interest is 0.5 mm.

    • Previous data shows – Normal distribution model is adequate for Max Lesion Depth

     – Current Design has average max lesion depth of 2.3 mm

     – New Design has average max lesion depth of 2.2 mm

     – Largest pooled standard deviation of max lesion depth is 0.356.

    • Follow Approach 1 to plan sample size for the equivalence test

    • Assume test data as follows to complete the equivalenceanalysis – New: n=15, mean = 2.733, stdev = 0.342

     – Current: n=15, mean = 2.723, stdev = 0.386

    • State your conclusion

    | MDT Confidential69

     Alternate Exercise: Equivalence Testing

    • Within your team, identify an example of

    equivalence testing in your own work.

    • Apply Approach 1, using actual or made-up

    characterization data for the planning step.

    • Use Minitab to simulate data collection.

     – Hint: Use Calc -> Random Data -> Normal . . .

    • Use Minitab to complete the Approach 1 data

    analysis.

    • State your conclusion from the data.

    | MDT Confidential70

  • 8/17/2019 Advanced Statistics Manual PDF

    44/258

    EQUIVALENCE Take Away Messages

    • An insignificant p-value is not a rigorous method ofproving equivalence.

    • Ideally, practical significance and sample size should beconsidered before the experiment begins.

    • Rigorously proving equivalence first demands carefullydefining the threshold ( ∆) of practical significance.

    • The most rigorous way to prove equivalence is todemonstrate that a confidence interval is fully containedwithin [- ∆, ∆].

    • An alternative—but less formal—approach is toretrospectively perform a power analysis.

    • Don’t feel like you need to remember all the Minitab steps;

    we hope you remember the concepts and call yourneighborhood statistician for further support.

    | MDT Confidential71

    Summary and Review

    • Quality Trainer Review

    • ANOVA

     – Assumptions

     – Using Minitab Assistant vs Stat Menu

     – Calculation Deep Dive

     – Sample Size

     – ANOVA Signals

    • Equivalence Testing

    | MDT Confidential72

  • 8/17/2019 Advanced Statistics Manual PDF

    45/258

    Chapter 2:Measurement Systems Analysis

    Topics

    • Quality Trainer Review

    • Topics with Variables Data

     – Gage R&R Sample Size

     – Probability of Misclassification (Variables Data)

     – Helpful Hints

    • MSA for Destructive Tests

    • MSA for Attribute Tests

    | MDT Confidential2

  • 8/17/2019 Advanced Statistics Manual PDF

    46/258

    Quality Trainer Review

    | MDT Confidential3

    Value of Measurement Systems Analysis

    If your goal is . . . then MSA helps by . . .

    Process Improvement

    Reducing variability in Xs and

    Ys so that the “key” Xs may be

    discovered.

    Capability

    Demonstration or

    Estimation

    More accurate measurements

    of process performance

    Sorting Out BadProduct

    Reducing the Probability ofMisclassification

    InnovationReduced noise allows discovery

    of more subtle signals

    | MDT Confidential4

  • 8/17/2019 Advanced Statistics Manual PDF

    47/258

    5 | MDT Confidential

    Recall . . . MSA Concepts

    •Bias – Mean (Delta – difference -- from reference)

    •Linearity – Mean (Bias vs Part or Operating Value)

    •Stability – Mean (Bias vs Time)

    •Repeatability – Standard Deviation

    •Reproducibility – Standard Deviation

    •Gage R&R – Standard Deviation

    …so linearity

    and stability

    should be

    plotted

    …while bias,

    repeatability and

    reproducibility are

     just single

    numbers

    Gage Bias and Linearity

    • Bias is the difference between the average ofrepeated measurements and the “true value”

    • MSA tends to focus on Gage R&R (variability), butaccuracy (= lack of bias) is equally important

     – Assumption that procedures for Calibration are in place- need to confirm

     – Assumption that procedures for Calibration areadequate – need to confirm

    • “Linearity” is a study of bias across the range ofmeasured values

    • In Minitab, use Stat -> Quality Tools -> Gage Study ->Gage Linearity and Bias Study

    | MDT Confidential6

  • 8/17/2019 Advanced Statistics Manual PDF

    48/258

    7 | MDT Confidential

    Gage Stability

    > Stat > Control Charts > Variables Charts for Subgroups > Xbar-R

    Day

       S  a  m  p   l  e   M  e  a  n

      1   2  -  S  e  p

       5  :  0  0

      1   2  -  S  e  p

      1  1  :  0  0

      1  1  -  S  e  p

        5  :  0  0

      1  1  -  S  e  p

       1  1  :  0  0

      1  0  -  S

      e  p    5  :  0  0

      1  0  -  S

      e  p   1  1  :  0  0

      9  -  S  e

      p    5  :

      0  0

      9  -  S  e  p

       1  1  :  0  0

      8  -  S  e

      p    5  :

      0  0

      8  -  S  e  p

       1  1  :  0  0

    0.254

    0.252

    0.250

    0.248

    0.246

     _  _ X=0.2497

    UCL=0.253458

    LCL=0.245942

    Day

       S  a  m  p   l  e   R  a  n  g  e

      1   2  -  S  e  p

        5  :  0  0

      1   2  -  S  e  p

       1  1  :  0  0

      1  1  -  S  e  p

        5  :  0  0

      1  1  -  S  e  p

       1  1  :  0  0

      1  0  -  S

      e  p    5  :  0  0

      1  0  -  S

      e  p   1  1  :  0  0

      9  -  S  e

      p    5  :

      0  0

      9  -  S  e  p

       1  1  :  0  0

      8  -  S  e

      p    5  :

      0  0

      8  -  S  e  p

       1  1  :  0  0

    0.0100

    0.0075

    0.0050

    0.0025

    0.0000

     _ R=0.00367

    UCL=0.00946

    LCL=0

     Xbar-R Chart of Rep1, ..., Rep3

    Xbar Chart - in control

    R Chart - in control

    Measurement system

    is stable over time as

    evidenced by:

    Snap Gauge.mtwMINITAB®

    GAGE R&R SAMPLE SIZE

    | MDT Confidential8

  • 8/17/2019 Advanced Statistics Manual PDF

    49/258

    Gage R&R Sample Size

    • General recommendation:

     – 5 to 10 Parts (P)

     – 2 to 3 Operators (O)

     – 2 to 3 Repeats (R)

    • More rigorous methods

     – Specify minimum Degrees of Freedom for

    estimating Repeatability and Reproducibility

    standard deviations

     – Use confidence intervals for standard deviationestimates (option provided in Minitab 16)

    | MDT Confidential9

    Degrees of Freedom Approach

    • Estimating Reproducibility Std Dev: O-1

     – Include as many operators as feasible

    • Estimating Repeatability Std Dev: P*O*(R-1)

     – With 30 df, 90% confidence bound on ratio of estimateto true value is (0.79, 1.21). Ref: on www.minitab.comsearch for “ID 2613” to access “Minitab Assistant WhitePapers.”

    | MDT Confidential10

    CVG Test

    Method

    Validation

  • 8/17/2019 Advanced Statistics Manual PDF

    50/258

    PROBABILITY OF

    MISCLASSIFICATION

    | MDT Confidential11

    12 | MDT Confidential

     

    USLLSL

    Probability of Misclassifying

    Good Unit as Bad UnitProbability of 

    Misclassifying

    Bad Unit as Good Unit

     

    Misclassification

    Two Misclassification Probabilities

    • Probability of Misclassifying Bad Unit as Good

    • Probability of Misclassifying Good Unit as Bad

  • 8/17/2019 Advanced Statistics Manual PDF

    51/258

    13 | MDT Confidential

    MINITAB Simulated Estimation of Misclassification:

    Following Gage RR study

    Part mean = 30, Part Std Dev = 10, Part Upper Spec = 40

    No measurement system bias Gage R&R Std Dev = 2.6

    1) Calc/Random Data/Normal

    (simulate true part measurements)2) Calc/Random Data/Normal

    (simulate gage variability)

    14 | MDT Confidential

    MINITAB Simulated Estimation of Misclassification (cont)

    3) Calc/calculator/ use the “+”

     Add 1) + 2) to simulate observed

    measurements

    4) Calc/calculator : assign a 1 for in

    spec for 1)

    Ex: (‘TrueMeasure’≤ 40)

  • 8/17/2019 Advanced Statistics Manual PDF

    52/258

    15 | MDT Confidential

    MINITAB Simulated Estimation of Misclassification (cont)

    5) Calc/calculator : assign a 1 for in

    specs for 2)

    Ex: (‘ObsMeasure’≤

    40)

    6) Stat/Table/Crosstabs to

    crosstabulate 4) and 5).

    16 | MDT Confidential

    MINITAB Simulated Estimation of Misclassification (cont)

    Estimated % of Truly Out of Spec called In Spec is 2.1%.

    The simulation sample size was 10000. A larger sample size would be better.

  • 8/17/2019 Advanced Statistics Manual PDF

    53/258

    17 | MDT Confidential

    MINITAB Misclassification

    18 | MDT Confidential

    MINITAB Misclassification

    Two problems:

    1) Only three decimals for probabilities( i.e. 0.000)

    2) Can’t enter historical: 1) process mean 2) part std.dev 3) gage std.dev

    (Note: (2) can now be done with a CSR work aid 13)

  • 8/17/2019 Advanced Statistics Manual PDF

    54/258

    19 | MDT Confidential

    Misclassification Using Minitab

    and Work Aid 13

    Load into the worksheet:the Part mean (30) and the Part Sigma (10) and the Gage Sigma (2.6)

    CSRworkaid13 POM.mtwMINITAB®

    20 | MDT Confidential

    MINITAB Misclassification

  • 8/17/2019 Advanced Statistics Manual PDF

    55/258

    21 | MDT Confidential

    MINITAB MisclassificationEnlarging the label on the sample mean chart, we see the mean is 30.

    22 | MDT Confidential

    MINITAB Misclassification

    Examining the output we see that: USL 40, and the Part Sigma (10)and the Gage Sigma (2.6) .

    Prob. of a truly bad part called good is .021

  • 8/17/2019 Advanced Statistics Manual PDF

    56/258

    Probability of Misclassification (POM) Tool

    • Originally written in R by Tarek Haddad to re-

    create functionality lost when Medstat was

    retired.

    • Jim Dawson collaborated with Tarek to continue

    development and turn it into an Excel tool.

    • A substantial Software Validation effort was

    undertaken by Nick Finstrom and Barry Christy,

    with the support of Pete Patel and the CVG Test

    Method Council. Validation work to be completedin early 2014.

    | MDT Confidential23

    POM Tool

    • Replicates Medstat functionality

    • More resolution in results than Minitab

    • Graphics

    • Guardbanding

    • Normal, Lognormal and Weibull distributions of parts

    | MDT Confidential24

  • 8/17/2019 Advanced Statistics Manual PDF

    57/258

    POM with Guardband

    | MDT Confidential25

    Exercise

    • Run POM analysis

     – Using Minitab

    Simulation

     – Using Work Aid 13 and

    Minitab GRR

     – Using POM Tool

    | MDT Confidential26

  • 8/17/2019 Advanced Statistics Manual PDF

    58/258

    HELPFUL HINTS

    | MDT Confidential27

    Gage R&R Helpful Hints - Normality

    • Normality testing is not needed for Gage R&Ranalysis

     – Distribution of the raw data will depend strongly on theparts used in the study – there no expectation orassumption that the raw data will to follow any specificdistribution

     – Repeated measurements on the same part by thesame operator will likely follow a normal distribution

    • Like any ANOVA model, the residuals are assumed to follow

    a normal distribution – but the analysis is relatively “robust”to non-normality of the residuals

     – Probability of Misclassification does depend on the partor process distribution (each part measured once)

    | MDT Confidential28

  • 8/17/2019 Advanced Statistics Manual PDF

    59/258

    Gage R&R Helpful Hints – One-Sided

    Specification• In the case of a one-sided specification, the Percent

    Tolerance metric depends on the part average

    • Minitab uses the overall average in the Gage R&R studyas the estimate of the part average

    • If the parts used in the study are not representative of theexpected part distribution . . .

     – The overall average will be a poor estimate of the processaverage

     – The percent tolerance result will be misleading

     – Best practice would be to calculate Percent Toleranceseparately using a better estimate of the process average

     – Being “not representative” can be a good practice – forexample, including parts that don’t meet the specification

    | MDT Confidential29

    Corrective Actions for Failed Gage R&R

    • Repeatability problem

     – Could be due to part positional variation

    • Standardize by measuring same position on each part

    • Or make multiple measurements at random or systematic

    positions and use the average

     – If gage itself is too variable, may need to improve

    or replace

    • In the meantime, Repeatability variability can be filtered

    out by taking repeated, independent measurements and

    using the average. Note that this approach does not

    correct for Reproducibility issues.

    | MDT Confidential30

  • 8/17/2019 Advanced Statistics Manual PDF

    60/258

    Corrective Actions for Failed Gage R&R

    • Reproducibility Problem

     – Look for assignable causes that explain the

    operator-to-operator differences

     – Understand any Operator*Part interactions – these

    may provide clues to differences in technique.

     – Possibly improve the measurement procedure

    and/or re-train the operators

     – Improve any visual aids or samples used in the

    measurement procedure

    | MDT Confidential31

    32 | MDT Confidential

     Approaches to Robust Gage R&R

    Standard Gage R&R methods assume that other factors that affect

    measurements have been studied and controlled in the development

    of the test method.

    If these sources of variability still affect the measurements, then . . .

    The Expanded Gage R&R allows you to add additional factors.

    Besides operator & part, you could add fixture number, gage

    number or other factors. The Expanded GRR can also handle

    missing data.

    Reference: “Make Your Destructive, Dynamic, and Attribute

    Measurement System work for you” by William Mawby.

    This book includes the Analysis Of Covariance method thatallows one to load in the varying environmental factors like

    temperature & humidity (covariates) into a GRR.

    The General Linear Model in Minitab (under the ANOVA branch)

    can be used to model covariates (also handles missing data).

  • 8/17/2019 Advanced Statistics Manual PDF

    61/258

    MSA FOR DESTRUCTIVE

    MEASUREMENTS

    | MDT Confidential33

    34 | MDT Confidential

    Two Types of Destructive Measurements

    1. Truly destructive: Measurement destroys unit being measured

    Pull test

    Peel test

    Tensile test

    2. Non-replicable: Measurement process can change the unit

    or you are measuring a transient phenomena

    Catapult distance

    Motor speed

    Heart rate

    Dimension of silicon part (can compress)

    Dimensions of heart tissue (can compress)

    Ref: Make Your Destructive, Dynamic, and Attribute measurement System

    Work for You. by. W. D. Mawby

    In neither case is it possible

    to take repeated measures,

    so gage R&R is not possible.

  • 8/17/2019 Advanced Statistics Manual PDF

    62/258

     Approaches to Destructive MSA

     Approach Pro ConDevelop a non-destructive

    measurement

    Ideal solution Often difficult or

    impossible

     Attempt to use identical parts

    as “repeat” measurements

    and apply usual requirements

    for GRR %Tolerance

    Easy to apply usual

    Minitab calculations

    Rarely works because

    parts aren’t actually

    identical

    Use a coupon test so that

    parts are more identical

    Results better than

    above

    Coupons may not be

    representative – easier to

    measure than real parts

    Focus on improving the

    measurement process using

    DMAIC

    Proven methodology Cannot conclude

    whether measurement

    system is adequate

    Focus on Reproducibility Not affected by part-

    to-part variability

    Might miss a

    Repeatability issue

    | MDT Confidential35

    What about using “Nested” Gage R&R?

    • The “nested” Gage R&R analysis applies when one operatormeasures different parts than another operator. – For example, John measures parts 1, 2, 3, 4, 5 repeatedly and

    Jane measures parts 6, 7, 8, 9, 10 repeatedly.

     – Common application would be “Inter-laboratory Testing,” whereoperators at each location measure different parts repeatedly.

     – Can work for Destructive MSA if each homogeneous samplemay be sub-sampled. Then operators can measure differentsamples repeatedly.

    • Analysis – The nested analysis does not include a term for Part * Operator

    interaction. – Note that Minitab Assistant doesn’t offer the Nested analysis

    • Unless sub-sampling of homogeneous material is possible,Nested does not solve the key problem of Destructive MSA – It’s impossible to repeat the measurement

    | MDT Confidential36

  • 8/17/2019 Advanced Statistics Manual PDF

    63/258

    37 | MDT Confidential

    Destructive Gage R&R Example

    Tensile testing of tubing

    8 pieces of tubing

    Each tubing cut into 2 sub samples

     Assume variation between sub

    samples due to measurement error 

     Assume an upper specification of

    850 g

    TestingSupplierCoils.mtwMINITAB®

    Destructive Gage R&R using sub-samples

    38 | MDT Confidential

  • 8/17/2019 Advanced Statistics Manual PDF

    64/258

    Destructive Gage R&R using sub-samples

    39|

    MDT

    Confi

    denti

    al

    40 | MDT Confidential

    Destructive Gage R&R using sub-samples

    Large result for

    % Tolerance

    Measurement system does

    not distinguish one part from

    another within the range of

    parts used in the study

    Nearly all measurement system

    variation due to repeatability

    rather than operator

    (reproducibility).. . . Or maybe

    sub-sample differences?

  • 8/17/2019 Advanced Statistics Manual PDF

    65/258

    41 | MDT Confidential

    Destructive Gage R&R using sub-samples

    Destructive Gage R&R using subsamples gave poor results

    Since repeatability accounts for most of the apparent measurement

    variation it is likely that parts were not very similar 

    In this project they used DMAIC Process Knowledge method to improve

    system without obtaining a formal measurement

    Focus on Reproducibility

    • With destructive measurements, theRepeatability Standard Deviation always includesthe part-to-part or subsample-to-subsamplevariation. In general, repeatability standarddeviation cannot be accurately estimated.

    • If one population of parts is randomly assigned tomultiple operators, then the ReproducibilityStandard Deviation is not affected by part-to-part

    variation.• Reproducibility standard deviation can be

    estimated accurately even for destructive tests.

    | MDT Confidential42

  • 8/17/2019 Advanced Statistics Manual PDF

    66/258

    Reproducibility

    • Stop

     – Trying to force (Repeatability + Part) Standard

    Deviation to be small enough to meet a requirement.

     – Trying to obtain or create “identical” parts.

    • Start

     – Estimate Reproducibility standard deviation and ensure

    that it is small enough. This standard deviation

    depends only on the differences between operator

    means.

     – Compare operator standard deviations. Identify caseswhere operators show substantially different variation

    across equivalent sets of parts.

    | MDT Confidential43

    Example: CVG Test Method Validationfor Destructive Tests

    • Obtain a population of 40 parts

     – Do not need to get identical or nearly identical

    parts

    • Randomly assign 10 parts to each of 4 operators

    • Calculate %Tolerance for Reproducibility

     – Compare to requirement of 25%

    • Calculate Std Dev Ratio

     – Compare to simulation-based critical values (for

    typical study, critical value is 3.10

    | MDT Confidential44

  • 8/17/2019 Advanced Statistics Manual PDF

    67/258

    Example Calculations

    • Data based on actual TMV studies – But altered to disguise

     – Detection Time A, Detection Time P

    | MDT Confidential45

    Detection Time A

    | MDT Confidential46

  • 8/17/2019 Advanced Statistics Manual PDF

    68/258

    Run One-Way ANOVA

    •   Reproducibility = sqrt((0.778-0.627)/10) = 0.123

    | MDT Confidential47

    Calculate Results

    • % Tolerance (Reproducibility)

    = 100 * ((6*0.123)/2*(30-11.740))

    = 100 * (.738 / 36.52)= 2.02%

    • Std Dev Ratio = 0.986 / 0.546 = 1.81

    • Result: Pass

    | MDT Confidential48

  • 8/17/2019 Advanced Statistics Manual PDF

    69/258

    Detection Time P

    | MDT Confidential49

    Calculations for Detection Time P

    •   Reproducibility = sqrt((11.225-0.976)/10) = 1.01

    • % Tolerance (Reproducibility)

    = 100 * ( (6*1.01) / 2*(30-14.798) )

    = 100 * (6.06 / 30.40)

    = 19.9%

    • Std Dev Ratio = 1.113 / 0.846 = 1.32• Result: Pass

    | MDT Confidential50

  • 8/17/2019 Advanced Statistics Manual PDF

    70/258

    Exercises

    • Open Destructive Exercises.mtw

    • For Bond Strength results:

     – Assume specification is Minimum 5 lb

     – Analysis

    • Individual Value Plot

    • % Tolerance for Reproducibility

    • Std Dev Ratio

    • Is this destructive measurement system adequate?

    • Repeat for Buckle Force results – Assume specification is Maximum 340 grams

    | MDT Confidential51

    MSA FOR ATTRIBUTEMEASUREMENTS

    | MDT Confidential52

  • 8/17/2019 Advanced Statistics Manual PDF

    71/258

     ATTRIBUTE GAGE R&R

    • Attribute data are usually the result of human judgment

     – Which category does this item belong in?

    • When categorizing items, you need a high degree of

    agreement on which way an item should be categorized

    • The best way to assess human judgment is to have all

    operators repeatedly categorize several known test units

    (Attribute Gage R&R)

     – Look for agreement

    • each person categorizes the same unit consis tently

    • there is agreement between the operators on each unit

     – Use disagreements as opportunities to determine and eliminateproblems

    | MDT Confidential53

    SETTING UP AN ATTRIBUTE GAGE STUDY

    • Most important aspect of attribute Gage Study is

    selecting parts (representative defects)

    • Most challenging aspect is choosing parts for the

    study. Typically use . . . – 50% acceptable parts

     – 50% defective parts

    • Have operators repeatedly classify parts inrandom order without knowledge of which part

    they are classifying (blind study)

    | MDT Confidential54

  • 8/17/2019 Advanced Statistics Manual PDF

    72/258

     Analysis of Attribute Gage R&R

    • Stat Quality Tools  Attribute Agreement

     Analysis

     – Percent Agreement based on number of Parts

     – Kappa Statistics (range -1 to 1)

    • Minitab Assistant Measurement System

     Analysis

     – More graphical output

     – Accuracy statistics based on number of Appraisals – No Kappa statistics

    | MDT Confidential55

    Use Minitab Assistant-> Measurement Systems Analysis (MSA)

  • 8/17/2019 Advanced Statistics Manual PDF

    73/258

    Create Attribute Agreement worksheet

    Create Attribute Agreement worksheet

  • 8/17/2019 Advanced Statistics Manual PDF

    74/258

    • Choose Number of Appraisers = 3

    • Choose Number of Trials = 2

    • Choose Number of Test Items = 10

    • Items 1-5 are “Good”; Items 6-10 are “Bad

    • Click “OK”

    • Copy column “Standards” and paste into “Results”

    • Fix column name back to “Results”

    • Find first trial of Item 1 and Item 2

     – Change result from “Good” to “Bad” to inject twoerrors into the simulated study

    • Save onto Desktop as “Attribute GRR”

    Create Result Data

     Attribute Agreement Analysis

  • 8/17/2019 Advanced Statistics Manual PDF

    75/258

    Summary Report

    | MDT Confidential61

    standard 96.7% of the time.

    The appraisals of the test items correctly matched the

    100%< 50%

     YesNo

    96.7%

     Appraiser 3 Appraiser 2 Appraiser 1

    120

    100

    80

    60

    40

    20

    0

    96.7%

    thus very difficult to assess.

    the study were borderline cases between Good and Bad,

    -- High percentage of mixed ratings: May indicate items in

    items are being passed on to the consumer (or both).

    many Good items are being rejected, or too many Bad

    -- High misclassification rates: May indicate that either too

    or incorrect standards.

    problems, such as poor operating definitions, poor training,

    Low rates for all appraisers may indicate more systematic

    indicate a need for additional training for those appraisers.

    -- Low accuracy rates: Low rates for some appraisers may

    measurement system can be improved:

    Consider the following when assessing how the

    Overall error rate 3.3%

    Good rated Bad 6.7%

    Bad rated Good 0.0%

    ways)

    Mixed ratings (same item rated both 6.7%

    Misclassification Rat es

    100.0100.0

    90.0

    % Accuracy by Appraiser Comments

     Attribute Agreement Analysis for Results

    Summary Report

    Is the overall % accuracy acceptable?

     Attribute “c=0”

    result . . .

    Showing that no

    bad parts were

    misclassified as

    good

    Overall, 96.7% of

    presentations

    were classified

    correctly

     Appraiser 3

     Appraiser 2

     Appraiser 1

    100806040

    Bad

    Good

    100806040

    2

    1

    100806040

     Appraiser 3

     Appraiser 2

     Appraiser 1

     Appraiser 3

     Appraiser 2

     Appraiser 1

    100806040

    Good

    Bad

    % by A ppraiser

    % by Standard

    % by Trial

    % by A ppraiser and Standard

     Attribute Agreement Analysis for Results

     Accuracy Report All graphs show 95% confidence intervals for accuracy rates.

    Intervals that do not overlap are likely to be different.

     Accuracy Report

    Illustrates the

    95% / 90% result

  • 8/17/2019 Advanced Statistics Manual PDF

    76/258

    Kappa

    63 | MDT Confidential

    Kappa is a measure of rater’s agreement.

    Minitab:

    • Reports two Kappa statistics: Fleiss’ & Cohen’s

    • Defaults to Fleiss’ Kappa Minitab will only calculate Cohen’s Kappa if you choose the option for

    Cohen’s Kappa, and if one of these two conditions is true:

    •  A) Two appraisers perform a single trial on each

    sample

    • B) One appraiser performs two trials on each sample Kappa is meant for attribute data.

    Kappa ranges from -1 to 1.

    64 | MDT Confidential

    Kappa (Landis and Koch)

     According to AIAG (Auto industry), a general ru le of thumb i s:

     A Kappa value greater than 0.75 ind icates a good to excellent

    agreement

    Kappa values less than 0.40 indicate poor agreement.

    This general rule of thumb may not apply for most Medtronic

    applications. Any disagreement on rejectable units would be of

    concern.

  • 8/17/2019 Advanced Statistics Manual PDF

    77/258

    Kappa calculations

    | MDT Confidential65

    Kappa results

    | MDT Confidential66

  • 8/17/2019 Advanced Statistics Manual PDF

    78/258

    Summary and Recap

    • Quality Trainer Review

    • Topics with Variables Data

     – Gage R&R Sample Size

     – Probability of Misclassification (Variables Data)

     – Helpful Hints

    • MSA for Destructive Tests

    • MSA for Attribute Tests

    | MDT Confidential67

    BACKUP SLIDES

    | MDT Confidential68

  • 8/17/2019 Advanced Statistics Manual PDF

    79/258

    69 | MDT Confidential

    Destructive Gage R&R - 2 Nested Designs

    •2 Stage Nested Design Approach

    •Samples are parts that canbe subdivided intohomogenous sub samples.

    •Stage 1: 1 operatormeasures sub-samples (2-5)from parts (5-10).

    •Stage 2: 3 operators eachmeasure same location perpart (5-10).

    1 2 5

    1

    1 2 5

    2

    1 2 5

    10Parts

    Location

    Stage 1

    1 Operator 

    1 2 10

    1Operator 

    Parts

    Stage 2

    1 sub-sample per part

    1 2 10

    2

    1 2 10

    3

    70 | MDT Confidential

    Destructive Gage R&R - 2 Stage Die

    Bond Example (cont.)•Project: Destructive 2 stage nested.mpj

    Pull testing of die bond.

    Parts are die. Sub-samples

    are 5 wire locations on the

    die. Spec = 7.5 grams

    minimum.

    Stage1: 1 operator pull

    tests all 5 wire locations on

    each of 10 die.

    Stage 2: Each of 3

    operators pull test 10 die at

    wire location 1.

    MINITAB®

  • 8/17/2019 Advanced Statistics Manual PDF

    80/258

    71 | MDT Confidential

    Destructive Gage R&R - 2 Stage Die Bond

    Example (cont.)

    Stage 1: Stat > ANOVA > Fully Nested

     ANOVA

    Nested ANOVA: Pull Strength versus Die

    Var i ance Component s

    Source Var Comp. % of Total St Dev

    Di e 0. 088 15. 50 0. 296Err or 0. 479 84. 50 0. 692

     Tot al 0. 567 0. 753

    2ˆ part 

      

    From worksheet: stage1

    72 | MDT Confidential

    Nested ANOVA: Pull Strength (Wire 1) versus Operator 

    Vari ance Component s

    Source Var Comp. % of Total StDev

    Oper ator 0. 053 11. 08 0. 231

    Err or 0. 428 88. 92 0. 654

     Tot al 0. 481 0. 694

    Destructive Gage R&R - 2 Stage Die Bond Example

    (cont.)Stage 2: Stat >

     ANOVA > Fully Nested

     ANOVA

    From worksheet: stage2

    2ˆoperator 

      

    2

    repeat  part   

  • 8/17/2019 Advanced Statistics Manual PDF

    81/258

    73 | MDT Confidential

     2

     ˆ repeat

      =  2 / 

     ˆ repeat part

     2

     ˆ part

     

    = 0.428 – 0.088 = 0.340

    2

    & R R  = 0.340 + .053 = 0.393

    Destructive Gage R&R - 2 Stage Die Bond Example

    (cont.)Manual calculation of Gage Repeatability and Reproducibility

    Compare Gage R&R variance to part variance if parts are

    chosen to be representative of production process.Since this is a one-sided spec (7.5 grams) use

    Misclassification to determine gage acceptance.

    74 | MDT Confidential

    Kappa – Call Center Example

    Call Center workers were asked to categorize types of calls they

    received: Callcat.mtwMINITAB®

  • 8/17/2019 Advanced Statistics Manual PDF

    82/258

    75 | MDT Confidential

    Kappa Attribute Analysis: Option Setting

    76 | MDT Confidential

    Kappa : Within Appraiser Agreement

  • 8/17/2019 Advanced Statistics Manual PDF

    83/258

    77 | MDT Confidential

    Kappa: Each Appraiser vs Standard

    78 | MDT Confidential

    Kappa for Appraisers

    What do we conclude from this analysis for the raters performance?

    What would you do next?

    Can this method be applied to the banana data?

  • 8/17/2019 Advanced Statistics Manual PDF

    84/258

     

  • 8/17/2019 Advanced Statistics Manual PDF

    85/258

    Distribution AnalysisThe Art of Finding Useful Models

    Jeremy Strief, Ph.D.

    MECC Principal Statistician

    Objectives

    • Explain why distributional analysis is statistically

    complicated (and sometimes emotionally frustrating!)

    • Emphasize the importance of engineering theory and

    historical precedent.

    • Encourage the use of multiple graphical methods in

    addition to numerical tests.

    • Review common causes of Non-Normality.

    • Discuss Transformations and how they compare to

    fitting non-Normal distributions.

    Medtronic Confidential

  • 8/17/2019 Advanced Statistics Manual PDF

    86/258

    Recap from Quality Trainer 

    • Normal Distribution Basics

    • Capability Analysis (Normal)

    • Capability Analysis (Non-Normal)

    • Graphical tools

     – Boxplots

     – Histograms

     – Individual Value Plots

    | MDT Confidential3

    Distribution AnalysisMotivation and Philosophy

  • 8/17/2019 Advanced Statistics Manual PDF

    87/258

    5 | MDT Confidential

    Why Assess Distribution• Statistical tools vary in sensitivity to and effect of distributional assumptions

    • Some MDT procedures require distributional assessment for thosestatistical methods which are highly sensitive to distributional assumptions

    Statistical 

    Tool Distributional 

    Sensitivity 

    Effect 

    of  

    Poor 

    Distributional 

    Fit

    Capability  Analysis High Incorrect PPM/Ppk

    Tolerance Intervals High Incorrect Bounds

    Variables Lot Acceptance Sampling High Altered rejection and acceptance  rates

    Individuals Chart for SPC High Incorrect control limits

    GLM / Regression / ANOVA Med approximate p‐value

    Xbar chart for SPC Med/Low approximate p‐value

    Two‐sample t‐test Low approximate p‐value

    Non‐parametric methods Low approximate p‐value

    6 | MDT Confidential

    Time

          P     e     r     c

         e     n      t

    60504030201 00-1 0-2 0

    99.9

    99

    95

    90

    80

    7060

    504030

    20

    10

    5

    1

    0.1

    M ean

    < 0.005

    12.31

    S t D e v 9 .6 56

    N 100

     A D 5.7 38

    P -V alu e

    P r o b a b i l i t y P l o t o f T i m eNorm a l

    Time

       F  r  e  q  u  e  n  c  y

    50403020100

    40

    30

    20

    10

    0

    H i s t o g r a m o f T i m e

    Not All Data Are Normal: Example

    Lead Time Data

    usually have a long

    tail – skewed

    distribution

  • 8/17/2019 Advanced Statistics Manual PDF

    88/258

    Not All Data are Normal: Considerations

    • Observed data need not follow any tractablemathematical model.

    • Some mathematical models may be useful, if

    imperfect, representations of the data.

    | MDT Confidential7

    Frustrations with Distributional Analysis

    • Larger sample sizes (n>100) cause the statistical

    tests to detect small departures from a theoretical

    model. Such departures may not be practically

    significant.

    • Smaller sample sizes (n

  • 8/17/2019 Advanced Statistics Manual PDF

    89/258

    The Underlying Statistical Hypotheses

    • The statistical hypothesis testing is ‘backward,’ in that the null

    hypothesis assumes that the particular distribution is a good fit.

     – H0: Distribution specified has a good fit

     – H1: Distribution specified has lack-of-fit

    • Low p-values will disprove the fit of a distribution. So certain

    distributions can be ruled out as a reasonable models.

    • Using the standard goodness-of-fit metrics, it is technically not

     possible to prove that a particular distribution is the “true model”

    for the data.

    • Instead of providing statistical “proof”, distribution analysis is

    geared toward assessing which statistical distributions are

    plausible models for the data at hand.

    | MDT Confidential9

    Philosophy of Distribution Analysis

    “All models are approximations. Essentially, all

    models are wrong, but some are useful. However,

    the approximate nature of the model must always

    be borne in mind.”

    --G.E.P. Box

    | MDT Confidential10

  • 8/17/2019 Advanced Statistics Manual PDF

    90/258

    N=15 Probability Plots

    Medtronic Confidential

    N=500 Examples

    Medtronic Confidential

    Only 12 out of 500 values were affected by the truncation or

    censoring.

  • 8/17/2019 Advanced Statistics Manual PDF

    91/258

    13 | MDT Confidential

    How to Determine Distribution

    1. Scientific/Engineering Knowledge

    2. Historical distribution analysis

    3. Distribution analysis

    Priority order 

    Why is

    distribution

    analysis last?

    • Sample size (50 to 100)

    • Regardless of n, key Xs and shift and drift

    can mask true distribution

    Distribution applies to short term data only

    Importance of Engineering Theory

    • The choice of distribution should be both statistically

    plausible and scientifically justified.

    • Engineering theory and historical precedents often

    suggest whether a distribution should be Normal,

    Lognormal, or Weibull.

    • If scientific theory does not lead to one single

    statistical model, at least consider 

     – Whether the distribution should be skewed or symmetric

     – Which distributions can be ruled out

    Medtronic Confidential

  • 8/17/2019 Advanced Statistics Manual PDF

    92/258

    Data Analysis Philosophy

    • Information shouldn’t be destroyed. Examples ofinformation destruction are

     – Converting variables data to attribute data.

     – Heavy rounding with a bad measurement system.

     – Drifting measurement system.

    • Check the quality and structure of the raw data.

     – Are there physically impossible values, wildoutliers, missing values, too many ties?

     – Are the data paired or unpaired?

     – Was randomization employed? – How was the data generated?

    | MDT Confidential15

    Data Analysis Philosophy

    • Plot the data AND do analytics.

     – PLOT histograms, run charts, scatter plots,… .See what is going on. Do a probability plot forprocess data.

     – Use ANALYTICS to get quantitative about whatyou have seen. Examine the residual plots fromanalytical model fits.

    • Analyses are performed on yesterday’s data

    today to predict tomorrow’s performance. – Data from an unstable process that is analyzed

    (ignoring the instability) may result in a conclusionthat will not hold up tomorrow.

    | MDT Confidential16

  • 8/17/2019 Advanced Statistics Manual PDF

    93/258

    Distribution AnalysisReview of Engineering Distributions

    Most Common Statistical Models for

    Engineering Applications

    • Weibull

    • Exponential (special case of Weibull)

    • Lognormal

    • Normal

    | MDT Confidential18

  • 8/17/2019 Advanced Statistics Manual PDF

    94/258

    Weibull

    • A flexible model which can assume many different shapes, depending on the

    choice of parameters• Scale parameter α or η

    • Shape parameter β

    • Arises from “weakest link” failures, or situations when the underlying process

    focuses on the minimum or maximum value of independent, positive random

    variables.

    • Models stress-strength failures

    | MDT Confidential19

    Exponential

    • Special case of Weibull when β=1

    • Constant hazard rate, meaning that the probability of failure is not a

    function of the age of the device/material.

    • May occur when multiple failure modes are operating simultaneously

    • May be useful in modeling software failures resulting from external

    sources (e.g. cosmic radiation causes bit-flips at an extremely low,

    constant rate)

    | MDT Confidential20

  • 8/17/2019 Advanced Statistics Manual PDF

    95/258

    Lognormal

    • Models time-to-failure caused by several forces which combine

    multiplicatively.

    • Describes time to fracture from fatigue crack growth in metals.

    • Right skewed distribution, useful when data values take multiple

    orders of magnitude (e.g. 1.4, 14, 140).

    • Two parameters (μ,σ), each of which is traditionally expressed on

    the log scale.

    • So if X~Lognormal(μ,σ), then ln(X)~Normal(μ,σ)

    | MDT Confidential21

    Normal

    • Models time-to-failure caused by additive, independent forces

    • Commonly describes gage error, dimensional measurements from

    a supplier, and other symmetric, bell-shaped phenomena

    | MDT Confidential22

  • 8/17/2019 Advanced Statistics Manual PDF

    96/258

     Additional Models to Consider 

    • Logistic• Smallest Extreme Value (SEV)

    • Largest Extreme Value (LEV)

    | MDT Confidential23

    Some Relationships

    • SEV distribution = ln(Weibull distribution).

    • LEV distribution = ln(1/Weibull distribution).

    • Normal distribution = ln(Log-normal distribution).

    • All Weibull distributions can be rescaled and

    repowered to get another Weibull.

    • The Weibull(100,4) is very close to a Normal

    (mean=90.64, s.d= 25.43). This normal is thicker in

    the tails than the Weibull (100,4). Ref: 02SR013

    “Algorithm for Computing Weibull Sample Size forComplete Data”

    | MDT Confidential24

  • 8/17/2019 Advanced Statistics Manual PDF

    97/258

    25 | MDT Confidential

    Review: Common Engineering Distributions

    DimensionsMeasurement

    error 

    Lead Time

    Normal Weibull

    Lognormal

    Default

    Time to

    stress/strength

    related failure

    Time to

    fatiguerelated failure

    Infant

    mortality

    Wearout

    Distribution AnalysisStatistical Overview

  • 8/17/2019 Advanced Statistics Manual PDF

    98/258

    Statistical Approach to Distribution Analysis

    • Both graphical and numerical approaches areneeded

    • P-value is not definitive, given the “backward”

    nature of hypothesis testing

    • Visual assessment of the probability plot is

    crucial

    • Reasonably large sample sizes (~50) are

    needed. Consult your local procedures (e.g.

    DOC000550 within CRDM) for specific rules.

    | MDT Confidential27

    Distribution AnalysisGraphical Methods

  • 8/17/2019 Advanced Statistics Manual PDF

    99/258

    Good Distribution Analysis Should

     Always Begin With Plots!

    • Probability plots

    • Histograms

    • Time plots

    Medtronic Confidential

    Probability Plot

    • A probability plot is a 2-dimensional plot with specialized (often

    logarithmic) axes, to facilitate comparison between observed

    data and a hypothesized distribution.

    • More specifically, a probability plot is a comparison between the

    observed and theoretical quantiles (i.e. percentiles) for a

    hypothesized distribution.

    | MDT Confidential30

  • 8/17/2019 Advanced Statistics Manual PDF

    100/258

    Probability Plot Interpretation

    • If the distribution i s a good fit to the data, the plotted points

    should fall approximately in a straight line.

    • When interpreting the probability plot, examine both the p-value

    and the visual fit.

     – At the tails of the distribution, look whether the points are falling on

    the conservative side of the fitted line.

     – Look for major deviations in the pattern of points from a straight

    line—kinks, ties, curves, jumps, etc. Do not worry if a few points

    fall outside the confidence bounds.

     – Fat Pencil Test: Can the observed data values be covered up by a

    “fat pencil”?

    | MDT Confidential31

    Probability Plot in Minitab

    | MDT Confidential32

  • 8/17/2019 Advanced Statistics Manual PDF

    101/258

    Probability Plot Examples

    Large N makes for

    obvious curvature:

    Medtronic Confidential

    Right skew and

    curvature:

    Probability Plot Examples

    “Subtle Patterns” can becaused by randomness

    Medtronic Confidential

    Both datasets were

    sampled directly from a

    Normal distribution.

  • 8/17/2019 Advanced Statistics Manual PDF

    102/258

    Probability Plot Examples

    • Distribution does not pass the Anderson-Darling test, but the

    lower tail of the distribution falls on the conservative side of thefitted line.

    • Distribution appears to have a lower limit of zero

    • It would be conservative to use the Normal model to estimate

    the lower tail behavior.

    | MDT Confidential35

    Histograms in Minitab

    The graph menu offers a histogram platform, but the graphical

    summary platform offers more information with fewer clicks.

    | MDT Confidential36

  • 8/17/2019 Advanced Statistics Manual PDF

    103/258

    Histograms

    • More intuitive than probability plots, since the x-y axes are not

    transformed.• Not informative with small sample sizes (

  • 8/17/2019 Advanced Statistics Manual PDF

    104/258

    39 | MDT Confidential

    Why is Stability needed to Assess Distribution?

    Distribution Assessment Risks

    • Shift and Drift, and Variation in Key Xs

    masks distribution

    • Initial capability data always contains

    Shift and Drift

    • At Final Capability, process is stable

    and variation in Key Xs is removed

    Distribution Analysis Shift and Drift.mtw

    100 samples from Week 1

    25 samples from Week 2

    100 samples from Week 3

    Distribution applies to short term data only

    MINITAB®

    40 | MDT Confidential

    Initial Process Data often have Shift and Drift

    Observation

       I  n   d   i  v   i   d  u  a   l   V  a   l  u  e

    221199177155133111896745231

    35

    30

    25

    20

    15

    10

    5

     _ X=19.93

    UCL=26.30

    LCL=13.55

    1

    11

    11

    1

    1

    1

    111

    1

    111

    11

    1

    11

    1

    1

    1

    11

    11

    1

    111

    1

    1

    1

    1

    1

    1

    1

    111111

    11

    1

    1

    11

    1

    1

    11111

    1

    11

    1

    111

    11

    111

    111

    1

    1111

    1

    1111

    1

    1

    1

    11

    11

    1

    1

    1

    11

    1

    1

    11

    11

    1

    11

    11

    11

    11

    1

    11

    1

    1

    11

    11

    111

    1

    11

    11

    1

    1

    11

    1

    1

    111

    111

    11111

    1

    1

    1

    1

    1

    111

    111

    11

    1

    1

    11

    1

    111

    1

    1

    1

    11

    11

    1

    1111

    1

    11

    11

    1

    1

    11

    1

    1

    1

    11

    1

    1

    I Chart of Initial Capability Data

  • 8/17/2019 Advanced Statistics Manual PDF

    105/258

    41 | MDT Confidential

    Long Term Data May not be Normal

    Initial Capability Dat a

       P  e  r  c  e  n   t

    6050403020100-10-20

    99.9

    99

    95

    90

    80

    7060504030

    20

    10

    5

    1

    0.1

    Mean

  • 8/17/2019 Advanced Statistics Manual PDF

    106/258

    Distribution AnalysisNumerical Methods

    Numerical Methods

    • For all numerical methods:

     – A large (≥0.05) p-value implies there is no evidenceagainst the hypothesized distribution.

     – A small (

  • 8/17/2019 Advanced Statistics Manual PDF

    107/258

    45 | MDT Confidential

    Most Common Normality Tests

    •  Anderson-Darl ing (AD) test

    • Ryan-Joiner test

    Note: The Ryan-Joiner test is essentially

    equivalent to the Shapiro-Wilk test.

     Anderson-Darling

    • Default approach in Minitab.

    • May be used to assess fit of Normal and non-

    Normal distributions.

    • Gives unreliable results when data are

    discretized/grouped, which is fairly common

    when measurement system resolution is poor.

    | MDT Confidential46

  • 8/17/2019 Advanced Statistics Manual PDF

    108/258

     Anderson-Darling in Minitab

    For assessing Normality:

    | MDT Confidential47

     Anderson-Darling in Minitab

    For any/all distributions:

    | MDT Confidential48

  • 8/17/2019 Advanced Statistics Manual PDF

    109/258

     Anderson-Darling Results

    Normal(10,1.5)

    | MDT Confidential49

    Normal(10,1.5)--Rounded

    Ryan-Joiner 

    • Useful for discretized, rounded, or clumpy data

    • Will not declare significant lack-of-fit simply due to poor

    measurement resolution

    • Recommended minimum of 5 “groups” to have a meaningful p-

    value. Fewer groups may yield an overly optimistic (high) p-

    value.

    | MDT Confidential50

     Anderson-Darling Ryan-Joiner 

  • 8/17/2019 Advanced Statistics Manual PDF

    110/258

    Ryan-Joiner in Minitab

    | MDT Confidential51

    Truncation

    • The Normal distribution may be used to model tail

    behavior if it provides a conservative estimate of

    those tails.

    • This situation arises when data are truncated, which

    is quantitatively captured as negative kurtosis.

    | MDT Confidential52

  • 8/17/2019 Advanced Statistics Manual PDF

    111/258

    Truncation

    • In principle, truncated data may be evaluatedgraphically or through a Skewness-Kurtosis (SK) test.

    • The SK test checks whether the tails of the Normal

    distribution are longer or shorter than the tails of your

    data.

    • MECC has created and validated an Excel

    spreadsheet (R134997) which executes the SK test.

    • In practice, consult your local procedures to ensure

    your analysis of truncated data is compliant.

    | MDT Confidential53

    Microsoft Excel

    Worksheet

     Avoiding Parametric Distributions Altogether 

    • Chebyshev’s inequality captures the tail behavior of anystatistical distribution with a finite variance. – For any random variable X and constant k > 1,

    P( |X- μ | ≥ k σ  ) ≤ 1/k 2 

    • This inequality may be useful for skipping the issue ofdistributional fit altogether, especially if distributional fit isbeing assessed in order to compute a tolerance interval.

    • Chebyshev’s will only be helpful if the process capability isextremely high.

    • Consult your own procedures for details, but CRDMprocedures invoke the following version of Chebyshev: – If the nearest specification is at least 10 standard deviations

    away from the mean, it may be inferred by Chebyshev that at

    least 99% of the distribution will fall within specification.

    | MDT Confidential54

  • 8/17/2019 Advanced Statistics Manual PDF

    112/258

    Why Normality Tests Fail

    1. A shift occurred in the middle of the data

    2. Multiple sources or multiple failure modes withdifferent distributions

    3. Outliers

    4. Piled up data.

    5. Truncated data (sorted before you get it)

    6. The underlying distribution is not normal (skewed)

    7. Poor measurement resolution

    8. Too much data (over powered to detect non-normality)

    9. Due to random chance –you expect the test to fail5% of the time (i.e. 95% confidence) if the data weretruly from a normal distribution.

    Resolving Non-Normality

    1 Data shift Sublot

    Skewness/kurtosis test

    Attribute sampling

    2 Multiple data sources Sublot

    Skewness/kurtosis test

    Attribute sampling

    3 Outliers Attribute sampling

    Outlier removal

    (May remove outliers only if they 

    constitute typos or data collection 

    errors.)

    4/5 Censored/Truncated  data 

    (tails lost)

    Skewness/kurtosis test

    Conservative fitting

    Attribute sampling

    6 Distribution not normal Non‐normal analysis

    Transformation

    Attribute Sampling

    7 Poor measurement resolution Ryan‐Joiner

    Skewness/kurtosis test

    8 Too much data Graphical evidence

    Randomsubsampling

    9 Random Chance Historical assessment

  • 8/17/2019 Advanced Statistics Manual PDF

    113/258

    When Multiple Distributions Fit

    Prior engineering knowledge is

    particularly useful when multipledistributions yield p-values above 0.05:

     – Picking the distribution solely based on best p-value orbest R2 is rational when there is absolutely no history orscientific theory.

     – A better approach is to assemble a list of plausible(p>0.05) distributions and then make a final choice basedupon history and science.

     – P-values will sometimes be below 0.05 simply as a resultof chance (Type I error). It is not recommended toimmediately change years of analysis based on onesignificant p-value. Investigate and monitor beforechanging distributions.

    | MDT Confidential57

     Avoid the daily special

     – Do NOT take the “distribution du jour” approach, in

    which multiple distributions are chosen for a single

    process. This reflects either:

    • An out-of-control process, which can’t be

    captured by a single distribution anyway.

    • The bad statistical practice of just defaulting to

    the distribution with the highest p-value.

    | MDT Confidential58

  • 8/17/2019 Advanced Statistics Manual PDF

    114/258

    59 | MDT Confidential

    Example: Capability for Non-Normal Data

    using Tribal Knowledge for Distribution

    Problem Statement: Time (in days) to process

    (reject/accept) loan applications is too long causing loss in

    customer applications

    Project Goal: Decrease potential customer loss from

    15% to 5%. Customer expectation is 20 days.

    Project Strategy: Path Y = Time

    Task: Determine capability for Y = Time

    LoanApplicationTime.MTW

     Assume lead time has a LogNormal Distribution

    MINITAB®

    60 | MDT Confidential

    Time

       P  e  r  c  e  n   t

    100101

    99.9

    99

    95

    90

    80

    7060504030

    20

    10

    5

    1

    0.1

    Loc

    0.299

    2.269

    Scale 0 .6845

    N 100

     AD 0.432

    P-Value

    Probability Plot of TimeLognormal - 95% CI

    Verify Lognormal Distribution

    Check if LogNormal

    provides a good fit Time

       P  e  r  c  e  n

       t

    10 0101

    99.9

    99

    95

    90

    80

    7060504030

    20

    10

    5

    1

    0.1

    Lo c

    0.299

    2.269

    S c a le 0 .6 84 5

    N 100

     A D 0.4 32

    P -Value

    Probabil ity Plot of T imeLogno rm a l - 9 5% CI

  • 8/17/2019 Advanced Statistics Manual PDF

    115/258

    61 | MDT Confidential

    Capability for Non-Normal Data using LogNormal

    50403020100

    USL

    Process Data

    Samp le N 100

    Location 2.26918

    S ca le 0 .6 84 49 3

    LSL *

    Target *

    USL 20

    Sample Mean 12.31

    Ov erall Capability

    Z.Bench 1.06

    Z.LSL *

    Z .U S L 0 .4 7

    P pk 0.16

    Observed Performance

    P PM < LS L *

    PPM > USL 160000

    PPM Total 160000

    Exp. Overall Performance

    P PM < LS L *

    PPM > USL 144242

    PPM Total 144242

    Process Capability of TimeCalculations Based on Lognormal Distribution Model

    Distribution AnalysisTransformations

  • 8/17/2019 Advanced Statistics Manual PDF

    116/258

    Two Options

    • When a dataset is non-Normal, it is acceptable either to

     – Mathematically transform the data to achieve Normality – Fit a non-Normal distribution

    • Transformation carries the practical advantage that manystatistical methods are based upon Normality, so there willbe more analytical tools available for the transformeddataset.

    • Transformation carries the disadvantages of creatingunnatural units (e.g. log-meters instead of meters) andaltering potentially relevant structures of the data.

    • Note: Please do NOT try transformations of data froman unstable process, or bimodal data (two bumps).

    | MDT Confidential63

    Transformation Advice

    • If a transformation is chosen, it should be as

    simple as possible, and it should ideally have a

    physical interpretation.

    • A log transformation is particularly desirable,

    since it

     – Is monotonic

     – Is straightforward to interpret (it turns multiplicative

    effects into additive effects)