Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 17-1 Business Statistics, 4e by Ken Black Chapter 17 Nonparametric Statistics

  • View
    221

  • Download
    6

Embed Size (px)

Text of Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 17-1 Business Statistics, 4e by...

  • Slide 1
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-1 Business Statistics, 4e by Ken Black Chapter 17 Nonparametric Statistics
  • Slide 2
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-2 Learning Objectives Recognize the advantages and disadvantages of nonparametric statistics. Understand how to use the runs test to test for randomness. Know when and how to use the Mann-Whitney U test, the Wilcoxon matched-pairs signed rank test, the Kruskal-Wallis test, and the Friedman test. Learn when and how to measure correlation using Spearmans rank correlation measurement.
  • Slide 3
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-3 Parametric vs Nonparametric Statistics Parametric Statistics are statistical techniques based on assumptions about the population from which the sample data are collected. Assumption that data being analyzed are randomly selected from a normally distributed population. Requires quantitative measurement that yield interval or ratio level data. Nonparametric Statistics are based on fewer assumptions about the population and the parameters. Sometimes called distribution-free statistics. A variety of nonparametric statistics are available for use with nominal or ordinal data.
  • Slide 4
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-4 Advantages of Nonparametric Techniques Sometimes there is no parametric alternative to the use of nonparametric statistics. Certain nonparametric test can be used to analyze nominal data. Certain nonparametric test can be used to analyze ordinal data. The computations on nonparametric statistics are usually less complicated than those for parametric statistics, particularly for small samples. Probability statements obtained from most nonparametric tests are exact probabilities.
  • Slide 5
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-5 Disadvantages of Nonparametric Statistics Nonparametric tests can be wasteful of data if parametric tests are available for use with the data. Nonparametric tests are usually not as widely available and well know as parametric tests. For large samples, the calculations for many nonparametric statistics can be tedious.
  • Slide 6
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-6 Runs Test Test for randomness - is the order or sequence of observations in a sample random or not Each sample item possesses one of two possible characteristics Run - a succession of observations which possess the same characteristic Example with two runs: F, F, F, F, F, F, F, F, M, M, M, M, M, M, M Example with fifteen runs: F, M, F, M, F, M, F, M, F, M, F, M, F, M, F
  • Slide 7
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-7 Runs Test: Sample Size Consideration Sample size: n Number of sample member possessing the first characteristic: n 1 Number of sample members possessing the second characteristic: n 2 n = n 1 + n 2 If both n 1 and n 2 are 20, the small sample runs test is appropriate.
  • Slide 8
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-8 Runs Test: Small Sample Example H 0 : The observations in the sample are randomly generated. H a : The observations in the sample are not randomly generated. =.05 n 1 = 18 n 2 = 8 If 7 R 17, do not reject H 0 Otherwise, reject H 0. 1 2 3 4 5 6 7 8 9 10 11 12 D CCCCC D CC D CCCC D C D CCC DDD CCC R = 12 Since 7 R = 12 17, do not reject H 0
  • Slide 9
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-9 Runs Test: Large Sample If either n 1 or n 2 is > 20, the sampling distribution of R is approximately normal.
  • Slide 10
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-10 Runs Test: Large Sample Example H 0 : The observations in the sample are randomly generated. H a : The observations in the sample are not randomly generated. =.05 n 1 = 40 n 2 = 10 If -1.96 Z 1.96, do not reject H 0 Otherwise, reject H 0. 1 1 2 3 4 5 6 7 8 9 0 11 NNN F NNNNNNN F NN FF NNNNNN F NNNN F NNNNN 12 13 FFFF NNNNNNNNNNNN R = 13 H 0 : The observations in the sample are randomly generated. H a : The observations in the sample are not randomly generated. =.05 n 1 = 40 n 2 = 10 If -1.96 Z 1.96, do not reject H 0 Otherwise, reject H 0. 1 1 2 3 4 5 6 7 8 9 0 11 NNN F NNNNNNN F NN FF NNNNNN F NNNN F NNNNN 12 13 FFFF NNNNNNNNNNNN R = 13
  • Slide 11
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-11 Runs Test: Large Sample Example -1.96 Z = -1.81 1.96, do not reject H 0
  • Slide 12
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-12 Mann-Whitney U Test Nonparametric counterpart of the t test for independent samples Does not require normally distributed populations May be applied to ordinal data Assumptions Independent Samples At Least Ordinal Data
  • Slide 13
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-13 Mann-Whitney U Test: Sample Size Consideration Size of sample one: n 1 Size of sample two: n 2 If both n 1 and n 2 are 10, the small sample procedure is appropriate. If either n 1 or n 2 is greater than 10, the large sample procedure is appropriate.
  • Slide 14
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-14 Mann-Whitney U Test: Small Sample Example Service HealthEducational Service 20.1026.19 19.8023.88 22.3625.50 18.7521.64 21.9024.85 22.9625.30 20.7524.12 23.45 H 0 : The health service population is identical to the educational service population on employee compensation H a : The health service population is not identical to the educational service population on employee compensation
  • Slide 15
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-15 Mann-Whitney U Test: Small Sample Example =.05 If the final p-value 15, T is approximately normally distributed, and a Z test is used. If n 15, a special small sample procedure is followed. The paired data are randomly selected. The underlying distributions are symmetrical.
  • Slide 25
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-25 Wilcoxon Matched-Pairs Signed Rank Test: Small Sample Example Family PairPittsburghOakland 11,950 1,760 21,840 1,870 32,015 1,810 41,580 1,660 51,790 1,340 61,925 1,765 H 0 : M d = 0 H a : M d 0 n = 6 =0.05 If T observed 1, reject H 0.
  • Slide 26
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-26 Wilcoxon Matched-Pairs Signed Rank Test: Small Sample Example Family PairPittsburghOaklanddRank 11,950 1,760 190 21,840 1,870 -30 32,015 1,810 205 41,580 1,660 -80 51,790 1,340 450 61,925 1,765 160 +4 +5 -2 +6 +3 T = minimum( T +, T - ) T + = 4 + 5 + 6 + 3= 18 T - = 1 + 2 = 3 T = 3 T = 3 > T crit = 1, do not reject H 0.
  • Slide 27
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-27 Wilcoxon Matched-Pairs Signed Rank Test: Large Sample Formulas
  • Slide 28
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-28 Airline Cost Data for 17 Cities, 1997 and 1999 City19791999dRankCity19791999dRank 120.322.8-2.5-81020.320.9-0.6 219.512.76.8171119.222.6-3.4-11.5 318.614.14.5131219.516.92.69 420.916.14.8151318.720.6-1.9-6.5 519.925.2-5.3-161417.718.5-0.8-2 618.620.2-1.6-41521.623.4-1.8-5 719.614.94.7141622.421.31.13 823.221.31.96.51720.817.43.411.5 921.818.73.110 H 0 : M d = 0 H a : M d 0
  • Slide 29
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-29 Airline Cost: T Calculation
  • Slide 30
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-30 Airline Cost: Conclusion
  • Slide 31
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-31 Kruskal-Wallis Test A nonparametric alternative to one-way analysis of variance May used to analyze ordinal data No assumed population shape Assumes that the C groups are independent Assumes random selection of individual items
  • Slide 32
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-32 Kruskal-Wallis K Statistic
  • Slide 33
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-33 Number of Patients per Day per Physician in Three Organizational Categories Two Partners Three or More PartnersHMO 132426 151622 201931 182227 232528 1433 17 H o : The three populations are identical H a : At least one of the three populations is different
  • Slide 34
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-34 Patients per Day Data: Kruskal-Wallis Preliminary Calculations n = n 1 + n 2 + n 3 = 5 + 7 + 6 = 18 Two Partners Three or More PartnersHMO PatientsRankPatientsRankPatientsRank 13124122614 153164229.5 2081973117 186229.52715 231125132816 1423318 175 T 1 = 29T 2 = 52.5T 3 = 89.5 n 1 = 5n 2 = 7n 3 = 6
  • Slide 35
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-35 Patients per Day Data: Kruskal-Wallis Calculations and Conclusion
  • Slide 36
  • Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons. 17-36 Friedman Test A nonparametric alternative to the randomized block design Assumptions The blocks are