5
Normality Test Kolmogorov-Smirnov Normality Test (D) Equations taken from Zar, 1984 For this test first calculate the cumulative frequencies of the observations, in this case n = D = rel F rel Fˆ 17, so for the first observation 1/17=0.0588, for the second observation 2/17=0.1176, i i i etc… Next, calculate the z-score of each observed cumulative frequency and determine and its probability from the z table. For negative z-scores the table probability is the expected ' ˆ cumulative frequency, for positive z-scores it is 1-z probability. D is the absolutei Di = rel Fi1 rel Fi difference between the observed and expected frequencies. D` is the difference betweeni the observed frequency and the next expected frequency, so for this table |0.0588 - 0.1736 = 0.1148|, |0.1176 – 0.2266 = 0.1090|, etc… Find the largest value from either of these two columns (here it is 0.1983) and compare to the D table. Observed Expected Population Cumulative Cumulative Village Density Frequency Z-score Z Probability Frequency Di D'i Aranza 4.13 0.0588 -1.40 0.0808 0.0808 0.0220 0.0808 Corupo 4.53 0.1176 -0.94 0.1736 0.1736 0.0560 0.1148 San Lorenzo 4.69 0.1764 -0.75 0.2266 0.2266 0.0502 0.1090 Cheranatzicurin 4.76 0.2352 -0.67 0.2514 0.2514 0.0162 0.0750 Nahuatzen 4.77 0.2940 -0.66 0.2546 0.2546 0.0394 0.0194 Pomacuaran 4.96 0.3528 -0.44 0.3300 0.3300 0.0228 0.0360 Sevina 4.97 0.4116 -0.43 0.3336 0.3336 0.0780 0.0192 Arantepacua 5.00 0.4704 -0.39 0.3483 0.3483 0.1221 0.0633 Cocucho 5.04 0.5292 -0.35 0.3632 0.3632 0.1660 0.1072 Charapan 5.10 0.5880 -0.28 0.3897 0.3897 0.1983 0.1395 Comachuen 5.25 0.6468 -0.10 0.4602 0.4602 0.1866 0.1278 Pichataro 5.36 0.7056 0.02 0.4920 0.5080 0.1976 0.1388 Quinceo 5.94 0.7644 0.69 0.2451 0.7549 0.0095 0.0493 Nurio 6.06 0.8232 0.83 0.2033 0.7967 0.0265 0.0323 Turicuaro 6.19 0.8820 0.98 0.1635 0.8365 0.0455 0.0133 Urapicho 6.30 0.9408 1.11 0.1335 0.8665 0.0743 0.0155 Capacuaro 7.73 0.9996 2.76 0.0029 0.9971 0.0025 0.0563 Ho: Village population density is not significantly different than normal. Ha: Village population density is significantly different than normal. α=0.05 n = 17 mean = 5.34 sd = 0.866 Dmax = 0.1983 DCritical = 0.207 Since 0.1983 < 0.207 accept Ho.

k-s ingles

Embed Size (px)

Citation preview

Page 1: k-s ingles

Normality Test

Kolmogorov-Smirnov Normality Test (D)Equations taken from Zar, 1984

For this test first calculate the cumulative frequencies of the observations, in this case n =D = rel F − rel Fˆ 17, so for the first observation 1/17=0.0588, for the second observation 2/17=0.1176,

i i ietc… Next, calculate the z-score of each observed cumulative frequency and determine

and its probability from the z table. For negative z-scores the table probability is the expected' ˆ cumulative frequency, for positive z-scores it is 1-z probability. D is the absolute i

Di = rel Fi−1 − rel Fidifference between the observed and expected frequencies. D` is the difference between i

the observed frequency and the next expected frequency, so for this table |0.0588 -0.1736 = 0.1148|, |0.1176 – 0.2266 = 0.1090|, etc… Find the largest value from either ofthese two columns (here it is 0.1983) and compare to the D table.

Observed ExpectedPopulation Cumulative Cumulative

Village Density Frequency Z-score Z Probability Frequency Di D'iAranza 4.13 0.0588 -1.40 0.0808 0.0808 0.0220 0.0808Corupo 4.53 0.1176 -0.94 0.1736 0.1736 0.0560 0.1148San Lorenzo 4.69 0.1764 -0.75 0.2266 0.2266 0.0502 0.1090Cheranatzicurin 4.76 0.2352 -0.67 0.2514 0.2514 0.0162 0.0750Nahuatzen 4.77 0.2940 -0.66 0.2546 0.2546 0.0394 0.0194Pomacuaran 4.96 0.3528 -0.44 0.3300 0.3300 0.0228 0.0360Sevina 4.97 0.4116 -0.43 0.3336 0.3336 0.0780 0.0192Arantepacua 5.00 0.4704 -0.39 0.3483 0.3483 0.1221 0.0633Cocucho 5.04 0.5292 -0.35 0.3632 0.3632 0.1660 0.1072Charapan 5.10 0.5880 -0.28 0.3897 0.3897 0.1983 0.1395Comachuen 5.25 0.6468 -0.10 0.4602 0.4602 0.1866 0.1278Pichataro 5.36 0.7056 0.02 0.4920 0.5080 0.1976 0.1388Quinceo 5.94 0.7644 0.69 0.2451 0.7549 0.0095 0.0493Nurio 6.06 0.8232 0.83 0.2033 0.7967 0.0265 0.0323Turicuaro 6.19 0.8820 0.98 0.1635 0.8365 0.0455 0.0133Urapicho 6.30 0.9408 1.11 0.1335 0.8665 0.0743 0.0155Capacuaro 7.73 0.9996 2.76 0.0029 0.9971 0.0025 0.0563

Ho: Village population density is not significantly different than normal.Ha: Village population density is significantly different than normal.α=0.05n = 17 mean = 5.34 sd = 0.866

Dmax = 0.1983DCritical = 0.207

Since 0.1983 < 0.207 accept Ho.

Village population density is not significantly different than normal (D0.1983, 0.10 > p > 0.05).

Page 2: k-s ingles

Normality Test

W/S Normality TestEquations taken from Kanji, 1993

w The W/S normality tests is a fairly simple test that require only the sample standardq = deviation and the data range. The test is based on the q statistic, which is the ‘studentized’s range, or the range expressed in standard deviation units. This test should not be confused

with the Shapiro-Wilks test, which is a more powerful normality test. The test statistic isq, which is compared to the critical q values from the table, w is the data range, and s isthe standard deviation of the data. The critical q value is a range. If the calculated q valuefalls within this range Ho is accepted (the data are normal), Ho is rejected when the qvalues are outside the critical range (the data are not normal).

Note: The W/S test does a poor job of rejecting Ho when the data are very skewed, if dataare clumped in the tails, or there are outliers. This test should only be used if the data areapproximately symmetrical.

Ho: Village population density is not significantly different than Populationnormal. Village DensityHa: Village population density is significantly different than normal. Aranza 4.13α=0.05 Corupo 4.53 San Lorenzo 4.69s = 0.866 Cheranatzicurin 4.76w = 3.6 Nahuatzen 4.77

Pomacuaran 4.963.6

q = = 3.74 Cocucho 5.040.962

Charapan 5.1qCritical = 3.06 − 4.31 Pichataro 5.36 Comachuen 5.53

Sevina 5.75Since 3.06 < q=3.74 < 4.31 accept Ho. Quinceo 5.94Village population density is not significantly different than normal Nurio 6.06(q3.74, p > 0.05). Turicuaro 6.19

Urapicho 6.3Arantepacua 7.21Capacuaro 7.73

Page 3: k-s ingles

Normality Test

D’Agostino’s D Normality TestEquations taken from Zar, 1984

T n +1D = where T = ∑i − Xi

3 2n SS

D’Agostino’s D test is a powerful test for departures from normality. Calculating D is relatively simple, but canbecome cumbersome without the use of a spreadsheet. First the data must be ordered from either smallest to largestor largest to smallest. The squared mean deviates are then calculated for each observation and the sum of squareddeviates (SS) determined. Then (n + 1)/2 is subtracted from the order or “rank” (Xi) of each observation, where n isthe sample size, and the result multiplied by the observation value. T is then the sum of these values. Since the rangeof D is fairly small, it is best to carry out the calculations to at least 5 decimal places. The test statistic D iscompared to the critical D values from the table. If the calculated D value falls within this range Ho is accepted (thedata are normal), Ho is rejected when the D values are outside the critical range (the data are not normal).

Ho: Village population density is not significantly Population Squared MeanVillage idifferent than normal. Density DeviatesHa: Village population density is significantly different Aranza 4.13 1 1.46410than normal.α=0.05 Corupo 4.53 2 0.65610n = 17 San Lorenzo 4.69 3 0.42250 Cheranatzicurin 4.76 4 0.33640n +1 17 +1

= = 9 Nahuatzen 4.77 5 0.324902 2

Pomacuaran 4.96 6 0.14440T = ∑(i − 9)Xi

Sevina 4.97 7 0.13690T = (1− 9)4.13+ (2 − 9)4.53+(17 − 9)7.73 Arantepacua 5.00 8 0.11560T = 63.23 Cocucho 5.04 9 0.09000

63.23 Charapan 5.10 10 0.05760D = = 0.26050 3 Comachuen 5.25 11 0.00810(17 )(11.9916)

Pichataro 5.36 12 0.00040DCritical = 0.2587,0.2860 Quinceo 5.94 13 0.36000

Since 0.2587 < D = 0.26050 < 0.2860 accept H . Nurio 6.06 14 0.51840o

Turicuaro 6.19 15 0.72250Village population density is not significantly differentthan normal (D0.26050, p > 0.20). Urapicho 6.30 16 0.92160

Capacuaro 7.73 17 5.71210Mean = 5.34 SS = 11.9916