Upload
nathaniel-je
View
213
Download
1
Embed Size (px)
Citation preview
TMA4255 Applied Statistics V2014 (23)Signed-Rank test [16.2]Wilcoxon Rank-sum test [16.3]
Anna Marie Holand
April 2, 2014, wiki.math.ntnu.no/tma4255/2014v/start
www.ntnu.no [email protected], TMA4255V2014
2
Outline of part 7
— Approximation of E and Var• First order Taylor approximation [p 133-135]
— Nonparametric tests:• One sample or two paired samples:
— The sign test [16.1], for continuous distributions.— The (Wilcoxon) signed-rank test [16.2], for continuous symmetric
distributions.• Two independent samples:
— The Wilcoxon rank-sum test (Mann-Whitney) [16.3], for twocontinuous distributions of the same shape.
www.ntnu.no [email protected], TMA4255V2014
3
Shoshoni and golden ratio conjugate— Data set of ratio of height/length for n = 8 rectangles found on
leather items at Shoshoni indians:
0.693 0.662 0.690 0.606 0.570 0.749 0.672 0.628
— The golden ratio, 1+√
52 = 1.618, is the longer segment divided
by the shorter, while the reverse is called the golden ratioconjugate, 0.618.
— Do the ratioes from the shoshoni rectangles correspond withthe golden ratio (conjugate)?H0 : median of rectangles ratios=0.618 vs. H1: not so.
— Sign test gave a p-value of 0.29.— But, the sign test only used the sign of each observation as
compared to the hypothesized mean. Can we do better?
www.ntnu.no [email protected], TMA4255V2014
4
The Sign Test [16.1]
— Use with one sample or two paired samples.— Test for the median, or the mean in a symmetric distribution.— Binomial test based on the number of positive (or negative)
differences between observations and the hypothesizedmedian. Binomial (n, p = 0.5).
— Normal approximation to the binomial used for n large.General rule of thumb np ≥ 5 and n(1− p) ≥ 5, here p = 0.5,so n ≥ 10.
— Values equal to the hypothesized median are deleted from thedata set.
— Only the sign of the data (wrt the hypthesized median), andnot the magnitude (actual value) of the data are used.
www.ntnu.no [email protected], TMA4255V2014
5
The Signed-Rank Test
Source: Statistics review 6: Nonparametric methods Elise Whitley and Jonathan Ball.
www.ntnu.no [email protected], TMA4255V2014
6
Shoshoni and golden ratio conjugate
yi yi − 0.618 |yi − 0.618| rank0.628 0.010 0.010 10.606 -0.012 0.012 20.662 0.044 0.044 30.570 -0.048 0.048 40.672 0.054 0.054 50.690 0.072 0.072 60.693 0.075 0.075 70.749 0.131 0.131 8
www.ntnu.no [email protected], TMA4255V2014
www.ntnu.no [email protected], TMA4255V2014
9
Critical values: W+, W− andW = min(W+,W−)
— We have one sample of n Yi ’s (or the difference betweenpaired samples).
— The null hypothesis tested is H0 : µ̃ = µ̃0.— We form differences Yi − µ̃0, and rank them.— W+ is the sum of the ranks of the positive differences.— W+ is the sum of the ranks of the negative differences.
Which W (W+, W− or W ) to be used to compare to the criticalvalues in Table A16 is deciede by the alternativ hypothesis:
H1 : µ̃ < µ̃0 : Reject H0 when W+ ≤ critical value (one-sided)H1 : µ̃ > µ̃0 : Reject H0 when W− ≤ critical value (one-sided)H1 : µ̃ 6= µ̃0 : Reject H0 when W ≤ critical value (two-sided)
www.ntnu.no [email protected], TMA4255V2014
10
The Signed-Rank Test: questions
— Q: What about zeros?• Remove, as for the sign test.
— Q: What about ties?• If two observations have the same absolute value, and these
two values should have been assigned rank 3 and 4 (say), thenboth observations are assigned rank 3.5.
— Q: What if n is large (n ≥ 15)?• Instead of the tables use the normal approximation to calculate
critical values and tail probabilites.
Z =W − E(W )√
Var(W )
where E(W ) = n(n + 1)/4 and Var(W ) = n(n + 1)(2n + 1)/24.
www.ntnu.no [email protected], TMA4255V2014
12
The Rank-Sum Test
Source: Statistics review 6: Nonparametric methods Elise Whitley and Jonathan Ball.
www.ntnu.no [email protected], TMA4255V2014
15
Critical values and U1, U2 andU = min(U1,U2)
— Sample 1: has the n1 observations, rank sum W1 and adjustedrank sum U1 = W1 − n1(n1+1)
2 .— Sample 2: has the n2 observations, rank sum W2 and adjusted
rank sum U2 = W2 − n2(n2+1)2 .
— Here n1 ≤ n2.
The null hypothesis about the medians µ̃ is H0 : µ̃1 = µ̃2.Which U (U1, U2 or U) to be used to compare to the critical valuesin Table A17 is deciede by the alternativ hypothesis:
H1 : µ̃1 < µ̃2 : Reject H0 when U1 ≤ critical value (one-sided)H1 : µ̃1 > µ̃2 : Reject H0 when U2 ≤ critical value (one-sided)H1 : µ̃1 6= µ̃2 : Reject H0 when U ≤ critical value (two-sided)
www.ntnu.no [email protected], TMA4255V2014
16
Efficiency of the Wilcoxon Rank-Sumtest
— When data are normal with equal variances, the rank-sum testis 95% as efficient as the pooled t-test for large samples.
— 95% efficient= the t-test needs 95% of the sample size of therank-sum test to acihive the same power.
— The rank-sum test will always be at least 86% as efficient asthe pooled t-test, and may be more efficient if the underlyingdistributions are very non-normal, escpecially with heavy tails.
— Power calculations for the rank-sum tests is in general difficult,since we need to specify the shapes of the two distributions.
Taken from Devore.
www.ntnu.no [email protected], TMA4255V2014
17
Balance— Is it harder to maintain your balance while
you are concentrating?
— Nine elderly and eight young people stoodbarefoot on a "force platform" and wasasked to maintain a stable upright positionand to react as quickly as possible to anunpredictable noise by pressing a handheld button.
— The noise came randomly and the subjectconcentrated on reacting as quickly aspossible. The platform automaticallymeasured how much each subject swayedin millimeters in both the forward/backwardand the side-to-side directions.
http://lib.stat.cmu.edu/DASL/Stories/MaintainingBalance.html
Sway Group1.5 14 young1.5 14 young
3 15 young4.5 17 young4.5 17 young6.5 19 elderly6.5 19 elderly
8 20 elderly9.5 21 elderly9.5 21 young11 22 young12 24 elderly
13.5 25 elderly13.5 25 young
15 29 elderly16 30 elderly17 50 elderly
www.ntnu.no [email protected], TMA4255V2014
18
Advantages of nonparametric tests— Nonparametric methods require no or very limited assump-
tions to be made about the format of the data, and they maytherefore be preferable when the assumptions required forparametric methods are not valid.
— Nonparametric methods can be useful for dealing with unex-pected, outlying observations that might be problematic with aparametric approach.
— Nonparametric methods are intuitive and are simple to carryout by hand, for small samples at least.
— Nonparametric methods are often useful in the analysis ofordered categorical data in which assignation of scores toindividual categories may be inappropriate.
Source: Statistics review 6: Nonparametric methods Elise Whitley and Jonathan Ball.
www.ntnu.no [email protected], TMA4255V2014
19
Disadvantages of nonparametric tests
— Nonparametric methods may lack power as compared withmore traditional approaches. This is a particular concern if thesample size is small or if the assumptions for the corre-sponding parametric method (e.g. Normality of the data) hold.
— Nonparametric methods are geared toward hypothesis testingrather than estimation of effects. It is often possible to obtainnonparametric estimates and associated confidence intervals,but this is not generally straightforward.
— Tied values can be problematic when these are common, andnonparametric methods adjustments to the test statistic maybe necessary.
Source: Statistics review 6: Nonparametric methods Elise Whitley and Jonathan Ball.
www.ntnu.no [email protected], TMA4255V2014