27
PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

  • View
    219

  • Download
    0

Embed Size (px)

Citation preview

Page 1: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

PSY 1950Post-hoc and Planned Comparisons

October 6, 2008

Page 2: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

PreamblePresentationsTutoringProblem 1e:

If you decide to reject the null hypothesis, you know the probability that you are making the wrong decision

Visual depiction of F-ratio

Page 3: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

Subpopulations• Cournot (1843): “...it is clear that nothing limits... the number of features according to which one can distribute [natural events or social facts] into several groups or distinct categories.”

• e.g., the chance of a male birth:– Legitimate vs. illegitimate– Birth order– Parent age– Parent profession– Parent health– Parent religion

• “… usually these attempts through which the experimenter passed don’t leave any traces; the public will only know the result that has been found worth pointing out; and as a consequence, someone unfamiliar with the attempts which have led to this result completely lacks a clear rule for deciding whether the result can or can not be attributed to chance.”

Page 4: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

Large Surveys and Observational Studies

• Abundant data• Limited a priori hypotheses• e.g., Genome Superstruct Project (GSP)– Genetic testing– Cognitive testing– Structural brain imaging– Functional brain imaging

Page 5: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

ANOVA• One-way ANOVA

– k(k-1)/2 possible pairwise comparisons

– e.g., with 5 levels, 10 possible comparisons

• Factorial ANOVA– The issue above plus– Multiple possible main effects/interactions

– e.g., with a 2 x 2 x 2, 7 possible effects

Page 6: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

Families• Set of hypotheses = Family• Type I error rate for a set of hypotheses = Familywise error rate– e.g., across pairwise comparisons in one-way ANOVA• If no mean differences exist, what is the chance of finding a significant one?

– e.g., across main effects/interactions in factorial ANOVA• If no main effects or interactions exist for a particular ANOVA, what is the chance of finding a significant one

– e.g., whole experiment with multiple ANOVAs• If no effects exist for the entire experiment, what is the chance of finding a significant one?

Page 7: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

Family Size• "If these inferences are unrelated in terms of their content or intended use (although they may be statistically dependent), then they should be treated separately and not jointly”– Hochberg and Tamhane (1987)

• e.g., suicide rates for 50 states, with 1225 possible pairwise comparisons– From a federal perspective, how big is the family?

– How about from a state perspective?

Page 8: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

Familywise • If family consists of two independent comparisons with = .05, AND if both corresponding null hypotheses are true:– The probability of NOT making a Type I error on both tests is: .95 x .95 = .9025

– The probability of making one or more type I errors is: 1 - .9025 = .0975

• If family consists of c independent comparisons with = .05, AND if all corresponding null hypotheses are true:– The probability of NOT making a Type I error on all tests is: (1 - .05)c

– The probability of making one or more Type I errors is: 1 - (1 - .05)c

Page 9: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

A Priori vs. Post-hoc Comparisons

• A priori comparisons– Chosen before data collection– Limited, deliberate comparisons

• Post hoc (a posteriori) comparisons– Conducted after data collection– Exhaustive, exploratory comparisons

Page 10: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

Significance of Overall F• Prerequisite for some tests (e.g., Fisher’s LSD)

• Efficient test of overall null hypothesis

• Need MSwithin for many tests

Page 11: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

A Priori Comparisons• Single stage tests

– Multiple t-tests– Linear contrasts– Bonferroni t (Dunn’s test)– Dunn-Sidak test

• Multistage tests– Bonferroni/Holm

Page 12: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

Multiple t-tests• Replace s2

pooled with MSwithin

• Use dfwithin

Page 13: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

Linear Contrasts• Compare more than one mean with another mean

Page 14: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008
Page 15: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

Bonferroni t (Dunn’s Test)• If c independent tests are performed

corrected = / cpcorrected = p x c

• Imprecise math– e.g., for pcorrected = .05 with c = 21, pcorrected 1.05

– pcorrected = 1 - (1 - .05)c

Bonferroni, C. E. (1936). Teoria statistica delle classi e calcolo delle probabilit. Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commerciali di Firenze, 8, 3-62.

Perneger, T.V. (1998). What is wrong with Bonferroni adjustments. BMJ,136,1236-1238.

Page 16: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

Dunn-Sidak Test• Identical to Bonferroni, except uses correct math

• Less conservative than Bonferroni– e.g., for pcorrected = .05 with c = 10:

•pBonferroni = .50

•pSidak = .40

Page 17: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

Multistage Bonferroni (e.g., Holm)

• Calculate t for all c contrasts of interest

• Order results based on |t||t1| > |t2| > |t3|

• Apply different Bonferroni corrections for or p based on position in above sequence, stopping when t is insignificant– For t1, c1 = 3; if p1 > .05/3, then…– For t2, c2 = 2; if p2 > .05/2, then…– For t1, c1 = 3; use = .05/1

Page 18: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

Post-hoc Comparisons• Fisher’s LSD• Tukey’s test• Newman-Keuls test• The Ryan procedure (REGWQ)• Scheffe’s test• Dunnett’s test

Page 19: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

Fisher’s LSD Test• LSD = Least significant difference• Two-stage process:

– Conduct ANOVA• If F is nonsignificant, stop• If F is significant…

– Make pairwise comparisons using

• Ensures familywise = .05 for complete null

• Ensures familywise = .05 for partial null when c = 3

Page 20: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

Studenized Range Statistic (q)

• If Ml and Ms represent the largest and smallest means and r is the number of means in the set:

• Order means from smallest to largest

• Determine r, calculate q, lookup p

Page 21: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

Tukey’s HSD Test• Determines minimum difference between treatment means that is necessary for significance

• HSD = honestly significant difference

Page 22: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

Scheffe• Not for post-hoc pairwise comparisons

• Not for a priori comparisons• Howell: “I can’t imagine when I would ever use it, but I have to include it here because it is such a standard test”

Page 23: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

Newman-Keuls (S-N-K) Test• Readjusts r based upon means tests

• Doesn’t control for familywise = .05

Page 24: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

Comparing Different Procedures

Page 25: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

Which Test?• One contrast

– Simple: t-test– Complex: linear contrast

• Several contrasts– A priori: Multistage Bonferroni (e.g., Holm)

– Post-hoc: Fisher’s LSD• Many contrasts

– Ryan REGRQ or Tukey• Find critical values for different tests– with a control: Dunnett– planned: Bonferroni– not planned: Scheffé

Page 26: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008
Page 27: PSY 1950 Post-hoc and Planned Comparisons October 6, 2008

Imaging Data• 200,000 tests on 200,000 voxels• 1000 false positives when = .05

• Bonferroni?– No, requires voxel independence