23
On-line resources http://wise.cgu.edu/powermod/index.asp http://wise.cgu.edu/regression_applet.asp • http://wise.cgu.edu/hypomod/appinstruct.asp • http://psych.hanover.edu/JavaTest/NeuroAnim/ stats/StatDec.html • http://psych.hanover.edu/JavaTest/NeuroAnim/ stats/t.html • http://psych.hanover.edu/JavaTest/NeuroAnim/ stats/CLT.html Note demo page

On-line resources

Embed Size (px)

Citation preview

Page 1: On-line resources

On-line resources

• http://wise.cgu.edu/powermod/index.asp• http://wise.cgu.edu/regression_applet.asp• http://wise.cgu.edu/hypomod/appinstruct.asp• http://psych.hanover.edu/JavaTest/NeuroAnim/stats/

StatDec.html• http://psych.hanover.edu/JavaTest/NeuroAnim/stats/t.html• http://psych.hanover.edu/JavaTest/NeuroAnim/stats/

CLT.html

• Note demo page

Page 2: On-line resources

Effect sizes

For a small effect size, .01,The change in success rate is from 46% to 54% For a medium effect size, .06,The change in success rate is from 38% to 62%. For a large effect size, .16,The change in success rate is from 30% to 70%

R-squared r Cohen’s DLarge .15 .39 .80Medium .06 .24 .50Small .01 .10 .20

Page 3: On-line resources

But what does .10 really mean?

Predictor Outcome R2 r

Vietnam veteran status

Alcohol abuse .00 .03

Testostone Juvenile delinquency .01 .10.10

AZT Death .05 .33.33

Psychotherapy Improvement .10 .32.32

Page 4: On-line resources

Is psychotherapy effective?(after Shapiro & Shapiro, 1983)

Therapy target Number of studies

Cohen’s D r R2

Anxiety & depression 30 .67 .31 9.6%

Phobias 76 .88 .54 29%

Physical and habit problems

106 .85 .52 27%

Social and sexual problems

76 .75 .43 18%

Performance anxieties

126 .71 .37 14%

Page 5: On-line resources

Calculating Cohen’s DEffect size = difference between predicted mean and mean of known population divided by population standard deviation (assumes that you know population and sample size)

(imagine one population receives treatment, the other does not)

d= 12) / 1=mean of population 1 (hypothesized mean for the population that is subjected to the experimental manipulation)2=mean of population 2 (which is also the mean of the comparison distribution)=standard deviation of population 2 (assumed to be the standard deviation of both populations

Page 6: On-line resources

One other way to think about D

• D =.20, overlap 85%, 15 vs. 16 year old girls distribution of heights

• D=.50, overlap 67%, 14 vs. 18 year old girls distribution of heights

• D=.80, overlap 53%, 13 vs. 18 years old girls distribution of heights

Page 7: On-line resources

Effect sizes are interchangeable

Page 8: On-line resources

• http://www.amstat.org/publications/jse/v10n3/aberson/power_applet.html

Page 9: On-line resources

Statistical significance vs. effect size

• p <.05 • r =.10

– For 100,000, p<.05– For 10, p>.05– Large sample, closer to population, less chance of

sampling error

Page 10: On-line resources

Brief digression

• Research hypotheses and statistical hypotheses

• Is psychoanalysis effective?– Null?– Alternate?– Handout

• Why test the null?

Page 11: On-line resources

Statistical significance and decision levels.(Z scores, t values and F values) Sampling distributions for the null hypothesis:

http://statsdirect.com/help/distributions/pf.htm

Page 12: On-line resources

One way to think about it…

Page 13: On-line resources

Two ways to guess wrong

Truth for population

Do not reject null hypothesis

Reject null hypothesis

Null is true Correct! Type 1 error

Null is not true Type 2 error Correct!

Type 1 error: think something is there and there is nothingType 2 error: think nothing is there and there is

Page 14: On-line resources

An exampleNull hypothesis is false Null hypothesis is true

Reject null hypothesis Merit pay works and we know it

We decided merit pay worked, but it doesn’t.

Do not reject null hypothesis We decided merit pay does not work but it does.

Merit pay does not work and we know it.

Page 15: On-line resources

An example

Imagine the following research looking at the effects of the drug, AZT, if any, on HIV positive patients. In others words, does a group of AIDs patients given AZT live longer than another group given a placebo. If we conduct the experiment correctly - everything is held constant (or randomly distributed) except for the independent measure and we do find a different between the two groups, there are only two reasonable explanations available to us:

From Dave Schultz:

Null hypothesis is false

Null hypothesis is true

Reject null hypothesis

Do not reject null hypothesis

Page 16: On-line resources

Power -> .10 .20 .30 .40 .50 .60 .70 .80 .90Effect size |.01 21 53 83 113 144 179 219 271 354.06 5 10 14 19 24 30 36 44 57.15 3 5 6 8 10 12 14 17 22

If you think that the effect is small (.01), medium, (.06) or large (.15), and you want to find a statistically significant difference defined as p<.05, this table shows you how many participants you need for different levels of “sensitivity” or power.

Statistical power is how “sensitive” a study is detecting various associations (magnification metaphor)

Page 17: On-line resources

Power -> .10 .20 .30 .40 .50 .60 .70 .80 .90

Effect size |

.01 70 116 156 194 232 274 323 385 478

.06 13 20 26 32 38 45 53 62 77

.15 6 8 11 13 15 18 20 24 29

If you think that the effect is small (.01), medium, (.06) or large (.15), and you want to find a statistically significant difference defined as p<.01, this table shows you how many participants you need for different levels of “sensitivity” or power.

Page 18: On-line resources

What determines power?

1. Number of subjects2. Effect size3. Alpha level

Power = probability that your experiment will reveal whether your research hypothesis is true

Page 19: On-line resources

How increase power?

1. Increase region of rejection to p<.102. Increase sample size3. Increase treatment effects4. Decrease within group variability

Page 20: On-line resources

Study feature Practical way of raising power

Disadvantages

Predicted difference Increase intensity of experimental procedures

May not be practical or distort study’s meaning

Standard deviation Use a less diverse population

May not be available, decreases generalizability

Standard deviation Use standardized, controlled circumstances of testing or more precise measurement

Not always practical

Sample size Use a larger sample size Not practical, can be costly

Significant level Use a more lenient level of significance

Raises alpha, the probability of type 1 error

One tailed vs. two tailed test

Use a one-tailed test May not be appropriate to logic of study

Page 21: On-line resources

What is adequate power? .50 (most current research).80 (recommended)

How do you know how much power you have? Guess work

Two ways to use power:1. Post hoc to establish what you could find2. Determine how many participants need

Page 22: On-line resources

Outcome statistically significant

Sample Size Conclusion

Yes Small Important results

Yes Large Might or might not have practical importance

No Small Inconclusive

No Large Research H. probably false

Page 23: On-line resources

Statistical power (for p <.05)

r=.10r=.30r=.50Two tailedOne tailedPower: Power = 1 - type 2 errorPower = 1 - beta