On-line resources

On-line resources

• http://wise.cgu.edu/powermod/index.asp• http://wise.cgu.edu/regression_applet.asp• http://wise.cgu.edu/hypomod/appinstruct.asp• http://psych.hanover.edu/JavaTest/NeuroAnim/stats/

StatDec.html• http://psych.hanover.edu/JavaTest/NeuroAnim/stats/t.html• http://psych.hanover.edu/JavaTest/NeuroAnim/stats/

CLT.html

• Note demo page

http://wise.cgu.edu/powermod/index.asp

http://wise.cgu.edu/regression_applet.asp

Effect sizes

For a small effect size, .01,The change in success rate is from 46% to 54% For a medium effect size, .06,The change in success rate is from 38% to 62%. For a large effect size, .16,The change in success rate is from 30% to 70%

R-squared r Cohen’s DLarge .15 .39 .80Medium .06 .24 .50Small .01 .10 .20

But what does .10 really mean?

Predictor Outcome R2 r

Vietnam veteran status

Alcohol abuse .00 .03

Testostone Juvenile delinquency .01 .10.10

AZT Death .05 .33.33

Psychotherapy Improvement .10 .32.32

Is psychotherapy effective?(after Shapiro & Shapiro, 1983)

Therapy target Number of studies

Cohen’s D r R2

Anxiety & depression 30 .67 .31 9.6%

Phobias 76 .88 .54 29%

Physical and habit problems

106 .85 .52 27%

Social and sexual problems

76 .75 .43 18%

Performance anxieties

126 .71 .37 14%

Calculating Cohen’s DEffect size = difference between predicted mean and mean of known population divided by population standard deviation (assumes that you know population and sample size)

(imagine one population receives treatment, the other does not)

d= 12) / 1=mean of population 1 (hypothesized mean for the population that is subjected to the experimental manipulation)2=mean of population 2 (which is also the mean of the comparison distribution)=standard deviation of population 2 (assumed to be the standard deviation of both populations

One other way to think about D

• D =.20, overlap 85%, 15 vs. 16 year old girls distribution of heights

• D=.50, overlap 67%, 14 vs. 18 year old girls distribution of heights

• D=.80, overlap 53%, 13 vs. 18 years old girls distribution of heights

Effect sizes are interchangeable

• http://www.amstat.org/publications/jse/v10n3/aberson/power_applet.html

Statistical significance vs. effect size

• p <.05 • r =.10

– For 100,000, p<.05– For 10, p>.05– Large sample, closer to population, less chance of

sampling error

Brief digression

• Research hypotheses and statistical hypotheses

• Is psychoanalysis effective?– Null?– Alternate?– Handout

• Why test the null?

Statistical significance and decision levels.(Z scores, t values and F values) Sampling distributions for the null hypothesis:

http://statsdirect.com/help/distributions/pf.htm

One way to think about it…

Two ways to guess wrong

Truth for population

Do not reject null hypothesis

Reject null hypothesis

Null is true Correct! Type 1 error

Null is not true Type 2 error Correct!

Type 1 error: think something is there and there is nothingType 2 error: think nothing is there and there is

An exampleNull hypothesis is false Null hypothesis is true

Reject null hypothesis Merit pay works and we know it

We decided merit pay worked, but it doesn’t.

Do not reject null hypothesis We decided merit pay does not work but it does.

Merit pay does not work and we know it.

An example

Imagine the following research looking at the effects of the drug, AZT, if any, on HIV positive patients. In others words, does a group of AIDs patients given AZT live longer than another group given a placebo. If we conduct the experiment correctly - everything is held constant (or randomly distributed) except for the independent measure and we do find a different between the two groups, there are only two reasonable explanations available to us:

From Dave Schultz:

Null hypothesis is false

Null hypothesis is true

Reject null hypothesis

Do not reject null hypothesis

Power -> .10 .20 .30 .40 .50 .60 .70 .80 .90Effect size |.01 21 53 83 113 144 179 219 271 354.06 5 10 14 19 24 30 36 44 57.15 3 5 6 8 10 12 14 17 22

If you think that the effect is small (.01), medium, (.06) or large (.15), and you want to find a statistically significant difference defined as p<.05, this table shows you how many participants you need for different levels of “sensitivity” or power.

Statistical power is how “sensitive” a study is detecting various associations (magnification metaphor)

Power -> .10 .20 .30 .40 .50 .60 .70 .80 .90

Effect size |

.01 70 116 156 194 232 274 323 385 478

.06 13 20 26 32 38 45 53 62 77

.15 6 8 11 13 15 18 20 24 29

If you think that the effect is small (.01), medium, (.06) or large (.15), and you want to find a statistically significant difference defined as p<.01, this table shows you how many participants you need for different levels of “sensitivity” or power.

What determines power?

1. Number of subjects2. Effect size3. Alpha level

Power = probability that your experiment will reveal whether your research hypothesis is true

How increase power?

1. Increase region of rejection to p<.102. Increase sample size3. Increase treatment effects4. Decrease within group variability

Study feature Practical way of raising power

Disadvantages

Predicted difference Increase intensity of experimental procedures

May not be practical or distort study’s meaning

Standard deviation Use a less diverse population

May not be available, decreases generalizability

Standard deviation Use standardized, controlled circumstances of testing or more precise measurement

Not always practical

Sample size Use a larger sample size Not practical, can be costly

Significant level Use a more lenient level of significance

Raises alpha, the probability of type 1 error

One tailed vs. two tailed test

Use a one-tailed test May not be appropriate to logic of study

What is adequate power? .50 (most current research).80 (recommended)

How do you know how much power you have? Guess work

Two ways to use power:1. Post hoc to establish what you could find2. Determine how many participants need

Outcome statistically significant

Sample Size Conclusion

Yes Small Important results

Yes Large Might or might not have practical importance

No Small Inconclusive

No Large Research H. probably false

Statistical power (for p <.05)

r=.10r=.30r=.50Two tailedOne tailedPower: Power = 1 - type 2 errorPower = 1 - beta

Documents

On-line resources