39
PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

Embed Size (px)

Citation preview

Page 1: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

PSY2004 Research Methods PSY2005 Applied Research

Methods

Week Eleven

Stephen Nunn

Page 2: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn
Page 3: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

•What it is

•Why it is important

Page 4: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

sensitivityof a statistical test

Page 5: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

why stats?

variability in the data

lots of different, random sources of variability

Page 6: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

we’re trying to see if changes in the

Independent Variable• type of non-word

• treatment type

affects scores on the

Dependent Variable• reaction time

• no. of days drugs taken

Page 7: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

lots of other things affect the DV• individual differences

• time of day

• mood

• level of attention

• etc etc etc etc

Lots of random, unsystematic, sources of variation, unrelated to IV

‘noise’

Page 8: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn
Page 9: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

sometimes the effects due to the IV are big, strong

easy to see through the noise

but what if the effect you’re looking for is small, weak

Page 10: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn
Page 11: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

your ‘equipment’(eyes, statistical test)

needs to be sensitive enough to spot it

otherwise you’ll miss it

Page 12: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

sensitivityof a statistical test

Page 13: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

ability or

probability of detecting an effect

[when there is one]

Page 14: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

sounds like a good thing

[but is often ignored]

Page 15: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

Reviews of meta-analyses* suggest most social science effect sizes are medium at

best, mostly small (Ellis, 2010)

*meta-analyses combine the results from several studies addressing the same hypotheses

Page 16: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

Estimates of power in published Psychological research (e.g., Clark-Carter, 1997, looking at BJP)

mean power for medium effects = 0.6

mean power for small effects = 0.2

NB recommended level of power = 0.8

Page 17: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

What does power = 0.2 mean?

[when there is an effect to detect]

you only have a 20% chance of detecting it[i.e., getting a statistically significant result]

Page 18: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

The ‘noise’ will tend to swamp the effect of your IV.

Repeated running of the same study would only give a significant result 20% of the time

Page 19: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

Or,

you have a 80% probability of making a Type II error

[failing to reject the null hypothesis when it is false]

Page 20: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

what affects power?

anything that changes the effect / ’noise’ ratio

Page 21: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

effect size

all other things being equal you will have greater power with a bigger effect, less power with a smaller effect

Page 22: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

design

all other things being equal repeated measures designs are more powerful that independent groups

because they allow you to remove the ‘noise’ in the data due to individual differences

Page 23: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

cell size

all other things being equal simpler designs, fewer levels of your IV will increase power

Page 24: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

alpha[criterion for rejecting H0]

stricter (smaller) alphas DECREASE power

e.g., Post-hoc Type 1 error rate correction

Bonferroni

achieved at the expense of power

Page 25: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

measures, samples

unreliable measures

heterogeneous samples

–> increase the ‘noise’

–> decrease power

Page 26: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

sample size

a larger N gives you more power

[from Central Limit Theorem, increasing N reduces the variability in the sample means, reduces the ‘noise’]

Page 27: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

but does this matter?

Page 28: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

for the individual researcher:

power = 0.2 = highly likely to waste time and other resources

Page 29: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

for ‘science’:

should we not worry more about Type 1 errors?

[rejecting H0 when it is false]

Page 30: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

maybe, but:

common (but mistaken) tendency to interpret non-significant results as evidence for no difference

i.e., non-significant result due to low power isn’t just waste of resources, but can be misinterpreted in a misleading way

Page 31: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

maybe, but:

a strong publication bias in Psychology means Type 1 errors and Power are intertwined

i.e., only significant results tend to get published

Page 32: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

This bias means that if all H0 were true

then all published studies would be Type 1 errors

i.e., keeping the type 1 error rate at 5% for individual studies or research as a whole doesn’t keep the error rate in the literature at that level due to the publication bias

Page 33: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

Low power across the discipline increases the proportion of published studies that are Type 1 errors

i.e., general low power reduces the proportion of studies with false H0s that reach

significance and which are therefore published (due to the publication bias). The ratio of Type 1 errors to correct rejections of H0 is therefore

increased (Ellis, 2010)

Page 34: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

H0 true(no effect)

H0 true(no effect)

H0 false(effect)

H0 false(effect)

Type 1 errors(5%)

Correct failure to reject H0

Correct failure toreject H0

Correct rejection

of H0

80%power

40%power

Type 2 errors

Type 1 errors(5%)

Type 2 errors

Correct rejection

of H0

published published

ratio of type 1 errors to correct rejections = 5:80 (6.2%)

ratio of type 1 errors to correct rejections = 5:40 (12.4%)

Page 35: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

NB the publication bias also makes it harder to reveal Type 1 errors in the literature

i.e., non-significant failures to replicate a published study (that reveal it as a possible Type 1 error) are less likely to be published due to the publishing bias against non-significant findings.

Page 36: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

maybe, but:

small sample (i.e., low power) studies tend to over-estimate the size of effects and are more likely to be Type 1 errors (Ellis, 2010)

i.e., studies with small N are more likely to give misleading results (but not always)

Page 37: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

low power is BAD for individual researchers and BAD for Psychology as a discipline

Page 38: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

what you should do:

• make sure your own study (e.g., FYP) has sufficient power• use something like G*Power to calculate your N for

power (1-β) = 0.8

• simplify your designs, only include what’s necessary• an extra IV or condition either reduces power or

raises the N you need

Page 39: PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn

what you should do:

• pay more attention to published studies that have greater power – e.g., larger samples

• distrust results from small samples

• look for meta-analyses