55
Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Embed Size (px)

Citation preview

Page 1: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Introduction to Statistical Considerations in

Experimental Research

Dr. Richy Hetherington

and Dr. Kim Pearce

Page 2: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Introductions

Page 3: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Today’s Session

• Run a live Experiment

• Discussion of considerations when setting up experiments

• Analyse the results of our experiments with thoughts on what to look out for when analysing data

• The best help for you

Page 4: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Solving Problems

Open the sheet and see what you make of the problems

Page 5: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce
Page 6: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Simpson’s Paradox

Page 7: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

“Are people good intuitive statisticians? …

…expert colleagues, like us, greatly exaggerated the likelihood that the original result of an experiment would be successfully replicated even with a small sample. They also gave very poor advice to a fictitious graduate student about the number of observations she needed to collect. Even statisticians were not good intuitive statisticians.”

Page 8: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

The Experiment

Does perception of time relate to organisation and timeliness?

Page 9: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Personality Types

• http://bigthink.com/ideafeed/different-personalities-experience-time-differently

Page 10: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

An Experiment

“The action of trying anything, or putting it to proof; a test, trial”

Oxford English Dictionary

My Life as a TurkeyBook Illumination in the FlatwoodsJoe Hutto

Page 11: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce
Page 12: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Take Home Messages

• Leave no stone unturned (use all possible sources of information)

• Training to help (workshops throughout the year):Non-Medline Library Databases Robust search Methodologies for Literature ReviewSystematic ReviewAlerting ServicesMedline

• Think about what is coming next

Page 13: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Planning Your Experiments

Take home message .Don’t believe everything you read

& Introduction to Critical Appraisal (online)Academic Integrity and Plagiarism

Use non-rigorous experiments but be prepared to repeat them with rigour

Take home message.Get as much help as is available in setting up your experiments (shy bairns get nowt!)

Page 14: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Make every result count

Take home message . Set up your experiments so all eventualities

are interesting

Results can be meaningful and interesting without being statistically significant

Also reporting non-significant findings avoids others from needlessly repeating that experiment

Page 15: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Subject Selection and Randomisation

• Make sure the sample you take is representative of what you are testing

• Samples should be made randomly to avoid bias

e.g. are you a representative sample of the population.

What would I do if everyone had turned up early?

Page 16: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Replication

• Combining datasets from separate experiments is difficult

• Datasets can be treated as replicates if all other variables are the same or weighted

• Analysis of replicates indicate the amount of variation in a result

Page 17: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Controls

• Controls should give you internal validity

• Take as much care with controls as with samples

• Each experiment requires its own control

Page 18: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Why have small sample sizes

Non-Human

Primates often n=1

Very rare conditions the population is small

Animal experimentation

Page 19: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Get a statisticians help now

• “To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of.”Dr. R. A. Fisher ca1938

Page 20: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Type A (early birds) versus Type B (laid back)

Page 21: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

A (Early) vs B (Laid back)The independent 2 sample t-test (Parametric Test)

• Subjects (units) are usually randomly assigned to two groups. One of the groups undergoes experimental manipulation (e.g. has a treatment applied), the other group is the control.

• In many examples, however, two groups are compared where membership is ‘fixed’ e.g. males vs females, left vs right handed, early vs laid back etc.

• We are testing if the two population means are equal.

• The 2 sample t-test statistic makes use of

1. the difference between the (average) value of the A and B groups,

2. the (pooled) standard deviation, and

3. the size of the A and B groups.

(We do not have to have equal numbers in our groups)

• We compare the value of the statistic to a statistical distribution. The significance of the statistic is obtained and is expressed by a ‘p value’.

• When p value is < 0.05 we say that the statistic is statistically significant i.e. in this case, there is evidence that the A group is different to the B group (in the population).

Page 22: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Result using the data available

• Do groups A and B differ?

• p value=?

• Let’s look at the data on a plot.

Page 23: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

The Boxplot

Page 24: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Boxplot for the data today

Page 25: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Using smaller samples

• 5 people from A and 5 people from B were randomly chosen and the 2 sample t-test was again carried out.

• Is there evidence that the A group is different to the B group (in the population)?

• As the group size is small, there is a reduced chance of observing a difference between the A and B groups when we conduct the test.

Page 26: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

What is the power of these tests?

• We would like our test to have high power which means that the test will detect a difference when it truly exists.

• The power of the test is influenced by different things including sample size.

• The lower sample size of our 2nd test (using 5 people from groups A and B) means that the test’s power has been reduced.

Page 27: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Power of Our tests

Test 1 (large sample sizes). Power =

Test 2 (5 from A, 5 from B). Power=

Page 28: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

What influences the power of a test?

1. As variation in the sample increases, power decreases.

2. As the difference we care about decreases, power decreases.

3. As sample size decreases, power decreases.

Page 29: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Prospective Power Analysis (used before collecting data)

• Finding a sample size to detect an effect size we care about at a specific power.

• Usually need to specify: Alpha levelVariance (from literature or pilot data)Statistical powerEffect size we care about*

*Effect size could be, for example, the difference between the means

Page 30: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Retrospective Power Analysis (after test has been done on collected data):

controversial!

• Finding the power of the test that you have performed to detect “an effect size”.

• Usually need to specify: Alpha levelVariance (from data)Sample SizeEffect size

Page 31: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Retrospective Power Analysis

• You could: calculate power based on effect size you observe in your data: not recommended……

Power calculated in this way is related to the p value of the test and both are dependent on the observed effect size.

- Non significant test tends to have low power;

- Significant test tends to have high power.

Page 32: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Retrospective Power Analysis

• Calculate power based on effect size you care about. Less controversial. For example, say we get a non significant test…we can work out the power that your test has to detect an effect size that you care about. If test has a low power to detect this effect size then you can do something about it (e.g. collect more data) to increase the power, then continue to evaluate the same problem; if test has high power to detect this effect size, then you may conclude that there is no meaningful difference (effect) and refrain from collecting additional data. Suggested that you also report 95% confidence interval for power (as variance is estimated from sample data).

• Which effect size should I choose? Look at a range of effect sizes. • Can also use ‘reverse power analysis’ : determine effect size

detectable with a certain power…question could be ‘what effect size am I able to detect with my data at power 0.8?’

Page 33: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Retrospective Power Analysis

• Calculate confidence intervals about the effect size calculated from your data –recommended. For example, if dealing with differences between means, we can be 95% confident that the true difference between the means (in the population) lie within this interval. If a zero is contained within the 95% confidence interval, this means there is no evidence to suggest that there is a difference between means.

• We ask ourselves : does the ‘difference we care about’ lie in this interval?

• Confidence intervals ‘quantify our uncertainty’.

Page 34: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

What is the confidence interval for our study?

Page 35: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Let’s see how I did the power calculation!

Page 36: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Retrospective Power Analysis

• References

• Hoenig, J.M. and Heisey, D.M. (2001). The abuse of power: The pervasive fallacy of power calculations for data analysis. The American Statistician 55, 19--24.

 • Thomas, L. (1997). Retrospective power analysis. Conservation Biology 11,

276-280.

• Lenth, R.V. (2001). Some Practical Guidelines for Effective Sample Size Determination. The American Statistician, 55, No. 3, 187-193.

Page 37: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Independent Samples

Group 1 Group 2

Page 38: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

More than 2 independent groups:1-way Analysis of Variance (ANOVA)

• We are testing if population means are equal when there are 3+ groups.

• 1-way ANOVA is also called a ‘completely randomised’ experiment.

• Subjects are regarded as being homogeneous ‘units’; even so, the subjects are assigned to the experimental groups at random to reduce the risk of any (unknown) variation influencing the experiment.

Page 39: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

More than 2 groups:1-way Analysis of Variance (ANOVA)

• Each group is comprised of different subjects.

• A measurement is recorded for each subject (in the above, say, “test score”).

• Although not necessary, it is usually a good idea to have the same number of subjects in each treatment group.

Hypothetical experimental set-up. Say a treatment 1 is learning method 1; treatment 2 is leaning method 2:

Control Treatment 1 Treatment 2

Page 40: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Adding a 2nd Factor: 2-Way ANOVA Drug Placebo Drug A Drug B

Alertness Fresh 10 people 10 people 10 people Tired 10 people 10 people 10 people

• In a 2 –way ANOVA we have 2 factors. Experiments such as this with two or more crossed factors are called factorial experiments.

• There are n replicates per treatment combination (here 10 replicates). There are 10 different people per treatment combination.

• The subjects (units) are considered homogeneous above & these units are randomly assigned to the 6 experimental conditions (combinations)

• Here the 2 factors are ‘alertness’ and ‘drug’ type – by testing, we can establish if there are differences between (i) levels of alertness and (ii) levels of drug and (iii) establish if there is a alertness x drug interaction.

Page 41: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

2-Way ANOVA: What is meant by an interaction?

• There is a significant interaction.

• The lines on the plot are non-parallel.

• The difference in (mean) driving performance between fresh and tired subjects depends on which treatment (drug) they have received.

• If an interaction is significant you must be careful interpreting the main effects....here, the effect of being fresh or tired is dependent on which level of drug you are considering.

Page 42: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

1-way ANOVA - revisited

•What do you think are its disadvantages?

Page 43: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

1-way ANOVA - revisited

• What are its disadvantages?1. We may get differences between treatment

groups occurring not just because the treatments are having different effects, but also because the groups of people tested are different (due to IQ levels, age, experience etc) i.e. there is a lot of noise which can cloud the result

2. It uses a lot of subjects

Page 44: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Paired Samples

Group 1 Group 2

Page 45: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Repeated Measures• Each subject has a measure taken at each level of the treatment factor. • In the example below, ‘learning method’ is the factor. It is called a ‘within-

subjects’ factor.

Note this is a simple example! There are many other more complex designs.

Learning Method 1 2 3

Page 46: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Repeated Measures• Disadvantages:• Practice Effect: say if you had to learn 3 similar lists.

The first list was learned under a control condition, then the second under method A, then the third under method B. An improvement under method A, for example, may be a practice effect – the more lists one learns, the better one gets at learning lists.

• Carry over effect: Recall of items in a list is prone to interference from items in previous lists.

• Order Effect (dependent on sequence of conditions). If we moved from method A to control condition, it would be almost impossible for the subject to cease to use method A on demand.

Page 47: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Repeated Measures Counterbalancing

• Remedy by “counterbalancing”.....the order of presentation of the levels making up the repeated measures factor is varied from subject to subject. It is hoped that carry over effects and order effects will balance out.

• Counter balancing makes little sense in some situations e.g. it would make little sense to have the control condition coming last in the above example.

Page 48: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Repeated Measures

• Instead of the effects of different treatments being studied for a set of subjects, we may look at the effect of something over time.

• For example:

• does IQ change when we compare a set of subjects at age 12, age 13, age 14 and age 15?

• A set of subjects learns a list of 50 words and are given 3 trials; the number of words recalled correctly per trail is recorded. We can test if the subjects learn as a function of practice.

Page 49: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

When do you need a 1-2-1 statistical session?

• When:1. You do not know what sample size is required to get a

reliable result

2. You need to check that your proposed design is appropriate for a statistical test

3. When you have some idea of how to analyse your data but you need to double check and/or get further advice on appropriate methods

4. You need some suitable study references

Page 50: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Statistics 1-2-1 Sessions• The statistics 1-2-1 sessions are only 1 hour long• They are NOT:

1. Meant as a means of regular intensive statistical tuition

2. Provided to solve a list of all of your statistical problems

3. Provided to have a statistician do your analysis for you

4. Provided to correct your results

5. A means to have a statistician interpret results and write your conclusions

PLEASE send a detailed description of your query at least 2 days before the session.

PLEASE avoid bringing queries/papers to the session which have not previously been seen by the statistician.

Page 51: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Statistics –The Way Forward• Think Ahead!: what are the potential problems? Drop out?

Missing Values?• Use your supervisor• Read some statistics books that feature the types of tests you

need (manuals written to accompany statistical packages are good)

• There are some good worked examples on youtube (e.g. how2stats)

• Don’t gather your data, THEN try and fit a statistical test to a messy data set....you are going to run into problems. E.g. missing values, unequal replicates etc. It could make your analysis much more difficult than it should have been. ...and you may have to learn advanced techniques.

Page 52: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

• Please don’t leave the statistics until the last minute.

• The analysis can be VERY time consuming and the writing of associated conclusions has to be spot on!

Page 53: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Analysis Software

• There are many statistics packages available.• MINITAB & SPSS are the most widely used &

among the most straightforward to learn (Minitab has a good help facility)

• The ISS (computing service) provides support to users.

• Other packages (e.g. SAS) may be used in various schools.

• Excel is not recommended as a piece of analysis software.

Page 54: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

So what is right for you?

• Refresher in stats – – ISRU very basic stats (45 minutes)– ISRU basic stats (3 hours) clinical / pure science

• Overview of Stats packages• SPSS beginners and Advanced• Getting stated with SAS• MatLab• Introduction to Applied Health Research Methods • One to one stats is useful for anyone at the right time• Maths aid by appointment

(ncl.ac.uk/students/mathsaid/support/book.htm)• Applied Statistics (ICM students)

Page 55: Introduction to Statistical Considerations in Experimental Research Dr. Richy Hetherington and Dr. Kim Pearce

Important messages reminder

• Statistical Support is available for your needs

• Get advice at the right time• Keep it simple• Don’t underestimate what information is

relevant• Set up your tests to get noteworthy results• p < 0.05 is not everything