30
Statistical experimental design principles for biological studies Alexander (Alec) Zwart 10 July 2014 CSIRO DIGITAL PRODUCTIVITY AND SERVICES FLAGSHIP

Alec Zwart - Statistical experiment design principles for biological studies

Embed Size (px)

DESCRIPTION

“To consult the statistician after an experiment is finished is often merely to ask them to conduct a post mortem examination. They can perhaps say what the experiment died of.” ‐ Sir Ronald Aylmer Fisher, the father of modern statistics. Statistical experimental design, accompanied by the appropriate statistical analyses, plays a crucial role in producing valid and precise inferences, and avoiding ‘design disasters’ in empirical science. I will quickly refresh some of the basic elements of experimental design in general, and discuss some key issues and examples that arise in the areas of genetics/genomics and high throughput data. First presented at the 2014 Winter School in Mathematical and Computational Biology http://bioinformatics.org.au/ws14/program/

Citation preview

Page 1: Alec Zwart - Statistical experiment design principles for biological studies

Statistical experimental design principles for biological studiesAlexander (Alec) Zwart10 July 2014

CSIRO DIGITAL PRODUCTIVITY AND SERVICES FLAGSHIP

Page 2: Alec Zwart - Statistical experiment design principles for biological studies

An apology, and a caveat

•Apology: I am not a genetics/genomics expert!•But I DO know a bit about design…

• Caveat: Experimental design is a science! (as is statistical analysis…)•Every experiment is different•This talk is necessarily simplistic, and veryincomplete…

• Intended as a reminder of some concepts, some bits of advice, and a plea…

Presentation title  |  Presenter name2 |

Page 3: Alec Zwart - Statistical experiment design principles for biological studies

An experiment – recap:

• Apply treatments (experimental conditions) or treatment combinations to subjects (or experimental units).

• Carefully control/remove other influences as much as possible.

• Observe one or more responses (observed or measured results)

• A way to infer cause => effect.

• Genotype/Species/Variety is a treatment?  Yes...

Presentation title  |  Presenter name3 |

Page 4: Alec Zwart - Statistical experiment design principles for biological studies

What is statistical experimental design concerned with?• Everything!  The entire process leading to the final dataset…

• Recognising sources of variation (systematic, random) that influence your results and hence controlling for them

•How to ensure that we obtain data that provides ‘honest’ and accurate answers to our questions

Presentation title  |  Presenter name4 |

Page 5: Alec Zwart - Statistical experiment design principles for biological studies

Key point:

•Good statistics cannot rescue a bad dataset!

•And a bad dataset may not be able to answer your questions (honestly, that is)

•Do the experiment right => get a good dataset

• Consider design from the very beginning of planning the experiment

Presentation title  |  Presenter name5 |

Page 6: Alec Zwart - Statistical experiment design principles for biological studies

Fundamentals

Randomisation

Blocking

Replication

Presentation title  |  Presenter name6 |

Page 7: Alec Zwart - Statistical experiment design principles for biological studies

Randomisation

•Allocate treatments to subjects randomly(ideally using a proper random number generator, not your own personal judgement)

•Also randomise:•Spatial layout•Order of processing

•Avoid systematic patterns (But: see Blocking)

•Protects against unforeseen biases

ED for Biology |  Alexander Zwart7 |

Page 8: Alec Zwart - Statistical experiment design principles for biological studies

Blocking

• Identify groupings that may effect your results•Choice of operator/technician/machine•Cage/Table/shelf/assay plate/batch

•Try to assign (ideally) one of each treatment to each group! (Randomised Block Design or RBD)•(Example coming later…)

ED for Biology |  Alexander Zwart8 |

Page 9: Alec Zwart - Statistical experiment design principles for biological studies

Replication

• Sample size•G*Power (http://www.gpower.hhu.de/en.html)•Russ Lenth’s applets (http://homepage.stat.uiowa.edu/~rlenth/Power/)

•Simulation studies in the literature? (Google scholar)

• Biological/internal/technical replicates•Don’t confuse these – Internal/technical replicates are not a substitute for biological replication.

• (more later…)

ED for Biology |  Alexander Zwart9 |

Page 10: Alec Zwart - Statistical experiment design principles for biological studies

12 mice, three treatments (Colour)

Presentation title  |  Presenter name10 |

Completely randomised designBalanced (equal info on each treatment,

treatments compared with equal accuracy)

Page 11: Alec Zwart - Statistical experiment design principles for biological studies

But, the mice must be housed…

Presentation title  |  Presenter name11 |

Here, treatments are confounded with cages

Cannot separate Cage effects from Trt effects!

Page 12: Alec Zwart - Statistical experiment design principles for biological studies

Cages as Blocks

Presentation title  |  Presenter name12 |

Page 13: Alec Zwart - Statistical experiment design principles for biological studies

Grr

Presentation title  |  Presenter name13 |

Animals (especially) can pose a problem…

Grrr!

Grrr!GrrEeeek!

Page 14: Alec Zwart - Statistical experiment design principles for biological studies

Cages as Experimental Units / Subjects…

Presentation title  |  Presenter name14 |

…& Cage = Mouse

Page 15: Alec Zwart - Statistical experiment design principles for biological studies

(Aside‐ env. trend across columns of cages?

Presentation title  |  Presenter name15 |

=> Block by Columns!

Page 16: Alec Zwart - Statistical experiment design principles for biological studies

Not enough cages? A compromise:

But, Cages are the Experimental unitsOnly two replicates of each Trt!

Page 17: Alec Zwart - Statistical experiment design principles for biological studies

More on replication:

• The replication used determines the population(s) being sampled...

...hence…

• The replication used determines the conclusions that may be validly drawn from a statistical test

ED for Biology |  Alexander Zwart17 |

Page 18: Alec Zwart - Statistical experiment design principles for biological studies

Comparison between populations of plants

ED for Biology |  Alexander Zwart18 |

Mutant

2.1 1.4 1.9 1.6 4.3 4.44.13.9

Wild Type

Page 19: Alec Zwart - Statistical experiment design principles for biological studies

Comparisons between two plants

ED for Biology |  Alexander Zwart19 |

Mutant

4.3 4.44.13.9

Wild Type

2.1 1.4 1.9 1.6

Page 20: Alec Zwart - Statistical experiment design principles for biological studies

Recording

•Details of treatments and associated randomisation, blocking, types of replication should always be recorded

• Check what information is required for the analysis, or talk to the statistician about what is required.

• This is important meta data when distributing your data…

ED for Biology |  Alexander Zwart20 |

Page 21: Alec Zwart - Statistical experiment design principles for biological studies

More on Blocking…

• ‘One of each treatment per block’ is not always feasible – blocks may be too small• ‘Incomplete blocks’

• Think about:•Balance – treatments occur together in the same blocks the same number of times.

•Linkage – common treatments in different blocks provide links between blocks – ensure you have plenty of linkage.

ED for Biology |  Alexander Zwart21 |

Page 22: Alec Zwart - Statistical experiment design principles for biological studies

Balanced Incomplete Block Design (BIBD)

- Trts occur together the same number of times- Plenty of treatment-linkages between blocks

Page 23: Alec Zwart - Statistical experiment design principles for biological studies

Notes on incomplete blocks

• BIBDs are only availble for certain combinations of parameters.

•However, the general principles are still relevant• ‘Unbalanced incomplete blocks’

•Note that teasing apart the treatment effects from block effects in unbalanced situations is dependent on the capabilities of the analysis method•Check on said capabilities, consider simpler experiment if necessary…

ED for Biology |  Alexander Zwart23 |

Page 24: Alec Zwart - Statistical experiment design principles for biological studies

Treatment structure

• Sir Ronald Fisher => factorial treatment structure

(3x4 factorial)

• 5 = Number of replicates of each treatment

ED for Biology |  Alexander Zwart24 |

Page 25: Alec Zwart - Statistical experiment design principles for biological studies

Factorial treatment structure

• Separate main effects and interactions• Can include more factors (in theory)•Unbalanced numbers of treatments usually OK:

ED for Biology |  Alexander Zwart25 |

Page 26: Alec Zwart - Statistical experiment design principles for biological studies

But beware missing treatment combinations

• The more that entire treatments are missing from the factorial, the harder it is to tease apart main effects and interactions and the more confounded different treatment effects become.

ED for Biology |  Alexander Zwart26 |

Page 27: Alec Zwart - Statistical experiment design principles for biological studies

Don’t get too ambitious!

• Two‐way factorials are much easier to interpret than three‐way, four way… etc.

• The higher the factorial structure, the more likely the structure will be incomplete • (missing values, treatment combos which don’t make sense, treatment combos you aren’t interested in and don’t include).

• Statisticians and their tools can’t work miracles!• Simpler is better…• Talk to a statistician!

ED for Biology |  Alexander Zwart27 |

Page 28: Alec Zwart - Statistical experiment design principles for biological studies

Plea:

• If you don’t already know how to design and conduct good experiments…

…then…

•…involve a statistician in the early stages of planning your experiment!

•And note – the statistician may need time…

Presentation title  |  Presenter name28 |

Page 29: Alec Zwart - Statistical experiment design principles for biological studies

Sir Ronald Fisher, ‘the father of modern statistics’…

“To consult the statistician after an experiment is finished is often merely to ask them to conduct a post mortem examination. 

They can perhaps say what the experiment died of.”

(R. Fisher)

Presentation title  |  Presenter name29 |

Page 30: Alec Zwart - Statistical experiment design principles for biological studies

CSIRO DIGITAL PRODUCTIVITY AND SERVICES FLAGSHIP

Thank you