74
1. Introduction 2. Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics ntroduction to the Practice of Statistics h. 1, 2.5, 3.2 MBP1010 – Jan. 5, 2010

1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Embed Size (px)

Citation preview

Page 1: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

1. Introduction

2. Course Information and Schedule

3. Study Design

4. Looking at Data

Today’s Topics

Introduction to the Practice of StatisticsCh. 1, 2.5, 3.2

MBP1010 – Jan. 5, 2010

Page 2: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

(1) How can we describe and draw meaning from a collection of data?

(2) How can we infer information about the whole population when we know data from only some of the population (a sample)?

Meaning from Data

Page 3: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

- science of understanding data and making decisions in the face of variability and uncertainty

- statistics is NOT a field of mathematics

Page 4: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Statistical Thinking

-humans are good at recognizing patterns and there is real danger of over-interpreting patterns that are merely due to the play of chance (false leads)

- role of statistics - to reject chance as an explanation so that we can have reasonable assurance that patterns seen are worthy of interpretation

Page 5: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Statistical Thinking

- explore data prior to analysis

- think about context and design

- reasoning behind standard statistical methods

Interpretation/Conclusions

Page 6: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

1. Looking at data 2. Concepts of statistical inference and hypothesis testing

3. Specific statistical tests - 1 and 2 sample test for continuous and categorical data - correlation, regression and ANOVA

4. Other Topics - eg survival analysis, logistic regression 5. Bioinformatics

Course Overview

Page 7: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Course Information and Schedule

Tutorials: Thursdays 2 to 3:30 pm OCI 7-605

R Tutorials: Thurs Jan 7 and 14(Part 1 and part 2)

Lectures: Tuesdays 1 to 3 pm 620 University, 7-709

Page 8: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Can what we eat influence our risk of cancer?

The case of dietary fat and breast cancer

Study Design

Posted on website: New York Times articleSearching for clarity: A primer on medical studies

Page 9: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,
Page 10: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

What should we do next?

Page 11: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

An observational study observes individuals and measures variables of interest but does not attempt to influence the responses.

Observational Studies

Page 12: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Observational Studies

Case/control and cohort studies common in cancer research (epidemiology)- outcome is binary: cancer/ no cancer

Observational studies often examine factors associated with continuous outcome variables

- eg association of body weight or diet with hormone levels

- calcium intake and blood pressure

Page 13: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

X

XX X

X

XX

0

X

XX

0

0

0 00

0

00

0

Exposureeg diet

Case Control Study

Exposureeg diet

Page 14: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

0 0

0

0 00

0

00

0

0

0

0

0

0 0

0

X 00

0

X0

0

X

0

0

X

Cohort Study

Exposureeg diet Cancer (yes/no)

Page 15: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Relative Risk

• Compare risk of disease in those with highest versus lowest intake

RR = 1.0 no association

RR = 1.4 1.4 times the risk 40% higher risk

RR = 0.8 20% lower risk

Page 16: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

a. Total Fat

Odds Ratio or Relative Risk

Case Control:

Challier (1998)DeStefani (1998)Ewertz (1990)Franceschi (1996)Graham (1982)Graham (1991)Hirohata (1985)Hirohata (1987) (Caucasian)Hirohata (1987) (Japanese)Ingram (1991)Katsouyanni (1988)Katsouyanni (1994)Landa (1994)Lee (1991)Levi (1993)Mannisto (1999)Martin-Moreno (1994)Miller (1978)Núñez (1996)Potischman (1998)Pryor (1989)Richardson (1991)Rohan (1988)Shun-Zhang (1990)Toniolo (1989)Trichopoulou (1995)van't Veer (1990,1991)Wakai (2000)Witte (1997)Yuan (1995)Zaridze (1991)

Case Control Summary

Cohort:

Gaard (1995)Graham (1992)Holmes (1999)Howe (1991)Jones (1987)Knekt (1990)Kushi (1992)Thiébaut (2001)Toniolo (1994)van den Brandt (1993)Velie (2000)Wolk (1998)

Cohort SummaryAll Studies Summary

Bingham (2003)Cho (2003)

0 1 2 3 4 5 6 13 14 15

Page 17: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Interpretation

Suppose we find that women who eat a low fatdiet tend to have lower risk of breast cancer.

Can we conclude that the fat in the diet is responsible for the lower risk of breast cancer?

Page 18: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Interpretation

Suppose we find that women who eat a low fatdiet tend to have lower risk of breast cancer.

Can we conclude that the fat in the diet is responsible for the lower risk of breast cancer?

No. Other factors may be responsiblefor the association with dietary fat(confounding)

Page 19: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Problem of Confounding

Suppose A is associated with B:

This may be because:• A causes B• B causes A• X is associated with both A and B

X need not be a cause of either A or B

Page 20: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Problem of Confounding

-women who eat more dietary fat may differ from those who less fat (eg. weight, exercise, other dietary factors)

-these factors may influence the risk of breast cancer

In our dietary fat example:

Page 21: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Trying to control for confounding

- measure potential confounderseg. measure weight and physical activity

-“control” for possible confounders in analysis

- but…what about confounding with variables we don’t know exist or can’t measure?

Page 22: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

An observational study observes individuals and measures variables of interest but does not attempt to influence the responses.

Association between variables a response variable, even if it is very strong, is not good evidence of a cause and effect link between variables

Observational Studies

Correlation is not causation

Page 23: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Randomized Experiments

- impose treatment and observe response

- subjects/animals randomly assigned to treatments and control

- randomization should result in groups that are similar with respect to any possible confounding variables

- difference in outcome must be due to treatment (OR the play of chance in random assignment)

Page 24: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Basic principles of experimental design

1. Formulate question/goal in advance

2. Comparison/control

3. Replication

4. Randomization

5. Stratification (or blocking)

6. Factorial experiments

Page 25: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Replication

Page 26: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Jackson et al. Nutr.Cancer, 1998

Dietary fat and mammary tumors in Sprague-Dawley rats(n=30 per diet group)

Randomized Design

Page 27: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Stratification

• Suppose that some measurements will be made in males and femalesAND• You anticipate a difference in responses between males and females

– Randomize within males and females separately - any systematic difference by sex removed - this is sometimes called “blocking”.

-Take account of the difference between males and females in analysis: - helps control variability

Page 28: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Randomization and stratification

• If you can (and want to), fix a variable.– e.g., study only men or women or a single strain of animal

• If you don’t fix a variable, stratify on it.– e.g., randomize treatment men and women

• If you can neither fix nor stratify a variable, randomize to treatment.

Page 29: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Dietary fat and fiber and mammary tumors in Sprague-Dawley rats(n=30)

Factorial Experiment

Page 30: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

• Diet and Breast Cancer Prevention Study• 4793 high risk women followed for 7-17 years

(not yet published)

•Women’s Health Initiative (US)• 48,835 postmenopausal women followed for 8-12 years • reported in 2006

Randomized Clinical Trials in Humans -Dietary Fat and Breast Cancer

Page 31: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Eligible Subjects Identified (> 50% density)

Prerandomization Assessment

Intervention Control(n=2,343) (n=2,350)

Annual Visits

• demo/anthro data• diet records• non fasting serum

Follow up until Dec 2005(7-17 years per subject)breast cancer incidence

Page 32: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Women’s Health Initiative

- Postmenopausal women (50-79 years of age)

- n=48,835; follow-up 8-12 years

- randomized 40:60 intervention and control

- group dietary counselling

- follow up for breast cancer

Page 33: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Copyright restrictions may apply.

Prentice, R. L. et al. JAMA 2006;295:629-642.

Kaplan-Meier Estimates of the Cumulative Hazard for Invasive Breast Cancer

Page 34: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Practical Issues:

- long (particularly for cancer outcomes!)

- expensive

- limited in “treatment” options

Randomized Clinical Trials in Humans

Page 35: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

- highly selected subjects- selection criteria and motivation

- subject/investigator blinding

- subjects drop out

-compliance?

Randomized Clinical Trials in Humans

Other issues:

Page 36: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Main Points

- primary interest is causal relationshipsbetween variables

- observational studies show associations only

- randomized studies best for causation but arenot without challenges

- totality of evidence important

Page 37: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,
Page 38: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

What’s in the dataset?

What are the observations (individuals)?Eg people, animals, cells, countries

How many observations are in the dataset?

How many observations should there be?

Are the observations independent?- repeated in an individudal?

Page 39: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

What are the variables?

What is their exact definition?

How were they measured?

What are the units of measurement?

What type of variables?

What’s in the dataset?

Page 40: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Main Types of Variables

Categorical: - include nominal and dichotomous variables

- qualitative difference between values- eg sex (male/female), smoker/non smoker

Continuous:- quantitative- equal distance between each value- eg blood pressure, age, dietary fat

Ordinal variables can be ordered but they do not have specific numeric values, eg scales, ratings

Page 41: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Continuous Variables

Page 42: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Stem and Leaf Plots

- displays distribution of small/moderate amounts of data- includes the actual numerical values

Example data: Blood pressure data in 21 patients

107 110 123 129 112 111 107 112 136 102123 109 112 102 98 114 119 112 110 117 130

9 : 810 : 2277911 : 001222247912 : 33913 : 06

Stem(all but last digit)

Leaf (last digit)

Page 43: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

9 : 10 : 11 : 12 : 13 :

Stem and Leaf Plot

Blood Pressure Data:

107 110 123 129 112 111 107 112 136 102123 109 112 102 98 114 119 112 110 117 130

Stem Add leaves Order leaves

9 : 810 : 7 7 2 9 211 : 0 2 1 2 2 4 9 2 0 712 : 3 9 313 : 6 0

9 : 810 : 2 2 7 7 911 : 0 0 1 2 2 2 2 47912 : 3 3 913 : 0 6

Page 44: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

1. Divide data into classes of equal width.2. Count the number in each class.3. Plot bars with heights proportional to number or percent of data points in each interval.

Frequency Histograms

- like a stem plot but leaves (individual data points) are not distinguished - usuually plotted horizontally

How to make a histogram?

Page 45: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,
Page 46: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,
Page 47: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Similarity of Histogram and Stem Leaf Plot

Blood Pressure Data: n= 21 measurements

9

: 8

10

: 2 2

7 7

91

1 : 0

0 1

2 2

2 2

47

91

2 : 3

3 9

13

: 0 6

Page 48: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Effect of Using Different Intervals

Blood Pressure Data: n= 21 measurements

Page 49: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Describing Distributions with Numbers

Page 50: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,
Page 51: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Blood Pressure Data: n= 21 measurements

mean = 2395/21 = 114median = observation 11 = 112

98 102 102 107 107 109 110 110 111 112 112112 112 114 117 119 123 123 129 130 136

Page 52: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

2 8 15 3 29 5 8 1 20 17 6 5 31 44 10 12 23 62

Mean versus Median - skewed data

0: 123556881: 02572: 0393: 14: 45: 6: 2

Stem Plot

Mean = 16.7

Median = 11

Remove highest observation (62): mean = 14.1 median = 10

Page 53: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

100 102 104 105 106 112 114 115 116 125

100 104 109 115 125

BP data; n = 10

Min Q1 Median Q3 Max

Page 54: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,
Page 55: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

75% quantile

25% quantile

Median

IQR

1.5xIQR

1.5xIQR

Everything above or below are considered outliers

Page 56: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Measures of Spread

- range of data set: largest - smallest value

- interquartile range (IQR): 3rd minus 1st quartile

- sample variance and standard deviation

Page 57: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Deviation from the Mean

Page 58: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,
Page 59: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,
Page 60: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,
Page 61: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Extreme Observations or Outliers

- rule of thumb 1.5 x IQR for potential outliers

- observations that stand apart from the overall pattern (not just extreme values)

- do not automatically delete outliers

- try to explain them

- an error in measurement or in recording data- an usual occurrence

- describe outliers, what you do with them and what their effect is

Page 62: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Stem Leaf # Boxplot 18 9 1 0 16 14 12 0258 4 | 10 244579 6 +-----+ 8 1122447839 10 *--+--* 6 5886689 7 +-----+ 4 6 1 | ----+----+----+----+

1.5 x 3.5(IQR) = 5.2575th (11.46) + 5.25 = 16.71

19.9 MJ

Energy expenditure in 29 women measured by doubly labelled water (MJ per day).

Page 63: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

What did we do about the outlier?

- checked recording/calculations/data entry

- unusual occurrence?

- biological plausible?

- re-measured laboratory samples

- analysis with and without outlier

- described all above in paper

Page 64: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Data Relationships

Page 65: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Schematic Plots

| 45 + | | | | | | 40 + | | | | | | | 35 + 0 | | 0 | | 0 +-----+ | | | 30 + | | | | *--+--* | | | | | | | | 25 + | | | | | +-----+ | | | | +-----+ | 20 + | | | | | | | | *--+--* | | | | | 15 + | | | | +-----+ | | | | 10 + | | | | | | 5 + ------------+-----------+----------- GROUP 1 2

% Dietary Fat

Intervention Control

Dietary fat intake in the intervention and control groups(n=150 intervention and 187 control)

Page 66: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Dot Plot

Page 67: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,
Page 68: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

How to Display Data Badly

H Wainer (1984) How to display data badly. American Statistician 38(2):137-147 - posted at website

-use of Microsoft Excel and Powerpoint has resulted in remarkable advances in the field (of poor data display)

Page 69: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

The aim of good data graphics:

Display data accurately and clearly.

Some rules for displaying data badly:

– Display as little information as possible.– Obscure what you do show (with chart junk).– Use pseudo-3d and color gratuitously.– Make a pie chart (preferably in color and 3d).– Use a poorly chosen scale.

General principles

Page 70: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,
Page 71: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,
Page 72: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Pay attention to scale!

Same data, different scale

Page 73: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Displaying data well

• Be accurate and clear.

• Let the data speak.– Show as much information as possible, taking care not to obscure the message.

• Science not sales.– Avoid unnecessary frills — esp. gratuitous 3d.

• In tables, every digit should be meaningful.

Page 74: 1. Introduction 2.Course Information and Schedule 3.Study Design 4.Looking at Data Today’s Topics Introduction to the Practice of Statistics Ch. 1, 2.5,

Further reading – Data Display

• ER Tufte http://www.edwardtufte.com/tufte/(1983) The visual display of quantitativeInformation.(1990) Envisioning information. (1997) Visual explanations.

•WS Cleveland (1993) Visualizing data. Hobart Press.• WS Cleveland (1994) The elements of graphing data.CRC Press.