43
DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

Embed Size (px)

Citation preview

Page 1: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Stats Questions We Are Often Asked

Page 2: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Stats questions we are often asked

When can I use r and R 2 ?

When can I make a ‘causal-type’ claim?

Why should I be careful with a media reported margin of error?

When can I say a confidence interval gives support to a claim?

Page 3: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Stats questions we are often asked

When can I use r and R 2 ?

When can I make a ‘causal-type’ claim?

Why should I be careful with a media reported margin of error?

When can I say a confidence interval gives support to a claim?

Page 4: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

r – little r – what is it?

r is the correlation coefficient between y and x

r measures the strength of a linear relationship

r is a multiple of the slope

Page 5: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

r – when can it be used?

Only use r if the scatter plot is linear

Don’t use r if the scatter plot is non-linear!

x

y

******

** ** * **** **

** *r = 0.99

Page 6: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

r – what does it tell you?

How close the points in the scatter plot come to lying on the line

r = 0.99

x

y

**** ** ** ** * **** **** *

r = 0.57

x

y*

**

*

** *

*

**

*

****

*

*

*

* *

Page 7: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

R 2 – big R

2 – what is it?

R 2 is the coefficient of determination

Measures how close the points in the scatter plot come to lying on the fitted line or curve

x

y

**** ***

***

**

***

** ** *

x

y

******

** ** * **** **

** *

Page 8: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

R 2 – big R

2 – when can it be used?

When the scatter plot of y versus x is

linear or non-linear

x

y

**** ***

***

**

***

** ** *

x

y

******

** ** * **** **

** *

Page 9: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

R 2 – what does it tell you?

x

x

Dotplot of the y ’s

Shows the variation in the y ’s

y

y

Dotplot of the y ’s Shows the variation in the y ’s

ˆ

ˆ

Page 10: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

R 2 – what does it tell you?

x

We see some additional variation in the y ’s.

The excess is not explained by the model.

y y

2 Variation in y 'sˆVariation in fitted valuesVariation in y values Variation in y 's

R = =

Variation in the y ’s

This amount of variation can be explained by the model

ˆ

Page 11: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

R 2 – what does it tell you?

When expressed as a percentage, R 2 is

the percentage of the variation in Y that

our regression model can explain

R 2 near 100% model fits well

R 2 near 0% model doesn’t fit well

Page 12: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

R 2 – what does it tell you?

90% of the variation in Y is explained by

our regression model.

x

y** **

**

*

* ***

* ***

** ** *

R 2 = 90%

Page 13: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

R 2 – pearls of wisdom!

R 2 and r 2 have the same value ONLY

when using a linear model

DON’T use R 2 to pick your model

Use your eyes!

Page 14: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

R 2 and Excel & Graphics Calculators

Page 15: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Damaged for life by too much TV

Page 16: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Damaged for life by too much TV

N Z Herald (04/10/2005)

Page 17: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Damaged for life by too much TV

Page 18: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Damaged for life by too much TV

TV watching

Hea

lth

Sco

re

r = - 0.93

Causal relationship?

Page 19: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Causal relationships

Two general types of studies: experiments and observational studies

In an experiment, the experimenter determines which experimental units receive which treatments.

Page 20: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Damaged for life by too much TV

TV watching

Hea

lth

Sco

re

r = - 0.93

Causal relationship?

Page 21: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Causal relationships

Two general types of studies: experiments and observational studies

In an experiment, the experimenter determines which experimental units receive which treatments.

In an observational study, we simply compare units that happen to have received different levels of the factor of interest.

Page 22: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Causal relationships

Only well designed and carefully executed experiments can reliably demonstrate causation.

An observational study is often useful for identifying possible causes of effects, but it cannot reliably establish causation

Page 23: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Causal relationships - Summary

In observational studies, strong relationships are not necessarily causal relationships.

Correlation does not imply causation.

Be aware of the possibility of lurking variables.

Page 24: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Damaged for life by too much TV

Page 25: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Margin of Error

Sunday Star Times:National 44%Labour 37.2%NZ First 4.7%margin of error: 4.4%

(n = 540)

Herald on Sunday:Labour 42%National 38.5%NZ First 5.5%margin of error: 4.9%

(n = 400)

Page 26: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Margin of Error

Herald on Sunday:Labour 42%National 38.5%NZ First 5.5%margin of error: 4.9%

(n = 400)

Page 27: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Margin of Error

Herald on Sunday:Labour 42%National 38.5%NZ First 5.5%margin of error: 4.9%

(n = 400)

Confidence Interval:estimate ± margin of error

Page 28: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Margin of Error

SurveyErrors

Nonsampling ErrorsSampling Error

Page 29: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Margin of Error

SurveyErrors

Nonsampling ErrorsSampling Error

caused by the act of sampling has potential to be bigger in smaller samples can determine how large it can be

– margin of error unavoidable (price of sampling)

Page 30: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Margin of Error

SurveyErrors

Nonsampling ErrorsSampling Error

e.g., nonresponse bias, behavioural, . . . can be much larger than sampling errors impossible to correct for after completion of

survey impossible to determine how badly they affect

results

Page 31: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Margin of Error

Herald on Sunday:Labour 42%National 38.5%NZ First 5.5%margin of error: 4.9%

(n = 400)

Page 32: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Approx. 95% confidence interval for p:

Margin of Error

n

ppp

196.1ˆ

n

ppp

ˆ1ˆ96.1ˆ

np

5.05.02ˆ

np

Page 33: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Margin of Error

Margin of error(single proportion)

7.0ˆ3.0ˆ1

7.0ˆ3.01

porpn

pn

Page 34: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Margin of Error

Herald on Sunday:Labour 42%National 38.5%NZ First 5.5%margin of error: 4.9%

(n = 400)

Sunday Star Times:National 44%Labour 37.2%NZ First 4.7%margin of error: 4.4%

(n = 540)

Page 35: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

C – A: 0.5 to 20.7A – W: – 9.8 to 6.6

Bank Dissatisfaction Scores – 95% CIs

Page 36: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

C – A: 0.5 to 20.7A – W: – 9.8 to 6.6

With 95% confidence, the mean dissatisfaction score for Canterbury customers is somewhere between 0.5 and 20.7 larger than the mean dissatisfaction score for Auckland customers.

Bank Dissatisfaction Scores – 95% CIs

Page 37: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

C – A: 0.5 to 20.7A – W: – 9.8 to 6.6

With 95% confidence, the mean dissatisfaction score for Canterbury customers is somewhere between 0.5 and 20.7 larger than the mean dissatisfaction score for Auckland customers.

Bank Dissatisfaction Scores – 95% CIs

Page 38: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Bank Dissatisfaction Scores – 95% CIs

C – A: 0.5 to 20.7A – W: – 9.8 to 6.6

With 95% confidence, the mean dissatisfaction score for Auckland customers is somewhere between 9.8 less than and 6.6 greater than the mean dissatisfaction score for Wellington customers.

Page 39: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Bank Dissatisfaction Scores – 95% CIs

C – A: 0.5 to 20.7A – W: – 9.8 to 6.6

With 95% confidence, the mean dissatisfaction score for Auckland customers is somewhere between 9.8 less than and 6.6 greater than the mean dissatisfaction score for Wellington customers.

Page 40: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Bank Dissatisfaction Scores – 95% CIs

C – A: 0.5 to 20.7A – W: – 9.8 to 6.6

Does this confidence interval support the proposition that there is a difference between the two population means?

Supports A – W 0 ?

No, it doesn’t support the proposition.

Since 0 is in the confidence interval, then 0 is a believable value for the difference. There could be no difference between the two means.

A – W = 0 (no diff)

A – W 0 (a diff)

Page 41: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Bank Dissatisfaction Scores – 95% CIs

C – A: 0.5 to 20.7A – W: – 9.8 to 6.6

Does this confidence interval support the proposition that there is NO difference between the two population means?

Supports A – W = 0 ?

No, it doesn’t support the proposition.

Since there are non-zero numbers in the interval A – W could be non-zero, there could be a

difference.

A – W = 0 (no diff)

A – W 0 (a diff)

Page 42: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Bank Dissatisfaction Scores – 95% CIs

C – A: 0.5 to 20.7A – W: – 9.8 to 6.6

Does this confidence interval support the proposition that there is a difference between the two population means?

Supports A – W 0 ?

Yes, it does support the proposition.

Since zero is not in the interval, it is not believable that the difference is zero. No difference between the means is not believable.

A – W = 0 (no diff)

A – W 0 (a diff)

Page 43: DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS

Bank Dissatisfaction Scores – 95% CIs

C – A: 0.5 to 20.7A – W: – 9.8 to 6.6

Does this confidence interval support the proposition that there is NO difference between the two population means?

Supports A – W = 0 ?

No, it doesn’t support the proposition.

In fact, it provides evidence against it. 0 is not in the interval. No difference between the means is not believable.

A – W = 0 (no diff)

A – W 0 (a diff)