210
Last Time • T Distribution – Confidence Intervals – Hypothesis tests • Relationships Between Variables – Scatterplots (visualization) • Aspects of Relations – Form – Direction – Strength

Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Embed Size (px)

Citation preview

Page 1: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Last Time

• T Distribution– Confidence Intervals– Hypothesis tests

• Relationships Between Variables– Scatterplots (visualization)

• Aspects of Relations– Form– Direction– Strength

Page 2: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Reading In Textbook

Approximate Reading for Today’s Material:

Pages 101-105 , 447-465, 511-516

Approximate Reading for Next Class:

Pages 110-135, 560-574

Page 3: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Scatterplot E.g.Class Example 16:

How does HW score predict Final Exam?

xi = HW, yi = Final Exam

i. In top half of HW scores:Better HW Better Final

Page 4: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Important Aspects of Relations

I. Form of Relationship

II. Direction of Relationship

III. Strength of Relationship

Page 5: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

I. Form of Relationship• Linear: Data approximately follow a line

Previous Class Scores Examplehttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

Final vs. High values of HW is “best”

• Nonlinear: Data follows different pattern

Nice Example: Bralower’s Fossil Data

http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg17.xls

Page 6: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Bralower’s Fossil Datahttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg17.xls

From T. Bralower, formerly of Geological Sci.

Studies Global Climate, millions of years ago

Page 7: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

II. Direction of Relationship

• Positive Association

X bigger Y bigger

• Negative Association

X bigger Y smaller

Note: Concept doesn’t always apply:

Bralower’s Fossil Data

Page 8: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

III. Strength of Relationship

Idea: How close are points to lying on a line?

Revisit Class Scores Example:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

Page 9: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Comparing Scatterplots

Additional Useful Visual Tool

Page 10: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Comparing Scatterplots

Additional Useful Visual Tool:

• Overlaying multiple data sets

Page 11: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Comparing Scatterplots

Additional Useful Visual Tool:

• Overlaying multiple data sets

• Allows comparison

Page 12: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Comparing Scatterplots

Additional Useful Visual Tool:

• Overlaying multiple data sets

• Allows comparison

• Use different colors or symbols

Page 13: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Comparing Scatterplots

Additional Useful Visual Tool:

• Overlaying multiple data sets

• Allows comparison

• Use different colors or symbols

• Easy in EXCEL (colors are automatic)

Page 14: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Comparing Scatterplots HW

HW:

2.21, 2.25

Page 15: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

III. Strength of Relationship

Idea: How close are points to lying on a line?

Revisit Class Scores Example:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

Page 16: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

III. Strength of Relationship

Idea: How close are points to lying on a line?

Now get quantitative

Page 17: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Section 2.2: Correlation

Main Idea: Quantify Strength of Relationship

Page 18: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Section 2.2: Correlation

Main Idea: Quantify Strength of Relationship

Context:

– A numerical summary

Page 19: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Section 2.2: Correlation

Main Idea: Quantify Strength of Relationship

Context:

– A numerical summary

– In spirit of mean and standard deviation

Page 20: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Section 2.2: Correlation

Main Idea: Quantify Strength of Relationship

Context:

– A numerical summary

– In spirit of mean and standard deviation

– But now applies to pairs of variables

Page 21: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Section 2.2: Correlation

Main Idea: Quantify Strength of Relationship

Specific Goals

Page 22: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Section 2.2: Correlation

Main Idea: Quantify Strength of Relationship

Specific Goals:

– Near 1: for positive relat’ship & nearly linear

Page 23: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Section 2.2: Correlation

Main Idea: Quantify Strength of Relationship

Specific Goals:

– Near 1: for positive relat’ship & nearly linear

– > 0: for positive relationship (slopes up)

Page 24: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Section 2.2: Correlation

Main Idea: Quantify Strength of Relationship

Specific Goals:

– Near 1: for positive relat’ship & nearly linear

– > 0: for positive relationship (slopes up)

– = 0: for no relationship

Page 25: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Section 2.2: Correlation

Main Idea: Quantify Strength of Relationship

Specific Goals:

– Near 1: for positive relat’ship & nearly linear

– > 0: for positive relationship (slopes up)

– = 0: for no relationship

– < 0: for negative relationship (slopes down)

Page 26: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Section 2.2: Correlation

Main Idea: Quantify Strength of Relationship

Specific Goals:

– Near 1: for positive relat’ship & nearly linear

– > 0: for positive relationship (slopes up)

– = 0: for no relationship

– < 0: for negative relationship (slopes down)

– Near -1: for negative relat’ship & nearly linear

Page 27: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Approach

Numerical Approach

Page 28: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Approach

Numerical Approach:

for symmetric around )0,0(),( ii yx

Page 29: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Approach

Numerical Approach:

for symmetric around

has similar properties

)0,0(),( ii yx

n

iii yx

1

Page 30: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Approach

Numerical Approach:

for symmetric around

has similar properties

Worked out Example :http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg18-new.xls

)0,0(),( ii yx

n

iii yx

1

Page 31: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation – Graphical View

Plots (a) & (b): illustrating :

• > 0 for positive relationship

n

iii yx

1

Page 32: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation – Graphical View

Plots (a) & (b): illustrating :

• > 0 for positive relationship

n

iii yx

1

Page 33: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation – Graphical View

Plots (a) & (b): illustrating :

• > 0 for positive relationship

• < 0 for negative relationship

n

iii yx

1

Page 34: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation – Graphical View

Plots (a) & (b): illustrating :

• > 0 for positive relationship

• < 0 for negative relationship

n

iii yx

1

Page 35: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation – Graphical View

Plots (a) & (b): illustrating :

• Bigger for data closer to line

n

iii yx

1

Page 36: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation – Graphical View

Plots (a) & (b): illustrating :

• Bigger for data closer to line

n

iii yx

1

Page 37: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation – Graphical View

But not all goals are satisfied

Page 38: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation – Graphical View

Problem 1: Not between -1 & 1

Page 39: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation – Graphical View

Problem 2: Feels “Scale”, see plot (c)

(just 10 1 vertical rescaling of)

Page 40: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation – Graphical View

Problem 2: Feels “Scale”, see plot (c)

(just 10 1 vertical rescaling of)

( feels factor of 1/10)

n

iii yx

1

Page 41: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation – Graphical View

Problem 3: Feels “Shift” even more, see (d)

(even gets sign wrong!)

Page 42: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation – Graphical View

Problem 3: Feels “Shift” even more, see (d)

(even gets sign wrong!)

• Data trend upwards

Page 43: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation – Graphical View

Problem 3: Feels “Shift” even more, see (d)

(even gets sign wrong!)

• Data trend upwards

• But < 0

n

iii yx

1

Page 44: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Approach

Solution to above problems

Page 45: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Approach

Solution to above problems:

Standardize!

Page 46: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Approach

Solution to above problems:

Standardize!

Define Correlation r

Page 47: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Approach

Solution to above problems:

Standardize!

Define Correlation

n

i y

i

x

i

s

yy

s

xxr

1

Page 48: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Revisit above examplehttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg18-new.xls

• r is always same, and ~1, for (a)

Page 49: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Revisit above examplehttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg18-new.xls

• r is always same, and ~1, for (a), (c)

Page 50: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Revisit above examplehttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg18-new.xls

• r is always same, and ~1, for (a), (c), (d)

Page 51: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Revisit above examplehttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg18-new.xls

• r is always same, and ~1, for (a), (c), (d)

• r < 0, and not so close to -1, for (b)

Page 52: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Revisit Class Scores Example:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

Final Exam vs. HW

Correlation = r = 0.73

Strongest Dependence

Page 53: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Revisit Class Scores Example:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

MT1 vs. HW

Correlation = r = 0.65

Weaker Dependence

Page 54: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Revisit Class Scores Example:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

MT2 vs. MT1

Correlation = r = 0.57

Weakest Dependence

Page 55: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Revisit Class Scores Example:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

• r is always > 0

(makes sense, since all trend upwards)

Page 56: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Revisit Class Scores Example:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

• r is always > 0

• r is biggest for Final vs. HW

Page 57: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Revisit Class Scores Example:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

• r is always > 0

• r is biggest for Final vs. HW

(visually strongest relationship)

Page 58: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Revisit Class Scores Example:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

• r is always > 0

• r is biggest for Final vs. HW

(visually strongest relationship)

• r is smallest for MT2 vs. MT1

Page 59: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Revisit Class Scores Example:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

• r is always > 0

• r is biggest for Final vs. HW

(visually strongest relationship)

• r is smallest for MT2 vs. MT1

(visually weakest relationship)

Page 60: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation – Computation

From Class Scores Example:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

Page 61: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation – Computation

From Class Scores Example:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

Use Excel function: CORREL

Page 62: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation – Computation

From Class Scores Example:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

Use Excel function: CORREL

Page 63: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation – Computation

From Class Scores Example:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

Use Excel function: CORREL

• Range of Xs

Page 64: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation – Computation

From Class Scores Example:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

Use Excel function: CORREL

• Range of Xs

• Range of Ys

Page 65: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation – Computation

From Class Scores Example:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

Use Excel function: CORREL

• Range of Xs

• Range of Ys

• Output is correlation, r

Page 66: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Fun Example from Publisher’s Website:

http://courses.bfwpub.com/ips6e.php

Page 67: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Fun Example from Publisher’s Website:

http://courses.bfwpub.com/ips6e.php

Choose

• Statistical Applets

Page 68: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Fun Example from Publisher’s Website:

http://courses.bfwpub.com/ips6e.php

Choose

• Statistical Applets

• Correlation and Regression

Page 69: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Fun Example from Publisher’s Website:

http://courses.bfwpub.com/ips6e.php

Choose

• Statistical Applets

• Correlation and Regression

Gives feeling for how correlation is affected by changing data.

Page 70: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Correlation and Regression Applet

Page 71: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Correlation and Regression Applet

I clicked to put

down 2 points

Page 72: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Correlation and Regression Applet

I clicked to put

down 2 points

Applet computed

correlation, r

Page 73: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Correlation and Regression Applet

Applet computed

correlation, r

r = -1, since

points on line

trending down

Page 74: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Correlation and Regression Applet

Try several points

close to some line

Page 75: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Correlation and Regression Applet

Try several points

close to some line

r ≈ -1, since

points near line

trending down

Page 76: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Correlation and Regression Applet

Add more points

with goal of

r ≈ -0.95

Page 77: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Correlation and Regression Applet

Add more points

with goal of

r ≈ -0.95

Page 78: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Correlation and Regression Applet

Add more points

with goal of

r ≈ -0.95

Page 79: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Correlation and Regression Applet

Now add a single

outlier

Page 80: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Correlation and Regression Applet

Now add a single

outlier

Major impact on r

-0.95 -0.35

Page 81: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Correlation and Regression Applet

Just 2 more

outliers

Page 82: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Correlation and Regression Applet

Just 2 more

outliers

Leads to r > 0

Page 83: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Correlation and Regression Applet

Just 2 more

outliers

Leads to r > 0

(Outliers have

major impact)

Page 84: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Correlation and Regression Applet

Weakness of

correlation, r

Page 85: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Correlation and Regression Applet

Weakness of

correlation, r

Measures linear

dependence

Page 86: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Correlation and Regression Applet

Weakness of

correlation, r

Can have r ≈ 0

Page 87: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Correlation and Regression Applet

Weakness of

correlation, r

Can have r ≈ 0,

yet strong

non-linear

dependence

Page 88: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - HW

HW:

2.31

2.33

2.39a

Page 89: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Outliers

Caution:

Outliers can strongly affect correlation, r

Page 90: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Correlation and Regression Applet

Add more points

with goal of

r ≈ -0.95

Page 91: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Correlation and Regression Applet

Now add a single

outlier

Major impact on r

-0.95 -0.35

Page 92: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Example

Correlation and Regression Applet

Just 2 more

outliers

Leads to r > 0

(Outliers have

major impact)

Page 93: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Correlation - Outliers

Caution:

Outliers can strongly affect correlation, r

HW:

2.39b

2.45

Page 94: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research Corner

Relationship between more than 2 variables?

Page 95: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research Corner

Relationship between more than 2 variables?

Each data point is (x1, x2, … , xd)

Called a “d-tuple”

Page 96: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research Corner

Relationship between more than 2 variables?

Each data point is (x1, x2, … , xd)

Eg: d = 3 (ordered triple)

Page 97: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research Corner

Relationship between more than 2 variables?

Each data point is (x1, x2, … , xd)

Eg: d = 3 (ordered triple)

(height, weight, age)

Page 98: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research Corner

Relationship between more than 2 variables?

Each data point is (x1, x2, … , xd)

Eg: d = 3 (ordered triple)

(height, weight, age)

(HW, MT1, Final)

Page 99: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research Corner

Visualization?

Page 100: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research Corner

Visualization?

What is “scatterplot”?

Page 101: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research Corner

Visualization?

What is “scatterplot”?

How can we “see” structure in data?

Page 102: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research Corner

Visualization?

Explore d = 3 (3d)

Page 103: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research Corner

Visualization?

Explore d = 3 (3d)

So can visualize “point cloud”

Page 104: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research CornerToy Example, modeling “gene expression”

Page 105: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research CornerMultivariate View: Highlight one

Page 106: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research CornerMultivariate View: Value of variable 1

Page 107: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research CornerMultivariate View: Value of variable 2

Page 108: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research CornerMultivariate View: Value of variable 3

Page 109: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research CornerMultivariate View: 1-d projection, X-axis

Page 110: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research CornerMultivariate View: X – Projection, 1-d View

Page 111: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research CornerMultivariate View: 1-d projection, Y-axis

Page 112: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research CornerMultivariate View: Y – Projection, 1-d View

Page 113: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research CornerMultivariate View: 1-d projection, Z-axis

Page 114: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research CornerMultivariate View: Z – Projection, 1-d View

Page 115: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research CornerMultivariate View: 2-d Projection XY-plane

Page 116: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research CornerMultivariate View: XY – projection, 2-d view

Page 117: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research CornerMultivariate View: 2-d Projection XZ-plane

Page 118: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research CornerMultivariate View: XZ – projection, 2-d view

Page 119: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research CornerMultivariate View: 2-d Projection YZ-plane

Page 120: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research CornerMultivariate View: YZ – projection, 2-d view

Page 121: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research CornerMultivariate View: All 3 planes

Page 122: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research CornerMultivariate View

Now collect these views

on a single page

Page 123: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research CornerMultivariate View: 1-d projections on diagonal

Page 124: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research CornerMultivariate View: 2-d views off diagonal

Page 125: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research CornerMultivariate View: switch off color (usual view)

Page 126: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Research CornerMultivariate View

(Useful summary of structure in data)

Page 127: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Sample InferenceMain Idea:

• Previously studied single populations

Page 128: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Sample InferenceMain Idea:

• Previously studied single populations

• Modeled as

Page 129: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Sample InferenceMain Idea:

• Previously studied single populations

• Modeled as:– Measurement Error

Page 130: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Sample InferenceMain Idea:

• Previously studied single populations

• Modeled as:– Measurement Error

nNX

,~

Page 131: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Sample InferenceMain Idea:

• Previously studied single populations

• Modeled as:– Measurement Error

– Counts

nNX

,~

Page 132: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Sample InferenceMain Idea:

• Previously studied single populations

• Modeled as:– Measurement Error

– Counts

nNX

,~

n

pppNppnBiX

)1(,~ˆ),,(~

Page 133: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Sample InferenceMain Idea:

• Previously studied single populations

• Modeled as:– Measurement Error

– Counts

• Did Inference

nNX

,~

n

pppNppnBiX

)1(,~ˆ),,(~

Page 134: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Sample InferenceMain Idea:

• Previously studied single populations

• Modeled as:– Measurement Error

– Counts

• Did Inference:– Confidence Intervals

– Hypothesis Tests

nNX

,~

n

pppNppnBiX

)1(,~ˆ),,(~

Page 135: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Sample InferenceMain Idea:

• Extend to two populations

• Modeled as:– Measurement Error

– Counts

• Will Develop Inference:– Confidence Intervals

– Hypothesis Tests

1

111 ,~n

NX

2

222 ,~n

NX

),(~ 111 pnBiX ),(~ 222 pnBiX

Page 136: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Sample InferenceLocation in Text

Page 137: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Sample InferenceLocation in Text:

• Measurement Error– Sec. 7.1

– Sec. 7.2

1

111 ,~n

NX

2

222 ,~n

NX

Page 138: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Sample InferenceLocation in Text:

• Measurement Error– Sec. 7.1

– Sec. 7.2

• Counts– Sec. 8.2

1

111 ,~n

NX

2

222 ,~n

NX

),(~ 111 pnBiX ),(~ 222 pnBiX

Page 139: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Sample Measurement Error

Easy Case: Paired Differences

Page 140: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Sample Measurement Error

Easy Case: Paired Differences

Have Treatment 1: nXXX ,,, 21

Page 141: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Sample Measurement Error

Easy Case: Paired Differences

Have Treatment 1:

Treatment 2:

nXXX ,,, 21

nYYY ,,, 21

Page 142: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Sample Measurement Error

Easy Case: Paired Differences

Have Treatment 1:

Treatment 2:

nXXX ,,, 21

nYYY ,,, 21

Page 143: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Sample Measurement Error

Easy Case: Paired Differences

Have Treatment 1:

Treatment 2:

Important: Measurements Connected

nXXX ,,, 21

nYYY ,,, 21

Page 144: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Sample Measurement Error

Easy Case: Paired Differences

Have Treatment 1:

Treatment 2:

Important: Measurements Connected,

e.g. made on same objects

nXXX ,,, 21

nYYY ,,, 21

Page 145: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Sample Measurement Error

Easy Case: Paired Differences

Have Treatment 1:

Treatment 2:

Approach: Apply 1 sample methods

nXXX ,,, 21

nYYY ,,, 21

Page 146: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Sample Measurement Error

Easy Case: Paired Differences

Have Treatment 1:

Treatment 2:

Approach: Apply 1 sample methods to:

nXXX ,,, 21

nYYY ,,, 21

niYXD iii ,,1,

Page 147: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

Researchers studying Vitamin C in a product were concerned about loss of Vitamin C during shipment and storage.

Page 148: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

Researchers studying Vitamin C in a product were concerned about loss of Vitamin C during shipment and storage. They marked a collection of bags at the factory, and measured the vitamin C

Page 149: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

Researchers studying Vitamin C in a product were concerned about loss of Vitamin C during shipment and storage. They marked a collection of bags at the factory, and measured the vitamin C. 5 months later, in Haiti, they found the same bags, and again measured the Vitamin C.

Page 150: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

The data are the two Vitamin C measurements, made for each bag.

Page 151: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

The data are the two Vitamin C measurements, made for each bag.

Available in Class Example 15:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg15.xls

Page 152: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesAvailable in Class Example 15:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg15.xls

Page 153: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesAvailable in Class Example 15:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg15.xls

Factory,

Cells B38:B64

Page 154: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesAvailable in Class Example 15:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg15.xls

Factory,

Cells B38:B64

Haiti,

Cells C38:C64

Page 155: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

The data are the two Vitamin C measurements, made for each bag.

Page 156: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

The data are the two Vitamin C measurements, made for each bag.

a. Set up hypotheses to examine the question of interest.

Page 157: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

The data are the two Vitamin C measurements, made for each bag.

a. Set up hypotheses to examine the question of interest.

b. Perform the significance test, and summarize the result.

Page 158: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

The data are the two Vitamin C measurements, made for each bag.

a. Set up hypotheses to examine the question of interest.

b. Perform the significance test, and summarize the result.

c. Find 95% CIs for the factory mean, and the Haiti mean, and the mean change.

Page 159: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

a. Sample average difference =

Computed as:

33.5D

Page 160: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

a. Sample average difference =

Computed as:

33.5D

niYXD iii ,,1,

Page 161: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

a. Sample average difference =

Computed as:

33.5D

niYXD iii ,,1,

Page 162: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

a. Sample average difference =

Computed as:

Then average

33.5D

niYXD iii ,,1,

Page 163: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

a. Sample average difference =

Some evidence factory is bigger,

is it strong evidence???

33.5D

Page 164: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

a. Sample average difference =

Some evidence factory is bigger,

is it strong evidence???

Let = Difference: Haiti – Factory

33.5D

D

Page 165: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

a. Sample average difference =

Some evidence factory is bigger,

is it strong evidence???

Let = Difference: Haiti – Factory

1-sided, from “idea of loss”

33.5D

D0:0 DH

0: DAH

Page 166: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

b. 0|..33.5 DcmorDPvalueP

Page 167: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

b. 0|..33.5 DcmorDPvalueP

0|33.5 DDP

Page 168: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

b. 0|..33.5 DcmorDPvalueP

0|33.5 DDP

D

DD nsnsD

P |33.5

Page 169: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

b. 0|..33.5 DcmorDPvalueP

0|33.5 DDP

D

DD nsnsD

P |33.5

D

Dn nstP |33.5

1

Page 170: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:b.

D

Dn nstPvalueP |33.5

1

Page 171: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:b.

But recall how TDIST works

(1 – tail: upper probability only)

D

Dn nstPvalueP |33.5

1

Page 172: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:b.

But recall how TDIST works:

=

(symmetry)

D

Dn nstPvalueP |33.5

1

Page 173: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:b.

But recall how TDIST works:

=

So compute with:

D

Dn nstPvalueP |33.5

1

DD

n nstPvalueP |33.5

1

Page 174: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

b. Excel Computation:

Class Example 15, Part 3http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg15.xls

Page 175: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

b. Excel Computation:

Class Example 15, Part 3http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg15.xls

Standard deviation

of differences, sD

Page 176: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

b. Excel Computation:

Class Example 15, Part 3http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg15.xls

Standard deviation

of differences, sD

P-value

Page 177: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

b. Excel Computation:

Class Example 15, Part 3http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg15.xls

P-value = 1.87 x 10-5

Page 178: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

b. Excel Computation:

Class Example 15, Part 3http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg15.xls

P-value = 1.87 x 10-5

Interpretation: very strong evidence

Page 179: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

b. Excel Computation:

Class Example 15, Part 3http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg15.xls

P-value = 1.87 x 10-5

Interpretation: very strong evidence

either yes-no or gray level

Page 180: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesVariations:

1. EXCEL function TTEST is useful here

Page 181: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesVariations:

1. EXCEL function TTEST is useful here

Page 182: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesVariations:

1. EXCEL function TTEST is useful here

(same answer as above)

Page 183: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesVariations:

1. EXCEL function TTEST is useful here

Notes:

a. Type is paired (discuss others later)

Page 184: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesVariations:

1. EXCEL function TTEST is useful here

Notes:

a. Type is paired (discuss others later)

b. Get same answer from swapping Array 1 and Array 2

Page 185: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesVariations:

1. EXCEL function TTEST is useful here

Notes:

a. Type is paired (discuss others later)

b. Get same answer from swapping Array 1 and Array 2

Page 186: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesVariations:

1. EXCEL function TTEST is useful here

Notes:

a. Type is paired (discuss others later)

b. Get same answer from swapping Array 1 and Array 2

c. This is something Excel does well

Page 187: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesVariations:

2. Can also use:

Data Data Analysis T-test Paired

Page 188: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesVariations:

2. Can also use:

Data Data Analysis T-test Paired

to give detailed results

Page 189: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesVariations:

2. Can also use:

Data Data Analysis T-test Paired

to give detailed results

e.g. d.f. = 26

Page 190: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesVariations:

2. Can also use:

Data Data Analysis T-test Paired

to give detailed results

e.g. d.f. = 26

P-value same

Page 191: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesVariations:

2. Can also use:

Data Data Analysis T-test Paired

to give detailed results

e.g. d.f. = 26

P-value same

(others we haven’t learned yet)

Page 192: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

c. Confidence Intervals

See Class Example 15, Part 3chttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg15.xls

Margin of error = ns

nTINVm 1,05.0

Page 193: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

c. Confidence Intervals

See Class Example 15, Part 3chttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg15.xls

Margin of error =

(same as above, but NORMINV TINV)

ns

nTINVm 1,05.0

Page 194: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

2 Paired SamplesE.g. Old Textbook 7.32:

c. Confidence Intervals

See Class Example 15, Part 3chttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg15.xls

Margin of error =

(same as above, but NORMINV TINV)

So CI has endpoints:

ns

nTINVm 1,05.0

mX

Page 195: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Paired Sampling CIs & TestsHW:

7.33, 7.35, 7.39

Interpret P-values: (i) yes-no (ii) gray-level

(suggestion: use TTEST, for hypo tests)

Page 196: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

And now for somethingcompletely different…

Does the statement, “We've always done it like that” ring any bells?

The US standard railroad gauge (distance between the rails) is 4 feet, 8.5 inches.

That's an exceedingly odd number.

Why was that gauge used?

Page 197: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

And now for somethingcompletely different…

Because that's the way they built them in England, and English expatriates built the US Railroads.

Why did the English build them like that?

Because the first rail lines were built by the same people who built the pre-railroad tramways, and that's the gauge they used.

Page 198: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

And now for somethingcompletely different…

Why did "they" use that gauge then?

Because the people who built the tramways used the same jigs and tools that they used for building wagons, which used that wheel spacing.

Page 199: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

And now for somethingcompletely different…

Okay! Why did the wagons have that particular odd wheel spacing?

Well, if they tried to use any other spacing, the wagon wheels would break on some of the old, long distance roads in England , because that's the spacing of the wheel ruts.

Page 200: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

And now for somethingcompletely different…

So who built those old rutted roads?

Imperial Rome built the first long distance roads in Europe (and England ) for their legions. The roads have been used ever since.

And the ruts in the roads?

Roman war chariots formed the initial ruts, which everyone else had to match for fear of destroying their wagon wheels.

Since the chariots were made for Imperial Rome , they were all alike in the matter of wheel spacing.

Page 201: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

And now for somethingcompletely different…

The United States standard railroad gauge of 4 feet, 8.5 inches is derived from the original specifications for an Imperial Roman war chariot.

And bureaucracies live forever.

So the next time you are handed a specification and wonder what horse's ass came up with it, you may be exactly right, because the Imperial Roman army chariots were made just wide enough to accommodate the back ends of two war horses!

Page 202: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

And now for somethingcompletely different…

When you see a Space Shuttle sitting on its launch pad, there are two big booster rockets attached to the sides of the main fuel tank.

These are solid rocket boosters, or SRBs.

The SRBs are made by Thiokol at their factory at Utah.

The engineers who designed the SRBs would have preferred to make them a bit fatter, but the SRBs had to be shipped by train from the factory to the launch site.

Page 203: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

And now for somethingcompletely different…

The railroad line from the factory happens to run through a tunnel in the mountains.

The SRBs had to fit through that tunnel. The tunnel is slightly wider than the railroad track, and the railroad track, as you now know, is about as wide as two horses' behinds.

So, a major Space Shuttle design feature of what is arguably the world's most advanced transportation system was determined over two thousand years ago by the width of a horse's ass.

Page 204: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

And now for somethingcompletely different…

- And –

you thought being a HORSE'S ASS wasn't important!

Page 205: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Carolina Course Evaluation

• Please give me your opinions

Page 206: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Carolina Course Evaluation

• Please give me your opinions

Most highly sought:

Written suggestions for improvement

Page 207: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Carolina Course Evaluation

• Please give me your opinions

Most highly sought:

Written suggestions for improvement

• Please fill out with # 2 pencil (black pen?)

Page 208: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Carolina Course Evaluation

• Please give me your opinions

Most highly sought:

Written suggestions for improvement

• Please fill out with # 2 pencil (black pen?)

• Return to student volunteer

• Will turn in independently from me

Page 209: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

Carolina Course Evaluation

• Please give me your opinions

Most highly sought:

Written suggestions for improvement

• Please fill out with # 2 pencil (black pen?)

• Return to student volunteer

• Will turn in independently from me

• Dept/Course/Section: STOR 155 001

• Instructor: J. S. Marron

Page 210: Last Time T Distribution –Confidence Intervals –Hypothesis tests Relationships Between Variables –Scatterplots (visualization) Aspects of Relations –Form

STOR 155 001, Course ID: 3021121128. Over the course of the semester, how frequently did you review the audio/screen

recordings? (S/D) Never. I didn't know that they were available.

(D) Never. I decided not to.

(N) Seldom

(A) Sometimes

(S/A) Often

29. Did you review the recordings before taking a test or exam? (S/D) Yes / (S/A) No

30. Did you review the recordings after you missed class? (S/D) Yes / (S/A) No

31. Did you review the recordings when you didn't understand something from class? (S/D) Yes / (S/A) No

32. The recordings were helpful for me as a study aid. (S/D D N A S/A)

33. I was less likely to attend class because I knew I would have access to the lecture materials online. (S/D D N A S/A)