Upload
noah-gray
View
219
Download
1
Embed Size (px)
Citation preview
Copyright ©2011 Nelson Education Limited
Linear Regression and Correlation
CHAPTER 12CHAPTER 12
Copyright ©2011 Nelson Education Limited
ExampleExampleThe table shows the math achievement test scores for a random sample of n = 10 college freshmen, along with their final calculus grades.
Student 1 2 3 4 5 6 7 8 9 10
Math test, x 39 43 21 64 57 47 28 75 34 52
Calculus grade, y 65 78 52 82 92 89 73 98 56 75
Copyright ©2011 Nelson Education Limited
ExampleExample
Copyright ©2011 Nelson Education Limited
The ANOVA TableThe ANOVA TableSource df SS MS F
Regression 1 SSR SSR/(1) MSR/MSE
Error n - 2 SSE SSE/(n-2)
Total n -1 Total SS
Copyright ©2011 Nelson Education Limited
The Calculus Example:The Calculus Example:
Source df SS MS F
Regression 1 1449.9741 1449.9741 19.14
Error 8 606.0259 75.7532
Total 9 2056.0000
Copyright ©2011 Nelson Education Limited
Regression Analysis: y versus xThe regression equation is y = 40.8 + 0.766 xPredictor Coef SE Coef T PConstant 40.784 8.507 4.79 0.001x 0.7656 0.1750 4.38 0.002
S = 8.70363 R-Sq = 70.5% R-Sq(adj) = 66.8%
Analysis of VarianceSource DF SS MS F PRegression 1 1450.0 1450.0 19.14 0.002Residual Error 8 606.0 75.8Total 9 2056.0
Regression coefficients, a and b
Minitab OutputMinitab Output
MSE
€
To test H 0 : β = 0Least squares regression line
Ft 2
Copyright ©2011 Nelson Education Limited
Measuring the Strength Measuring the Strength of the Relationship of the Relationship
• If the independent variable x is useful in predicting y, you will want to know how well the model fits.
• The strength of the relationship between x and y can be measured using:
€
Correlation coefficient : r =Sxy
SxxSyy
Coefficient of determination : r2 =Sxy
2
SxxSyy
=SSR
Total SS
€
Correlation coefficient : r =Sxy
SxxSyy
Coefficient of determination : r2 =Sxy
2
SxxSyy
=SSR
Total SS
Copyright ©2011 Nelson Education Limited
Measuring the Strength Measuring the Strength of the Relationship of the Relationship
• Since Total SS = SSR + SSE, r2 measures
the proportion of the total variation in the responses that can be explained by using the independent variable x in the model.
the percent reduction in the total variation by using the regression equation rather than just using the sample mean y-bar to estimate y.
€
r2 =SSR
Total SS
€
r2 =SSR
Total SS
Copyright ©2011 Nelson Education Limited
Interpreting a Interpreting a Significant Regression Significant Regression
• Even if you do not reject the null hypothesis that the
slope of the line equals 0, it does not necessarily mean
that y and x are unrelated.
• Type II error—falsely declaring that the slope is 0 and
that x and y are unrelated.
• It may happen that y and x are perfectly related in a
nonlinear way.
Copyright ©2011 Nelson Education Limited
Some Cautions Some Cautions • You may have fit the wrong model.
•Extrapolation—predicting values of y outside the
range of the fitted data.
•Causality—Do not conclude that x causes y. There may
be an unknown variable at work!
Copyright ©2011 Nelson Education Limited
ExampleExample The table shows the heights (in cm) and weights(in Kg) of n = 10 randomly selected college football players.
Player 1 2 3 4 5 6 7 8 9 10
Height, x 84.1 79.5 90.9 95.5 86.4 88.6 68.2 77.3 81.8 79.5
Weight, y 185.4 180.3 190.5 182.9 182.9 190.5 170.2 175.3 180.3 175.3
Copyright ©2011 Nelson Education Limited
Football PlayersFootball Players
Height
Weig
ht
75747372717069686766
210
200
190
180
170
160
150
Scatterplot of Weight vs Height