Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Chapter 2Simple Regression
Learning Objectives
• Explain what a simple regression model is
• Fit a simple regression model using the least-squares criterion
• Compute R2 to measure how well the model fits the data
• Interpret the results from a simple regression model
How We Estimate This Model (Find the Best Line)
Figure 1.5. It sure looks like API decreases with FLE.
Prediction part
iii eXbbY ++= 10
Outcome variable RHS variable
Error, how much we miss by if we use Xi to predict Yi.
Simple regression model (note: no Greek letters!)
The Least-squares Criterion: Minimize the Sum of Squared Errors (Why?)
Figure 2.1. The regression line minimizes the sum of squared errors
( )∑=
−−=N
iiibb
XbbYSSE1
210, 10
min
Goal: Find the b0 and b1 that minimize SSE:
Sum of Squared Errors (SSE):
ii XbbY 10ˆ +=
( )220 1
1 1= =
= − −∑ ∑n n
i i ii i
e Y b b X
Derive the Normal Equations
( )∑=
−−=N
iiibb
XbbYSSE1
210, 10
min
( )
( )
0 0 11
1 0 11
: 2 ( 1) 0
: 2 ( ) 0
=
=
− − − =
− − − =
∑
∑
N
i iiN
i i ii
b Y b b X
b Y b b X X
Solve the Normal Equations for b0
XbYb
XbNbY
XbbY
XbbY
N
ii
N
ii
N
ii
N
i
N
ii
N
iii
10
110
1
11
10
1
110
0)(
0)()()(
0)(
−=
=−−
=−−
=−−
∑∑
∑∑∑
∑
==
===
=
Solve the Normal Equations for b1 (Substituting for b0)
[ ]
[ ]
∑
∑
∑
∑
∑
=
=
=
=
=
=
=−−−
=−−−
=−−−−
N
iii
N
iii
N
iiiii
N
iiii
N
iiii
xX
yXb
XXXbYYX
XXXbYY
XXbXbYY
1
11
11
11
111
0)()(
0)()()(
0))()((
Where small-x and small-y are the deviations of Xi and Yi from their means
b0
yi xi
Two equivalent ways to write the OLS formula for b1
( )
( )1 1 1 1 1 1 1 1
12
1 1 1 1 1 1 1 1
*0
*0
= = = = = = = =
= = = = = = = =
− − − −= = = = = =
− − − −
∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑
∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑
N N N N N N N N
i i i i i i i i i i i i i ii i i i i i i i
N N N N N N N N
i i i i i i i i i i i i ii i i i i i i i
x y X X y X y Xy X y X y X y X X yb
x X X x X x Xx X x X x X x X X x
OLS Estimator of
XbYb 10 −=
∑
∑
=
== N
ii
N
iii
x
yxb
1
2
11
iii eXbbY ++= 10
Difference between Sample and Population Regression
iii eXbbY ++= 10
iii XY εββ ++= 10
Sometimes econometricians use hats “^” to indicate things they’ve actually estimated
iii eXbbY ˆˆˆ10 ++=
How Good Is Your Fit? Predicting with the Mean
∑=
=N
iiyTSS
1
2
Take Selby Lane Elementary (X,Y) = (80,730).
The deviation from the mean is 730 – 835.8 = –105.8.
Without a regression model, we would only have the mean to work with, and it would not be a very good predictor of Y.
Misses if predict with sample mean (Numerator in sample variance)
How Good Is Your Fit? The R-squared
2
2 1
2
1
1 1
N
iiN
ii
eSSERTSS y
=
=
= − = −∑
∑
Variation in Y not explained by regression model
∑=
=N
iiyTSS
1
2
Variation in Y not explained by mean (Numerator in sample variance)
Interpreting Regression Coefficients
• Suppose two individuals have X values that differ by one unit.
• The predicted difference in their Y values is b1
0 1i i iY b b X e= + +
What We Learned
• How to solve the least-squares problem to fit a simple regression model.
• How to apply the least-squares formula using a spreadsheet.
• R2 is a useful characteristic of your model, but maximizing R2 is often not the objective of your analysis.