Chapter 2 · 2020. 3. 28. · Chapter 2 Simple Regression. Learning Objectives • Explain what a simple regression model is • Fit a simple regression model using the least-squares

Chapter 2Simple Regression

Learning Objectives

• Explain what a simple regression model is

• Fit a simple regression model using the least-squares criterion

• Compute R2 to measure how well the model fits the data

• Interpret the results from a simple regression model

How We Estimate This Model (Find the Best Line)

Figure 1.5. It sure looks like API decreases with FLE.

Prediction part

iii eXbbY ++= 10

Outcome variable RHS variable

Error, how much we miss by if we use Xi to predict Yi.

Simple regression model (note: no Greek letters!)

The Least-squares Criterion: Minimize the Sum of Squared Errors (Why?)

Figure 2.1. The regression line minimizes the sum of squared errors

( )∑=

−−=N

iiibb

XbbYSSE1

210, 10

min

Goal: Find the b0 and b1 that minimize SSE:

Sum of Squared Errors (SSE):

ii XbbY 10ˆ +=

( )220 1

1 1= =

= − −∑ ∑n n

i i ii i

e Y b b X

Derive the Normal Equations

( )∑=

−−=N

iiibb

XbbYSSE1

210, 10

min

( )

( )

0 0 11

1 0 11

: 2 ( 1) 0

: 2 ( ) 0

=

=

− − − =

− − − =

∑

∑

N

i iiN

i i ii

b Y b b X

b Y b b X X

Solve the Normal Equations for b0

XbYb

XbNbY

XbbY

XbbY

N

ii

N

ii

N

ii

N

i

N

ii

N

iii

10

110

1

11

10

1

110

0)(

0)()()(

0)(

−=

=−−

=−−

=−−

∑∑

∑∑∑

∑

==

===

=

Solve the Normal Equations for b1 (Substituting for b0)

[ ]

[ ]

∑

∑

∑

∑

∑

=

=

=

=

=

=

=−−−

=−−−

=−−−−

N

iii

N

iii

N

iiiii

N

iiii

N

iiii

xX

yXb

XXXbYYX

XXXbYY

XXbXbYY

1

11

11

11

111

0)()(

0)()()(

0))()((

Where small-x and small-y are the deviations of Xi and Yi from their means

b0

yi xi

Two equivalent ways to write the OLS formula for b1

( )

( )1 1 1 1 1 1 1 1

12

1 1 1 1 1 1 1 1

*0

*0

= = = = = = = =

= = = = = = = =

− − − −= = = = = =

− − − −

∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑

∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑

N N N N N N N N

i i i i i i i i i i i i i ii i i i i i i i

N N N N N N N N

i i i i i i i i i i i i ii i i i i i i i

x y X X y X y Xy X y X y X y X X yb

x X X x X x Xx X x X x X x X X x

OLS Estimator of

XbYb 10 −=

∑

∑

=

== N

ii

N

iii

x

yxb

1

2

11

iii eXbbY ++= 10

Difference between Sample and Population Regression

iii eXbbY ++= 10

iii XY εββ ++= 10

Sometimes econometricians use hats “^” to indicate things they’ve actually estimated

iii eXbbY ˆˆˆ10 ++=

How Good Is Your Fit? Predicting with the Mean

∑=

=N

iiyTSS

1

2

Take Selby Lane Elementary (X,Y) = (80,730).

The deviation from the mean is 730 – 835.8 = –105.8.

Without a regression model, we would only have the mean to work with, and it would not be a very good predictor of Y.

Misses if predict with sample mean (Numerator in sample variance)

How Good Is Your Fit? The R-squared

2

2 1

2

1

1 1

N

iiN

ii

eSSERTSS y

=

=

= − = −∑

∑

Variation in Y not explained by regression model

∑=

=N

iiyTSS

1

2

Variation in Y not explained by mean (Numerator in sample variance)

Interpreting Regression Coefficients

• Suppose two individuals have X values that differ by one unit.

• The predicted difference in their Y values is b1

0 1i i iY b b X e= + +

What We Learned

• How to solve the least-squares problem to fit a simple regression model.

• How to apply the least-squares formula using a spreadsheet.

• R2 is a useful characteristic of your model, but maximizing R2 is often not the objective of your analysis.

Documents

Chapter 2 · 2020. 3. 28. · Chapter 2 Simple Regression. Learning Objectives • Explain what a simple regression model is • Fit a simple regression model using the least-squares