of 33 /33
Simple Regression Model 1 The Simple Regression Model

# The Simple Regression Model - Universidade Nova de Lisboadocentes.fe.unl.pt/~azevedoj/Web Page_files/Teaching_files... · Simple Regression Model 4 Further assumptions ... Now take

Embed Size (px)

### Text of The Simple Regression Model - Universidade Nova de Lisboadocentes.fe.unl.pt/~azevedoj/Web... Simple Regression Model 1

The Simple Regression Model Simple Regression Model 2

Simple regression model: Objectives

Given the model:

- where y is earnings and x years of education

- Or y is sales and x is spending in advertising

- Or y is the market value of an apartment and x is its area

Our objectives for the moment will be:

- Estimate the unknown parameters β0 andβ1

- Assess “how much”x explains y Simple Regression Model 3

The Simple Linear Regression (SLR) Model with Cross-Sectional Data

• y is the dependent variable (or explained variable, or response variable, or theregressand)

• x is the independent variable (or explanatory variable, or control variable, orthe regressor)

• u is the error term or disturbance (y is allowed to vary for the same level ofx)

• This is a Population equation. It islinear in the (unknown) parameters

Assumption SLR.1(Linearity in parameters) Simple Regression Model 4

Further assumptionsAssumption SLR.2(Random Sampling)

We have a random sample from the population such that:

Assumption SLR.3(Variation in the Regressor)

xi, i=1,2,...,n don’t have all the same value Simple Regression Model 5

Further assumptions

ueducationEarnings ++= 10 ββ

This is not a restrictive assumption, since we can always use β0 so that SLR.4´holds

Assumption SLR.4(Zero Conditional Mean)

Assumption SLR.4(́Zero Mean)

Not really necessary given that SLR.4 implies SLR.4´

Crucial assumption! Remember previous examples:

uPolicemanCrimeRate ++= 10 ββIn these cases SLR.4 most likely fails. Simple Regression Model 6

Implication of Zero Conditional Mean

..

x1 x2

E(y|x) = β0 + β1x

y

f(y)

implies

For anyx, distribution ofy is centered at E(y|x) Simple Regression Model 7

.

..

.

y4

y1

y2

y3

x1 x2 x3 x4

}

}

{

{

u1

u2

u3

u4

x

y

Population regression line, sample data pointsand the associated error terms

E(y|x) = β0 + β1x Simple Regression Model 8

Estimation of unknown parameters by method of moments

• Want to estimate the population parameters from a sample

• Let {(xi,yi): i=1, …,n} denote a random sample of size n from the population

• yi = β0 + β1xi + ui

implies

implies

since

i)

ii) Simple Regression Model 9

Estimation of unknown parameters by method of moments (intuition)

Clever idea is to pick so that sample moments match the population moments

These are two equations in the two unknowns

The sample counterparts are: Simple Regression Model 10

Estimation of unknown parameters by method of moments (solution)

Need sample variation of the x variable.Otherwise the slope parameter is notwell defined!

implies:

Plug-in in

to obtain:

this simplifies to: Simple Regression Model 11

Slope estimator

• The slope estimate is the sample covariance between x and ydivided by the sample variance of x

• If x and y are positively correlated, the slope will be positive

• If x and y are negatively correlated, the slope will be negative

• Need x to vary in our sample

It is anestimator if we treat (xi,yi) as randomvariables. It is an estimate once we substitute(xi,yi) by the actual observations. Simple Regression Model 12

Example and interpretation

• wageis net monthly wage andeducmeasures the number ofyears of schooling completed

• So, we estimate a positive relation between these twovariables; an additional year of schooling leads to an estimatedexpected increase of 52.513 euros in the wage

• Again, must be cautious about the interpretation of theseestimates

n=11064, Data from the Employment Survey (INE), 2003 Simple Regression Model 13

Definitions

Fitted value

Residual Simple Regression Model 14

.

..

.

y4

y1

y2

y3

x1 x2 x3 x4

}

}

{

{

û1

û2

û3

û4

x

y

Sample regression line, sample data pointsand the associated estimatederror terms

(called residuals); fitted values are on the line

xy 10ˆˆˆ ββ += Simple Regression Model 15

Alternate approach to derivationOLS – Ordinary Least Squares

• Can fit a line so as to minimise the sum of squared residuals

Equivalent to the equations we had before: same solution! Simple Regression Model 16

Review – OLS estimators

OLS obtained by minimising:

Fitted value:

Residual:

Given a random sample:

with respect to

Alternatively, can exploit the fact that: Simple Regression Model 17

Today

- Assess “how much”x explains y

- Assess “how close” our estimators

are from the true (population) values Simple Regression Model 18

Looking for a Goodness of fit measure: Decomposition of the Total Sum of

Squares (SST) We know:

SSE, Explained Sum ofSquares

This implies: , to be used later

Also: Now define the Total Sum of SquaresSST Simple Regression Model 19

Goodness-of-Fit

• How well does our sample regression line fit the data? If SSRis very small relative to SST(or SSEvery large relative to SST), then a lot of the variation in the dependent variable y is “explained” by the model

• Can compute the fraction of the total sum of squares (SST) that is explained by the model, denote this as the R-squared of the regression

• R2 = SSE/SST= 1 –SSR/SST

• The lower SSR, the higher will be R2

• OLS maximises R2 (minimisesSSR). We have always 0≤ R2 ≤1 Simple Regression Model 20

Example (continued)

• A lot of variation in the dependent variable (wage) is left unexplained by the model

• This isnot related to the fact thateducis most probably correlated withu

• Can have low R2 even if all the assumptions hold true

n=11064, Data from the Employment Survey (INE), 2003 Simple Regression Model 21

Assumptions again...Assumption SLR.1(Linearity in parameters)

Assumption SLR.2(Random Sampling from the population)

We have a random sample

Assumption SLR.3(Variation in the Regressor)

xi, i=1,2,...,n don’t have all the same value

Assumption SLR.4(Zero Conditional Mean)

This implies Simple Regression Model 22

Properties of OLS EstimatorsThought experiment:

• Get a sample of sizen from the population many times (infinite times)• Compute the OLS estimates• Is the average of these estimates equal to the true parameter values?

• Under the previous assumption (SLR.1 to SLR.4) it turnsout that theanswer is positive

It will be the case that:

These are estimators if we treat (xi,yi) as random variables Simple Regression Model 23

Unbiasedness of OLS (Proof)

∑ = −=n

i ix xxSST1

2)(

Let us rewrite our estimator as:

Put:

Now, note that: ( ) ( ) ( )2111

and 0 ∑∑∑ ===−=−=−

n

i ii

n

i i

n

i i xxxxxxx Simple Regression Model 24

Unbiasedness of OLS (cont)

Now take expectationsConditional on the values of the x’s, that is, treat thex’s asconstants. To avoid heavy notation, I don´t make this explicit.

Now, it´s really easy to prove unbiasedness of

Try it yourself!

And if the conditional expectation is a constantthe unconditinal expectation is also that constant... Simple Regression Model 25

Unbiasedness Summary

• The OLS estimators ofβ1 andβ0 are unbiased

• However, in a given sample we may be “close” or “far” from the true parameter, unbiasedness is a minimal requirement

• Unbiasedness depends on the 4 assumptions:if any assumption fails, then OLS is not necessarily unbiased Simple Regression Model 26

Variance of the OLS Estimators

• We know already that the sampling distribution of our estimators is centered around the true parameter

• Is the distribution concentrated around the true parameter?

Assumption SLR.5(Homoskedasticity) Simple Regression Model 27

Variance of OLS (cont.)

• Var(u|x) = E(u2|x)-[E(u|x)]2

• E(u|x) = 0, soσ2 = E(u2|x) = E(u2) = Var(u)

• So,σ2 is also the unconditional variance, called the variance of

the error

• σ, the square root of the error variance is called the standard deviation of the error

• Can write: E(y|x)=β0 + β1x and Var(y|x) = σ2

Assumption SLR.5(Homoskedasticity) Simple Regression Model 28

..

x1 x2

Homoskedastic Case

E(y|x) = β0 + β1x

y

f(y|x) Simple Regression Model 29

.

xx1 x2

yf(y|x)

Heteroskedastic Case

x3

..

E(y|x) = β0 + β1x Simple Regression Model 30

Variance of OLS (cont.)

( )( ) ( )

1 1

2 22

2 222 2 2

ˆ 1 ( )

1 1( ) ( )

1 1( )

i ix

i i i ix x

i xx x x

Var Var x x uSST

Var x x u x x Var uSST SST

x x SSTSST SST SST

β β

σσ σ

= + − = = − = − = − = =

∑∑ ∑∑

This is a conditional variance!

∑ = −=n

i ix xxSST1

2)(

)()()(

)()()( 2

XVarYVarXYVar

XVarbbXVarbXaVar

+=+

==+

Note that:

- a andb are constants andX a Random Variable

- If X andYare independent Simple Regression Model 31

Variance of OLS Summary

• The larger the error variance, σ2, the larger the variance of the slope estimator

• The larger the variability in the xi, the smaller the variance of the slope estimate

• SSTx tends to increase with the sample size n, so variance tends to decrease with the sample size

• Can also show:

( )xSST

Var2

1̂σβ = ∑ = −=

n

i ix xxSST1

2)( Simple Regression Model 32

Estimating the Error Variance• In practice, the error variance σ2 is unknown, must estimate

it...

• Can show that this is an unbiased estimator of the error variance

• The so-called standar error of the regression is given by: Simple Regression Model 33

Estimated standard error (se) of the parameters

And similarly for ##### Econometrics Multiple Regression Analysis: Heteroskedasticitydocentes.fe.unl.pt/~azevedoj/Web Page_files/Teaching_files/8... · MLR.5 Variance LM Statistic Testing Heteroskedasticity
Documents ##### 2. Korrelation, Linear Regression und multiple · PDF file2. Korrelation, Linear Regression und multiple Regression 2. Korrelation, lineare Regression und multiple Regression 2.1 Korrelation
Documents ##### REGRESSION 12.1 Simple Linear Regression Model 12.2 ...sman/courses/6739/SimpleLinearRegression.pdf · Goldsman — ISyE 6739 Linear Regression REGRESSION 12.1 Simple Linear Regression
Documents ##### Classification and Regression. Classification and regression What is classification? What is regression? Issues regarding classification and regression
Documents ##### Global Environment - Universidade Nova de Lisboadocentes.fe.unl.pt/~frafra/Site/GBE/Class4GE2014_Fall.pdf · the natural resources and new enterprises of any quarter ... Francesco
Documents ##### 1 Curve-Fitting Interpolation. 2 Curve Fitting Regression Linear Regression Polynomial Regression Multiple Linear Regression Non-linear Regression Interpolation
Documents ##### Econometrics Regression Analysis with Time Series Data ...docentes.fe.unl.pt/~azevedoj/Web Page_files/Teaching_files/11_TS_Examples.pdf · Examples Econometrics Regression Analysis
Documents ##### 1 Curve-Fitting Spline Interpolation. 2 Curve Fitting Regression Linear Regression Polynomial Regression Multiple Linear Regression Non-linear Regression
Documents ##### Penalised regression - Ridge, LASSO and elastic net regression · 8 Ridge regression LASSO regression Extensions Department of Mathematical Sciences Ridge regression For the least
Documents ##### Global Environment - Universidade Nova de Lisboadocentes.fe.unl.pt/~frafra/Site/GBE/Class6GE2014_Fall.pdfFrancesco Franco Global Environment 35/47 Monetary Policy Transmission Looking
Documents ##### LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model
Documents ##### Econometrics Regression Analysis with Time Series Datadocentes.fe.unl.pt/~azevedoj/Web Page_files/Teaching_files/9_TS... · Regression Analysis with Time Series Data ... Ok with Rainfall,
Documents ##### Long Run Francesco Franco - Universidade Nova de Lisboadocentes.fe.unl.pt/~frafra/Site/GBE/ClassgrowthGE2014_Fall.pdf · Global Environment Long Run Francesco Franco Nova SBE November
Documents ##### 2. Korrelation, Linear Regression und multiple Regression · 2. Korrelation, Linear Regression und multiple Regression 2. Korrelation, lineare Regression und multiple Regression 2.1
Documents ##### Multiple linear regression: estimation and model fitting · simple linear regression and multiple regression Multiple Simple regression regression Solar 0.05 0.13 Wind -3.32 -5.73
Documents ##### International Macro Finance - Universidade NOVA de Lisboadocentes.fe.unl.pt/~frafra/Site/IMF2012/Class10IMF2013.pdf · 2013-03-11 · Speculative Attacks Igeneration:exogenouspolicy
Documents ##### Logistic Regression - cs.wellesley.educs.wellesley.edu/~cs305/lectures/6_Logistic_Regression.pdfLogistic Regression Logistic regression is used for classification, not regression!
Documents Documents ##### 1 Curve-Fitting Polynomial Interpolation. 2 Curve Fitting Regression Linear Regression Polynomial Regression Multiple Linear Regression Non-linear Regression
Documents ##### Macroeconomic Theory - Universidade Nova de Lisboadocentes.fe.unl.pt/~frafra/Site/MT/Class8MT2017.pdf · Macroeconomic Theory Francesco Franco Nova SBE March 8, 2017 Francesco Franco
Documents ##### Regression I: Poisson‘sche Regression · T. Kießling: Fortgeschrittene Fehlerrechnung - Regression I: Poisson‘sche Regression 22.05.2019 Vorlesung 04-1 Regression I: Poisson‘sche
Documents  