49
1 Regression Regression Method Method

1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Embed Size (px)

Citation preview

Page 1: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

1

Regression Regression MethodMethod

Page 2: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 2

Chapter TopicsChapter Topics• Multiple regression• Autocorrelation

Page 3: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 3

Regression MethodsRegression Methods• To forecast an outcome (response variable,

dependent variable) of a study based on a certain number of factors (explanatory variables, regressors).

• The outcome has to be quantitative but the factors can either by quantitative or categorical.

• Simple Regression deals with situations with one explanatory variable, whereas multiple regression tackles case with more than one regressors.

Page 4: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 4

Simple Linear RegressionSimple Linear Regression– Collect data

Population

$ $

$

$

$

Unknown Relationship

iii XY 10

Random Sample

$$

$$

$$

$$

eXbbY 10

Page 5: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 5

Multiple RegressionMultiple Regression• Two or more explanatory variables • Multiple linear regression model

where is the error term and ~ N(0, 2)

• Multiple Linear Regression Equation

• Estimated Multiple Linear Regression Equation

pp XXXY ...22110

pp XXXYE ...)( 22110

pp XbXbXbbY ...ˆ22110

Page 6: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 6

Multiple RegressionMultiple Regression• Least Squares Criterion

• The formulae for the regression coefficients b0, b1, b2, . . . bp involve the use of matrix algebra. We will rely on computer software packages to perform the calculations.

• bi represents an estimate of the change in Y corresponding to a one-unit change in Xi when all other independent variables are held constant.

n

iii

n

ii YYe

1

2

1

2 )ˆ(minmin

Page 7: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 7

Multiple RegressionMultiple Regression• R2=SSR/SST=1-SSE/SST

• Adjusted R2 ( )

where n is the number of observationsand p is the number of independent variables

• The Adjusted R2 compensates for the number of independent variables in the model. It may rise or fall.

• It will fall if the increase in R2 due to the inclusion of additional variables is not enough to offset the reduction in the degrees of freedom.

2aR

2 2/( 1) 11 1 (1 )

/( 1) 1a

SSE n p nR R

SST n n p

Page 8: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 8

Test for SignificanceTest for Significance• Test for Individual Significance: t test

– Hypothesis

– Test statistic

– Decision rule: reject the null hypothesis at α level of significance if

• , or

• p-value < α

0:

0:0

ia

i

H

H

ib

i

s

bt

)2

;1(

pntt

Page 9: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 9

Test for SignificanceTest for Significance• Testing for Overall Significance: F test

– Test whether the multiple regression model as a whole is useful to explain Y, i.e., at least one X–variable in the regression model is useful to explain Y.

– Hypothesis H0 : all slope coefficients are equal to zero

(i.e. β1 = β2 =…= βp =0)

Ha : not all slope coefficients are equal to zero

Page 10: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 10

Test for SignificanceTest for Significance• Testing for Overall Significance: F test

– Test statistic

– Decision rule: reject null hypothesis if• F > Fα is based on an F distribution with p degrees of

freedom in the numerator and n – p –1 degrees of freedom in the denominator, or

• p-value < α

)1()ˆ(

)ˆ(

)1( 2

2

pnYY

pYY

pnSSE

pSSR

MSE

MSRF

ii

i

Page 11: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 11

Example: District SalesExample: District Sales• Use both target population and per capita

discretionary income to forecast district sales.District

i

Sales (gross of jars;

1 gross = 12 dozens)

Yi

Target population

(‘000 persons)

X1i

Per capita discretionary

income ($)

X2i

1 162 274 2450

2 120 180 3254

3 223 375 3802

4 131 205 2838

5 67 86 2347

6 169 265 3782

7 81 98 3008

8 192 330 2450

9 116 195 2137

10 55 53 2560

11 252 430 4020

12 232 372 4427

13 144 236 2660

14 103 157 2088

15 212 370 2605

Page 12: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 12

Example: District SalesExample: District Sales• Excel output

Page 13: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 13

Example: District SalesExample: District Sales• Multiple regression model

where Y = district sales X1 = target population

X2 = per capita discretionary income

• Multiple Regression EquationUsing the assumption E( ) = 0, we obtain

22110 XXY

22110)( XXYE

Page 14: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 14

Example: District SalesExample: District Sales• Estimated Regression Equation

b0, b1, b2 are the least squares estimates of 0, 1, 2. Thus

• For this example,

– Predicted sales are expected to increase by 0.496 gross when the target population increases by one thousand, holding per capita discretionary income constant.

– Predicted sales are expected to increase by 0.0092 gross when per capita discretionary income increase by one dollar, holding population constant.

22110ˆ XbXbbY

21 0092.04960.04526.3ˆ XXY

Page 15: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 15

Example: District SalesExample: District Sales• t Test for Significance of Individual Parameters

– Hypothesis

– Decision rule For = .05 and d.f. = 15 – 2 – 1 = 12, t.025 =

2.179 Reject H0 if |t| > 2.179

– Test statistic

– Conclusions Reject H0: 1 = 0 Reject H0: 2 = 0

0:

0:0

ia

i

H

H

92.8100605.0

49600.0

1

1 bs

bt 50.9

000968.0

00920.0

2

2 bs

bt

Page 16: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 16

Example: District SalesExample: District Sales• To test whether sales are related to population and per

capita discretionary income– Hypothesis

H0 : β1 = β2 =0

Ha : not both β1 and β2 equal to zero– Decision Rule

For = .05 and d.f. = 2, 12: F.05 = 3.89

Reject H0 if F > 3.89.– Test statistic

F = MSR/MSE = 26922/4.74 = 5679.47– Conclusion

Reject H0, sales are related to population and per capita discretionary income.

Page 17: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 17

Example: District SalesExample: District Sales• R2 = 99.89% means that 99.89% of total

variation of sales can be explained by its linear relation with population and per capita discretionary income.

• Ra2 = 99.88%. Both R2 and Ra

2 mean the model fits the data very well.

Page 18: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 18

Regression DiagnosticsRegression Diagnostics• Model assumptions about the error term

– The error is a random variable with mean of zero, i.e., E() = 0

– The variance of , denoted by 2, is the same for all values of the independent variable(s), i.e., Var() = 2

– The values of are independent.– The error is a normally distributed

random variable.

Page 19: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 19

Regression DiagnosticsRegression Diagnostics• Residual analysis: validating model assumptions• Calculate the residuals and check the following.

– Are the errors normally distributed?• Normal probability plot

– Is the error variance constant?• Plot of residuals against

– Are the errors uncorrelated (time series data)?• Plot of residuals against time periods

– Are there observations that are inaccurately recorded or do not belong to the target population?

• Double check the accuracy of outliers and influential observations.

y

Page 20: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 20

AutocorrelationAutocorrelation• Autocorrelation is present if the

disturbance terms are correlated. Three issues need to be addressed.– How does autocorrelation arise?– How to detect autocorrelation?– Alternative estimation strategies under

autocorrelation

Page 21: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 21

Causes of AutocorrelationCauses of Autocorrelation1. Omitting relevant regressors

Suppose the true model is

But the model is mis-specified as

That is,

If X2t is correlated with X2,t-1, νt is also correlated with νt-1. This is particularly serious if X2t represents a lagged dependent variable.

tttt XXY 22110

ttt XY 110

ttt X 22

Page 22: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 22

Causes of AutocorrelationCauses of Autocorrelation2. Specification errors in the functional form

Suppose the true model is

But the model is mis-specified as

νt would tend to be positive

for X<A and X>B, and negative for A<X<B.

tttt XXY 2210

ttt XY 10

Page 23: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 23

Causes of AutocorrelationCauses of Autocorrelation3. Measurement errors in the variables

Suppose Yt = Yt* + νt

where Y is the observed value, Y* is the true value and ν is the measurement error. Hence, the true model is

and the observed model is

Given a “common” measurement method, it is likely that measurement errors in period t and t-1 are correlated.

tptpttt XXXY ...22110*

)(...22110 tu

ttptpttt XXXY

Page 24: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 24

Causes of AutocorrelationCauses of Autocorrelation4. Pattern of business cycle

Time-series data relating to business and economics often exhibit pattern of business cycle. Sluggishness during recession persists over a certain time period while prosperity in bloom continues for a certain duration of time. It is apparent that successive observations tend to be correlated.

Page 25: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 25

Testing for First Order Testing for First Order Autocorrelation Autocorrelation

• First-order autocorrelation– The error term in time period t is related

to the error term in time period t–1 by the equation εt = ρεt-1 + at , where at ~ N(0, σa

2).

– Use Durbin-Watson test to test the existence of first order autocorrelation

Page 26: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 26

Testing for First Order Testing for First Order AutocorrelationAutocorrelation

• Durbin-Watson test– For positive autocorrelation

H0 : The error terms are not autocorrelated (ρ = 0)

Ha : The error terms are positively autocorrelated (ρ > 0)– For negative autocorrelation

H0 : The error terms are not autocorrelated (ρ = 0)

Ha : The error terms are negatively autocorrelated (ρ < 0)– For positive or negative autocorrelation

H0 : The error terms are not autocorrelated (ρ = 0)

Ha : The error terms are positively or negatively autocorrelated

(ρ 0)– Test statistic

n

tt

n

ttt

e

eeDW

1

2

2

21)(

Page 27: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 27

Testing for First Order Testing for First Order AutocorrelationAutocorrelation

,)1(2)(22

22)(

1

2

221

1

2

221

21

1

2

1

2

21

2

1

221

1

2

1

2

21

2

21

2

2

1

2

2

21

n

tt

nn

tt

n

n

ttt

n

tt

n

tt

n

tttn

n

tt

n

tt

n

tt

n

ttt

n

tt

n

tt

n

tt

n

ttt

e

eer

e

eeeee

e

eeeeee

e

eeee

e

eeDW

where r is the sample autocorrelation coefficient expressed as

n

tt

n

ttt

e

eer

1

2

21

Page 28: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 28

Testing for First Order Testing for First Order AutocorrelationAutocorrelation

• In “large samples”, DW 2(1–r)– If the disturbances are uncorrelated, then

r = 0 and DW 2– If negative first order autocorrelation

exists, then r<0 and DW > 2– If positive first order autocorrelation

exists, then r>0 and DW < 2• Exact critical values of the Durbin-Watson

test cannot be calculated. Instead, Durbin-Watson established upper (dU) and lower (dL) bounds for the critical values. They are for testing first order autocorrelation only.

Page 29: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 29

Testing for First Order Testing for First Order AutocorrelationAutocorrelation

• Test for positive autocorrelationH0 : ρ = 0

Ha : ρ > 0

• Decision rules

– If DW < dL,α, we reject H0.

– If DW > dU,α, we do not reject H0.

– If dL,α ≤ DW ≤ dU,α, the test is inconclusive.

Page 30: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 30

Example: Company SalesExample: Company Sales• The Blasidell Company wished to predict its sales by using

industry sales as a predictor variable. Year Quarter t X Y

1977 1 1 127.3 20.96

2 2 130 21.4

3 3 132.7 21.96

4 4 129.4 21.52

1978 1 5 135 22.39

2 6 137.1 22.76

3 7 141.2 23.48

4 8 142.8 23.66

1979 1 9 145.5 24.1

2 10 145.3 24.01

3 11 148.3 24.54

4 12 146.4 24.3

1980 1 13 150.2 25

2 14 153.1 25.64

3 15 157.3 26.36

4 16 160.7 26.98

1981 1 17 164.2 27.52

2 18 165.6 27.78

3 19 168.7 28.24

4 20 171.7 28.78

Page 31: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 31

Example: Company SalesExample: Company Sales• From the scatter plot, a linear regression model is

appropriate

170160150140130

29

28

27

26

25

24

23

22

21

20

X

Y

Page 32: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 32

Example: Company SalesExample: Company Sales• SAS output

Page 33: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 33

Example: Company SalesExample: Company Sales• Estimated regression equation

• The market research analyst was concerned with the possibility of positively correlated errors. Using the Durbin-Watson test:

H0 : ρ = 0

Ha : ρ > 0

XY 17628.045475.1ˆ

Page 34: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 34

Example: Company SalesExample: Company Saleset et-1 et-et-1 (et-et-1)^2 et^2

-0.02605 0.000679-0.06202 -0.02605 -0.03597 0.001294 0.0038460.02202 -0.06202 0.08404 0.007063 0.0004850.16375 0.02202 0.14173 0.020087 0.0268140.04657 0.16375 -0.11718 0.013731 0.0021690.04638 0.04657 -0.00019 3.61E-08 0.0021510.04362 0.04638 -0.00276 7.62E-06 0.001903-0.05844 0.04362 -0.10206 0.010416 0.003415-0.0944 -0.05844 -0.03596 0.001293 0.008911-0.14914 -0.0944 -0.05474 0.002996 0.022243-0.14799 -0.14914 0.00115 1.32E-06 0.021901-0.05305 -0.14799 0.09494 0.009014 0.002814-0.02293 -0.05305 0.03012 0.000907 0.0005260.10585 -0.02293 0.12878 0.016584 0.0112040.08546 0.10585 -0.02039 0.000416 0.0073030.1061 0.08546 0.02064 0.000426 0.0112570.02911 0.1061 -0.07699 0.005927 0.0008470.04232 0.02911 0.01321 0.000175 0.001791-0.04416 0.04232 -0.08648 0.007479 0.00195-0.03301 -0.04416 0.01115 0.000124 0.00109

sum= 0.097942 0.1333

735.0

13330.0

09794.0)(

20

1

2

20

2

21

tt

ttt

e

eeDW

Suppose α = 0.01. For n=20 (n denotes the number of observations) and k’ =1 (k’ denotes the number of independent variables),

dL = 0.95 and dU=1.15.

Since DW < dL, we conclude that the error terms are positively autocorrelated.

Page 35: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 35

Testing for First Order Testing for First Order AutocorrelationAutocorrelation

• Remark – In order to use the Durbin-Watson table,

there must be an intercept term in the model.

Page 36: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 36

Testing for First Order Testing for First Order AutocorrelationAutocorrelation

• Test for negative autocorrelationH0 : ρ = 0

Ha : ρ < 0

• Decision rules

– If 4 – DW < dL,α, we reject H0.

– If 4 – DW > dU,α, we do not reject H0.

– If dL,α ≤ 4 – DW ≤ dU,α, the test is inconclusive.

Page 37: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 37

Testing for First Order Testing for First Order AutocorrelationAutocorrelation

• Test for positive or negative autocorrelationH0 : ρ = 0

Ha : ρ 0

• Decision rules

– If DW < dL,α/2 or 4 – DW < dL,α/2, we reject H0.

– If DW > dU,α/2 and 4 – DW > dU,α/2 , we do not reject H0.

– If dL,α/2 ≤ DW ≤ dU,α/2 or dL,α/2 ≤ 4 – DW ≤ dU,α/2 , the test is inconclusive.

Page 38: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 38

Testing for First Order Testing for First Order AutocorrelationAutocorrelation

• Remarks – The validity of the Durbin-Watson test

depends on the assumption that the population of all possible residuals at any time t has a normal distribution.

– Positive autocorrelation is found in practice more commonly than negative autocorrelation.

– First-order autocorrelation is not the only type of autocorrelation.

Page 39: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 39

Solutions to Autocorrelation (1)Solutions to Autocorrelation (1)

1. Re-examine the model. The typical causes of autocorrelation are omitted regressors or wrong functional forms.

2. Go for alternative estimation strategy. Several approaches are commonly used. The approach considered here is the two-step Cochrane-Orcutt procedure.

Consider the following model with AR(1) disturbances :

(1)

with

Y Xt t t 1 2 ,

t t tu 1 .

Page 40: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 40

Solutions to Autocorrelation (2)Solutions to Autocorrelation (2)

Since equation (1) holds true for all observation, in terms of the (t-1)th observation, we have

(2)

where

Now, multiply (2) by , we obtain

(3)

Subtracting (3) from (1), we get

That is,

(4)

Note that the ut’s are uncorrelated. However, is unknown and needs to be estimated.

Y Xt t t 1 1 2 1 1 ,

t t tu 1 2 1.

Y Xt t t 1 1 2 1 1,

( ) ( ) ( ) ( )Y Y X Xt t t t t t 1 1 2 1 11

ttt uXY *2

*1

*

Page 41: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 41

Two-step Cochrance-OrcuttTwo-step Cochrance-Orcutt

1. Estimate equation (1) by Least Squares method and obtain the

resulting residuals et’s. Regress et = et-1 + ut and obtain

2. Substitute r into equation (4) and obtain OLS estimates of coefficients based on equation (4).

re e

e

t tt

n

tt

n

12

12

2

Page 42: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 42

The following table represents the annual U.S. personal consumption

expenditure (C) in billions of 1978 dollars from 1976 to 1990 inclusively :

Page 43: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 43

An OLS linear trend model has been fitted to the above data, and it gives

the following residuals :

Page 44: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 44

To test for positive first order autocorrelation in the error and hence

estimate a model for this error process, consider

H0 : = 0

Ha : 0

Using the Durbin-Watson test,

DWe e

e

t tt

tt

( )

.

.. .

12

2

15

2

1

15

627 7213

151410040 4146

Page 45: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 45

When k’ =1 and n=15,

dl = 1.08, du = 1.36

Hence we reject H0

By regressing et on et-1, we obtain r = 0.79

Hence the error process is

Re-estimate the trend model for consumption using the two-step

Cochrane-Orcutt procedure.

e e ut t t 0 79 1.

Page 46: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 46

Using the transformed model

with t=1 indicating year 1976, sequentially until t=15 representing year

1990, the transformed data are tabulated in following table.

C rC r t r t ut t t 1 1 21 1 ( ) [ ( )]

Page 47: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 47

Applying OLS to the transformed data yields

or

That is,

are parameter estimates of the original model.

. .* *C tt 41415 18 688

. . . *C C tt t 41415 0 79 18 6881

. ,

.

..

2

1

18 688

41415

1 0 79197 21

Page 48: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 48

Note that1. Because lagged values of Y and X had to

be formed, we are left with n-1 observations only

2. The estimate r is obtained based on OLS estimation assuming a standard linear regression model satisfying all classical assumptions. It may not be efficient estimator of r. This leads to the iterative Cochrane-Orcutt estimator.

Page 49: 1 Regression Method. Slide 2 Chapter Topics Multiple regression Autocorrelation

Slide 49

Chapter Summary Chapter Summary • Simple linear regression• Multiple regression• Regression on Dummy Variables• Autocorrelation

– Durbin-Watson test– Two step Cochrane Orcutt procedure