24
Returning to Consumption

Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

Embed Size (px)

Citation preview

Page 1: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

Returning to Consumption

Page 2: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

More on Consumption

• We return to the consumption problem to illustrate the issue of heteroscedasticity

• It turns out that OLS may NOT give us the best estimate of the MPC

• The reason is that one of the assumptions of the GM theorem is probably violated in the consumption model

• The data is probably not heteroscedastic• Var(ui|xi) ≠ 2

Page 3: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

Homoscedastic

E(y|x)

x=600 x=900 x=1200 x

E(y|x)

income

Page 4: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

Heteroscedastic

E(cons|income)

consumption

.

.

.

rich people poor people

income

Page 5: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

Characteristics of Heteroscedasticty

• Systematic pattern exists in variance of residuals: • Var(ui|xi) = i

2 = 2.f(xi)

• i2 = f(1+2Z2+…+ kZk)

• variance has different values for different observations or groups of observations

• Intuition: if random bit come from roll of dice then homo is with same dice and hetero is with different dice

• Evident in cross-section data or time series

Page 6: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

Consequences• OLS is unbiased• OLS is consistent • OLS is no longer efficient• Variance formula used previously is incorrect

– significance test, confidence intervals etc. cannot be used • Aside: a corrected formula can be used

– Stata: regress y x, robust – We don’t bother with this because can do better with

alternative estimator

Page 7: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

Testing for Heteroskedasticity

• Plot of residuals • Sort the residuals by explanatory variable and

plot against that variable , look for pattern, do this for each explanatory variable

• Not a formal test but can give an idea of what's going on

• Can use it to reject idea of Het

Page 8: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

An example of Het

. . . . . . . .

.

. . .

.

. . . .

.

.

. .

.

.

. . .

. .

.

.

.

. .

.

.

.

.

Xi

ei

Page 9: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

Consumption Example

-150

0-1

000

-500

05

001

000

Res

idu

als

-1000 0 1000 2000 3000monthly consumption

Page 10: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

Goldfeld Quandt test• Used for i

2 = 2.f(xi) i.e. related to one variable only

1. State Hypothesis– Test H0: i

2 = 2. H0: i2 ≠ 2.

– Note: the null is homoscedasticity

2. Sort residuals by ascending order of xi

3. Omit middle 20% observations: (n-c) observations remain4. Estimate the original model separately for two samples

– first (n-c)/2 obs (keep RSS1 )

– last (n-c)/2 obs (keep RSS2)

5. Compute: g = SSR2/SSR1

6. If g > Fc(df,df) => reject null hypothesis of homoscedasticity at a significance level

• Test can be carried out for each xi

Page 11: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

Intuition of GQ

• If het does exist then we can split sample into a low variance and high variance bit

• Run the regression separately for the two samples• Calculate the ratio of variances of the residual

(remember s2=RSS/df)• If this ratio is 1 then they are equal and the data is

homoscedastic• So reject null of homoscedasticity if bigger than 1• How much bigger? Bigger than F critical value

Page 12: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

Consumption Example• Test it for nmwage1. State Hypothesis

– Test H0: i2 = 2 H0: i

2 ≠ 2.– Note: the null is homoscedasticity

2. Sort residuals by ascending order of nmwagei

Stata command: sort nmwage

3. Omit middle 20% observations: (n-c) observations remain– Two sample: 1..550 781..1330

4. Estimate the original model separately 5. Compute: g = SSR2/SSR1

6. g= 92733079.9/ 91033129.2= 1.018674 7. If g > Fc(df,df) => reject null hypothesis of homoscedasticity at a significance level

– 5% sig level F(550,550)=1.15• So cannot reject the null at 5% significance level• Test can be carried out for each xi

Page 13: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

sort nmwage

. regress cons nmwage if _n<=550

Source | SS df MS Number of obs = 550-------------+------------------------------ F( 1, 548) = 49.12 Model | 8312140.6 1 8312140.6 Prob > F = 0.0000 Residual | 92733079.9 548 169220.949 R-squared = 0.0823-------------+------------------------------ Adj R-squared = 0.0806 Total | 101045220 549 184053.225 Root MSE = 411.36

------------------------------------------------------------------------------ cons | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- nmwage | .7304001 .1042153 7.01 0.000 .5256897 .9351104 _cons | 67.60298 49.48125 1.37 0.172 -29.59315 164.7991------------------------------------------------------------------------------

. regress cons nmwage if _n>780

Source | SS df MS Number of obs = 550-------------+------------------------------ F( 1, 548) = 118.87 Model | 19747100.1 1 19747100.1 Prob > F = 0.0000 Residual | 91033129.2 548 166118.849 R-squared = 0.1783-------------+------------------------------ Adj R-squared = 0.1768 Total | 110780229 549 201785.481 Root MSE = 407.58

------------------------------------------------------------------------------ cons | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- nmwage | .7270654 .0666855 10.90 0.000 .596075 .8580558 _cons | 94.26419 75.29303 1.25 0.211 -53.63408 242.1625------------------------------------------------------------------------------

Page 14: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

White’s Test• More general test that allows for more than one varibles to influence

the variance of the residuals1. Estimate model yi = 1 + 2 x2i + 3 x3i + ui

2. Run auxiliary regression: sq’d residuals on squares and cross products of X variables:

3. ei2 =1 +2 x2i +3 x3i + 4 x2i

2+ 5 x3i2+ 6 x2i x3i + vi

4. Null hypothesis is homoscedastic errors i.e. 5. 2 = 3 = 4 = 5 = 6 = 0 i.e. ei

2 = constant

6. calculate nR2 ~ df2 test

7. nR2 > df2 critical value reject null hypothesis

• Comment: why not an F-test

Page 15: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

Consumption Example

• Test it for nmwage1. State Hypothesis

– Test H0: i2 = 2 H0: i

2 ≠ 2.– Note: the null is homoscedasticity

2. Estimate the Model and generate residuals squared3. Regress residual squared on all of the variables that

may cause heteroscdasticity4. Form the test statistic: NR2=0.2665. Find critical value: chi-sq, df=2 alpha=0.05=5.996. We cannot reject the null at the 5% significance level

Page 16: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

predict u, residualgen u2=u^2gen nmwage2=nmwage^2regress u2 nmwage nmwage2

Source | SS df MS Number of obs = 1330-------------+------------------------------ F( 2, 1327) = 0.11 Model | 1.1824e+10 2 5.9121e+09 Prob > F = 0.8921 Residual | 6.8688e+13 1327 5.1762e+10 R-squared = 0.0002-------------+------------------------------ Adj R-squared = -0.0013 Total | 6.8700e+13 1329 5.1693e+10 Root MSE = 2.3e+05

------------------------------------------------------------------------------ u2 | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- nmwage | -13.03642 58.98887 -0.22 0.825 -128.758 102.6852 nmwage2 | .0109503 .0325921 0.34 0.737 -.0529874 .0748881 _cons | 163843.1 24661.71 6.64 0.000 115462.9 212223.3------------------------------------------------------------------------------

Page 17: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

Efficient Estimation

• If we find heteroscsadstity we know that OLS will be inefficient

• Remember why this might be a problem (see over)• Can we do better?• Yes. There is an efficient estimator called

Generalised Least Squares (GLS)• Two steps

1. Remove the heteroscedasticity from the data2. Do OLS on the transformed data

Page 18: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

Prob of error is lower for efficient estimator at any sample size

TRUE

Same sample size, different estimator

Page 19: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

The GLS Procedure

• Assume that i2 is known:

• Basic model:Yi = 1 + 2 Xi + ui , E(ui

2) = i2 (not constant)

• Create new data with each observations weighted by the heteroscedastic standard deviation

Page 20: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

The GLS Procedure

• Then run the regression on the transformed data

• The slope estimates are the BLUE of the coefficients of the original model

• Note the intercept tem is slightly different (it has now become coefficient on a variable)

Page 21: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

How it Works

• GLS eliminates heteroscedasticity • To see this note that

– var(ui*) = E(ui*)2 = E(ui/i)2 = 1/i2.E(ui

2) = (1/i2).i

2 = 1 – var of transformed error term is homoskedastic: it is constant

• NB This model does not have a constant now: it has two explanatory variables: 1/i and Xi/I

• Cannot apply GLS if the exact type of hetero is unknown. So do FGLS (Feasible GLS) and replace i with an estimate of I

– From White’s test

Page 22: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

The Consumption Example

• Transform the data to eliminate the heteroscedasticty

• Use the estimate of from White’s test• Stata command

– Predict white– generate c=cons/sqrt(white)– generate y=nmwage/sqrt(white)

• The GLS of the MPC is given by the regression on the transformed data

Page 23: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

predict white(option xb assumed; fitted values)

. gen sigma=white^0.5

.

. gen c=cons/sigma

. gen y=nmwage/sigma

.

. regress c y

Source | SS df MS Number of obs = 1330-------------+------------------------------ F( 1, 1328) = 583.66 Model | 584.542068 1 584.542068 Prob > F = 0.0000 Residual | 1330.00395 1328 1.001509 R-squared = 0.3053-------------+------------------------------ Adj R-squared = 0.3048 Total | 1914.54602 1329 1.44059144 Root MSE = 1.0008

------------------------------------------------------------------------------ c | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- y | .7552474 .0312614 24.16 0.000 .6939202 .8165746 _cons | .1572505 .0652215 2.41 0.016 .0293021 .285199------------------------------------------------------------------------------

Page 24: Returning to Consumption. More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS

Conclusion

• The example didn’t appear to have heteroscedasticity.

• When het does exist the difference between GLS and OLS can be substantial

• Both are unbiased and consistent• GLS is preferable because it is efficient so

there is a lower probability of substantial error