The Classical Two-Variable Regression Model II

Embed Size (px)

DESCRIPTION

quantitative economics

Citation preview

  • 1/33

    EC114 Introduction to Quantitative Economics16. The Classical Two-Variable Regression Model II

    Marcus Chambers

    Department of EconomicsUniversity of Essex

    21/23 February 2012

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • 2/33

    Outline

    1 Introduction

    2 Inference in the CLRM

    3 Summary of Computational Procedure

    4 Non-Linear Models

    5 Prediction

    Reference: R. L. Thomas, Using Statistics in Economics,McGraw-Hill, 2005, sections 12.312.5.

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Introduction 3/33

    We have seen that the Classical Linear Regression Model(CLRM) consists of a (population) regression equation,

    Yi = + Xi + i, i = 1, . . . , n,

    and a set of assumptions concerning X and .The assumptions are:

    IA (non-random X): X is non-stochastic (non-random);

    IB (fixed X): The values of X are fixed inrepeated samples;

    IIA (zero mean): E(i) = 0, for all i;

    IIB (constant variance): V(i) = 2 = constant for all i;

    IIC (zero covariance): Cov(i, j) = 0 for all i 6= j;IID (normality): each i is normally distributed.

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Introduction 4/33

    Different combinations of assumptions yield differentproperties for the ordinary least squares (OLS) estimators,a and b, of and :

    Property AssumptionsLinearity IA, IBUnbiasedness IA, IB, IIABLUness IA, IB, IIA, IIB, IICEfficiency IA, IB, IIA, IIB, IIC, IIDNormality IA, IB, IIA, IIB, IIC, IID

    It is the last of these properties that will form the basis forinference (hypothesis testing) concerning the populationparameters and .

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Inference in the CLRM 5/33

    Under Assumptions IA, IB, IIA, IIB, IIC, and IID, the OLSestimators are BLUE (best linear unbiased estimators) aswell as normally distributed.We have seen that the sampling distributions of a and b aregiven by

    a N(, 2a), b N(, 2b),where

    V(a) = 2a =2

    X2in

    x2i, V(b) = 2b =

    2x2i.

    These distributions provide a basis for making inferencesabout and .

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Inference in the CLRM 6/33

    Standardising we obtain

    a a

    N(0, 1), b b

    N(0, 1),

    suggesting that the N(0, 1) distribution can be used forinference.However, the problem is that we dont know 2 and, hence,we cant compute 2a and 2b.We therefore need to estimate 2.An unbiased estimator of 2 is

    s2 =

    e2in 2

    i.e. E(s2) = 2.

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Inference in the CLRM 7/33

    Note that the denominator of s2 involves n 2 and not n 1.This is because in constructing the ei we have had toestimate two parameters ( and ) and have thereforeused up two degrees of freedom.The estimated variances of a and b are then

    s2a =s2

    X2in

    x2i, s2b =

    s2x2i.

    The estimated standard errors are given by the squareroots of s2a and s2b and are denoted sa and sb respectively.

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Inference in the CLRM 8/33

    The corresponding standardised versions of a and b are

    a sa tn2, b sb tn2.

    These distributions are Students t because we have had toestimate 2 using s2.The distributions have n 2 degrees of freedom becausewe have lost two degrees of freedom through estimating and .The standardised variables above are used to constructconfidence intervals and to test hypotheses concerning and using the tn2 distribution.

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Inference in the CLRM 9/33

    95% confidence intervals (CIs) for and can beconstructed as follows:

    for , the 95% CI is a t0.025sa,for , the 95% CI is b t0.025sb,

    where t0.025 is the value from the tn2 distribution that puts2.5% of the distribution into each tail.The interpretation is that we are 95% confident that liesin the interval

    [a t0.025sa, a + t0.025sa] ,

    while we are 95% confident that lies in the interval

    [b t0.025sb, b + t0.025sb] .

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Inference in the CLRM 10/33

    Example. In Lecture 12 we looked at the demand formoney using data for 30 countries in 1985, obtaining

    Y = 0.0212 + 0.17485X

    where Y is money stock and X is GDP.Lets work out the standard errors of a and b and, hence,95% confidence intervals.We shall need the following sample statistics:

    y2i = 26.403,

    x2i = 666.86,xiyi = 116.60,

    X2i = 1274.66.

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Inference in the CLRM 11/33

    We first need to compute s2; for this we neede2i =

    y2i b

    xiyi

    = 26.403 (0.17485 116.60) = 6.0155.

    It follows that

    s2 =

    e2in 2 =

    6.015528

    = 0.2148.

    We then obtain

    s2a =s2

    X2in

    x2i=

    0.2148 1274.6630 666.86 = 0.0136,

    s2b =s2x2i

    =0.2148666.86

    = 0.0003221.

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Inference in the CLRM 12/33

    The resulting standard errors are

    sa =0.0136 = 0.1170, sb =

    0.0003221 = 0.01795.

    We can use these to form the confidence intervals thevalue t0.025 for the t28 distribution is 2.048 hence for the95% CI is

    a t0.025sa = 0.0212 (2.048 0.1170)= 0.0212 0.2396

    so the interval is [0.2184, 0.2608], while for

    b t0.025sb = 0.17485 (2.048 0.01795)= 0.17485 0.03676

    yielding the interval [0.1381, 0.2116].

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Inference in the CLRM 13/33

    Turning to hypothesis testing a common test is that ofsignificance i.e. a test of whether the population regressionparameter is zero or not.For example, in the model

    Yi = + Xi + i, i = 1, . . . , n,

    it is common to test

    H0 : = 0 against HA : 6= 0.Why is this of interest? Note that:

    Under H0 : Yi = + i;

    Under HA : Yi = + Xi + i.

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Inference in the CLRM 14/33

    Hence, under H0, X does not determine Y, so this is knownas a test of significance.Put another way, it is a test of whether X is a significantdeterminant of Y.We can also test H0 : = 0 against HA : 6= 0, in whichcase

    Under H0 : Yi = Xi + i;

    Under HA : Yi = + Xi + i.

    These null hypotheses concerning and can be testedseparately using t-tests.

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Inference in the CLRM 15/33

    When the null hypothesis is H0 : = 0, we have

    TS =bsb tn2 under H0.

    When the null hypothesis is H0 : = 0, we have

    TS =asa tn2 under H0.

    Let t0.025 denote the critical value from the tn2 distributionfor a 5% level two-tail test.In either case the decision rule is:

    If |TS| > t0.025 reject H0 in favour of HA;if |TS| < t0.025 do not reject H0 (reserve judgment).

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Inference in the CLRM 16/33

    Example (continued). The demand-for-money examplesuggested a proportional relationship between Y (moneystock) and X (GDP) of the form Y = kX.We shall first test H0 : = 0 against HA : 6= 0 at the 5%level of significance; the 5% critical value from the t28distribution is 2.048.The test statistic is

    TS =asa

    =0.02120.1170

    = 0.1812.

    As |TS| = 0.1812 < 2.048 we do not reject H0 : = 0i.e. there is insufficient evidence to reject H0 in favour of HA.This supports the theory of a proportional relationship (wecannot reject the hypothesis that the intercept is zero).

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Inference in the CLRM 17/33

    Turning to H0 : = 0 against HA : 6= 0 the test statistic is

    TS =bsb

    =0.174850.01795

    = 9.7409.

    Clearly |TS| = 9.7409 > 2.048 and so we reject H0 : = 0 infavour of HA : 6= 0 i.e. there is evidence to suggest that X(GDP) is a significant determinant of Y (money stock).This result supports our earlier analysis based on thesample correlation coefficient and the coefficient ofdetermination which suggested a positive correlationbetween these variables.

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Inference in the CLRM 18/33

    From economic theory we would expect > 0 and so wemight want to test H0 : = 0 against HA : > 0.We have the same test statistic under the null but thecritical value changes due to this being an upper one-tailtest.At the 5% level we want 5% of the tn2 distribution in theupper tail rather than the 2.5% that we have for a two-tailtest.We find that now t0.05 = 1.701 and so the conclusion of ourtest is unchanged we reject H0 : = 0 but this time infavour of HA : > 0 rather than just 6= 0.

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Inference in the CLRM 19/33

    When reporting regression results it is usual to include thestandard errors of the estimates in parentheses below theestimates themselves.For the money demand example we have

    Y = 0.0212 + 0.1749 X,(0.1170) (0.0180)

    where figures in parentheses denote standard errors.It is important to state what the numbers in parenthesesactually are, because sometimes the t-ratios (for testingwhether the coefficient is zero) are reported instead.For example:

    Y = 0.0212 + 0.1749 X,(0.1812) (9.7409)

    where figures in parentheses denote t-ratios.

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Inference in the CLRM 20/33

    The advantage of reporting standard errors is that they canbe used to test other hypotheses e.g. H0 : = 1 withouthaving to derive the standard error from the t-ratio.The advantage of reporting t-ratios is that you can testimmediately for significance by comparing the t-ratio withthe appropriate critical value.The disadvantage of reporting t-ratios is that to test otherhypotheses you need to derive the standard error.For example, if we want the standard error from the t-ratio,we obtain it as follows:

    TS =bsb sb = bTS .

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Summary of Computational Procedure 21/33

    We can summarise the computations required for OLSestimation of the two-variable regression model as follows:Step 1. Compute the quantities

    Xi,

    Yi,

    X2i ,

    Y2i ,

    XiYi.

    Step 2. Compute similar quantities for the deviations fromsample means:

    x2i =

    (Xi X)2 =

    X2i

    (

    Xi)2

    n,

    y2i =

    (Yi Y)2 =

    Y2i (

    Yi)2

    n,

    xiyi =

    (Xi X)Yi Y) =

    XiYi

    Xi

    Yin

    .

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Summary of Computational Procedure 22/33

    Step 3. Compute the OLS estimates:

    b =

    xiyix2i, a =

    Yin b

    Xin

    .

    Step 4. Compute the sum of squared residuals and s2:e2i =

    y2i b

    xiyi, s2 =

    e2i

    n 2 .

    Step 5. Compute the estimated variances of a and b:

    s2a =s2

    X2in

    x2i, s2b =

    s2x2i.

    Step 6. Compute R2:

    R2 =b2

    x2iy2i

    or R2 = 1

    e2iy2i.

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Non-Linear Models 23/33

    The above inferential procedures apply equally well tonon-linear models that have been linearised in anappropriate way.For example, in Lecture 13 we used data on the price (X)of, and demand (Y) for, carrots, obtaining

    ln(Y) = 6.73 0.744 ln(X).

    The estimated price elasticity of demand is 0.744, butsuppose we want to test the hypotheses that it is equal to1 against the alternative that it is greater than 1.How do we go about doing this?

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Non-Linear Models 24/33

    Writing the population regression as

    ln(Y) = + ln(X) +

    we therefore wish to test

    H0 : = 1 against HA : > 1.Denoting the OLS estimate of by b we use the statistic

    TS =b (1)

    sb=

    b + 1sb tn2 under H0.

    Here, n = 30 and the upper one-tail 5% critical value fromthe t28 distribution is 1.701.

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Non-Linear Models 25/33

    The decision rule is:

    If TS > 1.701 reject H0 in favour of HA;

    if TS < 1.701 do not reject H0 (reserve judgment).

    We find that sb = 0.0896 and so

    TS =0.744 + 10.0896

    = 2.857.

    Clearly TS = 2.857 > 1.701 and so we reject H0 : = 1 infavour of HA : > 1 i.e. there is evidence to suggest theprice elasticity of demand for carrots is greater than 1 (sothat the demand for carrots is price inelastic).

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Non-Linear Models 26/33

    Inference for non-linear models therefore proceeds in thesame way as for linear models, the only difference beingthat the linearised model is written in terms of transformedvariables.For example, if Y = AX then ln(Y) = ln(A) + ln(X) andso the regression equation is

    ln(Y) = + ln(X) + or Y = + X +

    where Y = ln(Y) and X = ln(X).It is important that all computations are carried out usingthe transformed variables, Y and X, that appear in thelinearised model.

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Prediction 27/33

    Suppose we estimate a linear regression between twovariables, Y and X, using a sample of n observations.The population regression is

    Yi = + Xi + i, i = 1, . . . , n,

    while our sample regression is

    Yi = a + bXi, i = 1, . . . , n.

    Now suppose a new observation on X becomes available,but not the corresponding value of Y.Can we use our sample regression to predict thecorresponding value of Y? If so, what are the properties ofthe prediction?

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Prediction 28/33

    For simplicity denote the new value of X by X0.Our predicted value of Y0 is obtained by plugging X0 intothe sample regression:

    Y0 = a + bX0.

    However, the true but unknown value of Y0 will bedetermined (by assumption) by the population regression:

    Y0 = + X0 + 0.

    The difference between Y0 and Y0 is called the predictionerror or forecast error, and is denoted f .

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Prediction 29/33

    The value of f is given by

    f = Y0 Y0= a + bX0 (+ X0 + 0)= (a ) + (b )X0 0.

    It therefore depends on three things:(i) the estimation errors, a and b ;(ii) the regressor, X0; and(iii) the disturbance, 0.Only the second of these the value of X0 is known,although the distributional properties of the remainder areknown (under Classical assumptions).

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Prediction 30/33

    Under the full set of Classical assumptions it can be shownthat f N(0, 2f ) where

    2f = 2[1 +

    1n

    +(X0 X)2

    x2i

    ]and X and

    x2i are obtained from the original n

    observations.The variance can be estimated using

    s2f = s2[1 +

    1n

    +(X0 X)2

    x2i

    ]where s2 is the usual estimator of 2.

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Prediction 31/33

    Note that

    ff

    =Y Y0f

    N(0, 1) and fsf

    =Y Y0sf

    tn2,

    the latter forming a basis for inference.For example, a 95% confidence interval for f is of the form

    0 t0.025sf or [t0.025sf , t0.025sf ]

    while for Y0 it is of the form

    Y0 t0.025sf or [Y0 t0.025sf , Y0 + t0.025sf ].

    The confidence intervals get wider as X0 gets further fromX (the term (X0 X)2 determines s2f ) as depicted in the nextdiagram:

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Prediction 32/33

    It is apparent from the diagram that the accuracy of theprediction is greatest for values of X0 closest to X, becausehere the width of the CI is smallest.

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

  • Summary 33/33

    Summary

    Inference in the CLRMNon-linear models and prediction

    Next week:Multiple linear regression

    EC114 Introduction to Quantitative Economics 16. The Classical Two-Variable Regression Model II

    IntroductionInference in the CLRMSummary of Computational ProcedureNon-Linear ModelsPrediction