Aspects of Bayesian Inference

Embed Size (px)

Citation preview

  • 8/19/2019 Aspects of Bayesian Inference

    1/25

    M320

    Cross Section and Panel Data Econometrics

    Topic 2: Generalized Method of Moments

    Part III: Testing

    Dr. Melvyn Weeks

    Faculty of Economics and Clare CollegeUniversity of Cambridge

    1

    Outline

    Outline

    1   Motivation

    2   The General Validity of Moment Restrictions

    3   The Anatomy of 2SLSThe Bias of 2SLS2SLS Bias and Weak InstrumentsThe Wald Estimator

    The BJB Critique4   Tests of Exogeneity and Weak Instruments

    Overidentification TestsA Test of a Subset of Orthogonality ConditionsA Test of Weak Identification

    5   GMM Distance Tests of Endogeneity and Exogeneity

    6   Endogeneity and the Linear ModelIsolating the effect of Education on wages from Ability

    7   The Costs of GMM

    2

    Notes

    Notes

  • 8/19/2019 Aspects of Bayesian Inference

    2/25

    Outline

    Readings :

    Angrist, J. and A. Krueger (1991). “Does CompulsorySchooling Attendance Affect Schooling and Earnings”.Quarterly Journal of Economics , 106, 979-1014.

    Bekker, P. (1994). “Alternative Approximations to theDistribution of Instrumental Variables Estimators”,Econometrica 62, 657-681.

    Bound, J., A. Jaeger and R. Baker (1996). “Problems withInstrumental Variables Estimation When the CorrelationBetween the Instruments and the Endogenous ExplanatoryVariable is Weak”,   Journal of the American Statistical Association  90, 443-450.

    Cameron, A.C. and P.K. Trivedi (2004).  Microeconometrics:Methods and Applications , Cambridge University Press[Chapter 21]

    3

    Outline

    Baum, C., M. Schaffer, and S. Stillman (2003):   Instrumental Variables and GMM: Estimation and Testing . Working PaperNo. 545, Boston College

    J. Angrist and J. Pischke (2009):  Mostly Harmless Econometrics . Princeton University Press

    4

    Notes

    Notes

  • 8/19/2019 Aspects of Bayesian Inference

    3/25

    Motivation

    Remark

    ” ... An important cost of performing IV estimation whenx and  ε   are uncorrelated: the asymptotic variance of the IV estimator is always larger, and sometimes much larger,

    than the asymptotic variance of the OLS estimator”

    Wooldridge (2006), p.490. Introductory Econometrics: A ModernApproach. 3rd Ed. New York: Thomson Learning.

    5

    The General Validity of Moment Restrictions

    IV or more general GMM estimators trade-off consistency versusthe inevitable loss of efficiency,

    The loss of efficiency may be a price worth paying IF the OLSestimator is biased and inconsistent

    This motivates a testing strategy.

    may all or  some   of the included endogenous regressors betreated as exogenous?

    may all or  some  of the included instruments be treated asexogenous?

    These are all:

    tests of orthogonality

    Non-orthogonality between: -

    regressors  and errors

    instruments  and errors

    And then some of the instruments may be  weak !

    6

    Notes

    Notes

  • 8/19/2019 Aspects of Bayesian Inference

    4/25

    The General Validity of Moment Restrictions

    Road Map

    Remark

    We consider tests for instrument relevance, instrument exogeneity and instrument weak exogeneity.

    For instrument exogeneity we consider a of class of overidentification tests that may be used with 2sls and GMM estimators 

    In considering tests of weak instruments our point of departure is the bias of 2sls. Although well known, since the early 1990s empirical researchers have realised that this bias can be significant when instruments are weak.

    We will see that in finite samples IV estimates are biased in the same direction as OLS, with the bias dependent on the fit of the first-stage regression

    We examine the in-famous Angrist Kruger (1991) study on returns to education and operationalise all these tests with some real data.

    7

    The Anatomy of 2SLS

    IV is consistent but biased in small samplesLocating good exogenous instruments is not easy.

    Basing a good instrument on features such as weak exogeneitymay make an instrument weak.

    In the presence of weak instruments the bias can be substantial.

    If the fit of the first stage regression is low, then the samplingdistribution of GMM/IV statistics are non-normal

    In such cases asymptotic theory will provide a poor guide to thefinite sample distribution of the IV estimator even if sample is isvery large

    8

    Notes

    Notes

  • 8/19/2019 Aspects of Bayesian Inference

    5/25

    The Anatomy of 2SLS

    M  =  K   or low over-identification   Conventional inference is notso misleading, unless degree of endogeneity is veryhigh.

    M  > K ,  high over-identification- may be the case in panel settings with internal

    instruments.

    - when instruments are interacted with otherexogenous variables to improve precision (see Angristand Krueger (1991).)

    In the presence of many weak instruments standard estimation canbe biased and conventional methods for inference can bemisleading.

    TSLS can exhibit very poor properties.

    9

    The Anatomy of 2SLS

    The Angrist and Krueger Study.

    y i  = α + βe i  +  εi    (1)

    y i   is log (yearly) earnings,  e i   is years education.

    Endogeneity: pre-schooling ability impact  e i   and earnings

    AK exploit variation in schooling levels that arise from differentialthe impact of compulsory schooling laws.

    School districts typically require a student to have turned six byJanuary 1st of the year the student enters school.

    Individuals are required to stay in school till they turn sixteen

    ⇒  an individual born in the 1st quarter will have  lower  requiredminimum schooling levels than individuals born in the last quarter.

    10

    Notes

    Notes

  • 8/19/2019 Aspects of Bayesian Inference

    6/25

    The Anatomy of 2SLS

    Figure : Schooling Laws

    This variation may also vary across states,

    11

    The Anatomy of 2SLS The Bias of 2SLS

    Remark

    Below we examine the nomenclature of the 2SLS estimator.

    We show that the 2SLS is biased in finite samples 

    We first do this for a single instrument The bias is magnified by the problem of weak instruments ..

    12

    Notes

    Notes

  • 8/19/2019 Aspects of Bayesian Inference

    7/25

    The Anatomy of 2SLS The Bias of 2SLS

    Revision

    1 Causal relationship of interest:

    y i  =  α + βe i  +  εi    (2)

    2 First stage regression:

    e i  =  ρz i  +  ζ 1i    (3)3 Second stage regression:

    y i  =  α + β e i  +  ζ 2i 4 Reduced form:

    y i  =  α + τ z i  +  εri    (4)

    13

    The Anatomy of 2SLS The Bias of 2SLS

    The reduced form (4) is obtained by substituting (3) into (2)

    y i    =   α + βe i  +  εi 

    =   α + β(z i  ρ + ζ 1i ) + εi 

    =   α + βz i  ρ + ( βζ 1i  +  εi )

    =   α + τ z i  +  εri 

    z i  ρ  denotes the  population  fitted values from (3).

    The second stage regression is

    y i    =   α + β e i  +  ζ 2i    (5)=   α + β e i  + (εi  +  β(e i  − e i ))

    where e i  = pz i   denotes the first stage  fitted  values.The estimator of  β  in (5) is the IV estimator.

    14

    Notes

    Notes

  • 8/19/2019 Aspects of Bayesian Inference

    8/25

    The Anatomy of 2SLS The Bias of 2SLS

    And Why is 2sls Biased

    Remark

    Bias in the 2sls estimator follows because the first stage is estimated. Consider 

    e  =  ρz  +  ζ 1

    If  ρ  were known then

     e  =  ρz.

    These fitted values are uncorrelated with the second stage errors 

    For unknown  ρ, e  =  ρz. e    =   z   ρ

    =   P z e 

    =   z  ρ + P z ζ 1

    P z  = z (z z )−1z .For endogenous e, P z ζ 1   is correlated with  ε , resulting in a biased 2sls estimator.

    15

    The Anatomy of 2SLS The Bias of 2SLS

    The 2SLS bias with many instruments.With multiple instruments the first stage is:

    e  =  Z  ρ + ξ 1.

    OLS estimates are biased because  ε i   is correlated with  ξ 1i .

    Instruments Z i   are uncorrelated with  ξ i 1  by construction anduncorrelated with  ε i   by assumption.The 2SLS estimator is:

      β2SLS  = (e P Z e )−1e P Z y  =  β + (e P Z e )−1e P Z εP Z  = Z (Z 

    Z )−1Z  is the projection matrix that produces fittedvalues from a regression of  e   on Z .

    Substitute the 1st stage for  e  in  e P Z ε  to get  β2SLS − β   = (e P Z e )−1( ρZ  + ξ 1)P Z ε= (e P Z e )

    −1 ρZ ε + (e P Z e )−1ξ 1P Z ε

    16

    Notes

    Notes

  • 8/19/2019 Aspects of Bayesian Inference

    9/25

    The Anatomy of 2SLS 2SLS Bias and Weak Instruments

    Z i   are uncorrelated with  ε i   and  ζ 1i 

    Therefore E [ ρZ ε] = 0, and we have

    E [  β2SLS − β]   ≈   (E [e P Z e ])−1E [ ρZ ε] + (E [e P Z e ])−1E [ξ 1P Z ε]= (E [e P Z e ])

    −1E [ξ 1P Z ε].

    Substitute in the first stage again.

    E [  β2SLS − β] ≈ (E [ ρZ  + ξ 1)P Z (Z  ρ + ξ 1)])−1E [ξ 1P Z ε].Note that  E [ ρZ ξ 1] =  0, so we get no cross-terms:

    E [  β2SLS − β] ≈ [E ( ρZ Z  ρ) + E (ξ 1P Z ξ 1)]−1E (ξ 1P Z ε).

    17

    The Anatomy of 2SLS 2SLS Bias and Weak Instruments

    If we continue this proof, it can be shown that the bias of 2sls isapproximately:

    b 2sls  = E [  β2sls − β] ≈  σ εζ 1σ 2ζ 11

    F  +  1  (6)

    F   is the population analogue of the F-statistic for the joint

    significance of the instruments in the first-stage regression. ρ → 0,  σ 2ζ 1 = σ 

    2e , and  b 2sls  = b ols  =

      σ εe σ 2e 

     ρ =  0. All variation in the first stage is through  ξ 1,  ⇒  variation ine   is the same as e  ρ = 0,  F  is small, then for  F  → 0, b 2sls  →

      σ εζ 1σ 2ζ 1

    2SLS is biased towards OLS with weak instruments.

    18

    Notes

    Notes

  • 8/19/2019 Aspects of Bayesian Inference

    10/25

    The Anatomy of 2SLS 2SLS Bias and Weak Instruments

    Remark

    When instruments are weak the finite (and large) sample bias of 2sls is accentuated.

    When instruments are weak 2sls standard errors are biased downwards.

    [Recall: When instruments are valid, adding instruments canreduce the variance of the two-stage least squares estimator.] 

    Weak instruments: confidence can be misleading in that the mid-point is biased and their width too narrow 

    19

    The Anatomy of 2SLS The Wald Estimator

    The Wald (IV) Estimator

    Back to the Angrist Krueger study

    Table 1, Panel A, shows average years of education and averagelog earnings for individuals born in the first and fourth quarter,using the 1990 census:

      βIV   =  ȳ 4 −  ȳ 1ē 4 − ē 1 = 0.0893   (0.0105)Ȳ t    ē t , are average log earnings and years of education forindividuals born in  t th quarter.   N  =  162,487

    the ratio of the reduced form to the first stage regression.

    20

    Notes

    Notes

  • 8/19/2019 Aspects of Bayesian Inference

    11/25

    The Anatomy of 2SLS The Wald Estimator

    A Generalisation: Multiple Instruments

    Consider the following model

    y i    =   xi  β + εi 

    =   wi  β1 + e i  β2 + εi 

    xi  = (wi , e i ) is a 501 dimensional vector of included covariates.

    wi   is a 500× 1 vector of state (50) times year (10) of birthdummies.

    AK then interact the binary instrument Q i  with the state and yearbirth dummies.

    Q i  = 1 (0) 4th (1st) quarter of birth

    In Table 1, section A and B provide inference on education effectswith single and multiple instruments

    21

    The Anatomy of 2SLS The Wald Estimator

    The 2sls estimator is given by

      β2SLS    = (XZ(ZZ)−1ZX)−1(XZ(ZZ)−1ZY)  β2,2SLS    =   0.073   (0.008)with dimensions for  Y, X, Z :

    N × 1, N × (K  +  M ), and  N × (L + M )  vector.

    N × 1, N × 501, and a  N × 1000 vector.

    K ,  M   and  L   denote endogenous variables, included and excludedinstruments.

    22

    Notes

    Notes

  • 8/19/2019 Aspects of Bayesian Inference

    12/25

    The Anatomy of 2SLS The Wald Estimator

    Table : Multiple Instruments

    A: Summary Statistics Subject of AK Data

    Vari able 1 st Quart er 4 th Quar te r D i ffe rence

    Year of Education   12.688 12.840 0.151Log Earnings   5.892 5.905 0.014Ratio   0.089

    B: Real and Random QOB Estimates

    S in gle Instrumen t 500 In strumen tsTSLS LIML

    Real QOB   0.089   (0.011)   0.073   (0.008)   0.095   (0.017)Random QOB   −1.958   (18.116)   0.059   (0.085)   −0.330   (0.1001)

    23

    The Anatomy of 2SLS The BJB Critique

    Bound, Jaeger and Baker (1995) show that randomly generatedinstruments, designed to match the data of Angrist and Krueger(1991), yield results remarkably similar to those based on theactual instruments

    Results are reported in section B of Table 1.

    In the case of a large number of instruments, the results imply thatrandom Z  can be used to infer the returns to education.

    24

    Notes

    Notes

  • 8/19/2019 Aspects of Bayesian Inference

    13/25

    The Anatomy of 2SLS The BJB Critique

    Exam Question

    Consider the following linear regression

    y i    =   ρS i  +  η i S i    =   π z i  +  εi 

    y i ,  S i , denote, respectively, wages and a measure of schooling forthe  i th individual.   ηi   and  ε i   are stochastic errors, and

    Cov(S i , ηi ) = 0.   z i  is a scalar instrument where Cov(z i , ηi ) = 0.   ρand  π  are unknown parameters.

    a) In the case where  z i  is a binary instrument, show that the IVestimator for  ρ  is

     ρ   =  E (y i |z i  = 1)− E (y i |z i  = 0)

    E (S i |z i  = 1)− E (S i |z i  = 0).

    25

    The Anatomy of 2SLS The BJB Critique

    b) Table 2 presents estimates of the effects of education on thelogarithm of men’s weekly earnings. Standard errors are givenin parenthesis.Column 1 presents results for the OLS estimator usingdifferent age controls. Given the presumed endogeneity of education, columns 2-4 present parameter estimates for a

    number of specifications based upon the use of instrumentalvariables (IV), with differing degrees of identification.

    The F statistic (F E ) for the test of the joint statisticalsignificance of the excluded instruments is reported.

    Provide a succinct summary of the main results.

    26

    Notes

    Notes

  • 8/19/2019 Aspects of Bayesian Inference

    14/25

    The Anatomy of 2SLS The BJB Critique

    Table : Estimated Effect of Completed Years of Education on Men’s LogWeekly Earnings

    (1) (2) (3) (4)OLS IV IV IV

    Coefficient .063 .142 .081 .083(.000) (.033) (.016) (.009)

    F E    13.486 4.747 2.428

    Age Control Variables 

    Age, Age2 x x9 Year of birth dummies x x

    Excluded instrumentsQuarter of birth x x xQuarter of birth  ×  year of birth x xQuarter of birth  ×  state of birth xNumber of excluded instruments 3 30 180

    Calculated from the 5% Public-Use Sample of the 1980 US Census for men born 1930-1938.Sample size is 329,509. All specifications include Race (1=black), SMSA (1=central city),Married (1 = married, living with spouse), and 8 Regional dummies as control variables.Column 4 also includes 50 state dummies as controls.

    SMSA denotes Standard Metropolitan Statistical Area.

    27

    Tests of Exogeneity and Weak Instruments

    The optimal GMM estimator, θGMM , based upon momentconditions ψ(wi , θ)  minimises

    Q C (θ) = [ 1

    ∑ i =1

    ψ(wi , θ)] C N [ 1

    ∑ i =1

    ψi (wi ,θ)]   (7)

     C N  = S−1 

    S =  Var(N −1 ∑ ψi (wi ,

     θ))

    A general model specification test can be used where  M  > K .

    This is a test of the population moment conditions.

    H 0   : E [ψ(w, θ0)] = 0   (8)

    - test closeness of  n−1 ∑  ψi , to 0, where ψi  = ψ(wi , θ).Just-identified models

    n−1 ∑  ψi  = 0  is imposed and the test is not possible.Over-identified modelsCannot set all  n−1 ∑ 

    ψij  = 0, j  =  1, ..., M 

    28

    Notes

    Notes

  • 8/19/2019 Aspects of Bayesian Inference

    15/25

    Tests of Exogeneity and Weak Instruments

    Remark

    Under H 0   : E [ψ(w, θ0)] = 0,  the value of Q C (θ)  evaluated at the 

    efficient GMM estimator ( 

     θGMM ) ∼  χ

    2(M −K )

    Using this statistic, and for M  > K, we can evaluate whether all 

    (or a subset) of the moment conditions (OC) are valid.

    These are tests of overidentifying restrictions.

    There are M −K of them.

    29

    Tests of Exogeneity and Weak Instruments Overidentification Tests

    Remark

    the overidentification test is a test of whether all the moment conditions are satisfied.

    If Q C (θ)  is large either the moment conditions or the other model assumptions (or both) are likely to be false.

    Only if we are confident about the other assumptions can we interpret a large value of Q C (.)  as evidence of the endogeneity of some of the instruments in  x i 

    Remark

    Hansen’s J statistic is consistent in the presence of heteroskedasticity; Sargan’s statistic is not.

    30

    Notes

    Notes

  • 8/19/2019 Aspects of Bayesian Inference

    16/25

    Tests of Exogeneity and Weak Instruments Overidentification Tests

    Example

    Taking the linear model as an example. Given the null hypothesis

    H 0   : E [ψ(w, θ0)] = 0   (9)

    then

    Q C (θ) = [1

    nZ(y −Xθ)] C−1N   [ 1n Z(y−Xθ)]

    ∼a  χ2(M −k )

    We will reject the overidentifying restrictions if the  p-value 

    Pr( χ2(M −k ) ≥ Q C (θ))   (10)

    is  <   the chosen significance level for the test.

    31

    Tests of Exogeneity and Weak Instruments A Test of a Subset of Orthogonality Conditions

    Consider the case when the number of zero restrictions is large(M − K   is large).

    Hansen/Sargan tests are Omnibus-type tests, and are thereforelikely to have low power.

    We can divide the   M instruments  into two groups, and test thevalidity of a subset of the instruments (say  M B )

    Vernacular:  Difference-in-Sargan or C-tests 

    Let J A  denote the efficient GMM test statistic usingM A =  M −M B   orthog. conditions that are  not  being tested.

    Proposition

    If the weighting matrix  CN   is chosen optimally for both J statistics,and restricted and unrestricted equations are well-specified.

    C  ≡ J − J A ∼a  χ2(M −M A )

    32

    Notes

    Notes

  • 8/19/2019 Aspects of Bayesian Inference

    17/25

    Tests of Exogeneity and Weak Instruments A Test of a Subset of Orthogonality Conditions

    Example: dividing the  M   instruments into two groups:

    zi 1:   M A  variables that are known  to satisfy orthogonality conditions

    zi 2:   M −M A =  M B   variables that are   suspected  of beingendogenous

    zi  = (zi 1, zi 2)

    Wish to test:   E (zi 2εi ) = 0

    Remark

    The basic notion is to compare two J statistics from 2 separate GMM estimators of the same coefficient vector  θ.

    1 using instruments in  z i 12 using instruments in  z i 2  in addition to  zi 1

    If the inclusion of  z i 2   substantially increases the J statistic, thenwe have reason to suspect the predeterminedness of  zi 2.

    Testable restrictions: if  M A  > K .

    33

    Tests of Exogeneity and Weak Instruments A Test of Weak Identification

    Remark

    One obvious informal test of weak instruments is to simply correlate the set of instruments with the endogenous variable(s)

    This will create pairwise correlation statistics.

    The joint correlation across a set of multiple instruments can be represented by the R 2 statistic from the first stage regression.

    A test of identification A measure of the joint significance of aset of instruments  z   is the F statistic.

    A test of weak identification  A rule of thumb suggested by Staiger and Stock (1997):

    an F statistic of less than 10 may indicate weak instruments.

    34

    Notes

    Notes

  • 8/19/2019 Aspects of Bayesian Inference

    18/25

    Tests of Exogeneity and Weak Instruments A Test of Weak Identification

    Stock and Yogo (2005) introduce a formal test of weakidentification.

    The null hypothesis: instruments are weak

    The alternative hypothesis: instruments are not weak

    One variant of the test which presumes overidentification, isderived from the notion that in the presence of weak instrumentsestimation bias of the IV estimator can be large and potentiallyexceeds that of the OLS estimator.

    The test: choose  b   the largest relative bias of the 2SLS estimatorrelative to OLS that is acceptable. Critical values vary with  b , thenumber of endogenous variables (m) and the number of exclusionrestrictions (K ).

    Example

    b  =  0.05,  m  =  1 and  K  = 3, the critical value of the test is 13.91Reject the null of weak instruments if the calculated  F   statistics> 13.91

    35

    GMM Distance Tests of Endogeneity and Exogeneity

    A C-test  of the exogeneity of one or more regressors or instruments.

    STATA: the  orthog option takes as its argument the list of exogenous variables (Z 2  above) whose exogeneity is called intoquestion.

    - if the exogenous variable being tested are instruments, theefficient GMM estimator that does not use the corresponding

    orthogonality condition - simply drops the instruments.This is illustrated below - the second regression is the estimationimplied by the  orthog option in the first.

    ivreg2   y x 1   x 2   (x 3   =   z 1   z 2   z 3   z 4), orthog(z 2)ivreg2   y x 1   x 2   (x 3   =   z 1   z 3   z 4)

    36

    Notes

    Notes

  • 8/19/2019 Aspects of Bayesian Inference

    19/25

    GMM Distance Tests of Endogeneity and Exogeneity

    Durbin-Wu-Hausman tests: estimate a model using OLS and IVand compare estimated coefficient vectors.

    H 0   OLS estimator is consistent and efficient

    Test Statistic:

    H  =  n(

      β

    c −

      β

    e )G (

      β−

      β

    e )   (11)

      βc  - estimator that is consistent (c) under  H 0  and H a  ( IV );  βe  - efficient (e) under  H 0, inconsistent if null not true (OLS ).G  = (Var (  βc ) −Var (  βe ))Var (  β)   is a consistent estimate of the asymptotic variance of  β ;G  denotes a generalized inverse.

    H  ∼  χ2k e ;  k e  denotes number of regressors tested for endogeneity.

    37

    GMM Distance Tests of Endogeneity and Exogeneity

    Remark

    For a discussion of 

    - the Durbin-Wu-Hausman test of the endogeneity of regressors 

    - the generalisation of this test to subsets of regressors, and 

    - how the Hausman form of this test may be interpreted as a GMM test 

    see Baum, C., M. Schaffer, and S. Stillman (2003)

    38

    Notes

    Notes

  • 8/19/2019 Aspects of Bayesian Inference

    20/25

    Endogeneity and the Linear Model

    We begin with the linear regression model

    y i  = xi  β + εi    (12)

    y i   is N × 1 and x i   is N × K 

    zi   is a  N × r   matrix of instruments.

    zi   satisfies the moment conditions

    E [zi εi ] =  0   (13)

    The associated GMM estimator minimises

    Q N ( β) =

      N 

    ∑ i =1

    zi εi 

    CN 

     N 

    ∑ i =1

    zi εi 

      (14)

    where CN   is the  r × r  weighting matrix.

    39

    Endogeneity and the Linear Model

    An extended IV estimation STATA package is now available. Thecomponents of the package:

    1 ivreg2 extends STATA’s  ivreg, including facilities for GMM(generalized method of moments), tests of exogeneityconditions, and weak identification.small requests that small-sample statistics (F and t) bereported instead of large-sample statistics (chi-squared andz-statistics). Large-sample statistics are the default.

    Remark

    ivreg2 provides extensions to Stata’s official ivreg. ivreg2 shares the same command syntax as official ivreg and supports (almost)all of its options.

    Differences between ivreg2 and ivreg : optional two-step feasible GMM estimation (gmm option); automatic output of the Hansen-Sargan statistic for overidentifying restrictions;

    40

    Notes

    Notes

  • 8/19/2019 Aspects of Bayesian Inference

    21/25

    Endogeneity and the Linear Model

    An Instrumental variables (IV) estimation package - cont

    1 orthog(varlist ) requests that a C-statistic be calculated as atest of the exogeneity of a subset of instruments.

    gmm, robust  and/or  cluster, Hansen’s J statistic isreported.

    This statistic allows observations to be correlated withingroups.

    2   ivhettest Heteroskedasticity tests for IV estimation.

    3   overid Overidentification tests for IV estimation.

    41

    Endogeneity and the Linear Model

    IVREG2

    The  gmm  option implements the two-step efficient generalizedmethod of moments (GMM) estimator.

    Efficiency gains of this estimator relative to the IV/2SLS estimatorderive from the use of the optimal weighting matrix.

    Efficient GMM estimator: robust to the presence of 

    heteroskedasticity of unknown form

    cluster option chosen, efficient GMM estimator uses acluster-robust optimal weighting matrix and the estimator is alsorobust to arbitrary intra-cluster correlation.

    Assumption of conditional homoskedasticity: efficient GMMestimator becomes the traditional 2SLS estimator.

    42

    Notes

    Notes

  • 8/19/2019 Aspects of Bayesian Inference

    22/25

    Endogeneity and the Linear Model Isolating the effect of Education on wages from Ability

    Example: Isolating the effect of Education on wages fromAbility

    National Longitudinal Survey of Young Men: cohorts 1969 and1980 (See Griliches (1976)).

    RNS=dummy for residency in southern states, MRT=dummy formarital status (1 if married)

    SMSA = dummy for residency in metropolitan areas, MED =mother’s education (years)

    KWW = score on Knowledge of Work test, IQ = IQ Score, AGE =age of individual

    S = completed years of schooling, EXPR = experience (years),

    TENURE = tenure in years, LW = log wage

    IQ: error-ridden and endogenous; also S may be endogenous.

    43

    Endogeneity and the Linear Model Isolating the effect of Education on wages from Ability

    Schooling is Endogenous in the Wage Equation?

    LW i  =  β1 +  β2S i  +  β3EXPR i  +  β4IQ i  +  εi    (15)

    1 Efficient two-step GMM estimation of  β  with

    zi  = (1, EXPR i , AGE i , MED i , S i )  as instruments.

    zi 1  = (1, EXPR i , AGE i , MED i )

    zi 2 = (S i )2 Compute J  and the 5× 5 matrix S3 Extract the 4× 4 submatrix S1  (from S) corresponding to z i 1.4 Estimate the same coefficient  β  by GMM with reduced set of 

    instruments.

    5 Difference in  J   statistic from two different GMM  ∼a  χ2(1)

    Why is age predetermined here?

    44

    Notes

    Notes

  • 8/19/2019 Aspects of Bayesian Inference

    23/25

    Endogeneity and the Linear Model Isolating the effect of Education on wages from Ability

    Below we introduce  STATA syntax for estimation and testing in thecontext of 2SLS and GMM estimaors

    The files used during the  STATA  session reproduce Table 3.3 of 

    Hayashi (2000)  Econometrics .

    45

    Endogeneity and the Linear Model Isolating the effect of Education on wages from Ability

    46

    Notes

    Notes

    E d i d h Li M d l I l i h ff f Ed i f Abili

  • 8/19/2019 Aspects of Bayesian Inference

    24/25

    Endogeneity and the Linear Model Isolating the effect of Education on wages from Ability

    * Reproduce line 5 of table *, Hayashi (2000)

    ** standard IV/2SLS estimator with asymptotic standard errors

    ** - assumption - conditionally homoscedastic/independent errors

    ivreg2 lw expr tenure rns smsa _I* (iq s = med kww mrt age)

    ** IV/2SLS estimator: robust to heteroscedasticity

    ** - robust option: Sargan statistic changes to Hansen J

    ivreg2 lw expr tenure rns smsa _I* (iq s = med kww mrt age), robust

    ** IV/2SLS estimator: robust to heteroscedasticity

    ** small

    ivreg2 lw expr tenure rns smsa _I* (iq s = med kww mrt age) , robust small

    * Reproduce line 5 of table *, Hayashi (2000)

    ** 2 step GMM estimator: robust to heteroscedasticity

    ivreg2 lw expr tenure rns smsa _I* (iq s = med kww mrt age) , gmm2s robust

    47

    Endogeneity and the Linear Model Isolating the effect of Education on wages from Ability

    ** Testing Exogeneity of Regressors or Instruments

    ** test H0: rns is a proper instrument

    i) using orthog option ("Diff-Sargan" C statistic)

    ivreg2 lw expr tenure rns smsa _I* (iq s = med kww mrt age), orthog(rns)

    ii) ivreg2 lw expr tenure rns smsa _I* (iq s = med kww mrt age), endog(rns)

    * cluster option: must reduce number of instruments below number of clusters* in order to calculate the variance-covariance matrix

    * (although standard ivreg does not impose that constraint)

    ivreg2 lw expr tenure rns smsa _I* (iq s = med kww mrt age), robust cluster(age)

    ivreg lw expr tenure rns smsa (iq s = med kww mrt age) , robust cluster(age)

    ivreg2 lw expr tenure rns smsa (iq s = med kww mrt age) , small robust cluster(age)

    * first and ffirst options

    ivreg2 lw expr tenure rns smsa _I* (iq s = med kww mrt age) , ffirst

    ivreg2 lw expr tenure rns smsa _I* (iq s = med kww mrt age) , first

    48

    Notes

    Notes

    Th C sts f GMM

  • 8/19/2019 Aspects of Bayesian Inference

    25/25

    The Costs of GMM

    Remark

    IV estimates of the standard errors are inconsistent in the presence of heteroscedasticity, thereby compromising valid inference.

    Standard diagnostic tests for endogeneity and overidentification are also invalid.

    The usual approach when facing heteroscedasticty of unknownform is to use GMM.

    Efficient GMM is consistent in the presence of arbitrary heteroscedasticty, but at a cost of poor finite sample performance.

    If heteroscedasticty is not present, then standard IV may be preferable.

    49

    The Costs of GMM

    The cost of GMM

    If heteroscedasticity is present it is possible to improve on 2SLS using GMM.

    As Wooldridge (2001) notes, there are still a dearth of applications, particulary cross-section, that utilise GMM.

    The reason for this is that one can always run 2SLS using standard

    errors that are robust to heteroscedasticity.Moreover, the optimal weight matrix which is required for efficientGMM is a function of fourth-order moments.

    Reasonable estimates of fourth-order moments require a largesample size

    Consequence: poor small sample properties mean Wald tests tendto over-reject the null

    50

    Notes

    Notes