10
Journal of Econometrics 148 (2009) 114–123 Contents lists available at ScienceDirect Journal of Econometrics journal homepage: www.elsevier.com/locate/jeconom Simulation based selection of competing structural econometric models Tong Li * Department of Economics, Vanderbilt University, VU Station B #351819, Nashville, TN 37235-1819, United States article info Article history: Available online 11 October 2008 Keywords: Model selection Non-nested structural models Simulated mean squared error of predictions abstract This paper proposes a formal model selection test for choosing between two competing structural econometric models. The procedure is based on a novel lack-of-fit criterion, namely, the simulated mean squared error of predictions (SMSEP), taking into account the complexity of structural econometric models. It is asymptotically valid for any fixed number of simulations, and allows for any estimator which has a n asymptotic normality or is n α -consistent for α> 1/2. The test is bi-directional and applicable to non-nested models which are both possibly misspecified. The asymptotic distribution of the test statistic is derived. The proposed test is general, regardless of whether the optimization criteria for estimation of competing models are the same as the SMSEP criterion used for model selection. A Monte Carlo study demonstrates good power and size properties of the test. An empirical application using timber auction data from Oregon illustrates the usefulness and generality of the proposed testing procedure. © 2008 Elsevier B.V. All rights reserved. 1. Introduction Model selection is an important component of statistical inference. It involves comparing competing models based on some appropriately defined goodness-of-fit or selection criterion. For the competing models that can be estimated by (conditional) maximum likelihood estimation (MLE), there has been a vast literature on model selection procedures, such as the Akaike (1973, 1974) information criterion (AIC), the Cox test (1961) and the Vuong (1989) likelihood ratio test, to name only a few. Another important development is the use of the encompassing principle in testing non-nested models assuming that one of them is correctly specified. See, e.g., Mizon and Richard (1986), and Wooldridge (1990), among others. For a comprehensive review of the literature, see Gourieroux and Monfort (1994) and Pesaran and Weeks (2001). In light of the development of new estimation methods in econometrics such as the generalized method of moments (GMM) and empirical likelihood estimation methods, which offer robust alternatives to the conventional MLE, recent work in model selection has attempted to develop procedures that can be used for models estimated by other methods than the MLE. For example, see Smith (1992) for extensions of the Cox test I am grateful to Co-Editor Takeshi Amemiya, an associate editor, and three referees for their constructive comments that greatly improved the paper. I also thank D. Andrews, F. Diebold, Y. Kitamura, H. Pesaran, and seminar participants at Indiana University, University of Cambridge, University of Pennsylvania, and Yale University for helpful comments and discussions, and Bingyu Zhang for excellent assistance in Monte Carlo experiments conducted in this paper. * Tel.: +1 615 322 3582; fax: +1 615 343 8495. E-mail address: [email protected]. and the encompassing test to non-nested regression models that are both estimated by instrumental variables, Rivers and Vuong (2002) for the extension of Vuong’s (1989) test to dynamic models, Kitamura (2002) for using empirical likelihood ratio-type statistics for testing non-nested conditional models, and Chen et al. (2003) for likelihood ratio tests between parametric and (unconditional) moment condition models. These model selection tests have been found useful in some of structural microeconometric models, which have been developed in the last two decades and applied in such fields of modern economics as labor and industrial organization. 1 For example Vuong’s (1989) likelihood ratio test has been used to select structural models both of which are estimated by MLE. See, e.g., Gasmi et al. (1992) for testing collusive behavior, Wolak (1994) for testing asymmetric information, and Li (2005) for testing binding reservation prices in first-price auctions, to name only a few. Also, Chen et al. (2003) develop a test to distinguish between a parametric model which can be estimated by the MLE and an unconditional moment model which can be estimated by the empirical likelihood method, and then apply their procedure to choose between a sequential search model and a non-sequential model. Despite these interesting applications of the aforementioned model selection tests, there are many other situations in which these model selection tests may not be applicable. 2 Such a gap can be mainly attributed to the 1 Heckman (2001) gives an insightful discussion on the development and the issues on identification and inference of structural microeconometric models. 2 For instance, Laffont et al. (1995) develop a simulated nonlinear least squares estimator to estimate a structural model of first-price auctions. They encounter 0304-4076/$ – see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.jeconom.2008.10.001

Simulation based selection of competing structural econometric models

  • Upload
    tong-li

  • View
    213

  • Download
    1

Embed Size (px)

Citation preview

Journal of Econometrics 148 (2009) 114–123

Contents lists available at ScienceDirect

Journal of Econometrics

journal homepage: www.elsevier.com/locate/jeconom

Simulation based selection of competing structural econometric modelsI

Tong Li ∗Department of Economics, Vanderbilt University, VU Station B #351819, Nashville, TN 37235-1819, United States

a r t i c l e i n f o

Article history:Available online 11 October 2008

Keywords:Model selectionNon-nested structural modelsSimulated mean squared error ofpredictions

a b s t r a c t

This paper proposes a formal model selection test for choosing between two competing structuraleconometric models. The procedure is based on a novel lack-of-fit criterion, namely, the simulatedmean squared error of predictions (SMSEP), taking into account the complexity of structural econometricmodels. It is asymptotically valid for any fixed number of simulations, and allows for any estimator whichhas a√n asymptotic normality or is nα-consistent for α > 1/2. The test is bi-directional and applicable to

non-nested models which are both possibly misspecified. The asymptotic distribution of the test statisticis derived. The proposed test is general, regardless of whether the optimization criteria for estimation ofcompeting models are the same as the SMSEP criterion used for model selection. A Monte Carlo studydemonstrates good power and size properties of the test. An empirical application using timber auctiondata from Oregon illustrates the usefulness and generality of the proposed testing procedure.

© 2008 Elsevier B.V. All rights reserved.

1. Introduction

Model selection is an important component of statisticalinference. It involves comparing competingmodels based on someappropriately defined goodness-of-fit or selection criterion. Forthe competing models that can be estimated by (conditional)maximum likelihood estimation (MLE), there has been a vastliterature on model selection procedures, such as the Akaike(1973, 1974) information criterion (AIC), the Cox test (1961)and the Vuong (1989) likelihood ratio test, to name only a few.Another important development is the use of the encompassingprinciple in testing non-nested models assuming that one of themis correctly specified. See, e.g., Mizon and Richard (1986), andWooldridge (1990), among others. For a comprehensive reviewof the literature, see Gourieroux and Monfort (1994) and Pesaranand Weeks (2001). In light of the development of new estimationmethods in econometrics such as the generalized method ofmoments (GMM) and empirical likelihood estimation methods,which offer robust alternatives to the conventional MLE, recentwork in model selection has attempted to develop proceduresthat can be used for models estimated by other methods than theMLE. For example, see Smith (1992) for extensions of the Cox test

I I am grateful to Co-Editor Takeshi Amemiya, an associate editor, and threereferees for their constructive comments that greatly improved the paper. I alsothank D. Andrews, F. Diebold, Y. Kitamura, H. Pesaran, and seminar participants atIndiana University, University of Cambridge, University of Pennsylvania, and YaleUniversity for helpful comments and discussions, and Bingyu Zhang for excellentassistance in Monte Carlo experiments conducted in this paper.∗ Tel.: +1 615 322 3582; fax: +1 615 343 8495.E-mail address: [email protected].

0304-4076/$ – see front matter© 2008 Elsevier B.V. All rights reserved.doi:10.1016/j.jeconom.2008.10.001

and the encompassing test to non-nested regression models thatare both estimated by instrumental variables, Rivers and Vuong(2002) for the extension of Vuong’s (1989) test to dynamicmodels,Kitamura (2002) for using empirical likelihood ratio-type statisticsfor testing non-nested conditional models, and Chen et al. (2003)for likelihood ratio tests between parametric and (unconditional)moment condition models.These model selection tests have been found useful in some of

structural microeconometric models, which have been developedin the last two decades and applied in such fields of moderneconomics as labor and industrial organization.1 For exampleVuong’s (1989) likelihood ratio test has been used to selectstructural models both of which are estimated by MLE. See,e.g., Gasmi et al. (1992) for testing collusive behavior, Wolak(1994) for testing asymmetric information, and Li (2005) fortesting binding reservation prices in first-price auctions, toname only a few. Also, Chen et al. (2003) develop a test todistinguish between a parametric model which can be estimatedby the MLE and an unconditional moment model which can beestimated by the empirical likelihood method, and then applytheir procedure to choose between a sequential search modeland a non-sequentialmodel. Despite these interesting applicationsof the aforementioned model selection tests, there are manyother situations in which these model selection tests may notbe applicable.2 Such a gap can be mainly attributed to the

1 Heckman (2001) gives an insightful discussion on the development and theissues on identification and inference of structural microeconometric models.2 For instance, Laffont et al. (1995) develop a simulated nonlinear least squaresestimator to estimate a structural model of first-price auctions. They encounter

T. Li / Journal of Econometrics 148 (2009) 114–123 115

complexity associated with the nature of structural econometricmodels. Model selection criteria are formulated in such waysthat they are calculated using sample information and comparedbetween competing models. Most of the structural econometricmodels, however, are constructed based on economic theorywhichdefines maps between the latent variable of interest or/and itsdistribution and the observables. For instance, in structural auctionmodels, it is assumed that the observed bids are Nash–Bayesianequilibrium strategies which are strictly increasing functions ofbidders’ private valuations whereas identifying and estimatingthe private values distribution is one of the main objectivesof the structural approach. The presence of latent variablesand the complex relationship between the latent and observedvariables defined by structural models make the formulation of awell-defined model selection criterion more involved. Moreover,in many cases, structural econometric models are constructedthrough moment conditions, meaning that they are estimated notby MLE but by GMM or method of simulated moments (MSM).Therefore, to accommodate these specific features arising fromthe nature of structural models and the estimation methods, newmodel selection tests need to be developed.Developing model selection procedures suitable in distinguish-

ing between competing structural models is especially relevant inusing the structural approach to analyze economic data and makepolicy evaluations. In the structural approach, policy analysis andthe resulting recommendations are based on a structural modelthat is closely derived from economic theory assuming that the in-volved economic agents are in the environment described by thetheory and behave according to the theory. As a result, it is piv-otal to validate the structural model under consideration. For ex-ample, when analyzing auction data using the structural approach,an econometrician faces choices among different paradigms suchas a private value model or a common value model. Even within achosen paradigm, the econometrician may also need to determinean appropriate parametric functional form for the latent distribu-tion. Furthermore, the researcher sometimes needs to choose be-tween different equilibria if multi-equilibria exist, as is the case formodels of two-stage dynamic games which yield a large numberof Bayesian perfect equilibria (Laffont and Maskin, 1990).The goal of this paper is thus to propose a new model selection

test in discriminating between competing structural econometricmodels. Our test is based on a comparison of the predictabilityof competing structural models. In time series literature, therehas been a rich set of papers since Diebold and Mariano (1995)and West (1996) in using predictability for model evaluation.More recently, a general model selection framework based onpredictability is developed in Rivers and Vuong (2002). Our testfalls within this framework, and uses a similar MSEP criterion.On the other hand, given that structural econometric modelsusually contain some latent variables that are unobserved, wepropose to simulate these latent variables in order to makethe predictions on the equilibrium outcomes. Also, since thesimulation is used, when formulating the sample analog to thosepopulation quantities, we need to correct for the asymptotic biasterm caused by the simulation, and hence propose a simulatedMSEP (SMSEP) as a consistent sample analog to the population

a problem of determining between 11 and 18 potential bidders. This problem,significant from an economic viewpoint as having 11 bidders could imply theexistence of a large trader and hence asymmetric bidding, calls for a formal testof non-nested models, as the structural models with different numbers of potentialbidders are non-nested. While this issue was not further pursued in Laffont et al.(1995) (see footnote 21 in Laffont et al. (1995)), and cannot be addressed using theexistingmodel selectionmethods, it can be resolved using our proposed procedure,as illustrated in the empirical application.

predictability criterion.3 As a result, while those using simulationbased prediction for model evaluations in time series frameworkusually require that the number of simulations tend to infinity,ours works for any fixed number of simulations. Moreover, ourmodel selection test allows for any estimators that are

√n

asymptotically normally distributed, or are n-consistent that canarise from some structural microeconometric models such asauction models and job search models (Donald and Paarsch, 1993,1996, 2002; Hong, 1998; Chernozhukov and Hong, 2004; Hiranoand Porter, 2003). Lastly, in a similar spirit to that of Vuong(1989) and Rivers and Vuong (2002), the test is bi-directionaland applicable to non-nested structural models which are bothpossibly misspecified. This adds a considerable advantage to theproposed test because in real applications, structural econometricmodels can be best considered an approximation, but not exactmodeling of the true data generating process. Nevertheless, withtwo possibly misspecified models, our model selection procedureenables one to tell which one is closer to the truth.While some empirical work has used predictions from struc-

tural models to validate a particular choice of the model, becauseof the lack of a formal test, it has been based on an ad-hoc com-parison of the closeness between the predictions and the observedoutcomes. The statistical significance of such a closeness is notassessed. In contrast, our testing procedure provides a formalframework in which the statistical significance of the differencein predictability of competing structural models can be assessed.The asymptotic distribution of the test statistic is derived. The pro-posed test is general regardless of whether the optimization crite-ria for estimation of competing models are the same as the SMSEPcriterion used for model selection.We conduct Monte Carlo exper-iments to study size and power properties of the test. An empiricalapplication using timber auction data from Oregon is used to illus-trate the usefulness and generality of the proposed testing proce-dure.It is worth noting that most of the recent work in model

selection tests has been based on comparing the Kullback–LeiblerInformation Criterion (KLIC) between two competing models. See,e.g., Kitamura (2000, 2002), and Chen et al. (2003). Our approachis different, as it is based on the simulated mean squared errors ofpredictions, a lack-of-fit criterion. This ismotivated by the fact thatmany structural econometric models are estimated by GMM orMSM other than the MLE, thus the KLIC cannot be used as a modelselection criterion.4 Our model selection criterion, on the otherhand, can be used for any estimationmethods that yield estimatorswith root-n asymptotic normality, or with nα-consistency for α >1/2, and hence has an appealing generality.This paper is organized as follows. Section 2 describes the

general model selection framework for structural econometricmodels using the SMSEP criterion. The hypotheses for modelselection are formulated. The asymptotic properties of theproposed test statistic are established. Section 3 is devoted toMonte Carlo experiments in investigating finite sample propertiesof the tests, and Section 4 considers an empirical application of theproposed test to structural auction models. Section 5 concludes.

2. An SMSEP criterion and the resulting model selection test

Two models M1 and M2 are estimated using data {yi, xi},i = 1, . . . , n, where y is a dependent variable and x is a 1 × K

3 The bias correction we use here in constructing the SMSEP adopts the oneintroduced by Laffont et al. (1995) for simulated nonlinear least squares estimation.4 On the other hand, if the structural models considered here are estimated usingempirical likelihood or other KLIC based methods, then one can apply the recentmodel selection tests such as Kitamura (2002).

116 T. Li / Journal of Econometrics 148 (2009) 114–123

vector of covariates. Both Mj, j = 1, 2, are structural models inthe sense that for model Mj, there is a p-dimensional vector oflatent variables vj ∈ Vj ⊂ Rp with the (conditional) probabilitydensity function (pdf) fj(·|x, θj) and the (conditional) cumulativedistribution function (cdf) Fj(·|x, θj), where θj is in Θj, a compactsubset of RKj , such that the observed dependent variable y andthe latent variables vj have a relationship as a result of thestructural model given by y = Hj(vj, fj(·|x, θj)) ≡ Hj(vj, x, θj).5As a result, the function Hj(·, x, θj) maps vj to the equilibriumoutcome y under model Mj. For instance, in structural auctionmodels where bidders are assumed to bid optimally accordingto the Nash–Bayesian equilibrium strategies, the observed bidscan be considered as an increasing function of bidders’ privatevaluations. See, e.g., Laffont (1997) for a review on empiricalauction models. Note that in addition to θj, the parameters thatappear in the (conditional) pdf of the latent variables, it is alsopossible to include in model Mj some parameters that are notassociated with the latent variable density provided that they canbe identified and estimated as well. An example of this case isbidders’ risk aversion parameter in auction models. Our modelselection procedure can be readily adopted to this case, in whichwe can have y = Hj(vj, fj(·|x, θj), γj) ≡ Hj(vj, x, θj, γj), whereγj is the parameter vector that is not associated with the latentvariable density. Thus, for ease of exposition, we will focus on thecase where each modelMj contains only θj. We have the followingrandom sampling assumption.

Assumption 1. {yi, xi}, i = 1, . . . , n, are independently andidentically distributed with finite first and second populationmoments.

Note that we make the random sampling assumption for the sakeof exposition. Our proposed selection procedure can be readilyextended to (weakly) dependent data, whose data generatingprocess satisfies the mixing conditions, such as those given inGallant and White (1988).Let θj be an estimator of θj using the observations {yi, xi}.

The estimator θj can be obtained from any estimation methodwith√n asymptotic normality. Specifically, we have the following

assumptions.

Assumption 2. For j = 1, 2, there is a unique θ∗j inside the interiorofΘj, such that θj converges to θ∗j in probability as n→∞.

Assumption 3. For j = 1, 2, there exist Kj × 1 random vectorsUj,i, i = 1, . . . , n, with mean zero and bounded second absolutemoments such that

√n(θj − θ∗j ) = −

1√nAj

n∑i=1

Uj,i + oP(1) (1)

where Aj are bounded nonstochastic symmetric Kj × Kj matrices.

Assumption 2 assumes the (weak) convergence of θj to a uniquevalue θ∗j inside the interior of Θj. Since we allow both Mj,

5 It is clear from the set-up here that both Mj, j = 1, 2, are allowed tobe conditional structural models with x being the variables that are used forcontrolling for heterogeneity, as is accounted for by most of the structural modelsin microeconometric applications. It is also worth noting that while we assumethe vector of exgenous variables x is the same for both competing models and thedimensionality of the latent variables vj is the same for both models for ease ofexposition, these assumptions are unnecessary and can be relaxed readily. Whenthe number of parameters varies across the competing models, since the proposedtest statistic is

√n-asymptotically normal, it is possible to use some penalty terms

as in Vuong (1989).

j = 1, 2, to be misspecified, θ∗j , j = 1, 2, are called pseudo-true values as in Gallant and White (1988). Assumption 3 givesan asymptotic linear representation for θj that is satisfied bymost of the econometric estimators possessing root-n asymptoticnormality (see, e.g., Newey and McFadden (1994)). Later thisassumption will be changed to accommodate the possibility thatone or both estimators are nα-consistent for α > 1/2.To formulate a set of hypotheses that are properly defined in

the framework of structural econometric models, we define thequantity

Qj(θ∗j ) = Ey,x[(y− EMj [y|x, θ∗

j ])2]

where Ey,x denotes the expectation taken with respect to the truebut unknown joint distribution of y and x, and EMj denotes thatthe expectation is taken with respect to modelMj, which may bemisspecified. Thus y− EMj [y|x, θ

j ] represents the prediction errorfrom the conditional model Mj. Qj(θ∗j ) is well-defined and finitebecause of Assumption 1.6Note that Qj(θ∗j ) can be viewed as the (asymptotic) lack-of-

fit from model Mj. Then within the classical hypothesis testingframework as adopted in Vuong (1989), Rivers and Vuong (2002),Kitamura (2000, 2002), and Chen et al. (2003), we can specify thefollowing set of null and alternative hypotheses

H0 : Q1(θ∗1 ) = Q2(θ∗

2 ),

meaning thatM1 andM2 are asymptotically equivalent, against

H1 : Q1(θ∗1 ) < Q2(θ∗

2 ),

meaning thatM1 is asymptotically better thanM2 in the sense thatthe former has a smaller (asymptotic) lack-of-fit than the latter, or

H2 : Q1(θ∗1 ) > Q2(θ∗

2 ),

meaning thatM2 is asymptotically better thanM1.From the formulation of the null and alternative hypotheses

above, it is clear that our model selection is based on acomparison of the asymptotic lack-of-fit, or predictability ofthe two (conditional) structural models under consideration. Inessence, under the null, both structural models have the sameasymptotic predictability, while under H1, model 1 has a betterasymptotic predictability than model 2, and under H2, model 2 isbetter than model 1 with respect to the asymptotic predictability.To test H0 against H1 or H2, we need to estimate Qj(θ∗j ) using

the observations {yi, xi}, i = 1, . . . , n. Let gj(·|x, θj) denote the pdffor y under modelMj. Then Qj(θ∗j ) could be estimated consistentlyby

Qj(θj) =1n

n∑i=1

(yi −

∫Hj(vj, xi, θj)fj(vj|xi, θj)dvj

)2(2)

=1n

n∑i=1

(yi −∫ygj(y|xi, θj)dy)2. (3)

However, (2) can be difficult to compute due to the functionalform of Hj(·, x, θj) that can be complicated or/and the multivariateintegral as a result of the multivariate vj, which lead to thecomputational burden in evaluating the integral. Similarly, (3) canbe computationally intractable because the functional form forgj(·|xj, θj), the pdf for the observed equilibrium outcome y undermodel Mj, can be hard to obtain as the result of the structuralmodel leading to y = Hj(vj, xi, θj) and so is the integral in (3). To

6 Throughout the paper, we assume that the endogenous variable is a scalar forease of exposition. Our testing procedure can be extended to the multivariate caseby defining Qj(θ∗j ) = Ey,x‖y − EMj [y|x, θ

j ]‖2 , where y is a vector of endogenous

variables.

T. Li / Journal of Econometrics 148 (2009) 114–123 117

address this issue, note that we can always re-write EMj [y|xi, θ∗

j ]

as

EMj [y|xi, θ∗

j ] = Euj [Ψj(uj, xi, θ∗

j )] (4)

where uj is a p-dimensional vector of random variables whosedistribution is known and thus does not depend on the underlyingparameters. For example, in general, Ψj(uj, xi, θ∗j ) can be takenas Hj(uj, xi, θ∗j )fj(uj|xi, θ

j )/ϕi(uj), where ϕi(·) is a known p-dimensional density whose support includes that of fj(·|xi, θ∗j ).In effect, this is the importance sampling technique that hasbeen widely used in simulation based inference. Therefore,(4) can be always obtained through the importance samplingtechnique. In some specific cases, Ψj(uj, xi, θ∗j ) can be obtainedthrough methods other than the importance sampling method.For instance, if vj is univariate, then Ψj(uj, xi, θ∗j ) can be takenas Hj(F−1j (uj|xi, θ∗j ), xi, θ

j ), where uj is a random variable fromUniform (0, 1).7In any case, (4) implies that EMj [y|xi, θ

j ] can be approximated

by Yj,i(θ∗j ) ≡∑Sjsj=1y(sj)j,i (θ

j )/Sj, where y(sj)j,i (θ

j ) ≡ Ψj(u(sj)j,i , xi, θ

j )

and u(sj)j,i , sj = 1, . . . , Sj, are independent draws from a knowndistribution, such as from ϕi(·) in general when the importancesamplingmethod is used, or fromUniform (0, 1) in the special casewhen vj is univariate as discussed in the preceding paragraph. Thisis because y

(sj)j,i (θ

j ) ≡ Ψj(u(sj)j,i , xi, θ

j ) is an unbiased simulator ofEMj [y|xi, θ

j ]. Noting that θ∗

j are unknown, but can be consistentlyestimated by θj, we could replace the integral in (2) by Yj,i(θj) ≡∑Sjsj=1y(sj)j,i (θj)/Sj where y

(sj)j,i (θj) = Ψj(u

(sj)j,i , xi, θj). Because of its

nonlinearity, however, the following quantity

1n

n∑i=1

(yi − Yj,i(θj))2

does not converge to Qj(θ∗) for any fixed number Sj of simulationsas the asymptotic bias caused by the simulations does not vanish.To correct for the asymptotic bias caused by the simulations, wedefine

Qj(θj) =1n

n∑i=1

(yi − Yj,i(θj))2

−1n

n∑i=1

1Sj(Sj − 1)

Sj∑sj=1

(y(sj)j,i (θj)− Yj,i(θj))

2. (5)

Moreover, we make following assumptions.

Assumption 4. For j = 1, 2,Ψj(uj, xi, θj) are continuous at θ∗j withprobability one.

Assumption 5. For j = 1, 2, there are neighborhoodsNj of θ∗j suchthat E[supθj∈Nj |Qj(θj)|] <∞.

Note that Assumption 4 is satisfied in both cases discussed above iffj(·|xi, θj) and Hj(·, xi, θj) are both continuous at θ∗j . Assumption 5is satisfied if Ψj(uj, xi, θj) has finite first and second momentsin a neighborhood Nj of θ∗j , and the covariance between yi andΨj(uj, xi, θj) is finite inNj and if Assumption 1 ismet. Thenwe havethe following result regarding the relationship between Qj(θj) andits population counterpart Qj(θ∗j ).

7 In some cases depending on the forms of the conditional expectation ofEMj [y|x, θ

j ], there are some other alternative ways of defining Ψj(uj, xi, θ∗

j ), asillustrated in Section 3.

Proposition 1. Assume Assumptions 1, 2, 4 and 5. For any fixed Sj, asn→∞, Qj(θj) converges to Qj(θ∗j ) in probability.

As justified in Proposition 1, for j = 1, 2 and any fixed Sj, Qj(θj)consistently estimate Qj(θ∗j ). Furthermore, S1 does not necessarilyequal S2. As a result, we propose to use Qj(θj) in practice to estimateQj(θ∗j ) and hence to test H0 against H1 or H2. Qj(θj) is thus theSMSEP we propose for modelMj. It can be viewed as an in-sampleSMSEP as it is calculated from the same sample that is used in theestimation. Alternatively,we can consider anout-of-sample SMSEPin the sense that the original data set is split into two parts, onepart is used for estimation of the competing models, and the otherpart is used for calculating Qj(θj) and hence for model selectiontest. Since within the framework considered here, the asymptoticproperties of the tests based on in-sample and out-of-sample arethe same, we will focus on the in-sample test based on (5) forease of exposition. It is worth noting that using Qj(θj) has thecomputational advantage as it can be readily obtained from thesample information with the help of simulations. Besides, as givenin Proposition 1, it converges to the population lack-of-fit criterionas the sample size approaches infinity for any fixed number ofsimulations. This feature makes it a basis for constructing ourtest statistic below. Note that bias corrections similar to (5) werefirst used in Laffont et al. (1995) and subsequently in Li andVuong (1997) in constructing objective functions to be minimizedthat produce simulated nonlinear least squares estimators whichare consistent for a fixed number of simulations in estimatingstructural auction models. A novelty of this paper is to use (5) for adifferent purpose, that is to use it as a consistent sample analogto the population lack-of-fit criterion in constructing a generaltest statistic for choosing between rival structural econometricmodels, not limited to auction models, as long as the structuralmodels under consideration allow one to generate predictionsfrom simulations.In order to propose our test statistic, we define Tn ≡

√n(Q1(θ1)− Q2(θ2)). We also make the following assumptions.

Assumption 6. For j = 1, 2, Ψj(uj, xi, θj) are twice continuouslydifferentiable in a neighborhood of θ∗j .

Assumption 7. For j = 1, 2, there are neighborhoodsLj of θ∗j suchthat E[supθj∈Lj |∂Qj(·)/∂θj|] <∞.

Note that Assumption 6 is satisfied in both cases discussed abovewhen fj(·|xi, θj) and Hj(·, xi, θj) are both twice continuouslydifferentiable in a neighborhood of θ∗j . Assumption 7 is madeto ensure the stochastic equicontinuity (e.g. Andrews (1994)) of∂Qj(·)/∂θj at θ∗j , and can be satisfied if ∂Ψj(uj, xi, ·)/∂θj has finitefirst and second moments in a neighborhood Lj of θ∗j , and thecovariance between yi and ∂Ψj(uj, xi, ·)/∂θj is finite in Lj and ifAssumption 1 is met. The next theorem establishes asymptoticproperties of Tn under our specified hypotheses H0, H1 and H2.

Theorem 1. Assume Assumptions 1–3 and 5–7.(i) Under H0, Tn ⇒ N(0, σ 2), where

σ 2 = p limn→∞[(1,−B1,n, B2,n)Vn(1,−B1,n, B2,n)′],

Vn = Var

[ CnU1,nU2,n

],

Ci = (yi − Y1,i(θ∗1 ))2−

1S1(S1 − 1)

S1∑s1=1

(y(s1)1,i (θ∗

1 )− Y1,i(θ∗

1 ))2

− (yi − Y2,i(θ∗2 ))2+

1S2(S2 − 1)

S2∑s2=1

(y(s2)2,i (θ∗

2 )− Y2,i(θ∗

2 ))2,

118 T. Li / Journal of Econometrics 148 (2009) 114–123

Bj,n =∂Qj∂θ′

j

|θ∗jAj,

and Aj and Uj,i are defined in (1) of Assumption 3.(ii) Under H1, Tn→p−∞.(iii) Under H2, Tn→p∞.

Theorem 1 is valid in a general sense in that while the SMSEPcriteria Qj(θj), j = 1, 2, are used in constructing Tn, the estimationmethods that are used to obtain θj, j = 1, 2, can be any resulting inestimators with

√n asymptotic normality. The estimators include

those commonly used in practice such as the GMM estimators,the MSM estimators surveyed in Gourieroux and Monfort (1996),as well as some semiparametric estimators surveyed in Powell(1994). Moreover, the criteria that are optimized in estimation canbe different from the SMSEP used as our model selection criterion.Such a general feature of our selection procedure leads to theconsequence that the asymptotic variances of θj, j = 1, 2, ingeneral contribute to the asymptotic variance σ 2 of Tn, as reflectedin the presence of Aj, Uj,i like terms in σ 2. On the other hand, insome applications one or both θj, j = 1, 2, can be obtained byminimizing the same SMSEP criterion defined in (5) which is usedas ourmodel selection criterion. In these situations, the expressionfor σ 2 can simplify, as indicated in the following corollary.

Corollary 1. Assume Assumptions 1–3 and 5–7.(i) If θ1 is obtained by minimizing (5), then under H0, Tn ⇒

N(0, σ 2), where

σ 2 = p limn→∞[(1, 0K1 , B2,n)Vn(1, 0K1 , B2,n)

′],

where 0K1 is a 1× K1 row vector of zeros, and B2,n and Vn are definedin Theorem 1.(ii) If θ2 is obtained by minimizing (5), then under H0, Tn ⇒

N(0, σ 2), where

σ 2 = p limn→∞[(1,−B1,n, 0K2)Vn(1,−B1,n, 0K2)

′],

where 0K2 is a 1× K2 row vector of zeros, and B1,n and Vn are definedin Theorem 1.(iii) If both θ1 and θ2 are obtained by minimizing (5), then under

H0, Tn ⇒ N(0, σ 2), where

σ 2 = Var(Ci),

where Ci is defined in Theorem 1.

Corollary 1 gives simplified expressions for σ 2 when one or bothθj, j = 1, 2, are obtained fromminimizing (5), the SMSEP criterion.This can occur when one or both structural models are specifiedusing the first moment conditions, and one or both estimatorsare simulated nonlinear least squares estimators resulting fromminimizing (5). Related examples are Laffont et al. (1995) andLi and Vuong (1997). Most interestingly, if both estimators areobtained fromminimizing (5), then σ 2 is the same as if θ∗j , j = 1, 2,were known. As a result, σ 2 does not depend on the asymptoticvariances of θj, j = 1, 2, meaning that the sampling variabilityattributed to the estimation of θ∗j is (asymptotically) irrelevant inusing Tn to test H0. This result is analogous to Theorem 3 in Riversand Vuong (2002).As can be seen from Theorem 1 and Corollary 1, in order to

propose a test statistic that is operational, one needs a consistentestimator for σ 2, the asymptotic variance of Tn. Provided that onecan find such a consistent estimator, say σ 2, we have the followingresult.

Corollary 2. Assume Assumptions 1–3 and 5–7. Let Tn = Tn/σ .(i) Under H0, Tn ⇒ N(0, 1).(ii) Under H1, Tn→p−∞.(iii) Under H2, Tn→p∞.

As stated in Corollary 2, our test statistic Tn has a niceasymptotic property in that under H0, it has a standard normaldistribution asymptotically. Therefore, given a consistent estimateσ for σ , our model selection procedure involves computing Tnand then comparing it with critical values from a standard normaldistribution. Specifically, let α denote the specified asymptoticsignificance level of the test and Zα/2 ≡ Φ−1(1 − α/2),where Φ−1(·) denotes the inverse cumulative standard normaldistribution. If |Tn| ≤ Zα/2, then we accept H0. Otherwise, if Tn <−Zα/2, we rejectH0 in favor ofH1; if Tn > Zα/2, we rejectH0 in favorofH2. Furthermore, a consistent estimator of σ 2 can be obtained byreplacing the quantities in Theorem1or Corollary 1 by their sampleanalogs, in a similar spirit to Rivers and Vuong (2002, Section 2).In particular, in case (iii) of Corollary 1, when both estimatorsare obtained from minimizing (5) and their sampling variationis asymptotically irrelevant to σ 2, σ 2 can be straightforwardlyestimated from the sample variation of Ci, i = 1, . . . , n.By now we have maintained Assumption 3 that assumes the

root-n asymptotic normality for the estimators obtained undercompeting models Mj, j = 1, 2. Maintaining this assumptionsimplifies the presentation and discussion. While most of theestimators that are used in estimating structural models satisfythis assumption, another class of estimators, relevant to somestructural models where the support of the dependent variablealso depends on the structural parameters, can have n consistency,a rate faster than root-n. These estimators include those basedon likelihood (Donald and Paarsch, 1993, 1996; Hong, 1998;Chernozhukov and Hong, 2004; Hirano and Porter, 2003), andthose based on the extreme order statistics (Donald and Paarsch,2002). It is worth noting that when one or both competing modelsare estimated by these rate-n consistent estimators, Theorem1 notonly remains valid, but also simplifies in a similar way to that inCorollary 1. The next corollary gives the corresponding results.

Corollary 3. Assume Assumptions 1, 2 and 5–7.(i) If θ1 is nα-consistent, where α > 1/2, but θ2 is root-n and

satisfies Assumption 3, then under H0, Tn ⇒ N(0, σ 2), where

σ 2 = p limn→∞[(1, B2,n)W2,n(1, B2,n)′],

where B2,n is defined in Theorem 1, and

W2,n = Var[CnU2,n

].

(ii) If θ2 is nα-consistent, where α > 1/2, but θ1 is root-n andsatisfies Assumption 3, then under H0, Tn ⇒ N(0, σ 2), where

σ 2 = p limn→∞[(1,−B1,n)W1,n(1,−B1,n)′],

where B1,n is defined in Theorem 1, and

W1,n = Var[CnU1,n

].

(iii) If both θ1 and θ2 are nα-consistent, where α > 1/2, thenunder H0, Tn ⇒ N(0, σ 2), where

σ 2 = Var[Ci],

where Ci is defined in Theorem 1.

T. Li / Journal of Econometrics 148 (2009) 114–123 119

As reflected in Theorem 1 and Corollary 3, our proposed modelselection procedure can be used when the competing models areestimated by estimators that are either nα-consistent for α > 1/2,or have root-n asymptotic normality. Thus, it has generality andwide applicability. Moreover, when both models are estimatedby nα-consistent estimators with α > 1/2 including those n-consistent estimators, Corollary 3 indicates that the samplingvariability attributed to the estimation of θ∗j , j = 1, 2, does notaffect (asymptotically) σ 2, thus calculation of σ 2 greatly simplifiesin the same way as in the case when both estimators are obtainedfrom minimizing (5), though the reasons are different.8

A critical issue in the non-nested model selection literature(Vuong, 1989; West, 1996; Rivers and Vuong, 2002) is that theasymptotic variance of the test statistic can be zero, thus theresulting test statistics are invalid. In particular, this is the casein Rivers and Vuong (2002) when the competing models arenested. Therefore, the test statistics in Rivers and Vuong (2002) areonly valid for essentially non-nested models. While our null andalternative hypotheses are formulated in a similarmanner to thosein Rivers and Vuong, our test statistics differ from those in Riversand Vuong (2002) as they directly construct the sample analog ofthe MSEP while ours use the SMSEP with the help of simulations.It would be interesting to analyze whether or not the asymptoticvariance of our test statistic can become zero. This question in factcan be answered by the following proposition.

Proposition 2. The asymptotic variance σ 2 of Tn given in Theorem 1and Corollaries 1 and 3 is always positive.

Proposition 2 is important as it indicates that our test statistic isalways valid. This makes our test distinctive from other tests. Sucha distinctive feature of our test can be attributed to the SMSEPwe use in constructing the test statistics, and in particular thesimulation used in creating additional randomness.

3. Monte Carlo experiments

In this section, we study finite sample performances of theproposed model selection test. In particular, we investigate thesize and power properties of our test in a finite sample setting.Though limited, the Monte Carlo experiments offer insight on howthe test can be useful in choosing between competing structuralmodels, and how it performs given a data set of moderate size.9As an illustration, we consider using our proposed test in selectingbetween competing structural auction models.We consider auctions within an independent private paradigm.

To be more specific, we consider the auctions that are held asDutch auctions inwhich onlywinning bids that are the highest bidsare observed to the econometrician and hence have more limiteddata than first-price sealed-bid auctions with all bids observed. Ofcourse, with first-price sealed-bid auctions with only winning bidsobserved, they can also be treated as Dutch auctions. In our design,the number of bidders is fixed across all auctions and is chosen asN = 10. The sample size is L = 200, meaning that we consider 200auctions, which is about the usual sample size for auctions. At the

8 I am grateful to an associate editor for suggesting that the results in Corollary3 in a previous version of the paper regarding n-consistent estimators also apply tonα-consistent estimators for α > 1/2.9 It is worth noting that the Monte Carlo experiments conducted in the papermake contribution to the literature in non-nested model selection, as to the best ofmy knowledge, there are few Monte Carlo studies in the literature in investigatingthe size and power properties of the non-nested tests that have been developed.

`-th auction, ` = 1, . . . , 200, the bidders draw their private valuesfrom an exponential distribution with the density

f (v|x1`, x2`) =1

exp(θ0 + θ1x1` + θ2x2`)

× exp[−

1exp(θ0 + θ1x1` + θ2x2`)

v

], (6)

where x1` and x2`, ` = 1, . . . , 200, are both generated froma Uniform variable on (0, 2), and they are independent. x1` andx2`, ` = 1, . . . , 200 are used to control for observed heterogeneityof the auctioned objects, and θ′ = (θ0, θ1, θ2) = (0.5, 1, 1).At the `-th auction, there is a public reserve price set at p0` =√2 exp(θ0 + θ1x1` + θ2x2`), meaning that the reserve price ismore than the mean of the private value distribution, and thusis effectively binding. It is worth noting that following Rileyand Samuelson (1981) among others, at the `-th auction, theNash–Bayesian equilibrium strategy for the winning bidder can bewritten as

bw,`(vw,`) = vw,` −1

FN−1(vw,`|x`)

∫ vw,`

p0`FN−1(u|x1`, x2`)du,

where vw,` is the private value of the winning bidder at the `-th auction, bw,`(vw,`) is the corresponding equilibrium bid, andF(·|x1`, x2`) is the cumulative distribution associated with thedensity f (·|x1`, x2`). The experiments are conducted through 500replications.To investigate the size properties of the test, we consider the

following scenario. Suppose we need to specify the conditionaldensity of private values. We would use the exponential to modelthe density, but need to determine the covariate between x1` andx2` to condition on. Specifically, we face two alternatives for thechoice of the conditional density, namely, fs,j(·|xj`), j = 1, 2,where

fs,j(v|xj`) =1

exp(δj0 + δj1xj`)exp

[−

1exp(δj0 + δj1xj`)

v

].

Note that both models Mj, j = 1, 2, with fs,j(·|xj`), j =1, 2, respectively, are misspecified, as the true density is givenby (6). Suppose that we estimate the resulting structural modelsunder specifications fs,j(·|xj`), j = 1, 2, using the winning bidsfrom the sold auctions, that is we use those highest bids that areabove the reserve prices. We can now estimate model Mj, j =1, 2, by minimizing

∑L`=1[bw,` − ms,j(xj`, δj)]

2I(bw,` > p0`) toobtain nonlinear least squares estimates δj, where ms,j(xj`, δj) ≡E[bw,`|bw,` > p0`, xj`], and I(bw,` > p0`) = 1 if bw,` > p0`and 0 otherwise. Note that because of our design that x1` and x2`are independently drawn from the same Uniform on (0, 2) andthat θ1 = θ2, δj, j = 1, 2, converge in probability to the samelimit, say, δ∗1 = δ∗2 = δ∗. In order to select between two models,suppose we define bh,` = bw,` if the auction is sold in which casebw,` > p0`, and otherwise, bh,` = p0`, and we define Qs,j(δ∗j ) =Ebh,xj [(bh − EMj [bh|xj, δ

j ])2]. Note that in our case, because of the

revenue equivalence theorem, (Riley and Samuelson, 1981; Laffontet al., 1995), we have

EMj [bh|xj, δ∗

j ] =

∫∞

p0vf (N−1:N)s,j (v|xj, δ∗j )dv

+

∫ p0

0p0f

(N−1:N)s,j (v|xj, δ∗j )dv

=

∫∞

p0vN(N − 1)FN−2s,j (v|xj, δ∗j )

× (1− Fs,j(v|xj, δ∗j ))fs,j(v|xj, δ∗

j )dv

+ p0F(N−1:N)s,j (p0|xj, δ∗j ) (7)

120 T. Li / Journal of Econometrics 148 (2009) 114–123

Table 1Empirical sizes of the tests.

Norminal size= 0.05 Norminal size= 0.10Method 1 Empirical size= 0.012 Empirical size= 0.046Method 2 Empirical size= 0.054 Empirical size= 0.086

where f (n−1:n)s,j (v|xj, δ∗j ), and F(n−1:n)s,j (v|xj, δ∗j ) are the density

and distribution of the second highest private value among nbidders, respectively. By the change of variables by letting uj =[Fs,j(v|xj, δ∗j )− Fs,j(p0|xj, δ

j )]/[1− Fs,j(p0|xj, δ∗

j )], we can get from(7) that EMj [bh|xj, δ

j ] = Euj [Ψj(uj, xi, θ∗

j )]where

Ψj(uj, xi, θ∗j ) = N(N − 1)(1− Fs,j(p0|xj, δ∗

j ))

× F−1s,j (uj(1− Fs,j(p0|xj, δ∗

j ))+ Fs,j(p0|xj, δ∗

j ))

×[uj(1− Fs,j(p0|xj, δ∗j ))+ Fs,j(p0|xj, δ∗

j )]N−2

×[1− uj(1− Fs,j(p0|xj, δ∗j ))− Fs,j(p0|xj, δ∗

j )]

+ p0F(N−1:N)s,j (p0|xj, δ∗j ), (8)

where uj is from Uniform (0, 1). It is now straightforward to verifythat the assumptions on Ψj(uj, xi, θ∗j ) in Section 2 are satisfied.Note because δ∗1 = δ∗2 = δ∗ as a result of our design, Qs,1(δ∗1) =

Qs,2(δ∗2). Thus H0 is satisfied by our design, which allows us tostudy the size properties of our test. We then follow (5) to get theSMSEP Qs,j(δj) and set the number of simulations in both casesas Sj = 10, j = 1, 2. We refer to this way of estimating andtesting as method 1. Note that method 1 uses different objectivefunctions in estimation and testing, thus corresponding to thegeneral case of Theorem 1. We also go further to investigate thecase where the estimationminimizes the same SMSEP criterion. Inthis case, we estimate modelMj, j = 1, 2, by minimizing Qs,j(δj)using bh,` across all auctions including the unsold ones. It can beverified by the argument similar to the preceding one that H0 isalso satisfied by our design.10 Thismethodof using the sameSMSEPin both estimation and testing is referred to as method 2, whichcorresponds to (iii) in Corollary 1.Table 1 reports the empirical sizes using method 1 and method

2. When the nominal size is 0.05, the empirical sizes are 0.012 and0.054 for method 1 andmethod 2, respectively. When the nominalsize is 0.10, the empirical sizes from method 1 and method 2 are0.046 and 0.086, respectively. It appears that the empirical sizesfrom using method 1 are smaller than the nominal sizes, meaningthat the nominal sizes are conservative. On the other hand, theempirical sizes from usingmethod 2 are close to the nominal sizes.These results demonstrate that our tests have good size propertiesfor an auction data set of 200 observations.We also conduct experiments to study power properties of the

tests. Suppose that model M1 is correctly specified in that theconditional density of private values is correctly specified as

fp,1(v|x1`, x2`) =1

exp(η0 + η1x1` + η2x2`)

× exp[−

1exp(η0 + η1x1` + η2x2`)

v

],

but model M2 is misspecified in that the conditional density ofprivate values is specified the same way as fs,2(·|x2`), that is,fp,2(v|x1`, x2`) = fs,2(·|x2`). H1 is true under this design, whichallows use to study the power of the tests. We also use method

10 Note that in this case, the estimation method is indeed simulated nonlinearleast squares. The simulated nonlinear least squaresmethodwe use is a bit differentfrom the one in Laffont et al. (1995) as they use the importance samping method insimulations while ours uses (8) in simulations. In effect, we propose a new way ofconducting simulated nonlinear least squares estimation in first-price auctions.

Table 2Empirical powers of the tests.

Norminal size= 0.05 Norminalsize= 0.10

Method 1 Empirical power= 1 Empirical power= 1Method 2 Empirical power= 1 Empirical power= 1

1 and method 2 described above to obtain empirical powers invarious cases. Table 2 reports the results. As one can see fromTable 2, our tests yield desirable power properties, as the powersare all equal to one at the 0.05 and 0.10 nominal sizes whethermethod 1 or method 2 is used.

4. An empirical application

To illustrate the usefulness and feasibility of our proposedmodel selection procedure, we present an application using astructural first-price auction model to analyze the timber saleauctions in Oregon organized by Oregon Department of Forest(ODF). This data set has been analyzed in Li (forthcoming), whichestimates a structural model within an independent private value(IPV) paradigm. A particular feature of the timber auctions inOregon, as noted in Li (forthcoming), is the presence of thepublicly announced reserve prices. As is well known, a structuralauctionmodel derived from the game theory assumes that biddersdraw their bids dependent of the number of potential bidders.Specifically, within the IPV paradigm, as discussed in Section 3,the symmetric Nash–Bayesian equilibrium strategy bm for the m-th bidder with a private value vm above the reserve price p is givenby

bm = vm −1

(F(vm))N−1

∫ vm

pFN−1(x)dx, (9)

where N is the number of potential bidders and F(·) is the privatevalue distribution.As in Li (forthcoming), we consider 108 lots with different

species grades and in different regions. Table 3 gives summarystatistics on the data such as the appraised volumes measuredin thousand board feet (MBF), the reserve prices, the regionaldummies to indicate where the lots are located, the bids per MBFand the log grades. For more details on these variables, see Li(forthcoming). Also, following Li (forthcoming), we assume thatthe private value density at the `-th lot be specified as

f`(vm`|z`) =1

exp(γ`)exp

[−

1exp(γ`)

vm`

],

where vm` is the private value for the m-th bidder at the `-thauction, γ` = γ0 + γ1grade` + γ2region1` + γ3region2&3`, and z`denotes the heterogeneity vector consisting of variables ‘‘grade’’,‘‘region1’’ and ‘‘region2&3’’, where ‘‘grade’’ is for log grade tomeasure the quality, ‘‘region1’’ and ‘‘region2&3’’ are both regionaldummies.As indicated from (9), to conduct the structural analysis, one

needs to know the number of potential bidders. With the timberauction data in our case, however, we only observe the number ofactual bidders, which is not the same as the number of potentialbidders due to the fact that the bidders whose valuations arebelow the reserve prices will not submit their bids. In essence,when reserve prices are binding, the number of potential bidders,if assumed to be a constant across auctions, can be regarded as astructural parameter that cannot be identified from the biddingmodel but from elsewhere.11 To resolve the issue of not observing

11 Determining the number of potential bidders in auctions with the presenceof reserve prices is indeed a common problem facing empirical economists whenanalyzing auction data. See also the discussion in footnote 1.

T. Li / Journal of Econometrics 148 (2009) 114–123 121

Table 3Summary statistics of the timber sale data.

Variable Number of observations Mean S.D. Min Max

Bid 451 331.62 136.70 119.67 2578.3Winning bid 108 382.5 231.13 157.86 2578.3Reserve price 108 273.19 77.32 118.32 463.96Volume 108 3165.22 2894.31 256.74 20211Grade 108 2.1653 0.3837 1.2727 3.0199Region1 108 0.8448 0.3625 0 1Regions2&3 108 0.1397 0.3471 0 1Number of submitted bids 108 4.1759 2.0178 1 10

the number of potential bidders, Li (forthcoming) assumes that thenumber of potential bidders is 10, which is the maximum numberof actual bidders in the data set. To illustrate the application ofthe proposed model selection procedure, we consider anotheralternative assumption about the number of potential bidderswhich is N = 50. This assumption comes from the fact thatthere are in total 50 different bidders in the data.12 Note thatthe resulting structural models from (9) with different numberof bidders N = 10 and N = 50 are non-nested. Assumingthat N = 10, Li (forthcoming) estimates the structural modelunder the specification (9) for the private value distribution usingan estimation method based on the indirect inference principleoriginally suggested by Smith (1993), Gourieroux et al. (1993), andGallant and Tauchen (1996). Specifically, the procedure proposedin Li (forthcoming) consists of two steps. The first step is to obtainthe OLS estimates from a regression of bids on a constant andz`. The second step is to simulate the structural model to get‘‘bids’’ and use the simulated bids to conduct the regression againand to get the OLS estimates. Then the estimates for structuralparameters are obtained from minimizing a distance betweenthe OLS estimates in the first step and in the second step. Nowassuming N = 50, we re-estimate the structural model usingthe same method. Table 4 reports the estimation results.13 It isinteresting to note that only comparing the estimates from thesetwo models with N = 10 and N = 50, respectively, does notallow us to distinguish between these two models, as the twosets of estimates are similar in both magnitudes and significancelevels. Thus, to determine the number of potential bidders thatbetter describes the bidding process, we apply ourmodel selectionprocedure based on the highest bids as described in the MonteCarlo section and obtain that the test statistic is Tn = 6.17, withthe formulation of H1 as the model with N = 50 being preferredand H2 as the model with N = 10 being preferred.14 As a result,at the 5% significance level, N = 10 is preferred to N = 50. Also,note that it is possible that neither N = 10 nor N = 50 could bea correct description of the true number of potential bidders. Forinstance, we maintain the assumption that the number of biddersis a constant across the auctions. In reality, however, the numberof potential bidders may be different from 10, and may even varyacross auctions. Nevertheless, since ourmodel selection test allowsboth models to be misspecified, we can conclude from the testthat N = 10 is a better approximation than N = 50 for thenumber of potential bidders. In otherwords, the competition effect

12 Mydiscussionwith the expert at ODF also confirms that there are about 50 firmsthat could be potentially interested in timber auctions.13 For completeness and comparison, we also include the results reported in Li(forthcoming) for the case of N = 10.14 We obtain this Tn by setting S1 = S2 = 10 in calculating Tn . Also σ is obtainedhere through nonparametric bootstrap with the number of boostraps equal to 800as the inference in Li (forthcoming) is made through bootstrap because of thecomplexity involved with the so-called ‘‘binding function’’ if one wishes to use theasymptotic variance formula.

Table 4Estimates for structural parameters in timber sale auctions.

N Parameter γ0 γ1 γ2 γ3

50 Estimate 4.6932 0.2576 0.2250 −0.0678Standard error 0.3247 0.1082 0.2032 0.2352

10 Estimate 4.9042 0.2642 0.2070 −0.0865Standard error 0.3017 0.0962 0.1927 0.2203

is better measured by N = 10 than N = 50.15 This applicationdemonstrates the usefulness of our model selection procedure inselecting the competing structural econometric models.16

5. Conclusion

This paper develops a general framework for testing be-tween competing non-nested structural econometric models. Ourmethod allows for any estimators that are either root-n asymptot-ically normally distributed or consistent at a rate faster than root-n, and can be used for distinguishing between twomodels that areboth possiblymisspecified. The statistical significance of the differ-ence between twomodels under consideration is assessed througha simulation based lack-of-fit criterion, taking into account thecomplex nature of structural econometric models. As such, our ap-proach provides a new model selection method for choosing be-tween competing structural models.Our Monte Carlo studies demonstrate the good size and power

properties of the test.We apply our testing procedure to determinethe number of potential bidders in the timber auctions in Oregon.Such an application illustrates the usefulness and generality of ourtest, and also demonstrates the importance of developing modelselection tests in structural econometric models.This paper has been focusing on selection of fully parametric

structural models. While the results in the paper do not applyto nonparametric structural models, it is possible to extend themto the semiparametric framework provided that the conditionalexpectation of the dependent variable can be simulated with thestructural parameter estimates. Also, as previously mentioned,this paper is motivated by the need to develop a general modelselection test for structural models when the estimation methodsused do not allow one to use the existing procedures. On the otherhand, because of its generality, our proposed method can also be

15 In fact, we have further conducted the same model selection test between themodel with N = 10 and a model with N that is larger than 10 and less than 50. Themodel with N = 10 is always preferred. For example, the test statistic for selectingbetween themodelwithN = 11 (M1) and themodelwithN = 10 (M2) is Tn = 3.51,clearly rejecting model 1 in favor of model 2. This finding is indeed reasonable asit reconfirms that under the assumption that the number of potential bidders is aconstant, the maximum number of actual bidders across auctions, which is 10 inour case, is an n-consistent estimate for the number of potential bidders.16 Note that the use of the indirect inference type estimators in estimating ourstructural models and the complex feature of the structural model itself make itdifficult to apply the existing model selection procedures here. This applicationdemonstrates the generality of our proposed selection procedure.

122 T. Li / Journal of Econometrics 148 (2009) 114–123

applied to the cases in which the existing model selection testswork as well. For instance, when two competing structural modelsare estimated by the MLE, we can use the Vuong (1989) likelihoodratio test as well as our test for model selection. Thus, it would beinteresting in this case to compare the asymptotic properties ofboth tests as well as their finite sample performances. These areleft for future research.

Appendix

Proof of Proposition 1. Under Assumptions 1, 2, 5 and 6, it canbe verified that the assumptions in Lemma 4.3 in Newey andMcFadden (1994) are satisfied. Invoking this lemma leads to

Qj(θj)− limn→∞

1n

n∑i=1

Ey,x,y

(sj)j[qi,j,Sj(θ

j )]→p 0,

where

qi,j,Sj(θj) = (yi − Yj,i(θj))2−

1Sj(Sj − 1)

Sj∑sj=1

(y(sj)j,i (θj)− Yj,i(θj))

2.

On the other hand,

Ey,x,y

(sj)j[qi,j,Sj(θ

j )] = Ey,x,y(sj)j[(yi − Yj,i(θ∗j ))

2]

− Ey,x,y

(sj)j

1Sj(Sj − 1)

Sj∑sj=1

(y(sj)j,i (θ

j )− Yj,i(θ∗

j ))2

= E

y,x,y(sj)j[(yi − EMj(y|xi, θ

j ))2]

+ Ex,y

(sj)j[Yj,i(θ∗j )− EMj(y|xi, θ

j )]2−1SjExVarMjy

(sj)j,i (θ

j )

= Ey,x[(yi − EMj(y|xi, θ∗

j ))2],

where VarMj(·) denotes the conditional variance given x undermodel Mj, the second equality follows from the unbiasedestimation of VarMj Yj,i(θ

j ), and the conditional independence

of yi and the simulations y(sj)j,i given xi leading to E

y,x,y(sj)j[(yi −

EMj [Yj,i|xi, θ∗

j ])(Yj,i − EMj [Yj,i|xi, θ∗

j ])] = 0. As a result,

Qj(θj)→p Qj(θ∗j ). (A.1)

Then Proposition 1 follows. �

Proof of Theorem 1. A Taylor expansion of Qj(θj) around θ∗j yields

Qj(θj) = Qj(θ∗j )+∂Qj∂θ ′j

∣∣∣∣∣θj

(θj − θ∗

j ),

where θj is a value between θj and θ∗j , j = 1, 2. We then have

√nQj(θj) =

√nQj(θ∗j )+

∂Qj∂θ ′j

∣∣∣∣∣θ∗j

√n(θj − θ∗j )

+

∂Qj∂θ ′j

∣∣∣∣∣θj

−∂Qj∂θ ′j

∣∣∣∣∣θ∗j

√n(θj − θ∗j )=√nQj(θ∗j )+

∂Qj∂θ ′j

∣∣∣∣∣θ∗j

√n(θj − θ∗j )+ oP(1), (A.2)

where the second equality follows from that θj − θ∗j → 0 inprobability because θj − θ∗j → 0 in probability as assumed in

Assumption 2, and that√n(θj − θ∗j ) = OP(1) from Assumption 3,

as well as ∂Qj(θj)/∂θ ′j − ∂Qj(θ∗

j )/∂θ′

j → 0 in probability which isa result of Assumptions 6 and 7. It then follows that√n{Q1(θ1)− Q2(θ2)− (Q1(θ∗1 )− Q2(θ

2 ))}

=√n{Q1(θ∗1 )− Q2(θ

2 )− (Q1(θ∗

1 )− Q2(θ∗

2 ))}

+∂Q1∂θ ′1

∣∣∣∣∣θ∗1

√n(θ1 − θ∗1 )−

∂Q2∂θ ′2

∣∣∣∣∣θ∗2

√n(θ2 − θ∗2 )+ oP(1)

=1√n

n∑i=1

(Ci −

∂Q1∂θ ′1|θ∗1A1U1,i +

∂Q2∂θ ′2|θ∗2A2U2,i

)−√n(Q1(θ∗1 )− Q2(θ

2 ))+ oP(1), (A.3)

where the second equality follows from Assumption 3 and thedefinition of Qj(θ∗j ), j = 1, 2. Then (i), (ii) and (iii) follow from (A.3)and application of central limit theorem after some algebra. �

Proof of Corollary 1. If θj is obtained by minimizing (5), then∂Qj(θj)/∂θ ′j = 0 by the first-order condition of the minimizationproblem. On the other hand, noting that θ∗j is the probability limitof θj, limn→∞ ∂Qj(θ∗j )/∂θ

j = 0. As a result, limn→∞ Bj = 0. Then(i), (ii), (iii) follow directly. �

Proof of Corollary 2. The result directly follows from Theorem 1and the assumption that σ 2 is a consistent estimator for σ 2. �

Proof of Corollary 3. If θj is nα-consistent, then√n(θj − θ∗j ) =

oP(1). As a result, (A.2) becomes√nQj(θj) =

√nQj(θ∗j )+ oP(1). (A.4)

(i) Now if θ1 is nα-consistent, but θ2 is root-n and satisfiesAssumption 3, then (A.4) holds for θ1 while (A.2) holds for θ2. Itthen follows that (A.3) becomes√n{Q1(θ1)− Q2(θ2)− (Q1(θ∗1 )− Q2(θ

2 ))}

=1√n

n∑i=1

(Ci + A2U2,i)

−√n(Q1(θ∗1 )− Q2(θ

2 ))+ oP(1). (A.5)

Then the result follows from (A.5) after some algebra.(ii) Now if θ2 is nα-consistent, but θ1 is root-n and satisfiesAssumption 3, then (A.4) holds for θ2 while (A.2) holds for θ1. Theresult follows from an argument similar to that of (i).(iii) Now if both θ1 and θ2 are nα-consistent, then (A.4) holds forboth θ1 and θ2. As a result, (A.3) becomes√n{Q1(θ1)− Q2(θ2)− (Q1(θ∗1 )− Q2(θ

2 ))}

=1√n

n∑i=1

Ci −√n(Q1(θ∗1 )− Q2(θ

2 ))+ oP(1). (A.6)

Then the result follows from (A.6) after some algebra.

To prove Proposition 2, we first need the following lemma.

Lemma 1. Define σ 20 ≡ Var[Ci], where Ci is defined in Theorem 1.Then σ 20 is always positive.

Proof of Lemma 1. Using the decomposition of variance, we have

Var[Ci] = Varxi,u1,u2 [E[Ci|xi,u1,u2]]+ Exi,u1,u2 [Var[Ci|xi,u1,u2]]

≥ Exi,u1,u2 [Var[Ci|xi,u1,u2]]. (A.7)

T. Li / Journal of Econometrics 148 (2009) 114–123 123

Furthermore, by the definition of Ci, and because y(sj)j,i (θ

j ), Yj,i(θ∗

j )are only functions of xi and uj, we have

Var[Ci|xi,u1,u2]= Var[(yi − Y1,i(θ∗1 ))

2− (yi − Y2,i(θ∗2 ))

2|xi,u1,u2]

= Var[2yi(Y2,i(θ∗1 )− Y1,i(θ∗

2 ))|xi,u1,u2]= 4(Y2,i(θ∗1 )− Y1,i(θ

2 ))2Var[yi|xi,u1,u2]

= 4(Y2,i(θ∗1 )− Y1,i(θ∗

2 ))2Var[yi|xi], (A.8)

where the last equality follows from that conditional on xi, yi anduj are independent. Thus, (A.7) and (A.8) lead to

Var[Ci] ≥ 4Exi,u1,u2 [(Y2,i(θ∗

1 )− Y1,i(θ∗

2 ))2Var[yi|xi]]

> 0,

unless Y2,i(θ∗1 ) − Y1,i(θ∗

2 ) = 0 almost everywhere, which cannotbe true. Then the desired result follows. �

Proof of Proposition 2. From (A.3) we have under H0, Tn =∑ni=1(Ci−B1U1,i+B2U2,i)/

√n+ oP(1), where B1 = p limn→∞ B1,n

and B2 = p limn→∞ B2,n. Lemma 1 has shown that σ > 0when B1 = B2 = 0. Now suppose at least one of B1 andB2 is not zero. It then suffices to show that Var(Ci − B1U1,i +B2U2,i) is positive because of the random sampling assumption(Assumption 1). Suppose otherwise that Var(Ci−B1U1,i+B2U2,i) =0, which implies that Ci − B1U1,i + B2U2,i is a constant almosteverywhere, or equivalently Ci is a linear combination of U1,i andU2,i almost everywhere. This cannot be true. As a result we reach acontradiction. �

References

Akaike, H., 1973. Information theory and an extension of the likelihood ratioprinciple. In: Petrov, B.N., Csaki, F. (Eds.), Proceedings of the SecondInternational Symposium of Information Theory. Akademiai Kiado, Budapest,pp. 257–281.

Akaike, H., 1974. A New Look at the Statistical Model Identification. IEEETransactions and Automatic Control AC-19, 716–723.

Andrews, D.W.K., 1994. Asymptotics for semiparametric econometric models viastochastic equicontinuity. Econometrica 62, 43–72.

Chen, X., Hong, H., Shum, M., 2003. Likelihood ratio tests between parametric andmoment condition models. Working Paper. Princeton University.

Chernozhukov, V., Hong, H., 2004. Likelihood estimation and inference in a class ofnonregular econometric models. Econometrica 72, 1145–1480.

Cox, D.R., 1961. Tests of separate families of hypotheses. In: Proceedings of theFourth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1.pp. 105–123.

Diebold, F.X., Mariano, R.S., 1995. Comparing predictive accuracy. Journal ofBusiness and Economic Statistics 13, 253–263.

Donald, S., Paarsch, H., 1993. Piecewise pseudo maximum likelihood estimation inempirical models of auctions. International Economic Review 34, 121–148.

Donald, S., Paarsch, H., 1996. Identification, estimation, and testing in parametricempirical models of auctions within the independent private value paradigm.Econometric Theory 12, 512–567.

Donald, S., Paarsch, H., 2002. Superconsistent estimation and inference in structuraleconometricmodels using extreme order statistics. Journal of Econometrics 109(2), 305–340.

Gallant, A.R., Tauchen, G.E., 1996. Which moments to match? Econometric Theory12, 657–681.

Gallant, A.R., White, H., 1988. A Unified Theory of Estimation and Inference forNonlinear Dynamic Models. Basil Blackwell, New York.

Gasmi, F., Laffont, J.J., Vuong, Q., 1992. Econometric analysis of collusive behaviorin a soft drink industry. Journal of Economics and Management Strategy 1,277–311.

Gourieroux, C., Monfort, A., 1994. Testing Non-nested Hypotheses. In: Engle, R.F.,McFadden, D. (Eds.), Handbook of Econometrics, vol. 4. North Holland,Amsterdam, pp. 2585–2640.

Gourieroux, C., Monfort, A., 1996. Simulation-Based Econometric Methods. OxfordUniversity Press, Oxford.

Gourieroux, C., Monfort, A., Renault, E., 1993. Indirect inference. Journal of AppliedEconometrics 8, S85–S118.

Heckman, J., 2001. Micro data, heterogeneity, and the evaluation of public policy:Nobel lecture. Journal of Political Economy 109, 673–748.

Hirano, K., Porter, J., 2003. Asymptotic efficiency in parametric structural modelswith parameter-dependent support. Econometrica 71, 1307–1338.

Hong, H., 1998. Nonregular maximum likelihood estimation in auction, job searchand production frontier models, Mimeo, Princeton University.

Kitamura, Y., 2000. Comparing misspecified dynamic econometric models usingnonparametric likelihood. Working Paper. University of Wisconsin.

Kitamura, Y., 2002. Econometric comparison of conditional models. Working Paper.University of Pennsylvania.

Laffont, J.J., 1997. Game theory and empirical economics: The case of auction data.European Economic Review 41, 1–35.

Laffont, J.J., Maskin, E., 1990. The efficient market hypothesis and insider trading onthe stock market. Journal of Political Economy 98, 70–93.

Laffont, J.J., Ossard, H., Vuong, Q., 1995. Econometrics of first-price auctions.Econometrica 63, 953–980.

Li, T., 2005. Econometrics of first-price auctions with entry and binding reservationprices. Journal of Econometrics 126, 173–200.

Li, T., 2005. Indirect Inference in Structural Econometric Models. Journal ofEconometrics (forthcoming).

Li, T., Vuong, Q., 1997. Using all bids in parametric estimation of first-price auctions.Economics Letters 55, 321–325.

Mizon, G.E., Richard, J.F., 1986. The encompassing principle and its application totesting non-nested hypotheses. Econometrica 54, 657–678.

Newey, W.K., McFadden, D., 1994. Large sample estimation and hypothesis testing.In: Engle, R.F., McFadden, D. (Eds.), Handbook of Econometrics, vol. 4. NorthHolland, Amsterdam, pp. 2111–2245.

Pesaran, M.H., Weeks, M., 2001. Non-nested hypothesis testing: An overview.In: Baltagi, B. (Ed.), A Companion to Theoretical Econometrics. Blackwell,Oxford.

Powell, J.L., 1994. Estimation of semiparametric models. In: Engle, R.F., McFad-den, D. (Eds.), Handbook of Econometrics, vol. 4. North Holland, Amsterdam,pp. 2443–2521.

Riley, J., Samuelson, W., 1981. Optimal auctions. American Economic Review 71,381–392.

Rivers, D., Vuong, Q., 2002. Model selection tests for nonlinear dynamic models.Econometrics Journal 5, 1–39.

Smith, A., 1993. Estimating nonlinear time series using simulated vectorautoregressions. Journal of Applied Econometrics 8, S63–S84.

Smith, R., 1992. Non-nested tests for competing models estimated by generalizedmethod of moments. Econometrica 60, 973–980.

Vuong, Q., 1989. Likelihood ratio tests for model selection and non-nestedhypotheses. Econometrica 57, 307–333.

West, K.D., 1996. Asymptotic inference about predictive ability. Econometrica 64,1067–1084.

Wolak, F., 1994. An econometric analysis of the asymmetric information regulatorutility interaction. Annales d’Economie et de Statistique 34, 13–69.

Wooldridge, J.M., 1990. An encompassing approach to conditional mean tests withapplications to testing non nested hypotheses. Journal of Econometrics 45,331–350.