13
Dynamic panel data methods for cross-section panels with an application on a winter tourism demand model Franz Eigner UK Panel data econometrics Prof. Kunst, SS09 June 21st, 2009 Abstract This paper deals with theoretical aspects of dynamic panel data anal- ysis and employs them in a simple application. The theoretical part deals with the importance of dynamic modelling, the bias of the LSDV estima- tor and then focuses on the description of consistent estimators. These are the Anderson/Hsia (1981), the Difference-GMM from Arellano and Bond (1991), the System-GMM from Blundell and Bond (1998) and the bias corrected LSDV estimator suggested by Bruno (2005). It follows an ap- plication which is taken from Eigner, Toeglhofer, Prettenthaler (2009) and which models winter tourism demand for 185 ski destinations in Austria, based on the number of overnight stays from 1973-2006. By using income and relative purchasing power of the tourists together with snow cover- age as determinants for tourism demand, both economic and climatologic aspects are combined in a single framework, based on an autoregressive distributed lag model. The study especially emphasizes the importance of climatologic variables in explaining winter tourism demand in Austria. 1

Dynamic panel data methods for cross-section panels with an application on a winter tourism demand model

Embed Size (px)

DESCRIPTION

This paper deals with theoretical aspects of dynamic panel data analysis and employs them in a simple application. The theoretical part deals with the importance of dynamic modelling, the bias of the LSDV estimator and then focuses on the description of consistent estimators. These are the Anderson/Hsia (1981), the Difference-GMM from Arellano and Bond (1991), the System-GMM from Blundell and Bond (1998) and the bias corrected LSDV estimator suggested by Bruno (2005). It follows an application which is taken from Eigner, Toeglhofer, Prettenthaler (2009) and which models winter tourism demand for 185 ski destinations in Austria, based on the number of overnight stays from 1973-2006. By using income and relative purchasing power of the tourists together with snow coverage as determinants for tourism demand, both economic and climatologic aspects are combined in a single framework, based on an autoregressive distributed lag model. The study especially emphasizes the importance of climatologic variables in explaining winter tourism demand in Austria.

Citation preview

Page 1: Dynamic panel data methods for cross-section panels with an application on a winter tourism demand model

Dynamic panel data methods for cross-sectionpanels with an application on a winter tourism

demand model

Franz Eigner

UK Panel data econometricsProf. Kunst, SS09

June 21st, 2009

Abstract

This paper deals with theoretical aspects of dynamic panel data anal-ysis and employs them in a simple application. The theoretical part dealswith the importance of dynamic modelling, the bias of the LSDV estima-tor and then focuses on the description of consistent estimators. These arethe Anderson/Hsia (1981), the Difference-GMM from Arellano and Bond(1991), the System-GMM from Blundell and Bond (1998) and the biascorrected LSDV estimator suggested by Bruno (2005). It follows an ap-plication which is taken from Eigner, Toeglhofer, Prettenthaler (2009) andwhich models winter tourism demand for 185 ski destinations in Austria,based on the number of overnight stays from 1973-2006. By using incomeand relative purchasing power of the tourists together with snow cover-age as determinants for tourism demand, both economic and climatologicaspects are combined in a single framework, based on an autoregressivedistributed lag model. The study especially emphasizes the importanceof climatologic variables in explaining winter tourism demand in Austria.

1

Page 2: Dynamic panel data methods for cross-section panels with an application on a winter tourism demand model

Contents1 Preliminary considerations 3

1.1 Bias of the LSDV estimator . . . . . . . . . . . . . . . . . . 31.2 Advantages of dynamic modelling . . . . . . . . . . . . . . . 4

2 Consistent estimators 42.1 First Differerence IV (Anderson/Hsiao, 1981) . . . . . . . . 42.2 GMM estimators . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2.1 Difference GMM (Arellano and Bond, 1991) . . . . . 52.2.2 System-GMM (Blundell/Bond, 1998) . . . . . . . . . 72.2.3 Validity of the instruments . . . . . . . . . . . . . . 7

2.3 Bias corrected LSDV estimator . . . . . . . . . . . . . . . . 82.3.1 Bruno (2005) . . . . . . . . . . . . . . . . . . . . . . 8

2.4 Summary of the models . . . . . . . . . . . . . . . . . . . . 9

3 Application - Winter tourism demand model 93.1 Data description . . . . . . . . . . . . . . . . . . . . . . . . 93.2 Commands in statistical software packages . . . . . . . . . . 103.3 Estimation table of the winter tourism demand model for

Austrian ski destinations from 1973 to 2006 . . . . . . . . . 103.4 Interpretation of the results . . . . . . . . . . . . . . . . . . 123.5 Final considerations concerning the application . . . . . . . 12

2

Page 3: Dynamic panel data methods for cross-section panels with an application on a winter tourism demand model

1 Preliminary considerationsDynamic modelling means that the static specification of the linear fixed effectsmodel

yit = βxit + µi + εit, uit = µi + εit

is enhanced by including autoregressive coefficients (lagged dependent vari-ables), which allow feedback from current or past shocks to current values ofthe dependent variable yit, The most trivial case is a AR(1) model specification

yit = ρyi,t−1 + βxit + µi + εit, uit = µi + εit

Dynamic modelling can be seen as adequate in the presence of

• temporal autocorrelation in the residuals εit or

• high persistency in the dependent variable yit

Unfortunately, as will be shown later, the inclusion of an autoregressive coef-ficient makes the LSDV estimator inconsistent and biased. One is thereforeinterested in finding a consistent estimator for N → ∞ and T fixed (cross-section panel), When the only aim is to cope with temporal autocorrelation inthe data, Parks method could be assessed. Parks method is also a dynamicapproach, assuming an autoregressive specification of the error term and esti-mated by a FGLS procedure. Alternatively, one could get rid of autocorrelationby using a well known procedure like the Prais Winston Transformation. It isimplemented as a feature in the Panel corrected standard errors model (PCSE)in Stata (StataCorp, 2007).

1.1 Bias of the LSDV estimatorStrictly speaking, the strict exogeneity assumption of the LSDV estimator

E(εi,t | xi, µi) = 0, t = 1, . . . , T ; i = 1, . . . , N

is obviously violated by inclusion of yi,t−1. Only a weaker form of exogeniety stillholds, that is sequential exogeneity. It means that variables xit and yi,t−1 arepredetermined, therefore only uncorrelated with the subsequent (future) errordisturbances. The LSDV model can be estimated by OLS using the within-transformed variables. For the within-transformed variable holds:

yi,t−1 = yi,t−1 − 1T−1 (yi2 + . . .+ yiT ) and εit = εit − 1

T−1 (εi2 + . . .+ εiT )One can show that the correlation between yi,t−1 and εit is negative, due tocor(yi,t−1,− 1

T−1εi,t−1) < 0 and cor(− 1T−1yit, εit) < 0, Therefore the LSDV

estimator is inconsistent and biased in dynamic models (for N → ∞ and fixedT ). The bias of the LSDV estimator was first estimated by Nickell (1981). Hisbias approximation follows:

ρ∗ = plimN→∞

(ρlsdv − ρ) =−σ2

εh(ρ, T )(1− ρ2

xy−1)σ2y−1

3

Page 4: Dynamic panel data methods for cross-section panels with an application on a winter tourism demand model

β∗ = −ζρ∗, where ζ = σxy−1/σ2x h(ρ, T ) = (T−1)−Tρ+ρT

T (T−1)(1−ρ)2 and ρxy−1 =σxy−1/σxσy−1

annot.: variables denoted as x and y are within-transformed

Because h(ρ, T ) is always positive (when ρ > 0), the LSDV estimate is down-ward biased. It can be seen that the bias for the autoregressive coefficient isespecially severe, when

1. the autoregressive coefficient ρ is high

2. the number of time periods T is low

3. the ratio of σ2ε /σ

2y−1 is high

Notice that the bias of the exogenous variables is proportional to the bias of theautoregressive coefficient, where ζ simply indicates the regression coefficient ofyi,t−1 on xit.

1.2 Advantages of dynamic modellingHowever, dynamic modelling includes several advantages. One not only takesinto account (temporal) autocorrelation in the residuals, but one is also able toreduce the amount of potential spurious regression, which may lead to wronginferences and inconsistent estimation in static models. Especially in the contextof tourism demand models, static models may lead to an overestimation of theeffects of the exogenous variables.1 Furthermore the coefficient of the laggeddependent variable itself may be of interest. In the tourism demand context, itindicates habit formation due to mouth-to-mouth effects.

2 Consistent estimatorsThis section deals with consistent estimators for dynamic panel data models.

2.1 First Differerence IV (Anderson/Hsiao, 1981)A specific case of a GMM estimator is the First-Difference IV. Individual fixedeffects µi are eliminated by differencing instead of within-transforming.

yit = ρyi,t−1 + x′itβ + µi + εit

4yit = ρ4yi,t−1 +4x′itβ +4εit

In matrix notation:

Fy = Fy−1ρ+ FXβ + Fε

1Spurious regression was also present in the application described in this paper. Becausewhen using a static specification of the winter tourism demand model, a negative coefficientfor snow cover is obtained, probably resulting from a positive evolution of the overstay nightstogether with a negative evolution of the snow cover variable. In dynamic specification, themore intuitive positive coefficient is obtained, meaning that higher snow cover leads to higherwinter tourism demand.

4

Page 5: Dynamic panel data methods for cross-section panels with an application on a winter tourism demand model

where F = IN⊗FT and FT =

−1 1 0 . . . 00 −1 1 . . . 0... 0

. . . . . . 00 0 0 −1 1

However 4yi,t−1 is correlated with the error term 4εi,t−1. Consistent estima-tion is still possible by using the IV method with yi,t−2 as instrument for4yi,t−1

, because2

E(yi,t−24εit) = 0

Even when all moments are employed, this means if all valid lagged internalvariables are included as instruments, estimation is inefficient, because not allinformation, e.g. 4εit ∼MA(1), is used. Only for the case of homoskedasticityof the residuals it can be shown that Andersion/Hsiao (1981) is the most efficientGMM estimator.

2.2 GMM estimators2.2.1 Difference GMM (Arellano and Bond, 1991)

Efficient estimates are obtained using a GMM framework. Following momentsare exploited:

E[yi,t−s4εit] = 0 and E[Xi,t−s4εit] = 0 for s ≥ 2; t = 3, . . . T

Xi =

26664yi2 − yi1 x′i3 − x′i2yi3 − yi2 x′i4 − x′i3

......

yi,T−1 − yi,T−2 x′iT − x′i,T−1

37775

Zi =

2666664[yi1,x

′i1, x

′i2] 0 · · · 0

0. . . · · · 0

......

. . ....

0 0 · · · [yi1, . . . , yi,T−2, x′i1, . . . , x

′i,T−1]

3777775X = (y−1, X), Z = (Z′1, Z

′2, . . . , Z

′N )′, γ′ = (ρ, β′)

The idea of the GMM framework is the following: L instruments imply a set ofL moments, i.a. gi(β) = yi,t−s4εit, where exogeneity holds when E(gi(β)) = 0.Each of the L moment equations corresponds to a sample moment g(β) =1n

∑ni=1 gi(β). Estimator is typically obtained by solving g(β) = 0. But here

this is not possible due to model overidentification. More equations are presentthan unknown variables, or equivalently speaking, more excluded instrumentsthan endogenous variables. As a consequence one is not able to solve the equa-tion exactly. Instead, the criterion JN is minimized.

JN (β) = g(β)′WN g(β)

or in our notation:

JN =

(1N

N∑i=1

4ε′iZi

)WN

(1N

N∑i=1

Z ′i4εi

)2valid under the assumption, that errors do not depend on time

5

Page 6: Dynamic panel data methods for cross-section panels with an application on a winter tourism demand model

where WN is an estimated weighting matrix. The weighting matrix is the weakpoint of the Anderson/Hsiao estimator, which uses the identity matrix as weight-ing matrix and therefore assumes conditional homoskedasticity. Such a matrixhowever overweights moments with higher variances and moments which arecorrelated with each other. Optimal weighting and therefore efficient estima-tion is assessed by using the inverse of the moment covariance matrix:

WN = V ar(Z ′4ε)−1 = (Z ′ΩZ)−1

This procedure is similar to the GLS procedure, but where GLS minimizes theweighted sum of the second moments of the residuals, GMM minimizes theweighted sum of the covariance structure of the moments. Unless Ω is known,efficient GMM is not feasible. A two-step procedure is therefore assessed. Firstreplace Ω with some simple G (here: assuming εit i.i.d., equivalent to conditionalhomoskedasticity)

W1N =

(N∑i=1

Z ′iGTZi

)−1

= (Z ′GZ)−1

where G =`IN ⊗G′T

´and GT = FTF

′T =

26666642 −1 0 0

−1 2. . . 0

0. . .

. . . −10 0 −1 2

3777775which delivers (consistent) first-step estimates. Its residuals 4ε1i are used forthe two-step estimation of W .

W =

(N∑i=1

Z ′i4ε1i4ε′1iZi

)−1

Efficient estimates for the Diff-GMM are then obtained with:

γEGMM =(X ′ZWZ ′X

)−1

X ′ZWZ ′y

One can show that under homoskedasticity one-step estimates are asymptoti-cally equivalent to two-step estimates. However when there is a high persistencyof yit, instruments are weak. (Blundell and Bond, 1998, Kitazawa, 2001). Be-cause the differences of a time series which is persistent (near a unit root) arenear to innovations and therefore difficult to instrument. This is known as theweak instrument problem.

Briefly I also discuss the estimation of the variance matrix.

• one-step procedure

replacing Ω with a sandwich-type proxy Ωβ1 delivers consistent and robust vari-ances.

dV ar[β1] =“X ′ZW1Z

′X”−1

X ′ZW1Z′Ωβ1ZW1Z

′X“X ′ZW1Z

′X”−1

• two-step procedure

6

Page 7: Dynamic panel data methods for cross-section panels with an application on a winter tourism demand model

using the optimal weighting matrix W = (Z′ΩZ)−1, above formula reduces to

dV ar[β2] =“X ′Z(Z′Ωβ1Z)−1Z′X

”−1

however: V ar[β2] can be heavily downward biased. For many years econome-tricians therefore printed out the t-statistics of the one-step procedure, whenestimating two-step GMM’s. However nowadays, windmeyer’s correction (2005)is mostly used for two-step t-statistics which have proven to perform very wellin Monte Carlo simulations.

2.2.2 System-GMM (Blundell/Bond, 1998)

It was invented to tackle the weak instrument problem. It builds up a system oftwo equations, the “level equation” and the “difference equation”. Endogenousvariables in the level equation are instrumented by lagged differences. Summa-rizing, »where Arellano-Bond instruments differences [. . . ] with levels, Blundell-Bond instruments levels with differences. [...] For random walk–like variables,past changes may indeed be more predictive of current levels than past levelsare of current changes« (Roodman, 2006). Following additional moments areexplored:

E[4yi,t−1(µi + εit)] = 0 and E[4Xi,t−1(µi + εit)] = 0, for t = 3, . . . , T

In matrix notation:

Xi =

yi2 − yi1 x′i3 − x′i2yi3 − yi2 x′i4 − x′i3

......

yi,T−1 − yi,T−2 x′iT − x′i,T−1

yi2 x′i2...

...yi,T−1 x′iT

Zi =

[ZDi 00 ZLi

]

ZLi =

[4yi2,4x′i2,4x′i3] 0 · · · 0

0. . . · · · 0

......

. . . 00 0 · · · [4yi2, . . . ,4yi,T−2,4x′i,2, . . . ,4x′i,T ]

2.2.3 Validity of the instruments

Performance depends strongly on the validity of the instruments. A valid in-strumental variable z requires:

1. E[ε | z] = 0 (exogeneity)

2. cov(z, x) 6= 0 (relevance)

These assumptions can be tested, using procedures like the overidentifying re-strictions test which is also used in the context of IV estimation.

• Overidentifying restrictions test (Sargan/Hansen Test)The assumption of exogeneity of all instruments is tested using the Hansen-Statistic J(βEGMM ) ∼ χ2

L−K or the Sargan-Statistic, which, as opposed

7

Page 8: Dynamic panel data methods for cross-section panels with an application on a winter tourism demand model

to Hansen, assumes conditional homoskedasticity. The main problem withthe Hansen-statistic is that it is seriously weakened by the use of a largeamount of instruments. By definition, if the number of instruments isabove the number of (cross-section) observations, it will deliver a p-valueof one, which is meaningless. The Sargan-statistic is known as relativelyrobust to the use of a large number of instruments, but may be inconsistentunder heteroskedasticity or at least, tests based on Sargan-statistic havelittle power. One could also test subset of instruments, which is especiallyhelpful in analyzing the new instruments employed for the System-GMM.This can be done by the Difference-in-Sargan statistic: DS = Su−Sr ∼ χ2

• Arellano and Bond - Autocorrelation testIt examines the differenced residuals to find autocorrelation of the resid-uals in levels, which would be an indicator of weak, not exogenous instru-ments.

It is difficult to detect how many instruments should be used in GMM esti-mation. Typically one employs more instruments to increase efficiency of theestimation. However it was shown that more instruments also increase the fi-nite sample bias (Bun/Kiviet, 2002). Thus it exists a trade-off between smallsample bias and efficiency. It is suggested by Hahn that a restricted subset ofinstruments should be used, using a number of instruments at least below thenumber of observations.

2.3 Bias corrected LSDV estimatorConsistent estimation is obtained by additive bias correction. One could es-timate LSDV coefficients and correct these by subtracting their (estimated)biases. The nowadays relatively large amount of bias corrected LSDV estima-tors may be distinguished according to their use of external information. Biascorrected LSDV estimators, which

• use a preliminary consistent estimatorKiviet (1995)Hansen (2001)Hahn and Kuersteiner (2002) - not for short TBruno (2005) - for unbalanced panels and short T

• do not need a preliminary consistent estimatorBun/Carree (2005)

2.3.1 Bruno (2005)

This paper will explain briefly the procedure suggested by Bruno (2005). Biasapproximations emerge with an increasing level of accuracy:

B1 = c1(T−1), B2 = B1 + c2(N−1T−1) and B3 = B2 + c3(N−1T−2)

where c1, c2 and c3 depend i.a. on σ2ε and γ. Because bias approximations

are not yet feasible. σ2ε and γ have to obtained from a consistent estimator.

(Anderson/Hsiao, Arellano/Bond, Blundell/Bond)

8

Page 9: Dynamic panel data methods for cross-section panels with an application on a winter tourism demand model

LSDV Ci = LSDV − Bi, i = 1, 2 and 3

2.4 Summary of the models

Model Transformation Regressors Consistency

LSDV/FE Within yi,t−1, xit noBias corrected LSDV Within yi,t−1, xit yesFirst-difference IV 4 4yi,t−1,4xit yes

First-difference GMM 4 4yi,t−1,4xit yesSystem-GMM 4 4yi,t−1,4xit, yi,t−1, xit yes

Table 1: Summary of the models

There is a vast literature dealing with the performance of these estimators undera variety of conditions. According to Monte Carlo Simulations, one can statethat GMMs are more adequate for large N dimension and are more accurate incase of heteroskedasticity, whereas the bias corrected LSDV performs in generalbetter for small data sets.

3 Application - Winter tourism demand modelTheoretical model descriptions are applied in this chapter on a winter tourismdemand model for 185 Austrian ski destinations. The study is taken from Eigneret al. (2009).

3.1 Data descriptionThe analyzed panel data set contains the number of overnight stays of touristsfrom the 15 most important countries (including Austria) in 1853 different ski-ing destinations in Austria in the winter season for the time period from 1973to 2006. Data can be described as typical cross-section panel data, with a mod-erately large number of cross-section units (N=185), each observed for a smallnumber of time periods (T=34).

Nights number of overnight stays in winter season

Snow snow cover

GDP income variable

Beds infrastructure variable

PP relative purchasing power

Table 2: Description of the variables

The number of overnight stays measures winter tourism demand, which isassumed to depend on relative purchasing power (PP)4 and income (GDP)5 of

3Austrian ski resort database. JOANNEUM Research (2008)4OECD (2008)5OECD (2008)

9

Page 10: Dynamic panel data methods for cross-section panels with an application on a winter tourism demand model

the tourists, on infrastructure, measured by the number of beds (BEDS)6 andon the climate variable snow (SNOW)7. To account for unobserved time effects,time dummies are included in the demand function.

3.2 Commands in statistical software packagesIn STATA, GMMs can be estimated by the offical command xtabond (Arel-lano/Bond) and xtdpdsys (Blundell/Bond) command, where the later is onlyavailable in STATA 10. However more testing procedures are available by usingxtabond2 (Roodman, 2006). The bias corrected LSDV estimator is availableunder the command xtlsdvc (Bruno, 2005). The cross section autocorrelationtest by Pesaran (2004) can be assessed using xtcsd.Using R (2008), a specific package named plm (Yves/Giovanni, 2008) is avail-able, containing the function pgmm for GMMs. Currently, this function failswhen dealing with large instrument matrices. Furthermore, it is not able toexactly reproduce the estimation results of the STATA functions.Current versions of other statistical software packages like SAS, Eviews andLIMDEP are also able to calculate GMMs for dynamic panel models. Un-fortunately, for all these programs (together with R and STATA) holds, thatbias corrected LSDV estimators are currently not present as official commandsin statistical software packages. Luckily, Bruno (2005) not only brought hisconsiderations into a paper but also implemented his estimator as an inofficalcommand for STATA.

3.3 Estimation table of the winter tourism demand modelfor Austrian ski destinations from 1973 to 2006

-------------------------------------------------------------------------------------------------------------------- pool fe fe_tw fe_tw_bc diffgmm2 sysgmm2 sysgmm_v sysgmm_g -------------------------------------------------------------------------------------------------------------------- L.NIGHTS 0.716*** 0.609*** 0.596*** 0.637*** 0.475*** 0.634*** 0.600*** 0.632*** (11.97) (10.54) (10.62) (67.01) (6.35) (11.63) (4.64) (11.78) L2.NIGHTS 0.215*** 0.174*** 0.187*** 0.161*** 0.166*** 0.163*** 0.268*** 0.179*** (3.92) (3.82) (4.36) (16.72) (5.44) (3.93) (3.69) (3.78) SNOW / 100 0.067*** 0.076*** 0.070*** 0.071*** 0.071*** 0.097*** 0.153*** 0.095*** (5.70) (6.49) (4.04) (4.44) (3.65) (4.41) (2.70) (5.35) log(BEDS) 0.086*** 0.113*** 0.132*** 0.119*** 0.202*** 0.202*** 0.134 0.222*** (7.40) (5.53) (5.80) (10.34) (4.41) (6.19) (1.38) (7.87) log(GDP) 0.039 0.013 0.407*** 0.436*** 0.977 0.700 0.361 0.663*** (0.73) (1.42) (3.29) (5.85) (1.45) (1.49) (0.97) (2.72) log(PP) -0.041*** -0.035*** -0.030** -0.028** -0.024 -0.010 -0.056 0.010 (-3.35) (-2.76) (-2.31) (-2.36) (-0.38) (-0.16) (-1.17) (0.30) -------------------------------------------------------------------------------------------------------------------- R2_within 0.776 0.785 corr(x_i,mu_i) 0.954 0.924 sigma_u 0.193 0.174 sigma_e 0.147 0.145 rho 0.632 0.592 Pesaran AR 74.3 2.8 Pesaran p_value 0.000 0.005 t-statistics Robust Robust Robust Corrected Corrected Corrected Corrected F 15133.0 1129.5 355.7 102.7 37285.5 274410.0 211497.0 diff AR(2) 0.621 0.901 0.345 0.926 Sargan test 0.000 0.000 0.000 0.000 Hansen test 1.000 1.000 0.202 1.000 Diff_Sarg IV 1.000 0.752 1.000 Diff_Sarg GMM 1.000 0.510 1.000 No. of instruments 562 594 147 893 No. of groups 185 185 185 185 185 185 185 No. of observations 5920 5920 5920 5920 5735 5920 5920 5920 -------------------------------------------------------------------------------------------------------------------- * p<0.1, ** p<0.05, *** p<0.01

Table 3: Estimation table of the winter tourism demand model for all estimators6Statistik Austria (2008)7ZAMG (2009)

10

Page 11: Dynamic panel data methods for cross-section panels with an application on a winter tourism demand model

Equation Type DIFF_GMM SYS_GMM SYS_GMM_valid SYS_GMM_gdp

First

difference

equation

IV

Diff.

(SNOW

log_PP

log_GDP

log_BEDS)

time_dummies

Diff.

(SNOW log_PP

log_GDP

log_BEDS)

time_dummies

- Diff.

(log_BEDS SNOW

log_PP)

time dummies

GMM Lag(2-.).

log_NIGHTS

Lag(2-.).

log_NIGHTS

Lag(4-7).

log_NIGHTS

Lag(2-.).

log_NIGHTS

log_GDP

IV - log_GDP

SNOW log_PP

log_GDP SNOW

log_PP

Level equation

GMM - Diff.Lag.

log_NIGHTS

Diff.(Lag(3).

log_NIGHTS)

Diff.Lag.(log_NIG

HTS log_GDP)

Table 4: Instrument list for Table 3

«The OLS coefficient of the lagged dependent variable is expected to suffer froman upward bias due to its ignorance of individual specific effects (Hsiao 1986),whereas the within estimator of the fixed effects model is expected to be down-ward biased (Nickell 1981). [...] According to Blundell and Bond (1995), aplausible parameter estimate should therefore lie between the within and theOLS estimate. « (Eigner et al., 2009). This initial plausability assumption isfulfilled in our estimations. Controlling for time effects in the fixed effects modelresults in a large increase in the coefficient for GDP, making it theoretically morereasonable, while leaving all other coefficients practically unchanged.The bias corrected estimator produces only minor changes to the two-way fixedeffects model, which seems to suffer therefore only from minor biases. The dif-ference GMM suffers from a substantial downward bias of the sum of the laggeddependent variables, which is consistent with the finite sample bias found inBlundell and Bond (1998) in case of near persistent series, which could be evengreater than the one for the within estimator, especially in the case of weak in-struments. Indeed this seems to be the case here. The Sargan test indicates forboth models (Diff-GMM and System-GMM) that instruments are weak, eventhough this test statistic is quite restrictive and has to be interpreted with careon account of its missing robustness to heteroskedasticity.The problem of a large amount of instrumental variables, which are weaken-ing the Hansen-Test and could lead to finite sample bias, is tackled in theSYS_GMM_v model. This model aims at fulfilling the overidentifying restric-tions test, needing less instrumental variables by leaving out the time dummiesas standard instrumentals and using only lag 4 to 7 of the dependent variableas GMM-instrumentals. Estimations seem to fit well of what one is expecting.However, model specification for SYS_GMM_v is quite arbitrary and the co-efficients for BEDS and GDP are not significant, which is unreliable and hintsat specification problems.Theoretical considerations led to the construction of SYS_GMM_g. Until nowGDP has been dealt as strictly exogenous in our tourism demand model, whichdoes not have to be plausible. As a higher GDP is expected to increase thenumber of overnight stays, increasing tourism earnings lead to an increase of the

11

Page 12: Dynamic panel data methods for cross-section panels with an application on a winter tourism demand model

Austrian GDP, which is relatively strongly weighted (9%) in the GDP variable.Although GMM estimators are in general assumed to be relatively robust toendogeneity problems, one could treat GDP not as standard but as GMM styleinstrumental variable. Assessing the SYS_GMM_v estimator with all availablemoments leads to estimation results which are similar to the bias-corrected two-way fixed effects model, with reliable and significant coefficients for SNOW andGDP. The main weakness of this model is the use of a vast number of instrumen-tal variables, which may enhance finite sample bias. It therefore breaks the rulethat the number of groups should be larger than the number of instrumentalvariables. As a consequence, due to the weakening of the Hansen Test no clearstatements can be given for the validity of the instruments. One has to acceptthat the overidentifying-restrictions test of Sargan is highly rejected, indicatingweak instruments, which is though weakened by heteroskedasticity. However,because the value of the lagged dependent variable is only slightly above theone of the bias corrected FE model, finite sample bias seems to be small andtherefore the gain in efficiency is preferred here.

3.4 Interpretation of the resultsAccording to estimations of the last model SYS_GMM_g, short-term incomeelasticity amounts to 0.55, which would indicate the presence of inelastic demandto Austrian skiing destinations. However, in the long term elasticity increases to3.01 and thus is strongly elastic as it is common for luxury goods. Estimationsof the bias corrected FE model reveal a short-term income elasticity of 0.44and a long-term income elasticity of 2.18. Snow estimation results deliver thatadditional 10 days with a snow height of more than 1 cm lead to an increase ofovernight stays to an extent of about 0.7 to 1 percent. At an aggregate levelfor the winter season, 79-82 percent of total overnight stays in Austria can beattributed to habit persistence and/or word-of-mouth effects, indicating a verygood reputation of Austrian skiing destinations.

3.5 Final considerations concerning the applicationThe biases in the estimates seem to follow the theory. However cross-sectiondependency in the data may still worse the estimations. In detail, cross-sectiondependence may lead to potential loss in efficiency (Phillips and Sul, 2003) andmay even lead to biased and inconsistent estimates. Therefore the regressionmodel should be extended to account for cross-section dependence, this meansfor spatial patterns in the data. This can be done in spatial interaction modelswith spatial weighting matrices.

12

Page 13: Dynamic panel data methods for cross-section panels with an application on a winter tourism demand model

References.

Blundell, Richard & Bond, Stephen (2000): GMM Estimation with persis-tent panel data: an application to production functions. Econometric Reviews,Taylor and Francis Journals, vol. 19(3), 321-340.

Bond, Stephen (2002): Dynamic panel data models: a guide to microdatamethods and practice. CeMMAP working papers CWP09/02, Centre forMicrodata Methods and Practice, Institute for Fiscal Studies.

Bruno, G.S.F. (2005): Estimation and inference in dynamic unbalancedpanel data models with a small number of individuals. CESPRI WP n.165 ,University Bocconi-CESPRI, Milan.

Croissant, Yves & Millo, Giovanni (2008): Panel Data Econometricsin R: The plm Package. Journal of Statistical Software 27(2). URL:http://www.jstatsoft.org/v27/i02/.

Eigner, F., Toeglhofer, C., & Prettenthaler, F. (2009). Tourism demand inAustrian ski destinations. A dynamic panel data approach (in preparation).

Kitazawa, Y. (2001): Exponential regression of dynamic panel data models.Economics Letters 73, 7-13.

Kiviet, J. F. (1995): On Bias, Inconsistency and Efficiency of VariousEstimators in Dynamic Panel Data Models. Journal of Econometrics 68, 53–78.

Nickell, Stephen J. (1981): Biases in Dynamic Models with Fixed Effects.Econometrica, Econometric Society, vol. 49(6), 1417-26, November.

Pesaran, M. H. (2004): General diagnostic tests for cross section depen-dence in panels. Cambridge Working Papers in Economics, 0435, University ofCambridge.

R Development Core Team (2008): R: A language and environment forstatistical computing. R Foundation for Statistical Computing,Vienna. URL:http://www.R-project.org

Roodman, D. (2006): How to Do xtabond2: An Introduction to "Differ-ence" and "System" GMM in Stata. Working Paper 103. Center for GlobalDevelopment, Washington.

StataCorp. (2007): Stata Statistical Software: Release 10. College Station,TX: StataCorp LP.

Windmeijer, F. (2005): A Finite Sample Correction for the Variance of Lin-ear Efficient Two–Steps GMM Estimators. Journal of Econometrics, 126, 25–51.

13