The Linear Regression Model with Panel Datapersonal.strath.ac.uk/gary.koop/ec468_topic4_slides.pdf · 16/2/2011 · Common for f to be log-linear (e.g. Cobb-Douglas or translog):

The Linear Regression Model with Panel Data

February 16, 2011

() Topic 4 February 16, 2011 1 / 29

Summary

Readings: Chapter 7 of textbook.

I will cover the pooled and individual e¤ects models.

A very popular version of the individual e¤ects model: the stochasticfrontier model will also be discussed.

Computational tools: Gibbs sampler with data augmentation.

() Topic 4 February 16, 2011 2 / 29

Notation

yit and εit denote tth observations (for t = 1, ..,T ) of the dependentvariable and error, respectively, for i th individual for i = 1, ..,N.

yi and εi denote vectors of T observations on dependent variable anderror, respectively, for i th individual.

Sometimes it is important to distinguish between the intercept andslope coe¢ cients.

Hence, de�ne Xi to be a T � k matrix containing T observations oneach of k explanatory variables (including intercept) for i th individual.eXi is T � (k � 1) matrix equal to Xi with intercept removed.

() Topic 4 February 16, 2011 3 / 29

If we stack observations for all N individuals together, we obtain theTN�vectors:

y =

2664y1..yN

3775 and ε =

2664ε1..

εN

3775Similarly, stacking observations on all explanatory variables togetheryields the TN �K matrix:

X =

2664X1..XN

3775

() Topic 4 February 16, 2011 4 / 29

The Pooled Model

Assume same regression relationship holds for every individual:

yi = Xi β+ εi ,

for i = 1, ..,N where β is the k�vector of regression coe¢ cients,including intercept

This is just a linear regression model of sort discussed in previouslectures.

No new issues arise.

() Topic 4 February 16, 2011 5 / 29

Individual E¤ects Models

Model is of the form:

yit = αi + βxit + εit

Di¤erent intercept for every individual (�individual e¤ect�)

Same slope for every individual

() Topic 4 February 16, 2011 6 / 29

The Likelihood Function

Likelihood function based on the regression equation:

yi = αi ιT + eXieβ+ εi

Properties of multivariate Normal imply likelihood function of theform:

p(y jα, eβ, h) =∏Ni=1

hT2

(2π)T2

�exp

�� h2

�yi � αi � eXieβ�0 �yi � αi � eXieβ��

where α = (α1, .., αN )0.

() Topic 4 February 16, 2011 7 / 29

The PriorA Non-hierarchical Prior

Any sort of prior, including a noninformative one.

Here we consider two types of priors which are computationally simpleand commonly used.

Individual e¤ects model can be written as:

y = X �β� + ε

where X � is a TN � (N + k � 1) matrix given by

() Topic 4 February 16, 2011 8 / 29

X � =

266664ιT 0T . . 0T eX10T ιT . . . eX2. 0T . . . .. . . . 0T .0T . . . ιT eXN

377775and

β� =

266664α1..

αNeβ

377775

() Topic 4 February 16, 2011 9 / 29

Individual e¤ects model can be written as regression model (withindividual dummy variables).

Use independent Normal-Gamma prior (but could also use naturalconjugate prior):

β� � N�

β�,V�

h � G�s�2, ν

�

() Topic 4 February 16, 2011 10 / 29

A Hierarchical Prior

Hierarchical priors are popular in many cases with high-dimensionalparameter spaces (such as the individual e¤ects model).

Consider a prior:αi � N (µα,Vα)

with αi and αj being independent of one another for i 6= j .Hierarchical structure of the prior arises if we treat µα and Vα asunknown parameters which require their own prior.

() Topic 4 February 16, 2011 11 / 29

We assume µα and Vα to be independent of one another with:

µα � N�

µα, σ2α

�and

V�1α � G�V�1α , να

�Hierarchical prior assumes all intercepts are drawn from samedistribution.

This extra structure (if consistent with patterns in the data), allowsfor more accurate estimation.

() Topic 4 February 16, 2011 12 / 29

For the remaining parameters, we assume a non-hierarchical prior ofthe independent Normal-Gamma variety.

eβ � N �β,V β

�and

h � G�s�2, ν

�This model is analogous to the frequentist random e¤ects model.

() Topic 4 February 16, 2011 13 / 29

Bayesian ComputationPosterior Inference under the Hierarchical Prior

Under the non-hierarchical prior, we have a linear regression modelwith independent Normal-Gamma prior. Hence, posterior inferencecan be can be carried out using methods in Chapter 4.

A Gibbs sampler can be used

The relevant posterior distributions for eβ and h, conditional on α, are

() Topic 4 February 16, 2011 14 / 29

eβjy , h, α, µα,Vα � N�

β,V β

�hjy , eβ, α, µα,Vα � G (s�2, ν)

αi jy , eβ, h, µα,Vα � N�αi ,V i

�,

µαjy , eβ, h, α,Vα � N�µα, σ

2α

�,

V�1α jy , eβ, h, α, µα � G�V�1α , να

�where formulae for arguments of these densities given in textbook,pages 152-154.

() Topic 4 February 16, 2011 15 / 29

Derivations above simple extensions of those for Normal linearregression model

Gibbs sampler requires only random number generation from Normaland Gamma distributions.

Note: the random coe¢ cients model is given by:

yi = Xi βi + εi

where βi varies over observation.

Discussed in textbook, pages 155-157. (Simple extension of individuale¤ects model so I will not discuss it here).

() Topic 4 February 16, 2011 16 / 29

E¢ ciency Analysis and the Stochastic Frontier Model

To motivate model, let output of �rm i at time t, Yit , be producedusing a vector of inputs, X �it , .

Firms have access to a common best-practice technology for turninginputs into output:

Yit = f (X �it ; β).

Production frontier measures the maximum amount of output thatcan be obtained from a given level of inputs.

Deviation of actual from maximum feasible output is a measure ofine¢ ciency.

() Topic 4 February 16, 2011 17 / 29

Formally:Yit = f (X �it ; β)τi

where 0 < τi � 1 is a measure of �rm-speci�c e¢ ciency and τi = 1indicates �rm i is fully e¢ cient.

Example: τi = 0.75 means that �rm i is producing only 75% of theoutput it could have if it were operating according to best-practicetechnology.

In this speci�cation, we have assumed each �rm has a particulare¢ ciency level which is constant over time. This assumption can berelaxed.

Adding a random error to the model, ζ it , to capture measurement (orspeci�cation) error:

Yit = f (X �it ; β)τi ζ it

() Topic 4 February 16, 2011 18 / 29

Common for f () to be log-linear (e.g. Cobb-Douglas or translog):

yit = Xitβ+ εit � zi

where yit = ln(Yit ), εit = ln(ζ it ), zi = �ln(τi ) and Xit is thecounterpart of X �it with the inputs transformed to logarithms

stack into matrices:yi = Xi β+ εi � zi ιT

zi is referred to as ine¢ ciency

0 < τi � 1It is a non-negative random variable.

Xit is assumed to contain an intercept and β1 is its coe¢ cient.

Note that this model is of the form of an individual e¤ects model:β1 � zi plays the same role that αi did for individual e¤ects models.

() Topic 4 February 16, 2011 19 / 29

Bayesian Inference in the Stochastic Frontier Model

Very similar to individual e¤ects model, so we will only sketch outdetails.The important new issue here is ine¢ ciency term, zi , so focus on that.Hierarchical prior for ine¢ ciencies:Since zi > 0, cannot use Normal hierarchical priorCommon choices include the truncated-Normal and members of thefamily of Gamma distributions.Here we will use the exponential distribution (which is Gamma withtwo degrees of freedom):

zi � G (µz , 2)

µz > 0 requires a prior.We use:

µ�1z � G�

µ�1z, νz

�() Topic 4 February 16, 2011 20 / 29

Now set up a Gibbs sampler.

Derive full conditional posterior distributions similarly to randome¤ects model

βjy , h, z , µz � N�

β,V�

hjy , β, z , µz � G (s�2, ν)

p(zi jyi ,Xi , β, h, µz ) ∝fN (zi jX i β� y i � (Thµz )

�1, (Th)�1)1(zi � 0)

µ�1z jy , β, h, z � G (µz , νz )

formulae for arguments of densities are given in the book.

Gibbs sampler involves drawning from Normal, truncated Normal andGamma distributions �all straightforward to do.

() Topic 4 February 16, 2011 21 / 29

Empirical Illustration: E¢ ciency Analysis with StochasticFrontier Models

To illustrate Bayesian inference in the stochastic frontier model,arti�cial data was generated from:

yit = 1.0+ 0.75x2,it + 0.25x3,it � zi + εit

for i = 1, .., 100 and t = 1, .., 5.

εit � N (0, 0.04), zi � G (� ln [.85] , 2), x2,it � U (0, 1) andx2,it � U (0, 1) .Note: ine¢ ciency distribution is selected to imply median of e¢ ciencydistribution is 0.85.

Priors are relatively noninformative (see textbook).

Posterior results based on Gibbs sampler

() Topic 4 February 16, 2011 22 / 29

Table 7.3 contains posterior means and standard deviations forparameters

With stochastic frontier models, interest often centers on �rm-speci�ce¢ ciencies, τi for i = 1, ..,N.

Since τi = exp (�zi ), and Gibbs sampler yields draws of zi , we cansimply transform them and average to obtain E (τi jy)There are N = 100 e¢ ciencies �we select �rms which have theminimum, median and maximum values for E (τi jy).These are labelled τmin, τmed and τmax in Table 7.3.

The histogram in Figure 7.5 plots the posterior means of thee¢ ciencies of all 100 �rms, might be presented to give a rough ideaof how e¢ ciencies are distributed across �rms.

() Topic 4 February 16, 2011 23 / 29

Table 7.3: Posterior Results for Arti�cialData Set from Stochastic Frontier Model

MeanStandardDeviation

β1 0.98 0.03β2 0.74 0.03β3 0.27 0.03h 26.69 1.86µz 0.15 0.02

τmin 0.56 0.05τmed 0.89 0.06τmax 0.97 0.03

() Topic 4 February 16, 2011 24 / 29

() Topic 4 February 16, 2011 25 / 29

An important issue in e¢ ciency analysis is whether point estimatescan be treated as a reliable guide to the ranking of �rms.

Important policy recommendations may hang on a �nding that �rm Ais less e¢ cient than �rm B.

Simply relying on point estimates which indicate that �rm A is lesse¢ cient than �rm B may lead to inappropriate policy advice.

But Gibbs sampler output can be used in a straightforward manner toshed light on this issue.

For instance, p (τA < τB jy) is the probability �rm A is less e¢ cientthan �rm B.

() Topic 4 February 16, 2011 26 / 29

We �nd p (τmax > τmed jy) = 0.89, p (τmax > τminjy) = 1.00 andp (τmed > τminjy) = 1.00.Thus, we can conclude that �rms which are ranked far apart in termsof their e¢ ciency estimates do truly di¤er in e¢ ciency.

However, it is likely that, e.g., researcher would be very uncertainabout saying 12th ranked �rm is more e¢ cient than 13th ranked.

Figure 7.6 plots posteriors for τmin, τmed and τmax.

() Topic 4 February 16, 2011 27 / 29

() Topic 4 February 16, 2011 28 / 29

Extensions/Applications

Panel data topics popular right now in the econometrics literature.

Panel data models introduced in this chapter are useful for modelingheterogeneity of various sorts.

This is a crucial issue in many �elds.

E.g. marketing has consumer heterogeneity,

labour economics, individuals may vary in many ways that cannot bedirectly observed by the econometrician (e.g. they may di¤er in theirreturns to schooling, their value of leisure, their productivity, etc.).

Dynamic panel data models are very hot these days (i.e. T is largeenough that you have to start worrying about time series and unitroot issues).

() Topic 4 February 16, 2011 29 / 29

Documents

The Linear Regression Model with Panel Datapersonal.strath.ac.uk/gary.koop/ec468_topic4_slides.pdf · 16/2/2011 · Common for f to be log-linear (e.g. Cobb-Douglas or translog):