Vector Autoregression - James E. Monogan III

Vector Autoregression

Jamie Monogan

University of Georgia

February 27, 2018

Jamie Monogan (UGA) Vector Autoregression February 27, 2018 1 / 17

Objectives

By the end of these meetings, participants should be able to:

Explain the relationship between VAR and Granger causality.

Weigh the advantages and disadvantages of the VAR approach.

Specify and estimate a VAR model with OLS.

Interpret a VAR model in terms of causal tests and impulse responseanalysis.


Granger and VAR

The direct Granger test is a bivariate case of vector autoregression.

That is, it considers a vector (length 2) of all possible endogenousvariables in a system of 2 variables and it controls expectedautoregression in y (and x) by introducing lags of the dependentvariables on the right hand side.

The VAR setup is an extension of the idea to a system of k variables.

We still consider each of the k variables from the system as a functionof their own lags and lags of the other k-1 variables.

We still do causal and exogeneity testing with F -ratios.


The Attitude of VAR Modeling

Equally good theory published side by side in the best journals isstrikingly contradictory. But all developed and tested according toaccepted econometric practice.

Maybe we have to step back from the presumption that it is possibleto specify the interrelationships of a group of variables and say,instead: What really matters, the question above all other questions,is causal ordering.

We have no confidence in the evidence we bring to bear on this issuebecause it is so deeply embedded in assumptions about multivariaterelationships that the data are not allowed to speak.

Let us instead assume that we really know nothing about thestructure of a set of variables—which may even be unknowable—andproceed modestly to a means to let the data tell us.


The Controversy over VAR Modeling

VAR: Structural models, while desirable, are impossible forcomplex systems. “Reduced form” models telling us aboutcausal ordering is the most that we can hope for. In fact, ina world of endogenous relationships, structural models maybe horribly misspecified.

Anti-VAR: Structure—which is to say, theory—is all that matters, andVAR cannot produce structural estimates.


Problems of VAR Analysis

VAR models require the estimation of very large numbers ofparameters (because they impose no theoretical restrictions).

They are therefore radically inefficient.

Like the Granger case, VAR coefficients should never be interpreted.(We estimate with OLS, so this is well known.)

But it is useful as a theory-light way to let the data speak to order ofcausality questions, as well as subsidiary questions like required laglength.


The VAR Model

yt = c +L∑

s=1Bsyt−s + ut

In effect, written entirely in terms of “y” because several dependentvariables are presumed to be jointly endogenous. (You do have theoption of specifying some exogenous predictors as well.)

This is a system of equations, where there are as many equations asvariables, k.

yt is understood to be a vector (hence, the name “vectorautoregression”), the value of each of the variables at time t. Bs is amatrix of regression parameters, k by k. (Note, this notion differsfrom how “B” has been used in this course.)


Vector Notation

Note: Brandt & Williams use row-vector notation; Enders and I usecolumn-vector notation.

Consider the case of a three-variable system.

We therefore wish to estimate the following model: y1t

y2t

y3t

=

c1

c2

c3

+L∑

s=1

β(s)11 β

(s)12 β

(s)13

β(s)21 β

(s)22 β

(s)23

β(s)31 β

(s)32 β

(s)33

y1t−s

y2t−s

y3t−s

+

u1t

u2t

u3t

Notice that the ut vector does not allow us to specify disturbancecovariances, so we can estimate this system with OLS equations.


Determining Lag Length

E (utu′t) = Σ

To test restrictions such as lag specification: Consider a model, UR,with s lags and a restricted model, a proper subset, with s-1.

Then:` = (T − c)(log |ΣR | − log |ΣUR |)

` is a likelihood ratio test, distributed as χ2

where log|ΣR | is the log of the determinant of the error covariance ofthe restricted model and c = ks + 1 is a degrees of freedomcorrection, based on the number of variables times the number of lagsin the unrestricted model.

You also may want to evaluate your specification my plottingpredicted values against the true series.


The Moving Average Representation

Instead of a Vector Autoregression (VAR) representation of ourmodel, how about a Vector Moving Average (VMA) representation?

Recall: A low order AR process ≈ MA(∞)

We use a low-order AR specification (remember, column notation):

yt = c + B1yt−1 + B2yt−2 + et

yt = c + (B1L + B2L2)yt + et

Hence, we can redefine our VAR model in terms of an MA(∞):

yt = d + (I + C1L + C2L2 + . . .)et

where (I + C1L + C2L2 + . . .) = (I− B1L− B2L

2 − . . .− BpLp)−1.


Interpretation from the VMA model

The individual moving average coefficients are defined as:

C1 = B1

C2 = B1C1 + B2

C3 = B1C2 + B2C1 + B3

...

C` = B1C`−1 + B2C`−2 + · · ·+ BpC`−p

Now we can estimate an empirical impulse response function fromeach of the innovations series to each of the variables in the system.

If we then shock the innovations, a process called “innovationaccounting,” we can then observe the multivariate causal flow.


Impulse Response Analysis

With impulse rseponse analysis we ask: “For a one unit pulse shockto variable A, what are the expected dynamic consequences in thesystem?”

The right hand side of a VMA model consists of disturbances ofvarious terms multiplied by coefficients.

Each MA coefficient gives us the expected impact on the LHSvariable for a shock at a particular lag of a particular variable.

Simply plotting the coefficients will graphically display what weexpect to happen over time.


Causal Testing

The clearest technique is to do F -tests on separate equations. (Iwould advise this.)

Language: null hypothesis of exogeneity or noncausality.

The “vars” package in R does a multi-equation F -test that asks if allcoefficients for one variable are zero in the equations for all othervariables. (Comparable to an F -test of coefficient equality in aseemingly unrelated regression.)

The advantage to the multi-equation F is it evaluates whether avariable has any consequence in the whole system.

The disadvantage is it does not speak to which other variables itcauses.


Testing for Serial Correlation

Visual Inspections

Residuals plotsACF / PACFBoth produced by default in R

Portmanteau tests (multivariate)

Breusch-GodfreyBox-Ljung

Qh = Th∑

j=1

tr(Γ′

j Γ−1′

0 Γj Γ−1

0 )

where Q ∼ χ2(k2(h − p)) where k is the number of endogenousvariables, h the the number of lags for which autocorrelation isconsidered, and p is the order of the VAR model (i.e., number of lagsof each variable on the right-hand side). Γj is the covariance matrix ofresiduals at time t with those at time t − j .We also might do a small-sample correction.


More on Lag Length

For a VAR(p) model with T observations, k variables, and |Σ| thedeterminant of the error covariance matrix:

AIC (p) = T log |Σ|+ 2(k2p + k)

BIC (p) = T log |Σ|+ log(T )(k2p + k)

HQ(p) = T log |Σ|+ 2 log(log(T ))(k2p + k)

These fit indices can be calculated for any model with a log-likelihoodfunction.

They allow for a probabilistic view of model selection.


Software

Stata

var varlist, lags(#) exogenous(list2)

R

“vars” library

var.model < − VAR(dataSet, p=#, type=”const”,exogen=vectorName)

plot(var.model)

causality(var.model, cause=”variable.name”)

var.model.irf < − irf(var.model, impulse = ”variable.name”, response= c(”var1”, ”var2”, ”var3”))

plot(var.model.irf)


Homework

For Next Time

Reading:Monogan. 2011. “Panel Data Analysis.” In International Encyclopediaof Political Science.Beck and Katz. 1995. “What to Do (and Not to Do) with Time SeriesCross-Section Data.” American Political Science Review 89:634-647.

Answer question #4 from page 186 of Political Analysis Using R.

Suppose you estimated the following VAR model (the constants arezero). Write down the VMA model through two lags:[

y1t

y2t

]=

[.5 .6.7 .8

] [y1t−1

y2t−1

]+

[.1 .2.3 .4

] [y1t−2

y2t−2

]+

[e1t

e2t

]


Homework

Additional Material


Bayesian Vector Autoregression

Basic Introduction to Bayesian Methods

Likelihood-based inference assumes that population parameters arefixed, and the data are randomly observed given the parameters.

From the data, we estimate which parameters are most likely to haveproduced the data we have observed.

Bayesian inference assumes that the data are fixed, and thepopulation parameters are random.

Consider: much of classical inference relies on repeated-sample theory.Could you repeat a sample of what you are studying?If you are Gregor Mendel, then yes you can find more pea pods.If you are studying the time series of Bush’s approval rating, then no.

So if the parameters are random, what is their distribution?

For θ the parameters and D the data, Bayes’ law tells us:

p(θ|D) =p(θ)p(D|θ)

p(D)



Priors, Likelihoods, & Posteriors

Using this rule, we can define our posterior distribution with the priordistribution and likelihood function:

π(θ|D) =p(θ)L(θ|D)∫

Θ p(θ)L(θ|D)dθ∝ p(θ)L(θ|D)

A few approaches to priors (non-exhaustive list):FlatConjugateElicited

π(θ|D) may be complex and hard to marginalize (w.r.t. eachparameter).

Hence, we turn to MCMC to numerically give us the distribution ofour posterior.

Short-term memory: useful because it will wander around values withthe highest density.



Bayesian Vector Autoregression: Advantages

Imposes structure through priors: coefficients diminishing to zero.

Greater ability to account for unit roots.

Easier and more accurate assessment of uncertainty.

Fairer assumptions about data.

The Sims-Zha Model

q(A) ∝ L(Y |A)π(a0)φ(a+,Ψ)



Sims-Zha Priors

How do we define Ψ?

Diagonal elements define variance of VAR parameters:

ψ`,j ,i =

(λ0λ1

σj`λ3

)2

Since ` represents lag length, a larger lag implies a smaller variance.(Parameters are converging to zero.)

λ0 speaks to overall parameter variance, λ1 speaks to the standarddeviation one lag out, and λ3 speaks to the rate of decay.Additional hyperparameters:

λ2 = 1, were it any different, then a variables own lags would carrydifferent relative weight.The variance of the constant is defined as (λ0λ4)2.The variance of exogenous variables is defined as (λ0λ5)2.µ5 & µ6 not addressed in reading: techniques for allowingcointegration.



Uncertainty in Impluse Response Analysis

Typical form for a confidence band: cij(t)± δij(t)

We can easily get an estimate of cij(t) in a frequentist perspective. δis less straightforward.

With Bayesian methods, we can sample from the posteriors of ourestimates.

Gaussian approximation: cij(t)± zασij(t) (σ from posterior of c)Pointwise quantiles: [cij.α/2(t), cij.(1−α)/2(t)]Likelihood-based eigenvector: cij + γk,low , cij + γk,high (accounts forserial correlation)


Documents

Vector Autoregression - James E. Monogan III