27
Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2 Massimiliano Marcellino 3 Norges Bank, 3 October 2017 1 Queen Mary, University of London 2 Federal Reserve Bank of Cleveland 3 Bocconi University and CEPR Carriero, Clark, Marcellino () Large VARs September 2017 1 / 22

Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

Large Vector Autoregressions with stochastic volatilityand non-conjugate priors

Andrea Carriero 1 Todd E. Clark 2 Massimiliano Marcellino 3

Norges Bank, 3 October 2017

1Queen Mary, University of London2Federal Reserve Bank of Cleveland3Bocconi University and CEPRCarriero, Clark, Marcellino () Large VARs September 2017 1 / 22

Page 2: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

Introduction

Introduction - two ingredients

Two main ingredients are key for the specification of a good Vector Autoregressivemodel (VAR) for forecasting and structural analysis of macroeconomic data:

A large cross section. Banbura, Giannone, and Reichlin (2010), Carriero, Clark,and Marcellino (2015), Giannone, Lenza, and Primiceri (2015) and Koop (2013)

Time variation in the volatilities. Clark (2011), Clark and Ravazzolo (2015),Cogley and Sargent (2005), D’Agostino, Gambetti and Giannone (2013), andPrimiceri (2005)

There are no papers which jointly allow for both time variation and large datasets

Carriero, Clark, Marcellino () Large VARs September 2017 2 / 22

Page 3: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

Introduction

Introduction - heteroskedasticity

The reason lies in the structure of the likelihood function

Homoskedastic VARs are SUR models with the same set of regressors in eachequation −→ Kronecker structure in the likelihood−→ OLS equation by equation

Equation-specific stochastic volatility breaks this symmetry because each equationis driven by a different volatility

The system would need to be vectorised, and the conditional posterior involves

manipulation of a matrix of dimension pN2 (N=number of variables, p=number oflags)

The computational complexity is therefore N23= N6

Carriero, Clark, Marcellino () Large VARs September 2017 3 / 22

Page 4: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

Introduction

Introduction - asymmetric priors

In a Bayesian framework, simmetry is not only needed in the likelihood, but also in

the prior

Kronecker structure in the likelihood+Kronecker structure in the prior= Kronecker

structure in the posterior

For example, the VAR estimated by Banbura, Giannone, and Reichlin (2010) is aVAR with 130 variables, but in order to make this estimation possible one needs to

assume:

(i) Homoskedasticity of the disturbances

(ii) A specific structure for the prior

Without either (i) or (ii) the system would need to be vectorised prior to estimation

Carriero, Clark, Marcellino () Large VARs September 2017 4 / 22

Page 5: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

Introduction

The problem

Consider the VAR of a N-dimensional vector yt :

yt = Π(L)yt−1 + vt ; vt ∼ iid N(0,Σt ) (1)

Define Xt = [1, y ′t−1, ..., y′t−p ]

′ and Π = [Π0 |Π1 |...|Πp ]

In general we have the posterior vec(Π)|Σ, y ∼ N(vec(µΠ),ΩΠ) with posteriorprecision:

Ω−1Π = Ω−1ΠPrior

+T

∑t=1

(Σ−1t ⊗ XtX ′t )Likelihood

(2)

The precision matrix Ω−1Π is of size N(Np + 1). Its manipulation requires(pN2)3 = O(N6) elementary operations

For N very large modern computers (laptops/desktops) can’t even store such amatrix in RAM (e.g. N = 125 needs 330 GB of RAM).

Carriero, Clark, Marcellino () Large VARs September 2017 5 / 22

Page 6: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

Introduction

The usual solution

In general we have the posterior vec(Π)|Σ, y ∼ N(vec(µΠ),ΩΠ) with

Ω−1Π = Ω−1ΠPrior

+T

∑t=1

(Σ−1t ⊗ XtX ′t )Likelihood

(2)

Now assume that

(i) Σt = Σ (homoskedasticity)

(ii) ΩΠ = Σ⊗Ω0 (conjugate prior)

Ω−1Π = Ω−1Π︸︷︷︸Σ−1⊗Ω−10

+T

∑t=1(Σ−1t︸︷︷︸

Σ−1

⊗ XtX ′t ) = Σ−1 ⊗(

Ω−10 +T

∑t=1

XtX ′t

), (3)

and the two terms can be manipulated separately, reducing complexity by O(N3)

Classical homoskedastic VARs can be estimated equation by equation.

Carriero, Clark, Marcellino () Large VARs September 2017 6 / 22

Page 7: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

Introduction

Problems with the the usual solution

The Natural-conjugate homoskedastic approach allows to use large datasets, but it hasimportant limitations:

It imposes homoskedasticity, against the overwhelming evidence in macroeconomicand financial data

The prior structure Σ⊗Ω0 is restrictive (Rothemberg (1963), Sims and Zha

(1998))

It prevents any asymmetry in the prior across equations, because the

coeffi cients of each equation feature the same prior variance Ω0 (up to a scale

factor given by the elements of Σ).It has the unappealing consequence that prior beliefs must be correlated across

equations, with a correlation structure proportional to that of the shocks (as

described by Σ).

Carriero, Clark, Marcellino () Large VARs September 2017 7 / 22

Page 8: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

Introduction

A new algorithm

In this paper we propose a new algorithm that makes possible to use:

A heteroskedastic model

The more general and less restrictive independent Normal - Inverse Wishart

(and Normal-diffuse) prior

Our procedure is based on a simple factorization of the likelihood, which allows todraw the VAR coeffi cients equation by equation

This reduces the computational complexity from N6 to N4.

Our new algorithm is very simple and can be easily inserted in any pre-existingalgorithm for estimation of BVAR models.

Carriero, Clark, Marcellino () Large VARs September 2017 8 / 22

Page 9: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

Introduction

The Model

Consider the following VAR model for a N-dimensional yt with stochastic volatility:

yt = Π0 +Π(L)yt−1 + vt ; (1)

vt = A−1Λ0.5t εt , εt ∼ iid N(0, IN ) (2)

where Λt is a diagonal matrix with generic j-th element hj ,t and A−1 is a lowertriangular matrix with ones on its main diagonal.

The bottleneck is drawing vec(Π)|A,ΛT , yT ∼ N(vec(µΠ),ΩΠ); To obtain a draw

one needs to i) invert

Ω−1Π = Ω−1ΠPrior

+T

∑t=1

(Σ−1t ⊗ XtX ′t )Likelihood

(3)

ii) compute its Cholesky factor and iii) multiply the Cholesky factor by a random

vector

Each of the above operations is of complexity N6

Carriero, Clark, Marcellino () Large VARs September 2017 9 / 22

Page 10: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

Introduction

An algorithm for large VARs

Consider again the decomposition vt = A−1Λ0.5t εt :v1,tv2,t...vN ,t

=

1 0 ... 0a∗2,1 1 ...... 1 0a∗N ,1 ... a∗N ,N−1 1

h0.51,t 0 ... 00 h0.52,t ...... ... 00 ... 0 h0.5N ,t

ε1,tε2,t...

εN ,t

,where a∗j ,i denotes the generic element of the matrix A

−1 which is available underknowledge of A.

Carriero, Clark, Marcellino () Large VARs September 2017 10 / 22

Page 11: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

Introduction

An algorithm for large VARs

The VAR can be written as:

y1,t = π(0)1 +

N

∑i=1

p

∑l=1

π(i )1,l yi ,t−l + h

0.51,t ε1,t

y2,t = π(0)2 +

N

∑i=1

p

∑l=1

π(i )2,l yi ,t−l + a

∗2,1h

0.51,t ε1,t + h

0.52,t ε2,t

...

yN ,t = π(0)N +

N

∑i=1

p

∑l=1

π(i )N ,l yi ,t−l + a

∗N ,1h

0.51,t ε1,t + · · ·+ a∗N ,N−1h0.5N−1,t εN−1,t + h0.5N ,t εN ,t ,

with the generic equation for variable j :

yj ,t − (a∗j ,1h0.51,t ε1,t + ...+ a∗j ,,j−1h

0.5j−1,t εj−1,t )︸ ︷︷ ︸

y ∗j ,t

= π(0)j +

N

∑i=1

p

∑l=1

π(i )j ,l yi ,t−l + hj ,t εj ,t . (4)

When drawing the coeffi cients of equation j the term y ∗j ,t is known, since it is given bythe difference between the dependent variable of that equation and the realized residualsof all the previous j − 1 equations. Hence (4) is a standard generalized linear regressionmodel with i.i.d. Gaussian disturbances.

Carriero, Clark, Marcellino () Large VARs September 2017 11 / 22

Page 12: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

Introduction

An algorithm for large VARs

The full conditional posterior distribution of the conditional mean coeffi cients can befactorized as:

p(Π|A,ΛT , y ) = p(π(N )|π(N−1),π(N−2), . . . ,π(1),A,ΛT , y )

×p(π(N−1)|π(N−2), . . . ,π(1),A,ΛT , y )...

×p(π(1)|A,ΛT , y ),and one can draw the coeffi cients in Π in separate blocks:

Πj|Π1:j−1,A,ΛT , y ∼ N(µΠj |1:j−1 ,ΩΠj |1:j−1 )

with

µΠj |1:j−1 = ΩΠj |1:j−1

T

∑t=1

Xj ,th−1j ,t y

∗′j ,t +Ω−1

Πj |1:j−1µΠj |1:j−1

Ω−1Πj |1:j−1 = Ω−1Πj |1:j−1 +

T

∑t=1

Xj ,th−1j ,t X

′j ,t ,

where µΠj |1:j−1 and ΩΠj |1:j−1 are moments of Πj|Π1:j−1 ∼ N(µ

Πj |1:j−1 ,ΩΠj |1:j−1 )

Carriero, Clark, Marcellino () Large VARs September 2017 12 / 22

Page 13: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

Introduction

An algorithm for large VARs

The conditional posterior of Π obtained is the same as the one from thesystem-wide algorithm

The algorithm will produce draws numerically identical to those of thesystem-wide sampler

This is true regardelss of the ordering, which is irrelevant to the conditionalposterior of Π

The total computational complexity of this estimation algorithm is O(N4), with a

gain of N2.

Uses equations with at most Np + 1 regressors, and the correlation across

equations typical of SUR models is implicitly accounted for by the factorization

The dimension of the posterior variance matrix Ω−1Πj is (Np + 1), which

means that its manipulation only involves operations of order O(N3).

Carriero, Clark, Marcellino () Large VARs September 2017 13 / 22

Page 14: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

A numerical comparison of the estimation methods

Computational complexity and speed of simulation

time for producing 10 draws as a function of N

Size of the cross­section (N)0 1 2 3 4 5 6 7 8 9 10

Sec

onds

0

2

4

6

8

10

12

14

16

18time for producing 10 draws as function of N

System wide agorithm

Triangular algorithm

Size of the cross­section (N)0 1 2 3 4 5 6 7 8 9 10

Num

ber o

f ele

men

tary

ope

ratio

ns

0

10

20

30

40

50

60

70

80

90

100Theoretical and Actual difference in computational complexity

Actual difference

Theoretical difference

Carriero, Clark, Marcellino () Large VARs September 2017 14 / 22

Page 15: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

A numerical comparison of the estimation methods

Computational complexity and speed of simulation

time for producing 10 draws as a function of N - log scale

Size of the cross­section (N)0 5 10 15 20 25 30 35 40

Sec

onds

10 ­1

10 0

10 1

10 2

10 3

10 4 time for producing 10 draws as function of N

System wide algorithm

Triangular algorithm

Size of the cross­section (N)0 5 10 15 20 25 30 35 40

Num

ber o

f ele

men

tary

ope

ratio

ns

10 ­1

10 0

10 1

10 2

10 3

10 4 Theoretical and Actual difference in computational complexity

Actual differenceTheoretical dif ference

Carriero, Clark, Marcellino () Large VARs September 2017 15 / 22

Page 16: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

A numerical comparison of the estimation methods

Convergence and mixing

Regardless of the power of the computers used to perform the simulation thetriangular algorithm will always produce many more draws than the traditional

system-wide algorithm in a given unit of time.

This has important consequences in terms of producing draws with good mixingand convergence properties.

The triangular algorithm can produce draws many times closer to i.i.d. samplingin the same amount of time.

These computational and storage gains increase quadratically with the system size

Carriero, Clark, Marcellino () Large VARs September 2017 16 / 22

Page 17: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

A numerical comparison of the estimation methods

Convergence and mixing

Ineffi ciency factors= distance from i.i.d sampling: ideally should be around 1.

­5 0 5 10 15 20 250

0.1

0.2

0.3

0.4

0.5

0.6

0.7Conditinal mean parameters, system­wide algorithm

­0 .5 0 0.5 1 1.5 2 2.5 3 3.50

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8Conditinal mean parameters, triangular algorithm

­10 0 10 20 30 40 50 60 700

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2Covariances, system­wide algorithm

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20

0.5

1

1.5

2

2.5

3Covariances, triangular algorithm

2 4 6 8 10 12 14 16 18 200

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2Volatility factors (averaged across time), system­wide algorithm

0.7 0.75 0.8 0.85 0.9 0.95 1 1.050

2

4

6

8

10

12

14

16

18

20Volatility factors (averaged across time), triangular algorithm

System­wide algorithm, 5000 draws.

­20 0 20 40 60 80 100 120 140 160 1800

0.005

0.01

0.015

0.02

0.025

0.03Volatility innovation variance, system­wide algorithm

Triangular algorithm results are based on 1305000 draws with skip­sampling of 261, producing an effective sample of 5000 draws.

­0 .5 0 0.5 1 1.5 2 2.5 3 3.50

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6Volatility innovation variance, triangular algorithm

Carriero, Clark, Marcellino () Large VARs September 2017 17 / 22

Page 18: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

A large structural VAR with drifting volatilities

Empirical applications

As an illustration we estimate a VAR with stochastic volatilities, using 13 lags and a

cross-section of 125 variables from FRED-MD

For a model of this size the system-wide algorithm would have a covariance matrix

of the coeffi cients of dimension 203250, which would require about 330 GB of RAM(2032502 × 8/109).

Our estimation algorithm can produce 5000 draws in just above 7 hours on a 3.5

GHz Intel Core i7.

We find that:

The variance of the shocks was clearly unstable over time

There is a factor structure in the volatilities

The combined use of both time variation in volatilities and a large data-set

improves point and density forecasts, more that what these two ingredients do

if used separately.

Carriero, Clark, Marcellino () Large VARs September 2017 18 / 22

Page 19: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2
Page 20: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2
Page 21: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125­0.4

­0.2

0

0.2

0.4

Var explained 73.2837%

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125­0.1

0

0.1

0.2

0.3

0.4

0.5

Var explained 19.0428%

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125­0.4

­0.2

0

0.2

0.4

0.6

Var explained 2.6287%

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125­0.4

­0.2

0

0.2

0.4

0.6

Var explained 1.6826%

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125­0.3­0.2­0.1

00.10.20.30.4

Var explained 0.52826%

FFRNB reserves

CES1021000001

PCEPIHoursRPI

interest rates, exchange rates, andfinancial indicators

Monetaryaggregates

Real variables PricesSurveys

Figure 11: Principal components loadings of the variance-covariance of the volatilities (matrix ).

PCA of the variance matrix of the shocks to volatilities

Page 22: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

heteroskedast ic2 2.5 3 3.5

hom

osch

edas

tic

2

2.5

3

3.5

RPI­ SCORE

heteroskedast ic3.7 3.8 3.9

hom

osch

edas

tic3.65

3.7

3.75

3.8

3.85

3.9

3.95

DPCERA3M086SBEA­ SCORE

heteroskedast ic3 3.1 3.2

hom

osch

edas

tic

2.95

3

3.05

3.1

3.15

3.2

3.25

CMRMTSPLx­ SCORE

heteroskedast ic3.3 3.4 3.5

hom

osch

edas

tic

3.3

3.35

3.4

3.45

3.5

3.55

INDPRO­ SCORE

heteroskedast ic­2.5 ­2 ­1.5 ­1

hom

osch

edas

tic

­2.5

­2

­1.5

­1

CUMFNS­ SCORE

heteroskedast ic­1 ­0.5 0

hom

osch

edas

tic

­1.2­1

­0.8­0.6­0.4­0.2

00.20.4

UNRATE­ SCORE

heteroskedast ic4.6 4.7 4.8 4.9 5

hom

osch

edas

tic

4.6

4.7

4.8

4.9

5

PAYEMS­ SCORE

heteroskedast ic­1 ­0.8 ­0.6 ­0.4 ­0.2

hom

osch

edas

tic

­1

­0.8

­0.6

­0.4

­0.2

CES0600000007­ SCORE

heteroskedast ic4.2 4.3 4.4 4.5

hom

osch

edas

tic

4.2

4.25

4.3

4.35

4.4

4.45

4.5

CES0600000008­ SCORE

heteroskedast ic3.4 3.5 3.6 3.7 3.8

hom

osch

edas

tic

3.43.45

3.53.55

3.63.65

3.73.75

3.8

PPIFGS­ SCORE

heteroskedast ic2 2.05 2.1

hom

osch

edas

tic

1.96

1.98

2

2.02

2.04

2.06

2.08

2.1

PPICMM­ SCORE

heteroskedast ic4.5 4.6 4.7 4.8 4.9

hom

osch

edas

tic

4.54.554.6

4.654.7

4.754.8

4.854.9

PCEPI­ SCORE

heteroskedast ic2.5 3 3.5 4 4.5

hom

osch

edas

tic

2.5

3

3.5

4

4.5

FEDFUNDS­ SCORE

heteroskedast ic0 0.5 1

hom

osch

edas

tic

­0.2

0

0.2

0.4

0.6

0.8

1

HOUST­ SCORE

heteroskedast ic1.75 1.8 1.85 1.9

hom

osch

edas

tic

1.75

1.8

1.85

1.9

S&P 500­ SCORE

heteroskedast ic2.25 2.3 2.35 2.4

hom

osch

edas

tic

2.25

2.3

2.35

2.4

EXUSUKx­ SCORE

heteroskedast ic­1.5 ­1 ­0.5 0

hom

osch

edas

tic

­1.5

­1

­0.5

0

T1YFFM­ SCORE

heteroskedast ic­2 ­1.5 ­1 ­0.5

hom

osch

edas

tic

­2

­1.5

­1

­0.5

T10YFFM­ SCORE

heteroskedast ic­2 ­1.5 ­1 ­0.5

hom

osch

edas

tic

­2

­1.5

­1

­0.5

BAAFFM­ SCORE

heteroskedast ic­3.6 ­3.4 ­3.2 ­3 ­2.8

hom

osch

edas

tic

­3.6

­3.4

­3.2

­3

­2.8

NAPMNOI­ SCORE

Figure 17: Comparison of density forecast accuracy. Each panel describes a different variable. The

x axis reports the (log) density score obtained using the BVAR with stochastic volatility (het-

eroschedastic), the y axis reports the (log) density score obtained using the homoschedastic BVAR.

Each point corresponds to a different forecast horizon from 1 to 12 step-ahead.

Score comparison: homoskedastic model (y axis) vs heteroskedastic model (x axis)

Page 23: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

heteroskedastic 10­35.7 5.75 5.8 5.85

hom

osch

edas

tic

10­3

5.7

5.75

5.8

5.85

RPI­ RMSFE

heteroskedastic 10 ­35 5.1 5.2

hom

osch

edas

tic

10 ­3

5

5.05

5.1

5.15

5.2

DPCERA3M086SBEA­ RMSFE

heteroskedastic 10 ­39.4 9.6 9.8

hom

osch

edas

tic

10­3

9.3

9.4

9.5

9.6

9.7

9.8

CMRMTSPLx­ RMSFE

heteroskedastic 10 ­36.4 6.6 6.8 7 7.2 7.4

hom

osch

edas

tic

10 ­3

6.4

6.6

6.8

7

7.2

7.4

INDPRO­ RMSFE

heteroskedastic1 2 3

hom

osch

edas

tic

1

1.5

2

2.5

3

CUMFNS­ RMSFE

heteroskedastic0.2 0.4 0.6 0.8

hom

osch

edas

tic

0.2

0.3

0.4

0.5

0.6

0.7

0.8

UNRATE­ RMSFE

heteroskedastic 10 ­31.5 1.6 1.7 1.8 1.9

hom

osch

edas

tic

10­3

1.5

1.6

1.7

1.8

1.9

PAYEMS­ RMSFE

heteroskedastic0.3 0.4 0.5

hom

osch

edas

tic

0.3

0.35

0.4

0.45

0.5

0.55

CES0600000007­ RMSFE

heteroskedastic 10­32.6 2.7 2.8 2.9

hom

osch

edas

tic

10 ­3

2.62.652.7

2.752.8

2.852.9

2.95

CES0600000008­ RMSFE

heteroskedastic 10 ­35.6 5.8 6 6.2

hom

osch

edas

tic

10 ­3

5.65.75.85.9

66.16.26.3

PPIFGS­ RMSFE

heteroskedastic0.03 0.031 0.032

hom

osch

edas

tic

0.03

0.0305

0.031

0.0315

0.032

0.0325

PPICMM­ RMSFE

heteroskedastic 10 ­32 2.2 2.4

hom

osch

edas

tic

10 ­3

1.9

2

2.1

2.2

2.3

2.4

2.5

PCEPI­ RMSFE

heteroskedastic0.01 0.015 0.02

hom

osch

edas

tic

0.01

0.015

0.02

FEDFUNDS­ RMSFE

heteroskedastic0.1 0.15 0.2 0.25

hom

osch

edas

tic

0.1

0.15

0.2

0.25

HOUST­ RMSFE

heteroskedastic0.03680.0370.03720.03740.0376

hom

osch

edas

tic

0.0368

0.037

0.0372

0.0374

0.0376

S&P 500­ RMSFE

heteroskedastic0.023 0.0235 0.024

hom

osch

edas

tic

0.0228

0.023

0.0232

0.0234

0.0236

0.0238

0.024

EXUSUKx­ RMSFE

heteroskedastic0.5 0.6 0.7 0.8 0.9

hom

osch

edas

tic

0.5

0.6

0.7

0.8

0.9

T1YFFM­ RMSFE

heteroskedastic0.6 0.8 1 1.2 1.4 1.6

hom

osch

edas

tic

0.6

0.8

1

1.2

1.4

1.6

T10YFFM­ RMSFE

heteroskedastic0.6 0.8 1 1.2 1.4 1.6 1.8

hom

osch

edas

tic

0.6

0.8

1

1.2

1.4

1.6

1.8

BAAFFM­ RMSFE

heteroskedastic4 5 6 7

hom

osch

edas

tic

44.5

55.5

66.5

77.5

NAPMNOI­ RMSFE

Figure 16: Comparison of point forecast accuracy. Each panel describes a different variable. The x

axis reports the RMSFE obtained using the BVAR with stochastic volatility (heteroschedastic), the

y axis reports the RMSFE obtained using the homoschedastic BVAR. Each point corresponds to a

different forecast horizon from 1 to 12 step-ahead (in most cases, a higher RMSFE corresponds to a

longer forecast horizon).

RMSFE comparison: homoskedastic model (y axis) vs heteroskedastic model (x axis)

Page 24: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

Conclusions

Conclusions

The assumptions of conjugacy and homoskedasticity in a VARs are hardly

defendable, but a more general specification is only manageable with a smallcross-section.

We have proposed a new estimation method VARs with non-conjugate priors anddrifting volatilities which can be applied with large models

The method is based on a straightforward triangularization of the system, and it is

very simple to implement.

Indeed, if a researcher already has algorithms to produce draws from a VAR with anindependent N-IW prior and stochastic volatility, only a single needs to be slightlymodified with a few lines of code.

Given its simplicity and the advantages in terms of speed, mixing, and convergence,we argue that the proposed algorithm should be preferred in empirical applications,especially those involving large datasets.

Carriero, Clark, Marcellino () Large VARs September 2017 21 / 22

Page 25: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

Conclusions

Prior dependence

We assumed that the prior variance was diagonal. This can be relaxed.

With a prior dependent across equations, the general form of the posterior can be

obtained using the triangularization also on the joint prior distribution, and is:

Πj|Π1:j−1,A,ΛT , y ∼ N(µΠj |1:j−1 ,ΩΠj |1:j−1 )

with

µΠj |1:j−1 = ΩΠj |1:j−1

T

∑t=1

Xj ,th−1j ,t y

∗′j ,t +Ω−1

Πj |1:j−1µΠj |1:j−1

Ω−1Πj |1:j−1 = Ω−1Πj |1:j−1 +

T

∑t=1

Xj ,th−1j ,t X

′j ,t ,

where µΠj |1:j−1 and ΩΠj |1:j−1 are moments of

Πj|Π1:j−1 ∼ N(µΠj |1:j−1 ,ΩΠj |1:j−1 ), i.e. the conditional priors implied by the

joint prior specification.

The moments of Πj|Π1:j−1 can be found recursively from the joint prior

Carriero, Clark, Marcellino () Large VARs September 2017 22 / 22

Page 26: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

Forecasting

Model size, stochastic volatility, and forecasting

Pseudo out of sample exercise performed recursively, starting with the estimationsample 1960:3 to 1970:2 and ending with 1960:3 to 2014:5.

We consider four models.

1 A small homoskedastic VAR including the growth rate of industrial production

(∆ ln IP), the inflation rate based on consumption expenditures (∆ lnPECEPI )and the effective Federal Funds Rate (FFR).

2 A large (20 variables) homoskedastic VAR along the lines of Carriero, Clark,

and Marcellino (2015), Giannone, Lenza, and Primiceri (2015), and Koop

(2013).3 A small VAR with time variation in volatilities along the lines of Clark (2011),

Cogley and Sargent (2005) and Primiceri (2005).4 The fourth model includes both time variation in the volatilities and a large

(20 variables) information set.

Carriero, Clark, Marcellino () Large VARs September 2017 19 / 22

Page 27: Large Vector Autoregressions with stochastic volatility and …...Large Vector Autoregressions with stochastic volatility and non-conjugate priors Andrea Carriero 1 Todd E. Clark 2

Forecasting

Forecasting

Direct effects:

The use of a larger dataset improves point forecasts via a better specification

of the conditional means.

The inclusion of time variation in volatilities improves density forecasts via a

better modelling of error variances,

Interactions:

A better point forecast improves the density forecast as well, by centering the

predictive density around a more reliable mean

Time varying volatilities improve the point forecasts at longer horizons -

because the heteroskedastic model will provide more effi cient estimates

(through a GLS argument) and a therefore a better characterization of the

predictive densities

Carriero, Clark, Marcellino () Large VARs September 2017 20 / 22