Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Federal Reserve Bank of Minneapolis
Research Department
Revised June 2007
Lecture Notes: Quantitative Methods
Ellen R. McGrattan
Table of Contents
1. Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1. Solving Nonlinear Systems of Equations . . . . . . . . . . . . . . . 1
1.1.1. The Bisection Method . . . . . . . . . . . . . . . . . . . . 1
1.1.2. The Newton-Raphson Method . . . . . . . . . . . . . . . . 2
1.1.3. The Secant Method . . . . . . . . . . . . . . . . . . . . . 3
1.2. Numerical Differentiation . . . . . . . . . . . . . . . . . . . . . 3
1.3. Numerical integration . . . . . . . . . . . . . . . . . . . . . . . 3
1.3.1. Choosing Quadrature Weights . . . . . . . . . . . . . . . . 4
1.3.2. Trapezoidal Rule . . . . . . . . . . . . . . . . . . . . . . 5
1.3.3. Simpson’s Rule . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.4. Gauss-Legendre Quadrature . . . . . . . . . . . . . . . . . 6
2. Dynamic Programming . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1. Discrete-time Dynamic Programming . . . . . . . . . . . . . . . . 8
2.2. Continuous-time Dynamic Programming . . . . . . . . . . . . . . . 10
3. Computing Equilibria in Near-Linear Economies . . . . . . . . . . . . . . 13
3.1. Linearizing and Log-linearizing . . . . . . . . . . . . . . . . . . . 13
3.2. Mapping the Problem to a Standard LQ Problem . . . . . . . . . . . 14
3.3. A Variant on Vaughn’s Method . . . . . . . . . . . . . . . . . . . 19
4. Computing Equilibria in Nonlinear Economies . . . . . . . . . . . . . . . 23
4.1. The Method of Parameterized Expectations . . . . . . . . . . . . . 23
4.2. Weighted Residual Methods . . . . . . . . . . . . . . . . . . . . 25
4.2.1. The General Procedure . . . . . . . . . . . . . . . . . . . 26
4.2.2. Applied to the Deterministic Growth Model . . . . . . . . . . 29
4.2.3. Applied to the Stochastic Growth Model . . . . . . . . . . . . 37
5. Maximum Likelihood Estimation . . . . . . . . . . . . . . . . . . . . . 42
5.1. Vector autoregressive representation . . . . . . . . . . . . . . . . . 44
5.2. The Likelihood Function . . . . . . . . . . . . . . . . . . . . . . 44
6. A Prototype Real Business Cycle Model . . . . . . . . . . . . . . . . . . 47
6.1. A Version of the Model with AR(1) Technology . . . . . . . . . . . . 47
6.1.1. Maximization problems . . . . . . . . . . . . . . . . . . . 47
6.1.2. First-order conditions . . . . . . . . . . . . . . . . . . . . 48
6.1.3. Log-linear computation . . . . . . . . . . . . . . . . . . . 49
6.2. A Version of the Model with Random Walk Technology . . . . . . . . 51
6.2.1. Maximization problems . . . . . . . . . . . . . . . . . . . 51
6.2.2. First-order conditions . . . . . . . . . . . . . . . . . . . . 51
6.2.3. Log-linear computation . . . . . . . . . . . . . . . . . . . 51
6.3. MLE Estimation . . . . . . . . . . . . . . . . . . . . . . . . . 52
6.3.1. State-space form in the general case . . . . . . . . . . . . . . 52
6.3.2. Log-likelihood function . . . . . . . . . . . . . . . . . . . . 52
6.3.3. MLE in the Benchmark Case . . . . . . . . . . . . . . . . . 53
6.3.4. MLE in the Random Walk Case . . . . . . . . . . . . . . . . 54
6.4. Simulating Data from the Models . . . . . . . . . . . . . . . . . . 56
7. A Prototype Sticky Price Model . . . . . . . . . . . . . . . . . . . . . 57
7.1. Model Economy . . . . . . . . . . . . . . . . . . . . . . . . . 57
7.2. Computing an Equilibrium . . . . . . . . . . . . . . . . . . . . . 60
8. Business Cycle Accounting . . . . . . . . . . . . . . . . . . . . . . . 66
8.1. The Prototype Model with Time-Varying Wedges . . . . . . . . . . . 66
8.2. Mapping Frictions to Wedges . . . . . . . . . . . . . . . . . . . . 68
8.2.1. Efficiency Wedges . . . . . . . . . . . . . . . . . . . . . . 68
8.2.2. Labor Wedges . . . . . . . . . . . . . . . . . . . . . . . 72
8.3. The Accounting Procedure . . . . . . . . . . . . . . . . . . . . . 76
8.3.1. The Accounting Procedure at a Conceptual Level . . . . . . . . 76
8.3.2. A Markovian Implementation . . . . . . . . . . . . . . . . . 77
9. Structural VARs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
9.1. A Version of the RBC Model . . . . . . . . . . . . . . . . . . . 80
9.1.1. Maximization problems . . . . . . . . . . . . . . . . . . . 80
9.1.2. First-order conditions . . . . . . . . . . . . . . . . . . . . 81
9.1.3. Log-linear computation . . . . . . . . . . . . . . . . . . . 82
9.2. VARs and the 2-Shock Version of the Model . . . . . . . . . . . . . 85
9.2.1. The Decision Functions . . . . . . . . . . . . . . . . . . . 85
9.2.2. The Model’s Moving Average . . . . . . . . . . . . . . . . . 88
9.2.3. Special Property of the D’s . . . . . . . . . . . . . . . . . . 89
9.2.4. VAR Coefficients . . . . . . . . . . . . . . . . . . . . . . 89
9.2.5. Proposition 1: Model has infinite-order VAR . . . . . . . . . . 90
9.2.6. Blanchard-Quah Identification . . . . . . . . . . . . . . . . 92
9.2.6.1. Sign convention on A0(1, 1) . . . . . . . . . . . . . . . 94
9.2.6.2. Sign convention on A(1, 1) . . . . . . . . . . . . . . . 94
9.2.6.3. Full solution . . . . . . . . . . . . . . . . . . . . . 94
9.2.6.4. Cholesky decomposition . . . . . . . . . . . . . . . . 95
9.2.7. Proposition 2: OLS Results . . . . . . . . . . . . . . . . . . 95
9.2.8. The Propositions for Two Special Cases . . . . . . . . . . . . 98
9.2.8.1. Proposition 3a: No capital in the model . . . . . . . . . 98
9.2.8.2. Proposition 3b: Only one shock . . . . . . . . . . . . . 99
9.3. VARs and 3-Shock Versions of the Model . . . . . . . . . . . . . . 102
9.3.1. Adding an Investment Tax Shock . . . . . . . . . . . . . . . 102
9.3.1.1. The Model’s Moving Average . . . . . . . . . . . . . . 106
9.3.1.2. Special Property of the D’s . . . . . . . . . . . . . . . 107
9.3.1.3. Proposition 4: Model has infinite-order VAR . . . . . . . 108
9.3.1.4. A Way to Make M Singular . . . . . . . . . . . . . . . 111
Chapter 1.
Background
1.1. Solving Nonlinear Systems of Equations
1.1.1. The Bisection Method
The problem is: find x on [a, b] such that f(x) = 0, where f : IR → IR is a continuous
function and f(a) and f(b) have opposite signs. See, for example, Figure 1. The idea
of the bisection method is to bracket a root. Once the root is bracketed, simply divide
the interval in half, figure out which of the two halves brackets the root, and repeat the
process. (On Figure 1, the halfway point is marked. Notice the function lies above the
x-axis at that point. Therefore, at the second iteration, this will be the new point a.)
f(x)
x
a b
Figure 1.
One problem with this method is that it can be slow; the error bound is
|xm − x∗| ≤ (b− a) /2m
where m is the index for the mth iteration and x∗ is the solution. Suppose b− a is 1 and
one wants the error to be less than or equal to 10−6. Then, we need to iterate 20 times
(i.e., m = 6/ log10 2).
1
1.1.2. The Newton-Raphson Method
The problem is: find x such that f(x) = 0, where f : IRn → IRn is a continuous function.
The idea behind this method is to apply a Taylor expansion,
f (x) = f (x) + f ′ (x) (x− x) + higher order terms,
where the expansion is taken around an approximate solution to f(x) = 0. Evaluate f at
the solution. Assuming that x is sufficiently close to x∗, higher order terms are small and
x∗ = x− f (x) /f ′ (x)
in the case that n = 1. This leads to the following updating scheme:
xk+1 = xk − f(
xk)
/f ′(
xk)
,
starting at k = 0 with initial guess x0. For n > 1, the updating scheme is:
xk+1 = xk − J(
xk)−1
f(
xk)
where the i, j element of J(x) is the derivative of the ith element of f with respect to
the jth element of x. Figure 2 displays one step of the updating scheme. Point x0 is the
starting point. Draw a tangent of f at that point and trace it to the x-axis. The crossing
point is the new guess, x1. Repeat the procedure until the fixed point is found.
f(x)
x
x0 x1
Figure 2.
2
When x0 is sufficiently far from a solution, one might have problems. (Try, for ex-
ample, finding the root of f(x) = log(x) with a starting point of x0 = 5. Try again with
x0 = 2.) On the other hand, if the method converges, it achieves quadratic convergence,
that is, |xk+1 − x∗| ≤ c|xk − x∗|2).
1.1.3. The Secant Method
The secant method uses the same updating scheme as Newton-Raphson but numerical
derivatives are used for J . (See next section.)
1.2. Numerical Differentiation
The problem is: find df(x)/dx where f : IR → IR is differentiable. In this case Taylor
expansions can be taken around a point x
f (x+ h) = f (x) + f ′ (x) h+1
2f ′′ (x) h2 + higher order terms
f (x− h) = f (x) − f ′ (x) h+1
2f ′′ (x) h2 + higher order terms
to get approximations
f ′ (x) ≈ f (x+ h) − f (x)
h
f ′ (x) ≈ f (x) − f (x− h)
h
f ′ (x) ≈ f (x+ h) − f (x− h)
2h
for the first derivative and
f ′′ (x) ≈ 1
h2f (x+ h) − 2f (x) + f (x− h)
and for the second derivative. These are not the only possible approximations. But these
are the approximations most typically used in applications. In practice, h cannot be too
small. If x is a vector and f vector-valued, then differentiation can be done element by
element.
1.3. Numerical integration
The problem is: find∫ b
af(x) dx.
3
Before solving the problem, it helps to start with some preliminary theorems and
definitions used later. Let x0, x1, . . . xn be n + 1 distinct points in interval [a, b] and let
f be a function with n + 1 continuous derivatives on [a, b] (which is denoted simply by
Cn+1[a, b]).
Theorem: There exists a unique polynomial Pn of degree at most n such that
f (xk) = Pn (xk) , k = 0, . . . n
and
Pn (x) = f (x0)Ln,0 (x) + f (x1)Ln,1 (x) + . . . f (xn)Ln,n (x) (1.3.1)
where
Ln,k (x) =
n∏
i=0
i 6=k
(x− xi)
(xk − xi).
Pn is called the nth Lagrange interpolating polynomial.
Theorem: For each x in [a, b] there exists a point y(x) in (a, b) such that
f (x) = Pn (x) +f (n+1) (y (x))
(n+ 1)!(x− x0) (x− x1) · · · (x− xn) ,
with Pn(x) given by (1.3.1).
1.3.1. Choosing Quadrature Weights
Approximate the integral with a weighted sum, that is,∫ b
a
f (x) dx ≈n∑
k=1
ωkf (xk) .
Here, we can use the Lagrange interpolating polynomial Pn:∫ b
a
f (x) dx =
∫ b
a
Pn (x) dx+ err (f)
=
∫ b
a
n∑
k=0
f (xk)Ln,k (x) dx+ err (f)
=n∑
k=0
[
∫ b
a
Ln,k (x) dx
]
f (xk) + err (f)
=
n∑
k=0
ωkf (xk) + err (f)
≈n∑
k=0
ωkf (xk)
4
where the weights ωk are given by
ωk =
∫ b
a
Ln,k (x) dx
for k = 0, . . . , n and the error err(f) is
err (f) =
∫ b
a
f (n+1) (y (x))
(n+ 1)!
n∏
k=0
(x− xk) dx.
The err(f) formula can be used to compute bounds on errors.
1.3.2. Trapezoidal Rule
For the Trapezoidal rule, use the first Lagrange polynomial with x0 = a and x1 = b. In
this case, the weights are
ω0 =
∫ x1
x0
L1,0 dx =
∫ x1
x0
x− x1
x0 − x1dx =
1
2(x1 − x0) =
1
2(b− a)
ω1 =
∫ x1
x0
L1,1 dx =
∫ x1
x0
x− x0
x1 − x0dx =
1
2(x1 − x0) =
1
2(b− a)
and the approximate integral is
∫ b
a
f (x) dx ≈ h
2f (a) + f (b)
where h = b− a.
1.3.3. Simpson’s Rule
For Simpson’s rule, use the second Lagrange polynomial and equally spaced nodes: x0 = a,
x1 = (a+ b)/2, and x2 = b. In this case, the weights are
ω0 =
∫ x2
x0
L2,0 dx =
∫ x2
x0
(x− x1) (x− x2)
(x0 − x1) (x0 − x2)dx =
1
6(b− a)
ω1 =
∫ x2
x0
L2,1 dx =
∫ x2
x0
(x− x0) (x− x2)
(x1 − x0) (x1 − x2)dx =
2
3(b− a)
ω2 =
∫ x2
x0
L2,2 dx =
∫ x2
x0
(x− x0) (x− x1)
(x2 − x0) (x2 − x1)dx =
1
6(b− a)
5
and the approximate integral is
∫ b
a
f (x) dx ≈ h
3f (a) + 4f (a/2 + b/2) + f (b)
where h = (b− a)/2.
The Simpson’s rule and Trapezoidal rule are special cases of Newton-Cotes formulas
that use equally spaced nodes xk = x0 + kh with x0 = a, xn = b, and h = (b− a)/n.
1.3.4. Gauss-Legendre Quadrature
The formulas above are weighted sums of the functional values at a set of equally spaced
points. Applying Gauss-Legendre quadrature involves choosing both the weights and the
abscissas – and doubling the degree of precision. For this, the following theorem is useful.
Theorem: If P is any polynomial of degree less than or equal to 2n− 1, then
∫ 1
−1
P (x) dx =n∑
k=1
ωkP (xi)
where
ωk =
∫ 1
−1
∏
j=1
j 6=k
(x− xj)
(xk − xj)dx (1.3.2)
and x1, x2, . . ., xn are the zeros of the nth Legendre polynomial.
Legendre polynomials can be found recursively as follows:
(i+ 1) pi+1 = (2i+ 1) xpi − ipi−1
starting with p0(x) ≡ 1 and p1(x) ≡ x. This class of polynomials is orthogonal on [−1, 1]
with respect to weighting function w(x) = 1, which means that
∫ 1
−1
pj (x) pk (x)w (x) dx
= 0 if j 6= k> 0 if j = k.
Orthogonal polynomials have the property that they can be used as basis functions to
represent any polynomial (assuming the degree of the polynomial is equal to the highest
order polynomial in the orthogonal set). Gaussian quadrature is highly accurate if the
function f being integrated is well approximated by a polynomial.
6
Applying a simple transformation from the domain [a, b] to the domain [−1, 1] yields
the following Gaussian-Legendre approximation
∫ b
a
f (x) dx =1
2(b− a)
∫ 1
−1
f ([(b− a) z + b+ a] /2) dz
≈ 1
2(b− a)
n∑
k=1
ωkf ([(b− a) zk + b+ a] /2)
where the weights ωk, k = 1, . . . , n are given by (1.3.2) and the abscissas zk, k = 1, . . . , n
are the zeros of the nth order Legendre polynomial.
7
Chapter 2.
Dynamic Programming
In this chapter, we cover discrete-time and continuous-time dynamic programming.
2.1. Discrete-time Dynamic Programming
The problem is: find utTt=0 that solves
maxT∑
t=0
βtr (xt, ut) + V0 (xT+1)
subject to xt+1 = g (xt, ut) (2.1.1)
with the initial value for x0 and the value function V0 known. Because the objective
function has terms that are separable in time, this problem can be restated recursively
with a sequence of Bellman equations,
Vj+1 (xT−j) = maxuT−j
r (xT−j , uT−j) + βVj (xT−j+1) (2.1.2)
where the maximization is subject to (2.1.1) with V0(x) and its derivative known. The
solution is found by solving a sequence of simple maximization problems, taking as given
the value function at the last step. With j = 0 in (2.1.2), we have
V1 (xT ) = maxuT
r (xT , uT ) + βV0 (g (xT , uT )) (2.1.3)
and therefore uT satisfies,
∂r (xT , uT )
∂uT+ β
∂g (xT , uT )
∂uTV ′
0 (xT+1) = 0. (2.1.4)
Finding uT to satisfy (2.1.4) is the standard problem described earlier, namely to solve
a nonlinear equation or system of equations if uT is a vector. If we solve this problem
for each value of xT , we can trace out the optimal decision function, call it uT = h0(xT ).
Substituting this solution into (2.1.3),
V1 (xT ) = r (xT , h0 (xT )) + βV0 (g (xT , h0 (xT ))) .
8
On the next step, we’ll need the derivative of V1(x) which (if it is differentiable) is
given by
V ′1 (xT ) =
[
∂r (xT , h0 (xT ))
∂uT+ β
∂g (xT , h0 (xT ))
∂uTV ′
0 (xT+1)
]
h′ (xT )
+∂r (xT , h0 (xT ))
∂xT+ β
∂g (xT , h0 (xT ))
∂xTV ′
0 (xT+1)
=∂r (xT , h0 (xT ))
∂xT+ β
∂g (xT , h0 (xT ))
∂xTV ′
0 (xT+1) (2.1.5)
where the second equality follows from the fact that (2.1.4) holds at a maximum.
With j = 1 in (2.1.2), we have
V2 (xT−1) = maxuT−1
r (xT−1, uT−1) + βV1 (g (xT−1, uT−1)) (2.1.6)
and therefore uT−1 satisfies,
∂r (xT−1, uT−1)
∂uT−1+ β
∂g (xT−1, uT−1)
∂uT−1V ′
1 (xT ) = 0. (2.1.7)
Notice that this expression depends on the derivative in (2.1.5). Solving (2.1.7) for uT−1
is the same excercise as for uT . If we solve this problem for each value of xT−1, we can
trace out the optimal decision function uT−1 = h1(xT−1). In fact, the same exercise is
conducted for each j = 0, . . . T and yields optimal decision functions for all ut, t = 0, . . . T .
If T = ∞ and V0(x) = 0, then the time-independent solution ut = h(xt) is found by
taking j to ∞. Under certain conditions on the objective function and constraints, this
limit is the solution to
V (x) = maxu
r (x, u) + βV (x)
where x = g(x, u) and V = limj→∞ Vj . The limiting value function V is equal to the
objective function at an optimum:
V (x0) = maxut∞
t=0
∞∑
t=0
βtr (xt, ut) .
As a practical matter, solving the problem by solving a sequence of subproblems works for
both the finite time and the infinite time cases.
9
A variation of the problem allows for stochastic shocks. The problem is: find ut∞t=0
to solve:
maxE0
∞∑
t=0
βtr (xt, ut)
subject to xt+1 = g (xt, ut, ǫt+1) (2.1.8)
where ǫt is a sequence of independently and identically distributed random variables. In
this case, the Bellman equation is
V (x) = maxu
r (x, u) + β
∫
V (g (x, u, ǫ)) dF (ǫ)
≡ maxu
r (x, u) + βE [V (g (x, u, ǫ)) |x] (2.1.9)
where F (ǫ) is the cumulative distribution function for ǫ. In this case, the sequence of
first-order conditions for (2.1.9)
∂r (x, u)
∂u+ βE
[
∂g (x, u, ǫ)
∂uV ′ (g (x, u, ǫ))
∣
∣x
]
= 0.
In practice, some method of integration is required to solve u = h(x). In the stochastic
case, the solution depends on the parameters of F (ǫ).
2.2. Continuous-time Dynamic Programming
The problem is: derive the standard Bellman’s equation for continuous time problems with
the state evolving according to the following differential equation:
dx = µ (t, x, u)dt+ σ (t, x, u)dz. (2.2.1)
Here, dz is the increment of a stochastic process z which is a Wiener process (also called a
Brownian motion) and µ and σ are known functions of time t, the state x, and a decision
variable(s) u to be described later. A stochastic process [z(t), t ≥ 0] is called a Wiener
process if (i) z(0) = 0; (ii) z(t) has stationary independent increments; and (iii) for every
t > 0, z(t) is normally distributed with mean 0 and variance c∆t where c is some positive
constant.
Consider the stochastic optimal control problem:
V (t0, x0) = maxu
E
[
∫ T
t0
r (t, x, u)dt+ g (x (T ) , T )
]
10
subject to (2.2.1) and x(t0) = x0. First, note that V (T, x(T )) = g(x(T ), T ). Next, break
up the integral as follows
V (t0, x0) = maxu
E
(
∫ t0+∆t
t0
r (t, x, u)dt+
∫ T
t0+∆t
V (T, x (T ))
)
= maxu
t0≤t≤t0+∆t
E
(
∫ t0+∆t
t0
r (t, x, u)dt+ maxu
t0+∆t≤t≤T
(
∫ T
t0+∆t
V (T, x (T ))
))
= maxu
t0≤t≤t0+∆t
E
(
∫ t0+∆t
t0
r (t, x, u)dt+ V (t0 + ∆t, x0 + ∆x)
)
where x(t0 + ∆t) = x0 + ∆x. If V is twice continuously differentible, then
V (t0, x0) ≃ maxu
E(
r (t0, x0, u)∆t+ V (t0, x0) + Vt (t0, x0)∆t+ Vx (t0, x0) ∆x
+1
2Vxx (t0, x0) (∆x)
2+ h.o.t.
)
(2.2.2)
where h.o.t. stands for higher order terms. Recall that the following holds (approximately)
∆x = µ∆t+ σ∆z
(∆x)2
= µ2 (∆t)2
+ σ2 (∆z)2
+ 2µσ∆t∆z
= σ2∆t+ h.o.t (2.2.3)
where the arguments of µ and σ have been dropped for convenience. The result in (2.2.3)
follows from the fact that increments of z, e.g., z(tj)−z(tj−1) are independently distributed
with mean zero and variances proportional to increments of t, e.g., tj − tj−1. Thus, the
variance (dz)2 = dt is first order while the other terms are all higher order.
Using this approximation in (2.2.2), we have the following for the value function
V (t, x) ≃ maxu
E(
r (t, x, u)∆t+ V (t, x) + Vt (t, x)∆t+ Vx (t, x)µ (t, x, u)∆t
+ Vx (t, x)σ (t, x, u)∆z +1
2Vxx (t, x)σ (t, x, u)
2∆t+ h.o.t.
)
Take expectations (which drops the ∆z term), subtract V (t, x) from both sides, divide
through by ∆t, and then take ∆t to zero to get:
−Vt (t, x) ≃ maxu
(
r (t, x, u) + µ (t, x, u)Vx (t, x) +1
2σ (t, x, u)
2Vxx (t, x)
)
which is the standard Bellman’s equation for continuous time problems.
11
Consider a variation of the problem with discounting:
V (t0, x0) = maxu
E
(
∫ T
t0
e−ρtr (t, x, u)dt+ g (x (T ) , T )
)
In this case, the steps above lead to
−Vt (t, x) + ρV (t, x) = maxu
(
r (t, x, u) + µ (t, x, u)Vx (t, x) +1
2σ (t, x, u)
2Vxx (t, x)
)
(2.2.4)
An example is the standard stochastic growth model. Households choose consumption
to maximize expected lifetime utility
maxcE
∫ ∞
0
e−ρtu (c) dt
subject to
dk = (f (k) − c) dt+ σ (k) dz
where u(c) = cω/ω and ω < 1. For this model, equation (2.2.4) is given by
−Vt (t, k) + ρV (t, k) = maxc
(
u (c) + (f (k) − c)Vk (t, k) +1
2σ (k)
2Vkk (t, k)
)
.
The first-order condition for the maximization is u′(c) = Vk. Substituting this back in
yields the following differential equation.
−Vt (t, k)+ρV (t, k) = (1 − ω)Vk (t, k)ω/(ω−1)
/ω+(f (k) − c)Vk (t, k)+1
2σ (k)
2Vkk (t, k) .
This differential equation can be solved using finite difference methods outlined in Candler
(1999).
12
Chapter 3.
Computing Equilibria in Near-Linear Economies
In this chapter, we solve economic decision problems that are inherently nonlinear
assuming that the solutions of these problems are well-approximated by linear or log-linear
functions. Most business cycle models fall in this category.
3.1. Linearizing and Log-linearizing
We will sometimes need to do a first-order Taylor expansion of a function f(x) around a
point x, that is,
f (x) = f (x) + f ′ (x) (x− x) + higher order terms.
We will also need to do the expansion after writing the variables in logs:
f (x) = f(
elog x)
= g (log x) = g (log x) + g′ (log x) (log x− log x) + higher order terms
Consider the following example based on the Euler equation of a very simple growth
model:
u′ (ct) = βu′ (ct+1) (f ′ (kt+1) − 1 + δ) (3.1.1)
with u(c) = c1−σ/(1 − σ) and f(k) = Akθ. If we linearize the difference between the left
and right hand sides of (3.1.1), we get
βc−σt+1
[
θkθ−1t+1 + 1 − δ
]
− c−σt ≈ βc−σss[
θkθ−1ss + 1 − δ
]
− c−σss
− σβc−σ−1ss
[
θkθ−1ss + 1 − δ
]
(ct+1 − css) + σc−σ−1ss (ct − css)
+ (θ − 1)βc−σss θkθ−2ss (kt+1 − kss)
= −σc−σ−1ss (ct+1 − ct) + (θ − 1)βc−σss θk
θ−2ss (kt+1 − kss)(3.1.2)
where the subscript ‘ss’ stands for steady state value.
Next, consider log-linearizing the equation. For convenience, we use a hat over the
variable to denote the natural logarithm, that is, ct = log ct. Then, the residual can be
13
approximated as:
βc−σt+1
[
θkθ−1t+1 + 1 − δ
]
− c−σt
= βθe−σct+1e(θ−1)kt+1 + β (1 − δ) e−σct+1 − e−σct
≈ βθc−σss kθ−1ss
(
1 − σ (ct+1 − css) + (θ − 1)(
kt+1 − kss
))
+ β (1 − δ) c−σss (1 − σ (ct+1 − css))
− c−σss (1 − σ (ct − css))
= constant + c−σss
[
(1 − β (1 − δ)) (θ − 1) kt+1 − σ (ct+1 − ct)]
(3.1.3)
The last equation uses the fact that the residual is equal to zero in the steady state.
How would we check this algebra using the computer? One way to do it is to take
approximate numerical derivatives as described above.
3.2. Mapping the Problem to a Standard LQ Problem
The original problem is:
maxut∞
t=0
E
[
∞∑
t=0
βtr (Xt, ut) |X0
]
subject to Xt+1 = g (Xt, ut, ǫt+1)
X0 given
Instead of solving this, we solve the following related problem:
maxut∞
t=0
E0
∞∑
t=0
βt (X ′tQXt + u′tRut + 2X ′
tWut)
subject to Xt+1 = AXt +But + Cǫt+1
X0 given (3.2.1)
where
r (Xt, ut) ≃ X ′tQXt + u′tRut + 2X ′
tWut
g (Xt, ut, ǫt+1) ≃ AXt +But + Cǫt+1, (3.2.2)
with Q and R symmetric. That is, we solve a problem with a quadratic objective function
and linear constraints. Note that implicit in our formulation of (3.2.1) are the assumptions
14
that Xt is contained in the agents’ information sets at time t and that the agents know
the objective function and transition functions for all variables.
To obtain the functions in (3.2.2), we take a second and first-order Taylor expansion
of the corresponding nonlinear functions around the steady state of the system. Thus,
when evaluated at the stationary point, the original and approximated functions have the
same value.
To find the steady state of the system, we first set the disturbance term ǫt to its
unconditional mean. Without loss of generality, assume the mean is zero. We then find
the first order conditions of the resulting nonstochastic version of the model:
maxut∞
t=0
∞∑
t=0
βtr (Xt, ut)
subject to Xt+1 = g (Xt, ut, 0) (3.2.3)
and X0 given. Formulating the Lagrangian:
L =∞∑
t=0
βtr (Xt, ut) − λ′t+1 (Xt+1 − g (Xt, ut, 0)) (3.2.4)
and taking derivatives with respect to ut and Xt+1, we obtain the following first-order
conditions
∂r (Xt, ut)
∂ut+∂g (Xt, ut, 0)
∂ut
′
λt+1 = 0
β∂r (Xt+1, ut+1)
∂Xt+1− λt+1 + β
∂g (Xt+1, ut+1, 0)
∂Xt+1
′
λt+2 = 0 (3.2.5)
for t ≥ 0, where λt is a sequence of Lagrange multipliers. Eliminating time subscripts
from (3.2.5) and the constraint in (3.2.3), we then get the following set of nonlinear equa-
tions:
∂r (X, u)
∂u+∂g (X, u, 0)
∂u
′
λ = 0
β∂r (X, u)
∂X− λ+ β
∂g (X, u, 0)
∂X
′
λ = 0
X − g (X, u, 0) = 0 (3.2.6)
This is a set of 2m+ n equations with 2m+ n unknowns, X, u, λ. The fixed point of this
system is the steady state, say X, u, λ, around which we take first and second-order Taylor
expansions of g and r. Thus, we have the problem given by (3.2.1).
15
As shown in Kwakernaak and Sivan (1972) or Sargent (1980), if R < 0 and the system
Xt+1 = AXt + But
Yt = DXt (3.2.7)
is stabilizable and detectable, where A =√β(A − BR−1W ′), B =
√βB, D is some
matrix satisfying Q = D′ΩD for some Ω < 0, Q = Q − WR−1W ′, Xt = βt2Xt and
ut = βt2 (ut + R−1W ′Xt), then the optimal policy function for the optimization problem
(3.2.1) is the time-invariant linear rule:
ut = −FXt, F = (R + βB′PB)−1
(βB′PA+W ′)
=(
R+ B′PB)−1
B′PA+R−1W ′. (3.2.8)
The matrix P in (3.2.8) is the steady-state solution to the matrix Riccati difference equation
Pt = Q+ βA′Pt+1A− (βA′Pt+1B +W ) (R+ βB′Pt+1B)−1
(βB′Pt+1A+W ′)
= Q+ A′Pt+1A− A′Pt+1B(
R + B′Pt+1B)−1
B′Pt+1A (3.2.9)
as t→ −∞, with terminal condition PT ≤ 0.
There have been many algorithms developed for the solution of the discrete-time
Riccati equation. Here, we review several which will later be used to solve the stochastic
growth model. (See Anderson and Moore (1979) for further discussion.) In all cases, we
take as given the matrices A, B, Q, R, W and scalar β (or equivalently A, B, Q, and R),
tolerance criteria γ1 and γ2, and a matrix norm ‖ · ‖.
Direct Iteration. Set an initial symmetric Riccati matrix, P 0 ≤ 0.a) At iteration n, we compute Pn+1 and Fn to be
Pn+1 = Q+ A′PnA− A′PnB(
R+ B′PnB)−1
B′PnA
Fn =(
R + B′PnB)−1
BPnA
b) If ‖Pn+1 − Pn‖ < γ1‖Pn‖ and ‖Fn+1 − Fn‖ < γ2‖Fn‖, go to (c); otherwise,increase n by one and return to (a).
c) Set F = Fn +R−1W ′, P = Pn.
Doubling Algorithm. Set additional initial conditons: a0 = A, b0 = BR−1B′, p0 = Q.
16
a) At iteration n, we compute an+1, bn+1, pn+1, Fn to be
an+1 = an (I + bnpn)−1an
bn+1 = bn + an (I + bnpn)−1bnan′
pn+1 = pn + an′pn (I + bnpn)−1an
Fn =(
R+ B′pnB)−1
B′pnA.
b) If ‖pn+1 − pn‖ < γ1‖pn‖ and ‖Fn+1 − Fn‖ < γ2‖Fn‖, go to (c); otherwise,increase n by one and return to (a).
c) Set F = Fn +R−1W ′, P = pn.
Vaughan’s (1970) Algorithm.a) Find the eigenvalues and eigenvectors of the Hamiltonian matrix, H:
H =
[
A−1 A−1BR−1B′
QA−1 QA−1BR−1B′ + A′
]
=
[
V11 V12
V21 V22
] [
Λ 00 Λ−1
] [
V11 V12
V21 V22
]−1
.
Note that Λ is a diagonal matrix containing the eigenvalues of H that exceedunity in absolute value.
b) Set P = V21V−111 , F = (R+ B′PB)−1B′PA+R−1W ′.
With a steady-state solution to the Riccati matrix, we can use (3.2.8) to compute F
and the law of motion for the state variables:
Xt+1 = (A−BF )Xt + Cǫt+1 (3.2.10)
Futhermore, given an initial condition for the states, X0, and a realization of the shocks,
ǫt, t ≥ 0, we can generate time-series for Xt via (3.2.10) and ut via (3.2.8).
Example 1.1 To illustrate these algorithms, let’s consider the version of the growth model
that was used in comparing alternative methods by Taylor and Uhlig. The problem in this
case is to find ct = h(kt, zt) that solves
maxct
E
[
∞∑
t=0
βtU (ct)∣
∣z0, k0
]
subject to constraints given by
ct + kt+1 − kt = ztkαt
log zt = ρ log zt−1 + ǫt, ǫt ∼ N(
0, σ2ǫ
)
17
and subject to the initial conditions z0 and k0. Notice that the rate of depreciation is equal
to 0 as in Taylor and Uhlig (1990). That simplifies some of the algebra below. In what
follows, we assume that U(c) = c1−σ/(1 − σ).
If we substitute the resource constraint into the objective function, we can rewrite the
problem as follows:
maxkt+1−kt∞
t=0
E0
∞∑
t=0
βt(kt + eωtkαt − kt+1)
1−σ
1 − σ
subject to ωt+1 = ρωt + ǫt+1
k0, ω0 given (3.2.11)
where β, 0 < β < 1 is a discount factor, ωt = log zt. In this formulation of the problem,
we have eliminated consumption, ct. However, given a policy function for kt+1 − kt, we
can compute the policy function for ct since ct = kt − kt+1 + eωtkαt .
To find the linear-quadratic version of (3.2.11), we must first find the steady state.
Thus, we set ǫt = 0 and form the Lagrangian
L =
∞∑
t=0
βt (kt + eωtkαt − kt+1)
1−σ
1 − σ− λt+1 (ωt+1 − ρωt)
(3.2.12)
Taking the derivative with respect to kt, we get
− (kt + eωtkαt − kt+1)−σ
+ β(
kt+1 + eωt+1kαt+1 − kt+2
)−σ (1 + αeωt+1kα−1
t+1
)
= 0
Eliminating time subscripts in this condition and in ωt+1 = ρωt implies
ω = 0, k =
(
αβ
1 − β
)1
1−α
. (3.2.13)
If we set ut = kt+1 − kt and xt = [kt 1 ωt]′, then the matrices of the transition function
for xt are given by
A =
1 0 00 1 00 0 ρ
, B =
100
, C =
0 0 00 0 00 0 1
.
The second element of xt captures constant terms. Since the constraints in the problem are
already linear, we need only approximate the objective function. Taking a second-order
Taylor expansion of the objective function in (3.2.11) around (3.2.13), we obtain
Q =kα(1−σ)
2
−σα2+α2−αk2
σα2−α2+2αk
α(1−σ)k
σα2−α2+2αk
21−σ − 3α+ α2 (1 − σ) 1 + ασ − α
α(1−σ)k
1 + ασ − α 1 − σ
,
18
W = [ασk−ασ−1 − (1+ασ)k−ασ σk−ασ]′/2, and R = −σk−ασ−α/2. Thus, given a value
for β and the parameters underlying A, B, C, Q, R, and W , we can compute the optimal
controls via (3.2.8) and (3.2.9).
As a check on our solution, we can compare it to one found analytically. Since the
state space for the stochastic growth model is small, it is easy to find F analytically using
the fact that (3.2.5) implies:
[
xt+1
λt+1
]
−(
H + H−1)
[
xtλt
]
+
[
xt−1
λt−1
]
= 0 (3.2.14)
when r is quadratic and g is linear, where λt = βt2λt/2. Taking the first equation of
(3.2.14), we have
βkt+1 −[
1 + β +(1 − α) (1 − β)
2
ασ
]
kt + kt−1 = κ0 + κ1ωt
where κ0 = β2α(α− 1)k2α−1/σ and κ1 = β(1 − ρ−1)kα − β2αk2α−1/σ and, hence,
kt+1 − kt = (ψ − 1) kt −ψκ0
1 − βψ− ψκ1ρ
1 − βρψωt, (3.2.15)
where ψ is the root of s2 −(
1 + 1/β + (1 − α)(1 − β)2/(βασ))
s + 1/β with modulus less
than one. From (3.2.15), we get ut = −Fxt.
3.3. A Variant on Vaughn’s Method
Let xt be the l-dimensional vector of endogenous state variables for the model we are
interested in. Let St be the m-dimensional vector of exogenous states variables of the
model with
st+1 = Pst +Qǫt+1,
and ǫt iid. Finally, assume that zt is a n-dimensional vector of choice variables and prices
that are, in equilibrium, functions of xt and st. The form of the solution we are seeking is
xt+1 = Axt +Bst and zt = Cxt +Dst.
Assume that the first-order equations to be solved, after log-linearization, can be
written as follows
0 = Θ1xt+1 + Θ2xt + Θ3zt + Θ4st
0 = EtΦ1xt+1 + Φ2xt + Φ3zt+1 + Φ4zt + Φ5st+1 + Φ6st (3.3.1)
19
Theory tells us that we can do this in two steps: find the coefficients on the endogenous
state vector xt and then use the results to compute the coefficients on st.
In the first step, we stack up the matrices of the equilibrium equations as follows:
0 = A1
[
xt+1
zt+1
]
+ A2
[
xtzt
]
+ stochastic shocks (3.3.2)
where
A1 =
[
Θ1 0Φ1 Φ3
]
A2 =
[
Θ2 Θ3
Φ2 Φ4
]
.
To compute A and C, we find generalized eigenvalues Λ (and associated eigenvectors V )
such that
A2V = −A1V Λ. (3.3.3)
For a unique stationary equilibrium, we need the same number of roots inside the unit
circle as there are elements of x. If we sort the eigenvalues and eigenvectors so that the
roots inside one are ordered first, then we have
A = V11Λ1V−111 (3.3.4)
C = V21V−111 (3.3.5)
where V11 is the l × l upper left partition of V , V21 is the n× l lower left partition of V ,
and Λ1 are the eigenvalues inside the unit circle.
Given A and C, solving for B and D involves solving a linear system of equations. To
see this, substitute the form of the solution into (3.3.1)
0 = Θ1 (Axt +Bst) + Θ2xt + Θ3 (Cxt +Dst) + Θ4st
= (Θ1A+ Θ2 + Θ3C) xt + (Θ1B + Θ3D + Θ4) st (3.3.6)
0 = EtΦ1 (Axt +Bst) + Φ2xt + Φ3 (CAxt + CBst +DPst +DQǫt+1)
+ Φ4 (Cxt +Dst)Φ5 (Pst +Qǫt+1) + Φ6st= (Φ1A+ Φ2 + Φ3CA+ Φ4C) xt
+ (Φ1B + Φ3CB + Φ3DP + Φ4D + Φ5P + Φ6) st. (3.3.7)
It turns out that the coefficients on xt in (3.3.6) and (3.3.7) are equal to 0 if we evaluate
them at A in (3.3.4) and C in (3.3.5). We need to set elements of B and D so that the
20
coefficients on st are also 0. We do this by stacking the coefficients in vectors using vec(x)
= [x11, ...xm1, x12, ...xm2, ...xmn]′ and setting it equal to zero:
[
I ⊗ Θ′1 I ⊗ Θ′
3
I ⊗ Φ′1 + I ⊗ Φ3C P ′ ⊗ Φ3 + I ⊗ Φ4
] [
vec (B)vec (D)
]
= −[
vec (Θ4)
vec (Φ5P + Φ6)
]
where we used the fact that vec(A+B)= vec(A) +vec(B) and vec(ABC) = (C′⊗A)vec(B).
Example 1.2. Let’s apply this to the growth model that we just solved in Example 1.1.
For that example, xt = kt and st = ωt, and—if we substitute out ct—zt = kt+1. In this
case, the capital stock is in levels which allows us to compare with the solution above.
After substituting for ct, the first-order condition is
0 = − (kt + eωtkαt − kt+1)−σ
+ β(
kt+1 + eωt+1kαt+1 − kt+2
)−σ
(
1 + αeωt+1kα−1t+1
)
.
Linearizing this equation yields the following
0 = akt+2 + bkt+1 + ckt + dωt+1 + eωt + constant (3.3.8)
where
a = βc−σ−1(
1 + αkα−1)
b = −σc−σ−1 + βσc−σ−1(
1 + αkα−1)2
+ βc−σα (α− 1) kα−2
c = σc−σ−1(
1 + αkα−1)
d = βσc−σ−1kα(
1 + αkα−1)
+ βc−σαkα−1
e = σc−σ−1kα
The matrices in (3.3.2) are equal to
A1 =
[
1 00 a
]
, A2 =
[
0 −1c b
]
.
It is easy to show that computing the eigenvalues associated with (3.3.3) is equivalent to
finding the roots of
aλ2 + bλ+ c = 0. (3.3.9)
Since there is only one state variable, the dimension of V11 is 1×1 and therefore cancels in
(3.3.4). Thus, A = λ1 where λ1 is the root of the quadratic equation in (3.3.9) that is inside
21
the unit circle. Since zt = xt+1, then it must be the case that A = C. It is easy to show
that this is indeed the case by deriving the eigenvectors V that satisfy A2V = −A1V Λ.
For this example, they are such that λ1 = V21V−111 .
Homework Exercise 1. Redo Example 1.2 without first substituting for ct. In this case, we
set zt = [kt+1, ct]′ and use the linearized first-order conditions in (3.1.2) and linearize the
resource constraint. Here, it is necessary to compute generalized eigenvalues since A1 will
be singular.
Homework Exercise 2. Consider an extension of the growth model used in Examples 1.1
and 1.2 that allows for some depreciation of capital at rate δ and a positive elasticity of
labor. In this case, replace U(c) for Example 1.1 with the utility function given by
U (c, h) =
[
(
c (1 − h)ψ)1−σ
− 1
]
/ (1 − σ)
and the resource constraint by,
ct + kt+1 − (1 − δ) kt = eωtkαt h1−αt
where ωt is an AR(1) process as before. Here the decisions are ct, kt+1, and ht. The state
variables are kt, ωt, and a constant. For this example, compute two sets of solutions, one
set with all decisions and capital stocks in levels (e.g., ct = a+ bkt+ cωt) and the second
with these variables in logs (e.g., log ct = a+ b log kt+ ωt).
22
Chapter 4.
Computing Equilibria in Nonlinear Economies
4.1. The Method of Parameterized Expectations
Den Haan and Marcet (1990) describe a method of parameterized expectations. Instead
of approximating a decision function, they approximate the conditional expectation that
typically appears in the first order conditions of a stochastic model. For example, they
consider the following version of the stochastic growth model:
maxct
E
[
∞∑
t=0
βtu (ct) |k0, θ0
]
, u (c) =c1−τ − 1
1 − τ
subject to
ct + kt − µkt−1 = θtkαt−1, (1)
ln θt+1 = ρ ln θt + ǫt+1, Eǫt = 0, Eǫtǫ′t = σ2. (2)
The intertemporal first order condition is given by
c−τt = βE[
c−τt+1
(
θt+1αkα−1t + µ
)
|kt−1, θt]
.
Den Haan and Marcet (1990) find an approximation to
Etc−τt+1
(
θt+1αkα−1t + µ
)
rather than to ct or kt+1.
But the choice of the function to approximate is not the main difference between this
method and those we described earlier. The main difference is that the approximation
is based on simulating time series with a guess for φ, projecting the resulting series for
c−τt+1(θt+1αkα−1t + µ) on the guess, and choosing the projection to minimize the mean
squared errors (i.e., they do nonlinear least squares).
To be more specific, suppose that the approximation for the conditional expectation
has the form:
φ (kt−1, θt; δ) = exp (Pn (log kt−1, log (θt)))
23
where Pn is an nth order polynomial that depends on the logarithm of the state vector.
For example, Den Haan and Marcet (1990) use a first order polynomial. Thus φ is defined
as follows:
φ (kt−1, θt; δ) = δ1kδ2t−1θ
δ3t .
Let ct(δ), kt(δ) be the sequence for consumption and the capital stock that is generated
for a particular δ by the following steps. First, draw a realization for ǫ from a normal
distribution. Note that only one draw will ever be used. Second, generate a realization for
θt using (2) and the simulated sequence for ǫt. Third, recursively derive consumption and
capital stock values from
c−τt = βφ (kt−1, θt; δ)
and (1).
A candidate solution for the optimization problem is δ. Given δ, we can derive values
for the conditional expectation, consumption, and capital. We just showed how to simulate
time series given a candidate solution. What we want to do next is choose a particular δ
– one that minimizes or maximizes some criterion with attractive features. The criterion
that Den Haan and Marcet use is the mean squared error. Define S : IRm → IRm as
follows:
S (δ) = argminδE[
c−τt+1 (δ)(
θt+1αkα−1t (δ) + µ
)
− φ(
kt−1 (δ) , θt; δ)]2
.
The goal then is to find the fixed point of δ = S(δ).
The steps to this fixed point starting with an initial guess δ0 are as follows:
1. Generate time series for ǫt and θt, t = 1 . . . T
2. At iteration j, j = 0, . . ., given δj , compute ct(δj), kt(δj)Tt=0;
3. Run a nonlinear least squares regression of
c−τt+1 (δ)(
θt+1αkα−1t (δ) + µ
)
on φ(kt−1(δ), θt; δ) to get an approximation for S(δj);
4. Update as follows:
δj+1 = (1 − λ) δj + λS(
δj)
where a smaller λ implies a more stable mapping.
5. Return to step 2 if ||δj+1 − δj || is small; stop otherwise.
24
Den Haan and Marcet (1990) accomplish step 3 by doing a sequence of ordinary least
squares regressions. The trick is to approximate φ(·) as a linear function in delta.
4.2. Weighted Residual Methods
Many problems in economics require the solution to a functional equation as an inter-
mediate step. Typically, we seek decision functions that satisfy a set of Euler conditions
or a value function that satisfies Bellman’s equation. In many cases, we cannot derive
analytical solutions for these functions and instead must rely on numerical methods. In
this chapter, we will apply weighted residual and finite element methods to this type of
problem.
In the case of weighted residual methods, the approximate solution to the functional
equation is represented as a linear combination of known basis functions. In many cases,
the basis functions are polynomials. The coefficients on each basis function are the objects
to be computed to obtain an approximate solution. These coefficients are found by setting
the residual of the equation to zero in an average sense. In other words, a weighted integral
of the residual is set to zero.
The finite element method can be viewed as a piecewise application of the weighted
residual method. With the finite element method, the first step in solving the functional
equation is to subdivide the domain of the state space into nonintersecting subdomains
called elements. The domain is subdivided because the method relies on fitting low-order
polynomials on subdomains of the state space rather than high-order polynomials on the
entire state space. The local approximations are then pieced together to get a global
approximation. As the dimensionality of the problem increases, higher-order functions can
be used where needed, with fewer elements.
The primary goal in this chapter is to illustrate the application of weighted residual and
finite element methods by way of examples. We start with a simple differential equation
because the coefficients to be computed satisfy a linear system of equations. For this
problem, we can work through examples without a computer. we then apply the methods
to a deterministic growth model and a stochastic growth model – two standard models
in economics.1 In the growth model examples, the coefficients to be computed satisfy
1 See Taylor and Uhlig (1990) for a summary of alternative algorithms used to solve the stochasticgrowth model.
25
nonlinear systems of equations. Fortunately, these nonlinear equations are exploitably
sparse if they are derived from a finite element method.
4.2.1. The General Procedure
The problem is to find d : IRm → IRn that satisfies a functional equation F (d) = 0, where
F : C1 → C2 and C1 and C2 are function spaces. As an example, think of d as decision or
policy variables and F as first-order conditions from some maximization problem. My goal
here is to find an approximation dn(x; θ) on x ∈ Ω which depends on a finite-dimensional
vector of parameters θ = [θ1, θ2, . . . , θn]′. Weighted residual methods assume that dn is a
finite linear combination of known functions, ψi(x), i = 0, . . . , n, called basis functions:
dn (x; θ) = ψ0 (x) +
n∑
i=1
θiψi (x) . (4.2.1)
The functions ψi(x), i = 0, . . . , n are typically simple functions. Standard examples of basis
functions include simple polynomials (for example, ψ0(x) = 1, ψi(x) = xi), orthogonal
polynomials (for example, Chebyshev polynomials), and piecewise linear functions.
Figure 3 displays the first five polynomials in the class of Chebyshev polynomials,
which is a popular choice for the basis functions. Chebyshev polynomials are defined on
[−1, 1] and are given recursively as follows: p0(x) = 1, p1(x) = x, and
pi (x) = 2x pi−1 (x) − pi−2 (x) , i = 2, 3, 4, . . .
(or, nonrecursively, as pi(x) = cos(i arccosx)). The domain Ω is not typically given by
[−1, 1]. If the domain is instead [a, b], then we can use ψi(x) = pi−1(2(x− a)/(b− a) − 1)
for i = 1, 2, . . . and ψ0(x) = 0.
26
-1 -0.5 0 0.5 1Domain (x)
-1
-0.5
0
0.5
1
Che
bysh
evpo
lyno
mia
ls(p
i(x),
i=1,
...,5
)
p1(x)p2(x)p3(x)p4(x)p5(x)
Five Chebyshev Polynomial Basis Functions
Figure 3.
Chebyshev polynomials constitute a set of orthogonal polynomials with respect to the
weight function w(x) = 1/√
1 − x2, because∫ 1
−1pi(x)pj(x)w(x)dx = 0 for all i 6= j. Using
orthogonal polynomials in my representation dn rather than the simple polynomials xi may
be preferable as n gets large. For large n, it is difficult to distinguish xn from xn+1. Thus,
the approximation is hardly improved when we add xn+1. With orthogonal polynomials,
however, pn is easily distinguished from pn+1 because they are orthogonal to each other.
Figure 4 displays basis functions that can be used to construct a piecewise linear
representation for dn. These basis functions are of the form
ψi (x) =
x−xi−1
xi−xi−1if x ∈ [xi−1, xi]
xi+1−xxi+1−xi
if x ∈ [xi, xi+1]
0 elsewhere.
(4.2.2)
We do not need to have the points xi, i = 1, . . . , n equally spaced. Therefore, if we want
to represent a function that has large gradients or kinks in certain places – say, because
27
inequality constraints bind – then we can cluster points in those regions. In regions where
the function is near-linear, we do not need many points.
0 1 2 3 4 5 6 7 8 9 10
Domain (x)
0
0.5
1
1.5
Fin
iteE
lem
entL
inea
rB
ases
(ψi(x
),i=
1,...
,5)
ψ1(x)ψ2(x)ψ3(x)ψ4(x)ψ5(x)
Five Piecewise Linear Basis Functions
x1 x2 x3 x4 x5
Figure 4.
We define the residual equation as the functional equation evaluated at the approxi-
mate solution dn:
R (x; θ) = F (dn (x; θ)) .
We want to choose θ so that R(x; θ) is close to zero for all x. Weighted residual methods
get the residual close to zero in the weighted integral sense. That is, we choose θ so that
∫
Ω
φi (x)R (x; θ)dx = 0, i = 1, . . . , n,
where φi(x), i = 1, . . . , n are weight functions. Note that φi(x) and ψi(x) can be different
functions. Alternatively, the weighted integral can be written
∫
Ω
w (x)R (x; θ)dx = 0, (4.2.3)
28
where w(x) =∑
i ωiφi(x) and (4.2.3) must hold for any nonzero weights ωi, i = 1, . . . , n.
Therefore, instead of setting R(x; θ) to zero for all x ∈ Ω, the method sets a weighted
integral of R to zero.
We consider three specific sets of weight functions and, hence, three ways of deter-
mining the coefficients θ1, . . . , θn.
1. Least Squares: φi(x) = ∂R(x; θ)/∂θi. This set of weights can be derived by calcu-
lating the first-order derivatives for the following optimization problem:
minθ
∫
Ω
R (x; θ)2dx.
2. Collocation: φi(x) = δ(x − xi), where δ is the Dirac delta function. This set of
weights implies that the residual is set to zero at n points x1, . . . , xn called the col-
location points: R(xi; θ) = 0, i = 1, . . . , n. If the basis functions are chosen from a
set of orthogonal polynomials with collocation points given as the roots of the nth
polynomial in the set, the method is called orthogonal collocation.
3. Galerkin: φi(x) = ψi(x). In this case, the set of weight functions is the same as the
basis functions used to represent d. Thus, the Galerkin method forces the residual to
be orthogonal to each of the basis functions. As long as the basis functions are chosen
from a complete set of functions, then equation (4.2.1) represents the exact solution,
given that enough terms are included. The Galerkin method is motivated by the fact
that a continuous function is zero if it is orthogonal to every member of a complete
set of functions.
To illustrate weighted residual methods, We apply the methods to standard growth
models. (See Aiyagari and McGrattan 1997, Braun and McGrattan 1993, and Chari,
Kehoe and McGrattan 1997 for other examples.)
4.2.2. Applied to the Deterministic Growth Model
We start with a version of the deterministic growth model:
maxct
∞∑
t=0
βtu (ct)
subject to ct + kt = f (kt−1) ,
(4.2.4)
where ct is consumption at date t, kt is the capital stock at date t, u(·) is the utility
function, f(·) is the production function, and β < 1 is a discount factor.2 From the Euler
2 See Sargent (1987) for a detailed discussion of the problems described here and in the next section.
29
equation, the functional equation is given by
F (c) (k) = βu′ (c (f (k) − c (k)))
u′ (c (k))f ′ (f (k) − c (k)) − 1 = 0,
and the boundary condition is given by c(0) = 0. In this case, we want to compute an
approximation cn(k; θ) to the consumption function that sets F (c) approximately equal to
zero for all k.
Example 2.1. Let u(c) = ln(c) and f(k) = λkα. In this case, the functional equation is
F (c) (k) =βαλ (λkα − c (k))
α−1c (k)
c (λkα − c (k))− 1
The solution for consumption in this case is
c (k) = (1 − βα)λkα.
Suppose that we want to obtain an approximate solution of the form
cn (k; θ) = θ1k + θ2k2 + . . .+ θnk
n
which satisfies the boundary condition at k = 0. The residual equation is therefore
R (k; θ) =βαλ
(
λkα −∑nj=1 θjk
j)α−1
∑nj=1 θjk
j
∑ni=1 θi
(
λkα −∑nj=1 θjk
j)i
− 1.
To apply weighted residual methods, we have to compute integrals of the form
∫ k
0
φi (k)R (k; θ) dk, i = 1, . . . , n (4.2.5)
where k is the upper bound of the domain for the capital stock. Since the residual R is a
nonlinear function of θ, it makes sense to do numerical integration. If we apply Gaussian
quadrature (which is typically done), then equation (4.2.5) is replaced by
∑
l
ωlφi (kl)R (kl; θ) , i = 1, . . . , n (4.2.6)
where ωl are the quadrature weights and the grid points kl are the quadrature abscissas. (See
Press et al. 1986 for the quadrature formulas and a description of how they are derived.)
The values for ωl and kl do not depend on the function being integrated (φi(k)R(k; θ) in
30
this case). In other words, once we know the bounds of integration (for example, 0 and
k) and the number of quadrature points, we can look up the ωl’s and kl’s in a standard
quadrature table.3 Depending on the specific quadrature rule (for example, Legendre,
Chebyshev, Hermite) used, ωl’s and kl’s will differ, but the calculations of R and φ will
look the same no matter what quadrature rule is used.
The final step is to solve the system of equations in (4.2.6). In this case, the system is
nonlinear. The problem is to find θ such that G(θ) = 0, where G has the same dimension
as θ. Applying Newton’s method to G(θ) = 0 means iterating on
θj+1 = θj −[
∂G (θ)
∂θ
∣
∣
∣
∣
∣
θ=θj
]−1
G(
θj)
, j = 1, 2, . . .
with some initial guess θ0, where θj is the vector of unknown coefficients at the jth iteration.
Notice that as we iterate, we solve a sequence of problems of the following form: find θ
such that Aθ = b, where A is the Jacobian matrix ∂G/∂θ evaluated at θj and b is the
function itself, G(θj).
For the three weighted residual applications below, assume that α = 0.25, β = 0.96,
λ = 1/(αβ), and k = 2. For this set of parameters, the steady-state capital stock is equal to
one. Assume also that the quadrature rule is Legendre with 20 quadrature abscissas used to
approximate the integral in (4.2.5). In this case, ωl =∫ 1
−1
∏20i=1,i6=l(x−xi)/(xl−xi)dx, l =
1, . . . , 20, and x1, . . . , x20 are the roots of the 20th Legendre polynomial found recursively
as follows: p0(x) = 1, p1(x) = x, ipi(x) = (2i−1)xpi−1(x)−(i−1)pi−2(x) for i = 2, . . . , 20.
Since k = 2, the points kl are given by kl = xl + 1, l = 1, . . . , 20.
5a. Least squares. To apply the method of least squares, we set φi(k) = ∂R(k; θ)/∂θi,
where the derivative of the residual is given by
∂R (k; θ)
∂θl= −βαλk
α−1kℓ
cn(
k; θ)
1 +(α− 1) cn (k; θ)
k− cn (k; θ)
cn(
k; θ)
(
kℓ − kℓn∑
i=1
iθiki−1
)
l = 1, . . . , n, and k = λkα +∑
j θjkj . Figure 5 displays the approximate solution cn for
n = 5 along with the exact solution. Since the derivative of the true function is infinite
at k = 0 and relatively small for high values of k, we must add more polynomials to
3 The weights and abscissas are chosen so that the n-point quadrature rule is exact for integrals ofall polynomials of order 2n− 1 times some weight function, which depends on the specific rule. Forexample, Gauss-Legendre quadrature uses a weight function of 1 and Gauss-Chebyshev quadrature
uses a weight function of 1/√
1 − x2, where x is defined on (−1, 1).
31
completely resolve the solution at all capital stocks. We also plot the result for a more
restricted grid on the capital stocks, namely, [ 13 ,53 ]. This is the grid Judd (1992) uses
when evaluating weighted residual methods for the deterministic growth model. For both
approximations, we assume that n = 5. Notice that although the approximation on [13 ,53 ]
is very close to the true solution, the exact solution is very smooth – almost linear.
0 0.5 1 1.5 2Capital Stock
0
0.5
1
1.5
2
2.5
3
3.5
4
Exa
ctan
dA
ppro
xim
ate
Sol
utio
nsfo
rC
onsu
mpt
ion
ExactApproximation on [0,2]Approximation on [1/3,5/3]
Two Least-squares Approximations for the
Deterministic Growth Model
Figure 5.
5b. Collocation. To apply the collocation method, we set φi(k) = δ(k − ki), where ki,
i = 1, . . . , n are collocation points in [0,k]. Figure 6 shows two approximations: one with
five evenly spaced collocation points between 0.1 and 2 and one with five evenly spaced
collocation points between 13
and 53. The problem of fitting functions with steep gradients
becomes acute in this case, which is why we avoid the region of capital stocks below 0.1.
Even so, the approximation on [0.1,2] is not very accurate. It is clear that we need better
choices for basis functions and collocation points to make this method competitive with
least squares. On [ 13 ,53 ], We find that the approximation is not quite as good as that for
32
least squares, but it is not too different from the exact solution. Here again, the fit is good
because the exact solution is very smooth on [ 13 ,53 ].
0 0.5 1 1.5 2Capital Stock
0
0.5
1
1.5
2
2.5
3
3.5
4
Exa
ctan
dA
ppro
xim
ate
Sol
utio
nsfo
rC
onsu
mpt
ion
ExactApproximation on [.1,2]Approximation on [1/3,5/3]
Two Collocation Approximations for the
Deterministic Growth Model
Figure 6.
5c. Galerkin. To apply the Galerkin method, we set φi(k) = ki, i = 1, . . . , n. Figure 7
shows approximate functions on [0,2] and [ 13 ,53 ] along with the exact solution. The results
here are similar to the results of the least squares method.
33
0 0.5 1 1.5 2Capital Stock
0
0.5
1
1.5
2
2.5
3
3.5
4
Exa
ctan
dA
ppro
xim
ate
Sol
utio
nsfo
rC
onsu
mpt
ion
ExactApproximation on [0,2]Approximation on [1/3,5/3]
Two Galerkin Approximations for the
Deterministic Growth Model
Figure 7.
Because we need to include more polynomials, which in the case of ki, i = 1, . . . , n
become similar to each other as n gets large, it makes sense to use a class of orthogonal
polynomials. Judd (1992) uses a representation for consumption of the form
cn (k; θ) =
n∑
i=1
θiψi (k) , (4.2.7)
where ψi(k) = pi−1(2(k − k)/(k − k) − 1), k is a lower bound on the capital stocks, and
pi(x) is the ith Chebyshev polynomial defined in equation (4.2.7).4
Example 2.2. In this case, assume that u(c) = c1−τ/(1−τ) and f(k) = λkα+(1−δ)k. Let
τ = 5, α = 0.25, δ = 0.025, β = 0.99, and λ = (1−β(1−δ))/(αβ) (so that the steady-state
capital is equal to 1). Let cn take the form of (4.2.7) with n = 10. Figure 8 displays the
4 Note that this approximation will not satisfy the boundary condition at c(0) = 0 if k = 0 for any θ.However, if we make a slight modification, namely, ψi(k) = kpi−1(2k/k − 1) defined on [0, k], then
the boundary condition is satisfied for all possible choices of θ.
34
approximate solutions for k = 0.03, k = 2 (marked with a square), and k = 0.1, k = 1.9
(marked with a circle) along with the exact solution.5 The location of the points marked
by squares or circles are the quadrature abscissas.
0 0.5 1 1.5 2Capital Stock
0
0.05
0.1
0.15
Exa
ctan
dA
ppro
xim
ate
Sol
utio
nsfo
rC
onsu
mpt
ion
ExactApproximation on [.03,2]Approximation on [.1,1.9]
Two Galerkin Approximations with Chebyshev
Basis Functions for the Deterministic Growth Model
Figure 8.
It is clear from Figure 8 that more polynomials are needed for a good approximation
on [0.03,2]. This is because we are trying to approximate a very steep part of the function
and a very flat part of the function using the same basis functions. When we restrict
the domain to [0.1,1.9], there is a significant improvement in the approximation over this
region of the state space. The approximation is visually indistinguishable from the exact
solution. In this restricted region of the domain, the function does not have any large
gradients.
5 What we’ll call the exact solution here is actually a finite element approximation with a large numberof elements. Although this itself is an approximation, doubling the number of elements leaves Figure8 unchanged.
35
Suppose that, instead of using Chebyshev polynomials, we apply the Galerkin method
with piecewise linear basis functions as is done for the finite element method.
Example 7. Assume that u(·), f(·), and the parameterization are the same as in Example
6. Let x1 = 0, x11 = 2, and xi = xi−1 + 0.005 exp(0.574(i − 2)). This partition implies
that there are 10 elements with lengths that increase exponentially. Thus, there will be
more points near the origin, where the function has a large (infinite in this case) gradient.
To compute the weighted integral, we use a Legendre quadrature rule with two quadrature
points per element. On an element of length ℓe, the Legendre quadrature rule with two
quadrature points implies the following weights and abscissas for (4.2.6): ωl = ℓe/2, l =
1, 2, and k1 = ke+0.211ℓe, k2 = ke+0.789ℓe, where ke is the first endpoint of the element.
Figure 9 displays the finite element approximation along with the exact solution.
Because the finite element method is a piecewise application of a weighted residual method,
it is possible to get a more accurate approximation over the entire [0,2] domain—we are
not using the same basis functions in the very steep region and the very flat region of the
consumption function.
0 0.5 1 1.5 2Capital Stock
0
0.05
0.1
0.15
Exa
ctan
dA
ppro
xim
ate
Sol
utio
nsfo
rC
onsu
mpt
ion
ExactApproximation
Finite Element Approximation for the Deterministic
Growth Model
36
Figure 9.
To obtain the approximation in Figure 9, the main computational task is the inversion
of a 10×10 matrix. In this matrix, 68 of the 100 elements are zeros, and the structure of the
matrix is band diagonal. As the number of unknowns becomes large, it becomes expensive
and, in some cases, infeasible to invert the matrix without using inversion routines that
exploit the fact that the matrix is band diagonal. (See Saad 1996.)
4.2.3. Applied to the Stochastic Growth Model
Suppose that, instead of the deterministic growth model, we want to calculate the decision
functions for the stochastic growth model in which decisions depend on the capital stock
and a stochastic shock. 6 The stochastic growth model assumes that output at date t can
be allocated either to current consumption ct or to current investment it. The consump-
tion/savings decision is assumed to be optimal in that the preferences of households are
maximized. The preferences are given by
E
[ ∞∑
t=0
βt u (ct)
∣
∣
∣
∣
k−1
]
, 0 < β < 1, (4.2.8)
where kt is the capital stock at t and k−1 is known. The maximization of equation (4.2.8)
is done subject to the feasibility constraints
ct + kt − (1 − δ) kt−1 = λtkαt−1, 0 < α < 1, 0 ≤ δ ≤ 1, (4.2.9)
the nonnegativity constraints ct ≥ 0, kt ≥ 0 for all t ≥ 0 and subject to the process for the
technology shock,
lnλt = ρ lnλt−1 + εt, −1 < ρ < 1, (4.2.10)
where εt is a serially uncorrelated, normally distributed random variable with mean zero
and variance σ2. Because ε is normally distributed, it does not have a compact support.
The technology shock in this case takes on values between 0 and infinity. On the computer,
we cannot specify an upper bound of infinity. Instead, we can either specify a large
upper bound (in which the probability of observing a larger value is small) or make a
transformation of variables and work with a bounded interval. Let z = tanh(ln(λ)), which
is defined on [−1, 1]. Then we can rewrite equation (4.2.10) as follows:
zt = tanh(
ρ tanh−1 (zt−1) +√
2σνt
)
,
6 See Judd (1992) for more details on spectral methods as applied to this problem and McGrattan(1996) for more details on the finite element method as applied to this problem.
37
where νt = εt/(√
2σ).
Because the stochastic shock takes on a continuum of values, we need to solve a
two-dimensional problem. The representation of the approximate solution is then
cn (k, z; θ) =
n∑
i=1
θiψi (k, z) .
A simple set of basis functions is all products of the elements of 1, k, k2, . . . , knk and
1, z, z2, . . . , znz. Alternatively, we can use all products of the elements of two sets of
orthogonal polynomials. In either case, however, the number of unknowns starts to add up
quickly, especially if a large number of polynomials are needed to approximate consumption
at both high and low values of the capital stock.
One way to keep the problem tractable is to use the set of complete polynomials rather
than all products of terms in kink
i=0 and zinz
i=0 (for example, bases 1, k, z, k2, kz, z2rather than 1, k, z, kz, k2, z2, k2z, kz2, k2z2). Using the set of complete polynomials al-
lows me to approximate higher-order functions but limits the number of unknown coeffi-
cients.7 Another way to keep the problem tractable is to apply a finite element method.
As earlier examples show, the system of equations to be solved for the unknown coefficients
θ is typically very sparse. Therefore, in big problems, we do not need as much storage as
in a typical spectral method, and we can apply algorithms for solving sparse systems of
equations.
Consider application of the finite element method to the stochastic growth model.
The first step is to write out the residual equation using the first-order condition for the
problem in (4.2.8):
R (k, z; θ) =β√π
∫ ∞
−∞
cn(
k, z; θ)−τ
cn (k, z; θ)−τ
(
αkα−1
√
1 + z
1 − z+ 1 − δ
)
e−ν2
dν − 1 = 0, (4.2.11)
where
k = kα√
(1 + z) / (1 − z) + (1 − δ) k − cn (k, z; θ)
z = tanh(
ρ tanh−1 (z) +√
2σν)
,
cn(0, z; θ) = 0, ν is distributed normally with mean zero and variance 1/2, and the domain
for the state space is Ω = [0, k] × [−1, 1]. If we apply a Gauss-Hermite quadrature rule
7 See Judd (1992) for a comparison of complete polynomials and tensor products in the stochasticgrowth model example.
38
when computing the integral in equation (4.2.11), then the residual equation becomes
R (k, z; θ) ≃ β√π
mν∑
l=1
cn(
k, zl; θ)−τ
cn (k, z; θ)−τ
(
αkα−1
√
1 + zl1 − zl
+ 1 − δ)
ωl − 1,
where zl = tanh(ρ tanh−1(z) +√
2σνl) and νl, ωl, l = 1, . . . , mν are the abscissas and
weights for an mν -point quadrature rule. (For the quadrature formulas, see Press et
al. 1986.)
The second step in applying the finite element method is to divide up the domain
into smaller nonoverlapping subdomains called elements. In this problem, the domain is
two-dimensional and rectangular: Ω = [0, k]× [−1, 1]. A reasonable choice for the element
shape, therefore, is a rectangle. Suppose that we divide the domain into smaller rectangular
subdomains which do not overlap.8 Each element will be a rectangle in Ω, say, [ki, ki+1]
× [zj , zj+1], where ki is the ith grid point for the the capital stock and zj is the jth grid
point for the technology shock.
We consider two types of approximations over the rectangular elements: linear and
quadratic. Suppose the representation for consumption on some element e is linear,
cne (k, z) = a+ b k + c z + d kz. (4.2.12)
Because there are four unknowns, we require an element with four nodes. If we place
the four nodes at the corners of the rectange, then we can uniquely define the geometry
of the element and use the values of the solution at the four nodes to pin down the
constants in equation (4.2.12). That is, as in the one-dimensional case, we can rewrite the
approximation in (4.2.12) so that cne (k, z; θ) =∑
i θeiψ
ei (k, z), i = 1, . . . , 4, where the basis
functions are such that ψei is 1 at node i and zero at the other three nodes on the element.
Before we give formulas for the basis functions, it is convenient to first consider a
mapping from global coordinates (k, z) to local coordinates (ξ, η) defined on a master
element. This is done for convenience, since the master element has a fixed set of coor-
dinates, while each element in Ω has a different set of coordinates. Thus, we can con-
struct basis functions once but use them for each element. Consider functions ξ(k) and
η(z) that map a typical element [ki, ki+1] × [zj , zj+1] to the square [−1, 1] × [−1, 1]; that
is, ξ(k) = (2k − ki − ki+1)/(ki+1 − ki) and η(z) = (2z − zj − zj+1)/(zj+1 − zj). As-
sume that the four nodes of the master element are (−1,−1), (1,−1), (1, 1), and (−1, 1)
8 Extensions to non-rectangular element shapes require additional work but are not as useful in eco-nomic problems as in engineering problems, which sometimes involve irregularly shaped domains.(See, for example, Hughes 1987 and Reddy 1993.)
39
using the local coordinates. In this case, the basis functions are constructed so that
cne (ξ, η; θ) =∑
i θeiψ
ei (ξ, η) with θe1 = cne (−1,−1; θ), θe2 = cne (1,−1; θ), θe3 = cne (1, 1; θ), and
θe4 = cne (−1, 1; θ). These restrictions imply that
cne (ξ, η; θ) =1
4(1 − ξ) (1 − η) θe1 +
1
4(1 + ξ) (1 − η) θe2
+1
4(1 + ξ) (1 + η) θe3 +
1
4(1 − ξ) (1 + η) θe4. (4.2.13)
To attain a more accurate approximation, we can increase the number of elements
while retaining linear basis functions or use higher-order polynomials. Consider, for exam-
ple, quadratic functions in two dimensions. One simple way to construct these functions
is to take the product of one-dimensional quadratic polynomials. A unique set of coeffi-
cients for the polynomial requires that there be nine nodes and, hence, nine interpolation
functions. In this case, the approximation on the master element [−1, 1] × [−1, 1] is given
by
cne (ξ, η; θ) =1
4ξ (ξ − 1) η (η − 1) θe1 +
1
4ξ (ξ + 1) η (η − 1) θe2 +
1
4ξ (ξ + 1) η (η + 1) θe3
+1
4ξ (ξ − 1) η (η + 1) θe4 +
1
2
(
1 − ξ2)
η (η − 1) θe5 +1
2ξ (ξ + 1)
(
1 − η2)
θe6
+1
2
(
1 − ξ2)
η (η + 1) θe7 +1
2ξ (ξ − 1)
(
1 − η2)
θe8 +(
1 − ξ2) (
1 − η2)
θe9.(4.2.14)
Example 8. Let τ = 1, δ = 0, β = 0.95, α = 0.33, ρ = 0.95, and σ = 0.1. Assume that
the partition on z is given by [−0.391, −0.123, 0.123, 0.391] and that the partition on k
is given by [0, 0.010, 0.036, 0.102, 0.273, 0.714, 1.85]. We set the number of quadrature
points on each element to nine, that is, three points for integration with respect to the
capital stock and three points for integration with respect to the technology shock. For
integration over ν, we set the number of quadrature points, mν , equal to 10.
Figure 10 displays the approximate piecewise linear solution (marked with a square)
along with the exact solution. Even though there are only 18 elements, it is hard to
distinguish the two.
40
0 0.5 1 1.5 2Capital Stock
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Exa
ctan
dA
ppro
xim
ate
Sol
utio
nsfo
rC
onsu
mpt
ion
Exact18 Element, Linear Bases18 Element, Quadratic Bases
Two Finite Element Approximations for the Stochastic
Growth Model
Figure 10.
Example 9. Suppose that we use the same parameterization as in Example 8, but instead
of linear basis functions, we use the quadratic functions in equation (4.2.14). In Figure 10,
the solution is marked with a circle. Notice that the fit with quadratic bases is slightly bet-
ter than that with linear bases – however, since the coarse piecewise linear approximation
is very accurate, there is not much room for improvement.
41
Chapter 5.
Maximum Likelihood Estimation
We describe how to use the Kalman filter to obtain an innovations representation and how
to use it to compute a Gaussian likelihood function. Finally, we display formulas for the
gradient of the log of Gaussian likelihood function with respect to free parameters of an
economic model. These formulas are messy, but easy to program and useful for accelerating
the process of maximizing the likelihood function.
Constructing an innovations representation is a key step in deducing the implications
of a model for vector autoregressions and for evaluating a Gaussian likelihood function.9 An
innovations representation is a state-space representation in which the vector white noise
driving the system is of the correct dimension (equal to that of the vector of observables)
and lives in the proper space (the space spanned by current and lagged values of the
observables).
Suppose that our theorizing and data collection lead us to a system of the form
xt+1 = Aoxt + Cwt+1
zt = Gxt + vt
vt = Dvt−1 + ηt,
(5.1)
where D is a matrix whose eigenvalues are bounded in modulus by unity and ηt is a
martingale difference sequence that satisfies
Eηtη′t = R
Ewt+1η′s = 0 for all t and s.
In Eq. (5.1), vt is a serially correlated measurement error process that is orthogonal to the
xt process.
We define the quasi-differenced process as
zt ≡ zt+1 −Dzt. (5.2)
From Eq. (5.1) and the definition (5.2) it follows that
zt = (GAo −DG) xt +GCwt+1 + ηt+1.
9 The calculations in this section are versions of ones described by Anderson and Moore (1979).
42
Then (xt, zt) is governed by the state-space system
xt+1 = Aoxt + Cwt+1
zt = Gxt +GCwt+1 + ηt+1,(5.3)
where G = GAo−DG. This system has nonzero covariance between the state noise Cwt+1
and the “measurement noise” (GCwt+1+ ηt+1). Let [Kt,Σt] be the Kalman gain and state
covariance matrix associated with the Kalman filter, namely,
Kt =(
CC′G′ + AoΣtG′)
Ω−1t (5.4)
Ωt = GΣtG′ +R+GCC′G′ (5.5)
Σt+1 = AoΣtAo′+CC′−
(
CC′G′+AoΣtG′)
Ω−1t
(
GΣtAo′+GCC′
)
. (5.6)
Then an innovations representation for system (5.3) is
xt+1 = Aoxt +Ktut
zt = Gxt + ut,(5.7)
wherext = E [xt | zt−1, zt−2, . . . , z0, x0]
ut = zt − E [zt | zt−1, . . . , z0, x0]
Ωt ≡ Eutu′t = GΣtG
′ +R +GCC′G′.
Initial conditions for the system are x0 and Σ0. From definition (5.2), it follows that
[zt+1, zt, . . . , z0, x0] and [zt, zt−1, . . . , z0, x0] span the same space, so that
xt = E [xt | zt, zt−1, . . . , z0, x0]
ut = zt+1 − E [zt+1 | zt, . . . , z0, x0] .
So ut is said to be an innovation in zt+1.
Equation (5.6) is a matrix Riccati difference equation. The Kalman filter has a steady-
state solution if there exists a time-invariant matrix Σ which satisfies Eq. (5.6), i.e., one
that satisfies the algebraic matrix Riccati equation. In this case, the same computational
procedures used for the optimal linear regulator problem apply. This is a benefit of the
duality of filtering and control referred to earlier. The steady-state Kalman gain, K, is
given by Eq. (5.4) with Σt = Σ and Ωt = GΣG′ +R+GCC′G′.
The innovations representation is equivalent with a Wold representation or vector
autoregression. Estimates of these representations are recovered in empirical work using
43
the vector autoregressive techniques promoted by Sims (1980) and Doan, Litterman, and
Sims (1984). It is convenient to have a quick way of deducing the vector autoregression
implied by a particular theoretical structure. To get a Wold representation for zt, substitute
Eq. (5.2) into Eq. (5.7) to obtain
xt+1 = Aoxt +Kut
zt+1 −Dzt = Gxt + ut.(5.8)
A Wold representation for zt is
zt+1 = [I −DL]−1[
I + G (I − AoL)−1KL
]
ut, (5.9)
where again L is the lag operator. From Eq. (5.8) a recursive whitening filter for obtaining
ut from zt is given byut = zt+1 −Dzt − Gxt
xt+1 = Aoxt +Kut.(5.10)
5.1. Vector autoregressive representation
Hansen and Sargent (1994) show that an autoregressive representation for zt is
zt+1 = D + (I −DL) G[
I −(
Ao −KG)
L]−1
KL zt + ut. (5.1.1)
or
zt+1 =[D + GK]zt +∞∑
j=1
[G(
Ao −KG)jK
−DG(
Ao −KG)j−1
K]zt−j + ut.
(5.1.2)
This equation expresses zt+1 as the sum of the one-step-ahead linear least squares forecast
and the one-step prediction error.
5.2. The Likelihood Function
We start with a “raw” time series yt that determines an adjusted series zt according to
zt = f (yt,Θ) ,
where Θ is the vector containing the free parameters of the model, including parameters
determining particular detrending procedures. For example, if our raw series has a geomet-
ric growth trend equal to µt which is to be removed before estimation, then the adjusted
44
series is zt = yt/µt. We assume that the state-space model of the form (5.3) and the
associated innovations representation (5.7) pertains to the adjusted data zt. We can
use the innovations representation (5.7) recursively to compute the innovation series, then
calculate the log-likelihood function
L (Θ) =
T−1∑
t=0
log |Ωt| + trace(
Ω−1t utu
′t
)
− log∣
∣
∂f (yt,Θ)
∂yt
∣
∣
(5.2.1)
and find estimates, Θ = argminΘL(Θ), where Ωt = Eutu′t is the covariance matrix of the
innovations. To find the minimizer Θ, we can use a standard optimization program. In
practice, it is best if we can calculate both the log-likelihood function and its derivatives
analytically. First, the computational burden is much lower with analytical derivatives.
Consider, for example, the model of McGrattan, Rogerson, and Wright (1993), which
has 84 elements in Θ. For each step of a quasi-Newton optimization routine, L and ∂L∂θ
are computed. To obtain ∂L∂θ
numerically for the McGrattan, Rogerson, Wright (1993)
example, the log-likelihood function must be evaluated 168 times if central differences are
used in computing an approximation for ∂L∂θ, e.g.,
∂L
∂θ≈ L (Θ + ǫe) − L (Θ − ǫe)
2ǫ, (5.2.2)
where e is a vector of zeros except for a 1 in the element corresponding to θ and ǫ is some
positive number. Usually, the costs of computing L a large number of times far outweigh
the costs of computing ∂L∂θ once. If L and ∂L
∂θ are to be computed many times, which is
typically the case, then the costs of computing numerical derivatives can be quite large.
A second advantage to analytical derivatives is numerical accuracy. If the log-likelihood
function is not very smooth for the entire parameter space, there may be problems with the
accuracy of approximations such as Eq. (5.2.2). With inaccurate derivatives, it is difficult
to determine the curvature of the function and, hence, to find a minimum.
For L(Θ) in Eq. (5.2.1), the derivatives ∂L(Θ)/∂θ are easy to derive. We derive them
in Anderson et al. (1996) and distinguish formulas that are steps in the derivation from
those that would be put into a computer code.
Once we have the log-likelihood function and its derivatives, we can apply standard
optimization methods to the problem of finding the maximum likelihood estimates. In
practice, we will have a constrained optimization problem since the equilibrium is not
typically computable for all possible parameterizations. For example, we may have simple
constraints such as ℓ < Θ < u, where ℓ and u are the lower and upper bounds for the
45
parameter vector. In this case, we use either a constrained optimization package or penalty
functions (see Fletcher 1987).
After computing the maximum likelihood estimates, we need to compute their stan-
dard errors,
Se (Θ) = diag
(
√
(
∑
t
∂Lt∂Θ
∂Lt∂Θ
′)−1)
, (5.2.3)
where Lt(Θ) is the logarithm of the density function of the date t innovation, i.e.,
Lt (Θ) = log |Ωt| + u′tΩ−1t ut − log
∣
∣
∂f (yt,Θ)
∂yt
∣
∣. (5.2.4)
See Anderson et al. (1996) for the formula for ∂Lt
∂θ.
46
Chapter 6.
A Prototype Real Business Cycle Model
We consider two versions of the model. The first has technology parameters that are
autoregressive of order one and the second has technology parameters that are unit root
processes.
6.1. A Version of the Model with AR(1) Technology
6.1.1. Maximization problems
Consider an economy with households, firms, and the government. The representative
household chooses consumption, investment, and labor to solve the following maximization
problem:
maxct,xt,lt
E∞∑
t=0
βt U (ct, 1 − lt)Nt
subject to (1 + τct) ct + (1 + τxt)xt = (1 − τkt) rtkt + (1 − τlt)wtlt + τktδkt + trt
Nt+1kt+1 = [(1 − δ) kt + xt]Nt
ct, xt ≥ 0 in all states
taking processes for the rental rate rt, wage rate wt, the tax rates τct, τxt, τkt, τlt, and
transfers trt as given. The representative firm solves a simple static problem at t:
maxKt,Lt
F (Kt, ZtLt) − rtKt − wtLt.
The government sets rates of taxes and transfers in such a way that their budget constraint
at t, namely,
Gt +Nttrt = τkt (rt − δ)Ntkt + τltwtltNt + τctNtct + τxtNtxt
is satisfied. In equilibrium, the following conditions must hold:
Nt (ct + xt) +Gt = F (Kt, ZtLt) (6.1.1)
Ntkt = Kt
Ntlt = Lt.
47
6.1.2. First-order conditions
Next, consider the first-order conditions in this economy. The Lagrangian for the household
optimization problem is given by
L = E∑
t
βtNt
U (ct, 1 − lt)
+ µt
(1 − τkt) rtkt + (1 − τlt)wtlt + τktδkt + trt − (1 + τct) ct − (1 + τxt) xt
+ λt
(1 − δ) kt + xt − (1 + gn) kt+1
Here, the nonnegativity constraints on consumption and investment have been ignored.
These constraints will not bind for postwar size business cycles. They do bind for large
shocks such as occurred during the Great Depression or World War II. When analyz-
ing those periods, we need to include a penalty function to enforce the nonnegativity
constraints. (See Chari, Kehoe, and McGrattan’s Staff Report 328 or McGrattan and
Ohanian’s Staff Report 315.)
The relevant first-order conditions are found by taking derivatives of L with respect
to ct, lt, xt, and kt+1:
0 = U1 (ct, 1 − lt) − µt (1 + τct)
0 = −U2 (ct, 1 − lt) + µt (1 − τlt)wt
0 = µt (1 + τxt) + λt = 0
0 = − (1 + gn)λt + Etµt+1 [(1 − τkt+1) rt+1 + δτkt+1] + λt+1 (1 − δ)
Eliminating multipliers yields:
U2 (ct, 1 − lt)
U1 (ct, 1 − lt)=
1 − τlt1 + τct
wt (6.1.2)
1 + τxt1 + τct
U1 (ct, 1 − lt) = βEt
[
U1 (ct+1, 1 − lt+1)
1 + τct+1
(1 − τkt+1) rt+1 + δτkt+1
+ (1 − δ) (1 + τxt+1)
]
. (6.1.3)
In addition, there are first-order conditions for the firm’s static problem. These are
rt = F1 (Kt, ZtLt) (6.1.4)
wt = F2 (Kt, ZtLt)Zt. (6.1.5)
48
Finally, we have a resource constraint given by (6.1.1).
From here on, we make the following functional form assumptions and auxiliary
choices:
F (k, l) = kθl1−θ (6.1.6)
U (c, 1 − l) =(
c (1 − l)ψ)1−σ
/ (1 − σ) (6.1.7)
τkt = τct = 0
st = [log zt, τlt, τxt, log gt]′
st+1 = P0 + Pst +Qǫs,t+1, ǫs ∼ N (04×1, I4×4) . (6.1.8)
We have turned off τc since it plays a similar role to τn in distorting the labor-leisure
choice. Similarly, we have turned off τk since it plays a similar role to τx in distorting the
intertemporal margin.
If we substitute the choices (6.1.6)-(6.1.7) into (6.1.1) and (6.1.2)-(6.1.5), then substi-
tute the equilibrium rates rt and wt into (6.1.2) and (6.1.3), we have:
Nt (ct + gt) +Nt+1kt+1 − (1 − δ)Ntkt = (Ntkt)θ(ZtNtlt)
1−θ(6.1.9)
ψct1 − lt
= (1 − τlt) (1 − θ) (Ntkt)θZ1−θt (Ntlt)
−θ(6.1.10)
(1 + τxt) c−σt (1 − lt)
ψ(1−σ)
= βEt[
c−σt+1 (1 − lt+1)ψ(1−σ)
(1 − τkt+1) θ (Nt+1kt+1)θ−1
(Zt+1Nt+1lt+1)1−θ
+ δτkt+1 + (1 − δ) (1 + τxt+1)]
. (6.1.11)
6.1.3. Log-linear computation
The next big step is to approximate the decision function for capital. Given an approximate
function for kt+1, We can use the static equations (2.1.4) and (2.1.7) to determine the
decisions ct and lt.
Log-linearizations are done for a stationary version of the equations (6.1.9)-(6.1.11).
Thus, before proceeding, We need to normalize variables. Dividing all variables that grow
by (1 + gz)t gives me:
ct + gt + (1 + gz) (1 + gn) kt+1 − (1 − δ) kt = yt = kθt (ztlt)1−θ
(6.1.12)
49
ψct1 − lt
= (1 − τlt) (1 − θ) kθt l−θt z1−θ
t (6.1.13)
(1 + τxt) c−σt (1 − lt)
ψ(1−σ)
= βEtc−σt+1 (1 − lt+1)
ψ(1−σ)[
θkθ−1t+1 (zt+1lt+1)
1−θ+ (1 − δ) (1 + τxt+1)
]
(6.1.14)
where β = β(1 + gz)−σ.
To do the log-linear approximation, we will also need the steady state values of the
variables in (6.1.12)-(6.1.14) (assuming constant values for z, the taxes, and government
spending):
k/l =
(1 + τx)(
1 − β (1 − δ))
βθz1−θ
1/(θ−1)
c =
[
(
k/l)θ−1
z1−θ − (1 + gz) (1 + gn) + 1 − δ
]
k − g = ξ1k − g
c =
[
(1 − τl) (1 − θ)(
k/l)θ
z1−θ/ψ
]
(
1 − 1/(
k/l)
k)
= ξ2 − ξ3k
where the last 2 equations imply k = (ξ2 + g)/(ξ1 + ξ3), c = ξ1k − g, l = (1/(k/l))k.
Assume that the solution for the capital decision takes the form:
log kt+1 = γk log kt + γ [ log zt τlt τxt log gt ]′+ constant, (6.1.15)
where γk is a scalar and γ is 1 × 4 and equal to [γz, γl, γx, γg]. Assume the residual from
the dynamic first-order condition (6.1.14) can be written (after substitutions from (6.1.12)
and (6.1.13)):
f(
Et log kt+2, log kt+1, log kt, log zt+1, log zt, τlt+1, τlt, τxt+1, τxt, log gt+1, log gt
)
≈ a0Et log kt+2 + a1 log kt+1 + a2 log kt + b0Etst+1 + b1st.
Then the general solution algorithm is to find γk that solves the quadratic equation
a0γ2k + a1γk + a2 = 0,
and γ that solves the linear equations:
a0γkγ + a0γP + a1γ + b0P + b1 = 01×4.
Note that this implies:
γ = − [(a0a+ a1) I4×4 + a0P′]−1
(b0P + b1I4×4)′.
Once we have values for the the coefficients γk and γ, We can use (6.1.12) and (6.1.13) to
back out ct and lt (either nonlinearly or by way of a log-linear approximation).
50
6.2. A Version of the Model with Random Walk Technology
The only changes relative to the model with AR(1) technology is: Zt = Zt−1zt where z
is the innovation to technology. In this case, detrending is done slightly differently. We
use vt = Vt/[NtZt] to denote the detrended, per-capita variable Vt, except in the case of
capital. There, we use kt = Kt/[NtZt−1].
6.2.1. Maximization problems
The maximization problems are the same as above except that households in this version
assume Zt = Zt−1zt with the process for log zt assumed to be autoregressive.
6.2.2. First-order conditions
The first-order conditions are the same as above.
6.2.3. Log-linear computation
The main difference between the benchmark model and the version with random-walk
technology is the step taken to normalize variables In this version, the normalized variables
are:
ct = ct/Zt, xt = xt/Zt, gt = gt/Zt, yt = yt/Zt, kt = kt/Zt−1.
Using the functional forms for F and U in (6.1.6) and (6.1.7), respectively, the equilibrium
rental and wage rates are:
rt = θKθ−1t (ZtLt)
1−θ= θkθ−1
t (ztlt)1−θ
wt = (1 − θ)Kθt (ZtLt)
−θZt = (1 − θ) kθt (ztlt)
−θZt.
This implies the following first-order conditions
ct + gt + (1 + gn) kt+1 − (1 − δ) z−1t kt = yt = kθt l
1−θt z−θt (6.2.1)
ψct1 − lt
= (1 − τlt) (1 − θ) kθt (ztlt)−θ
(6.2.2)
(1 + τxt) c−σt (1 − lt)
ψ(1−σ)
= βz−σt+1Etc−σt+1 (1 − lt+1)
ψ(1−σ)[
θkθ−1t+1 (zt+1lt+1)
1−θ+ (1 − δ) (1 + τxt+1)
]
.(6.2.3)
Next, we compute the steady state of the system for constant values for z, the taxes,
51
and government spending:
k/l =
(
(1 + τx) (1 − βz−σ (1 − δ))
βz−σθz1−θ
)1/(θ−1)
c =
[
(
k/l)θ−1
z−θ − (1 + gn) + (1 − δ) z−1
]
k − g = ξ1k − g
c =
[
(1 − τl) (1 − θ)(
k/l)θ
z−θ/ψ
]
(
1 − 1/(
k/l)
k)
= ξ2 − ξ3k
where the last 2 equations imply k = (ξ2 + g)/(ξ1 + ξ3), c = ξ1k − g, l = (1/(k/l))k.
The form of the solution and the procedure for computing it is the same as in the
benchmark case.
6.3. MLE Estimation
The next step is to describe a standard method we can use to estimate the processes
governing the four exogenous variables in st with the data described above.
6.3.1. State-space form in the general case
Assume that X is a vector of state variables from the model and Y are observables. The
state-space form then isXt+1 = AXt +Bǫt+1
Yt = CXt + ωt
ωt = Dωt−1 + ηt
where D is equal to parameters governing serial correlation of measurement error. Assume
that Eηtη′t = R, Eǫtη
′s = 0 for all periods t and s. Define Yt ≡ Yt+1 −DYt. Then system
can be rewritten as follows:
Xt+1 = AXt +Bǫt+1
Yt = CXt + CBǫt+1 + ηt+1
6.3.2. Log-likelihood function
The log-likehlihood function is
L (Θ) =
T−1∑
t=0
log |Ωt| + trace(
Ω−1t utu
′t
)
− log |∂f (Zt,Θ) /∂Zt|
(6.3.1)
52
where the parameters to be estimated are stacked in vector Θ, the innvation vector is ut,
and its covariance is Ωt. The last term in (6.3.1) is nonzero if the Y are not the raw series
but depend on the raw series Z plus the parameter vector. For example, if We estimate gz
and use per-capita values as our raw data, then Z is per-capita data and Y is detrended,
per-capita data.
The innovation vector ut and its covariance Ωt are defined as follows:
ut = Yt − E[
Yt|Yt−1, Yt−2, . . . , Y0, X0
]
= Yt+1 − E[
Yt+1|Yt, Yt−1, . . . , Y0, X0
]
= Yt+1 −DYt − CXt
Ωt = Eutu′t = CΣtC
′ +R + CBB′C′.
which in turn depends on the predicted state Xt:
Xt = E[
Xt|Yt, Yt, . . . , Y0, X0
]
.
The predicted state evolves according to
Xt+1 = AXt +Ktut
where Kt is the Kalman gain,
Kt =(
BB′C′ + AΣtC′)
Ω−1t
Σt+1 = AΣtA′ +BB′ −
(
BB′C′ + AΣtC′)
Ω−1t
(
CΣtA′ + CBB′
)
with state covariance Σt.
6.3.3. MLE in the Benchmark Case
In the benchmark case, we have Xt = [log kt, log zt, τlt, τxt, log gt, 1]′, Yt = [log yt, log xt,
log lt, log gt], and
A =
γk γz γl γx γg γ0
04×1 P P0
0 01×4 1
B =
01×4
Q0
C =
φyk φyz φyl 0 φyg φy0φxk 0 0 0 0 φx0φlk φlz φll 0 φlg φl00 0 0 0 1 0
+
φyk′
φxk′φlk′
0
[ γk γz γl γx γg 0 ] . (6.3.2)
53
The coefficients φ are derived by log-linearizing (6.1.13) after substituting in for con-
sumption from (6.1.12):
0 ≈ ψ
kθ (zl)1−θ
[
θ log kt + (1 − θ) (log zt + log lt)]
− (1 + gz) (1 + gn) k log kt+1 + (1 − δ) k log kt − g log gt
+ (1 − θ) (1 − τl) kθl−θz1−θ (1 − l)
1/ (1 − τl) τlt
− θ log kt + θ log lt − (1 − θ) log zt + l/ (1 − l) log lt
which we write succinctly as
log lt = φlk log kt + φlz log zt + φllτlt + φlg log gt + φlk′ log kt+1. (6.3.3)
Using this equation for log l, we use the production relation and the capital accumulation
equation to write log y and log x as follows:
log yt = (θ + (1 − θ)φlk) log kt + (1 − θ) (1 + φlz) log zt
+ (1 − θ)[
φllτlt + φlg gt + φlk′ log kt+1
]
≡ φyk log kt + φyz log zt + φylτlt + φyg log gt + φyk′ log kt+1 (6.3.4)
log xt = (1 + gz) (1 + gn) k/x log kt+1 − (1 − δ) k/x log kt
≡ φxk log kt + φxk′ log kt+1. (6.3.5)
We fixed parameters of preferences, production, and growth and estimated the pro-
cesses for the shocks. The parameters that were fixed were: ψ = 2.24, σ = 1, β = .9722,
θ = .35, δ = .0464, gn = 1.5%, and gz = 1.6%. We also set D = 04×4 and R = .0001×I4×4.
The parameters that were estimated were elements of P0, P , and Q.
6.3.4. MLE in the Random Walk Case
In the case of random-walk technology, the settings are slightly different. In this case,
we have Xst = [log kt, log zt, τlt, τxt, log gt, 1]′, Xt = [Xst, Xst−1]′, and Yt = [log yt −
log yt−1, log xt − log xt−1, log lt, log gt − log gt−1]. We can write the growth rates in Yt
as elements of Xt as follows:
log yt − log yt−1 = log (ytZt) − log (yt−1Zt−1)
= log (yt) − log (yt−1) + log zt
= φyk
(
log kt − log kt−1
)
+ (1 + φyz) log zt − φyz log zt−1
+ φyl (τlt − τlt−1) + φyg (log gt − log gt−1) + φyk′(
log kt+1 − log kt
)
54
Similarly the growth rates for xt and gt can be written in terms of the elements of Xt.
To obtain the φ coefficients, We log-linearize (6.2.2) after substituting in for consump-
tion from (6.2.1):
0 ≈ ψ
kθl1−θz−θ[
θ(
log kt − log zt
)
+ (1 − θ) log lt
]
− (1 + gn) k log kt+1 + (1 − δ) z−1k(
log kt − log zt
)
− g log gt
+ (1 − θ) (1 − τl) kθ (zl)
−θ(1 − l)
1/ (1 − τl) τlt
− θ log kt + θ (log lt + log zt) + l/ (1 − l) log lt
which again can be written succinctly as in (6.3.3). Using the equation for log l, production
relation and the capital accumulation equation can be used to write log y and log x as
follows:
log yt = (θ + (1 − θ)φlk) log kt + ((1 − θ)φlz − θ) log zt
+ (1 − θ)[
φllτlt + φlg gt + φlk′ log kt+1
]
≡ φyk log kt + φyz log zt + φylτlt + φyg log gt + φyk′ log kt+1 (6.3.6)
log xt = (1 + gn) k/x log kt+1 − (1 − δ) z−1k/x(
log kt − log zt
)
≡ φxk log kt + φxz log zt + φxk′ log kt+1. (6.3.7)
The matrices in the state space form are
A =
[
As 0I 0
]
B =
[
Bs0
]
where
As =
γk γz γl γx γg γ0
04×1 P P0
0 01×4 1
Bs =
01×4
Q0
and
C =
φyk − φyk′ 1 + φyz φyl 0 φyg φy0 −φyk −φyz −φyl 0 −φyg −φy0φxk − φxk′ 1 + φxz 0 0 0 φx0 −φxk −φxz 0 0 0 −φx0
φlk φlz φll 0 φlg φl00 1 0 0 1 0 0 0 0 0 −1 0
+
φyk′
φxk′φlk′
0
[ γk γz γl γx γg γ0 0 0 0 0 0 0 ] . (6.3.8)
55
6.4. Simulating Data from the Models
We first draw 1000 sequences ǫs,t. Given MLE estimates for P0, P , Q, and initial
conditions for s, we can use (6.1.8) to derive sequences for technology, tax rates, and
spending. Given an initial condition for the capital stock k0, we can use (6.1.8) to derive
the time path for a sequence kt. With technology, tax rates, spending, and capital, we
have the entire state vector Xt period by period. we then use Yt = CXt (since we have
assumed negligible measurement error) for my observable vector where C is (6.3.2) in the
benchmark case and (6.3.8) in the random-walk case.
56
Chapter 7.
A Prototype Sticky Price Model
7.1. Model Economy
Since we will use the first order conditions over and over again in these notes, we start
with a statement of the optimization problems solved by all of the agents in the economy
and the associated first order conditions.
The problem solved by the final goods producers each period is
maxY (i,st)
P(
st)
−∫
P(
i, st−1)
Y(
i, st)
/Y(
st)
di (7.1.1)
subject to∫
g
(
Y (i, st)
Y (st)
)
di = 1 (7.1.2)
The first order conditions for this problem are
P(
i, st−1)
= λ(
st)
g′(
Y (i, st)
Y (st)
)
where λ is the Lagrange multiplier on the constraint ((7.1.2)). The zero-profit condition,
P(
st)
=
∫
P(
i, st−1)
Y(
i, st)
/Y(
st)
di, (7.1.3)
and the first order condition for Pi imply the following for the relative price
P (i)
P=
g′(
Y (i)Y
)
∫
g′(
Y (j)Y
)
Y (j)Y dj
.
Inverting this equation gives the input demand functions
Y(
i, st)
= D
(
P(
i, st−1)
P (st)
∫
g′(
Y (j, st)
Y (st)
)
Y (j, st)
Y (st)dj
)
Y(
st)
(7.1.4)
where D ≡ (g′)−1.
57
If we assume that g(y) = yθ, which is the case in most of the paper, then we have:
Y(
i, st)
=
[
P (st)
P (i, st−1)
]1
1−θ
Y(
st)
(7.1.5)
P(
st)
=
[∫
P(
i, st−1)
θθ−1 di
]θ−1
θ
. (7.1.6)
The problem solved by consumers is
max∞∑
t=0
∑
st
βtπ(
st)
U(
C(
st)
, L(
st)
,M(
st)
/P(
st))
, (7.1.7)
subject to the sequence of budget constraints
P(
st)
C(
st)
+M(
st)
+∑
st+1
Q(
st+1|st)
B(
st+1)
(7.1.8)
≤ P(
st)
W(
st)
L(
st)
+M(
st−1)
+B(
st)
+ Π(
st)
+ T(
st)
, t = 0, 1, . . . ,
and borrowing constraints B(st+1) ≥ B for some large negative number B.
The first order conditions for the consumer are therefore given by the following equa-
tions:
−Ul (st)
Uc (st)= W
(
st)
(7.1.9)
Uc (st)
P (st)= β
∑
st+1
π(
st+1|st) Uc
(
st+1)
P (st+1)+Um (st)
P (st)(7.1.10)
Q(
sτ |st)
= βτ−tπ(
sτ |st) Uc (sτ )
Uc (st)
P (st)
P (sτ )for all τ > t (7.1.11)
where U(st) is shorthand for U(C(st), L(st),M(st)/P (st)).
The problem solved by the monopolist adjusting his price is to choose P (i, st−1),
K(i, sτ ), X(i, sτ), and L(i, sτ) τ = t, . . . , t+N − 1 to maximize
∞∑
τ=t
∑
sτ
Q(
sτ |st−1) [
P(
i, st−1)
Y (i, sτ ) − P (sτ )W (sτ )L (i, sτ ) − P (sτ )X (i, sτ )]
(7.1.12)
subject to the demand for good i in ((7.1.4)), the production technology:
Y(
i, st)
= F(
K(
i, st−1)
, L(
i, st))
(7.1.13)
58
and the law of motion for capital used in producing i:
K(
i, st)
= (1 − δ)K(
i, st−1)
+X(
i, st)
− φ
(
X (i, st)
K (i, st−1)
)
K(
i, st−1)
. (7.1.14)
The first order conditions for the case with F (K,L) = Kα1Lα2 are given by
∑
τ
∑
sτ
Q(
sτ |st−1)
Y (i, sτ ) + Y (sτ )[
1 − P (sτ )V (i, sτ ) /P(
i, st−1)]
(7.1.15)
D′
(
P(
i, st−1)
P (sτ )
∫
g′(
Y (j, sτ )
Y (sτ )
)
Y (j, sτ )
Y (sτ )dj
)
g′(
Y (i, sτ )
Y (sτ )
)
= 0
V(
i, st)
= W(
st)
/Fl(
i, st)
(7.1.16)
1
1 − φ′ (i, st)=∑
st+1
Q(
st+1|st−1)
P(
st+1)
Q (st|st−1) P (st)
[
V(
i, st+1)
Fk(
i, st+1)
(7.1.17)
+1
1 − φ′ (i, st+1)
1 − δ − φ(
i, st+1)
+ φ′(
i, st+1) X
(
i, st+1)
K (i, st)
]
where F (i, st) and φ(i, st) are shorthand for F (K(i, st−1), L(i, st)) and φ(X(i, st)/K(i, st−1)),
respectively. The monopolists not setting prices will still maximize with respect to labor,
investment, and capital. Therefore, there will be one pricing equation and N Euler equa-
tions for capital. The first order conditions for those monopolists not setting prices depend
on the prices that they last set.
Note that if the technology of the final goods producer is given by g(y) = yθ, then the
first order condition in ((7.1.15)) can be written more simply as follows:
Pi =1
θ
∑
τ
∑
sτ Q(
sτ |st−1)
Y (sτ ) P (sτ )2−θ1−θ V (i, sτ )
∑
τ
∑
sτ Q (sτ |st−1)Y (sτ ) P (sτ )1
1−θ
(7.1.18)
If adjustment costs are equal to zero, then the Euler equation for capital ((7.1.17)) can be
written more simply as follows:
Uc(
st)
= β∑
st+1
π(
st+1|st)
Uc(
st+1) [
V(
i, st+1)
Fk(
i, st+1)
+ 1 − δ]
Finally, the following equilibrium constraints must hold:
M(
st)
= µ(
st)
M(
st−1)
(7.1.19)
59
T(
st)
= M(
st)
−M(
st−1)
(7.1.20)
L(
st)
=
∫
L(
i, st)
di (7.1.21)
Y(
st)
= C(
st)
+
∫
X(
i, st)
di. (7.1.22)
To summarize, we have equations ((7.1.4))-((7.1.3)) from the final goods producers,
equations ((7.1.9))-((7.1.11)) from the consumers, equations ((7.1.14))-((7.1.17)) from the
intermediate goods producers, and equations ((7.1.19))-((7.1.13)) that must hold in equi-
librium.
7.2. Computing an Equilibrium
In this section, we describe in some detail the numerical algorithm used to solve the full-
blown model of Section 2. The solution takes the form
Zt = AZt−1 +BSt (7.2.1)
where Zt = [z′t, . . . , z′t−N+2]
′ is a (N + 2)(N − 1) × 1 vector and
zt = [pt−1 −mt−1, k1,t, . . . , kN,t, yt]′
St = [µt, µt−1, . . . , µt−N+1]′.
The matrices A and B are chosen to satisfy the first order conditions which can be written
generally as follows
Et [a0Zt+N−1 + a1Zt+N−2 + . . . aNZt−1 + b0St+N−1 + b1St+N−2 + . . . bN−1St] = 0
(7.2.2)
where Et ≡ E[·|st−1] for the first residual (the pricing equation) and Et = E[·|st] for all
other residuals.
Writing the residuals as in ((7.2.2)) makes the notation simpler but actually implies
lots of duplication. For example, lagged prices appear multiple times. Our subroutine
uses a smaller set of variables when constructing residuals of the first order conditions.
In particular, there are two inputs: the vector of parameters appearing in the first order
conditions and the following vector of variables:
Z ≡ [zt+N−1, zt+N−2, . . . , zt, pt−2 −mt−2, . . . , pt−N −mt−N , k1,t−1, . . . , kN,t−1,
µt+N−1, . . . , µt−N+1]′.
60
We show later that all other variables can be constructed once we know those in Z.
Above we assumed that β ≈ 1 when deriving the linearized pricing equation. In
writing the code, we will not make this assumption. If we linearize ((7.1.15)) we get
pt−1 =1
(N − 1)∑N−1i=0 βi
Et−1
[
pt−N + (1 + β) pt−N+1 + . . .(
1 + β + . . .+ βN−2)
pt−2(7.2.3)
+(
β + . . .+ βN−1)
pt + . . .+ βN−1pt+N−2
+ ϕN
vi,t + βvi,t+1 + . . .+ βN−1vi,t+N−1
]
where the constant terms have been ignored. It turns out that it is most convenient to
write the residuals by first normalizing the prices: we divide them by the money supply.
If we do this, then the pricing equation in ((7.2.3)) is equivalent to
pt−1−mt−1
=1
(N − 1)∑N−1i=0 βi
Et−1
[
(pt−N −mt−N ) + . . .(
1 + β + . . .+ βN−2)
(pt−2 −mt−2)
+(
β + . . .+ βN−1)
(pt −mt) + . . .+ βN−1 (pt+N−2 −mt+N−2)
+ ϕN
vi,t + βvi,t+1 + . . .+ βN−1vi,t+N−1
+[(
β + . . .+ βN−1)
+(
β2 + . . .+ βN−1)
+ . . .+ βN−1]
µt
+[(
β2 + . . .+ βN−1)
+ . . .+ βN−1]
µt+1
+ . . .+ βN−1µt+N−2
− µt−N+1 − [1 + (1 + β)]µt−N+2 −[
1 + (1 + β) +(
1 + β + β2)]
µt−N+3
− . . .−[
1 + (1 + β) + . . .+(
1 + β + . . .+ βN−2)]
µt−1
]
.
We had to write the pricing equation as above because we do not have explicit func-
tional forms for g(·) (and hence the demand function D(·) in ((7.1.15))). The other residu-
als can either be linearized by hand or numerically. For ease of reading the code, we chose
to linearize them numerically.
In addition to the pricing equation we have the the money demand equation and N
Euler equations for capital:
Uc (st)
P (st)= β
∑
st+1
π(
st+1|st) Uc
(
st+1)
P (st+1)+Um (st)
P (st)(7.2.4)
61
Uc(
st)
=[
1 − φ′(
i, st)]
β∑
st+1
π(
st+1|st)
Uc(
st+1)
[
V(
i, st+1)
Fk(
i, st+1)
(7.2.5)
+1
1 − φ′ (i, st+1)
1 − δ − φ(
i, st+1)
+ φ′(
i, st+1) X
(
i, st+1)
K (i, st)
]
In writing the residuals, we will use the following convention for naming the cohorts
(which is different than that used above). we will assume that monopolists named i are
those that set their prices i periods ago. For example, in t, group 1 charges pt−1, group 2
charges pt−2, and so on. Note that the particular assignment is not important. To evaluate
the pricing equation we need unit costs of group 1 for t, t + 1, . . ., t + N − 1. For these
costs, we will use the notation: v1,t, v2,t+1, . . ., vN,t+N−1 where
v1,t = wt + (1 − α2) l1,t − α1kN,t−1 − log (α2)
v2,t+1 = wt+1 + (1 − α2) l2,t+1 − α1k1,t − log (α2)
...
vN,t+N−1 = wt+N−1 + (1 − α2) lN,t+N−1 − α1kN−1,t+N−2 − log (α2)
if F (K,L) = Kα1Lα2 . The capital stocks are included in Z. The labor inputs are given
by
l1,t =ǫ
α2(pt − pt−1) −
α1
α2kN,t−1 +
1
α2yt
l2,t+1 =ǫ
α2(pt+1 − pt−1) −
α1
α2k1,t +
1
α2yt+1
...
lN,t+N−1 =ǫ
α2(pt+N−1 − pt−1) −
α1
α2kN−1,t+N−2 +
1
α2yt+N−1
which follows from yi,t − yt = −ǫ(pi,t−1 − pt) and exp(yi,t) = F (exp(ki−1,t−1), exp(li,t)).
Aggregate output is given in Z. For the relative prices, we write
pt − pt−1 =1
N[(pt−1 −mt−1) + . . .+ (pt−N −mt−N )] − (pt−1 −mt−1) (7.2.6)
− N − 1
Nµt−1 −
N − 2
Nµt−2 − . . .− 1
Nµt−N+1
pt+1 − pt−1 =1
N[(pt −mt) + . . .+ (pt−N+1 −mt−N+1)] − (pt−1 −mt−1)
62
+1
Nµt −
N − 2
Nµt−1 − . . .− 1
Nµt−N+2
pt+2 − pt−1 =1
N[(pt+1 −mt+1) + . . .+ (pt−N+2 −mt−N+2)] − (pt−1 −mt−1)
+1
Nµt+1 +
2
Nµt −
N − 3
Nµt−1 − . . .− 1
Nµt−N+3
...
pt+N−1 − pt−1 =1
N[(pt+N−1 −mt+N−1) + . . .+ (pt−1 −mt−1)] − (pt−1 −mt−1)(7.2.7)
+1
Nµt+N−2 +
2
Nµt+N−3 + . . .+
N − 1
Nµt
which depend on terms in Z.
The wage rate appears in the equation for unit costs. To construct wage rates we need
C(st), L(st), and M(st)/P (st). For aggregate consumption, we need aggregate output and
the individual investments:
C(
st)
= Y(
st)
− 1
N
∑
i
X(
i, st)
X(
i, st)
=1
b
(
1 + bδ −√
1 + 2bδ − 2b (K (i, st) /K (i− 1, st−1) − 1 + δ))
K(
i− 1, st−1)
where the capital stocks and output are in Z. When linearized, these equations look like
ct =(
Y yt −X∑
xi,t/N)
/C
ki,t = (1 − δ) ki−1,t−1 + δxi,t
where constant terms have been ignored. Notice that the monopolists with capital stocks
K(i−1, st−1) in t−1 are the same monopolists with capital K(i, st) using our new naming
convention. Monopolists named N this period are named 1 next period since they are the
next to change prices. Aggregate labor is given by
L(
st)
=1
N
∑
i
L(
i, st)
or, in logs, by
lt =1
N
∑
i
li,t.
Finally, logged real balances are given by
mt − pt = mt −1
N(pt−1 + . . .+ pt−N )
63
=1
N(mt−1 + µt) + (mt−2 + µt + µt−1) + . . .+ (mt−N + µt + µt−1 + . . . µt−N+1)
− 1
N(pt−1 + . . .+ pt−N )
= − 1
N
N∑
i=1
(pt−i −mt−i) +1
N(Nµt + (N − 1)µt−1 + . . . µt−N+1) .
For the pricing equation, we need real balances in t, t+ 1, . . . , t+N − 1 so we need to
know the sequences pt−N −mt−N , . . . pt+N−2 −mt+N−2. and µt−N+1, . . . , µt+N−1.These are in Z. The formulas for the relative prices pτ − pτ−i can be found in ((7.2.6))-
((7.2.7)). All of the variables appearing in the money demand equation ((7.2.5)) have at
this point been constructed.
There are two steps to solving the system of equations in ((7.2.1)). We start with
the first step: computing A. We use standard methods to solve the deterministic solution
Zt = AZt−1.
Define Xt to be the following vector of state variables:
Xt = [pt−2 −mt−2, . . . , pt−N −mt−N , k1,t−1, . . . , kN,t−1]′. (7.2.8)
Using the definition of X in ((7.2.8)), the residuals (dropping terms with µ) can be written
as
A1
[
Xt+1
Zt+N−1
]
+A2
[
XtZt+N−2
]
+
(
shock
terms
)
= 0
where elements of A1 and A2 are either coefficients of linearized residuals or 1’s and 0’s
used to associate variables with their lagged values. To compute A, we find generalized
eigenvalues Λ (and associated eigenvectors D) such that A2D = −A1DΛ. For a unique
stationary equilibrium, we need 2N−1 roots inside the unit circle. Note that X has length
2N−1. If we sort the eigenvalues and eigenvectors so that the roots inside one are ordered
first, then we have
Xt+1 = D11Λ1D−111 Xt
Zt+N−2 = D21D−111 Xt
where D11 is the upper left partition of D and is 2N − 1 × 2N − 1, D21 is the lower left
partition of D and has dimension (N + 2)(N − 2) × 2N − 1, and Λ1 is the upper left
partition of the matrix of eigenvalues. Recall that Zt+N−2 = [z′t+N−2, . . . , z′t]′. Recall also
that all of the elements in Xt are also in Zt−1. Therefore, we can use the solutions above
to fill in the elements of A.
64
Given A, solving for B involves solving a linear system of equations. The law of
motion for the shocks is given by
St = St−1 + ǫt+1 (7.2.9)
where the (1,1) of is ρ and the remaining elements are 1’s or 0’s and the first element
of ǫt is nonzero and all other elements are zero. We plug this law of motion and the law
of motion for Zt into ((7.2.2)) using recursion to write the equation in terms of Zt−1 and
St. The coefficients on these variables are both set equal to zero. Setting the coefficients
on St equal to zero gives us the equations we need for solving the elements of B. The
problem then is to find B such that EtFBG+HSt = 0 where the elements of matrices
F , G, and H are functions of the parameters and the computed elements of A. If Et ≡E[·|µt, µt−1, . . .], then B is given by
vec (B) = − (G⊗ F ′) vec (H) (7.2.10)
where vec(B) is a vector with the columns of B stacked one after another.
Note, however, that the first residual equation has Et ≡ E[·|µt−1, µt−2, . . .]. Therefore,
we have to treat it slightly differently from the others. Even so, the solution procedure is
an application of undetermined coefficients, and all elements of B can be found by solving
a system of linear equations.
65
Chapter 8.
Business Cycle Accounting
Business cycle accounting is a simple method to help researchers develop quantitative
models of economic fluctuations. The method rests on the insight that many models are
equivalent to a prototype growth model with time-varying wedges which resemble produc-
tivity, labor and investment taxes, and government consumption. Wedges corresponding
to these variables—efficiency, labor, investment, and government consumption wedges—
are measured and then fed back into the model in order to assess the fraction of various
fluctuations they account for.
8.1. The Prototype Model with Time-Varying Wedges
The prototype model is a a version of the RBC model described earlier. The main difference
is that we have a different set of shocks. We also keep track of the stochastic events so
as to be very clear about the timing of these shocks. Specifically, In each period t, the
economy experiences one of finitely many events st, which index the shocks. We denote
by st = (s0, ..., st) the history of events up through and including period t and often
refer to st as the state. The probability, as of period 0, of any particular history st is
πt(st). The initial realization s0 is given. The economy has four exogenous stochastic
variables, all of which are functions of the underlying random variable st: the efficiency
wedge At(st), the labor wedge 1 − τlt(s
t), the investment wedge 1/[1 + τxt(st)], and the
government consumption wedge gt(st).
Consumers maximize expected utility over per capita consumption ct and per capita
labor lt,∞∑
t=0
∑
st
βtπt(
st)
U(
ct(
st)
, lt(
st))
Nt,
subject to the budget constraint
ct +[
1 + τxt(
st)]
xt(
st)
=[
1 − τlt(
st)]
wt(
st)
lt(
st)
+ rt(
st)
kt(
st−1)
+ Tt(
st)
and the capital accumulation law
(1 + γn) kt+1
(
st)
= (1 − δ) kt(
st−1)
+ xt(
st)
, (8.1.1)
66
where kt(st−1) denotes the per capita capital stock, xt(s
t) per capita investment, wt(st)
the wage rate, rt(st) the rental rate on capital, β the discount factor, δ the depreciation
rate of capital, Nt the population with growth rate equal to 1 + γn, and Tt(st) per capita
lump-sum transfers.
The production function is F (kt(st−1), (1 + γ)tlt(s
t), where 1 + γ is the rate of labor-
augmenting technical progress, which is assumed to be a constant. Firms maximize profits
given by At(st)F (kt(s
t−1), (1 + γ)tlt(st)− rt(s
t)kt(st−1)− wt(s
t)lt(st).
The equilibrium of this benchmark prototype economy is summarized by the resource
constraint,
ct(
st)
+ xt(
st)
+ gt(
st)
= yt(
st)
, (8.1.2)
where yt(st) denotes per capita output, together with
yt(
st)
= At(
st)
F(
kt(
st−1)
, (1 + γ)tlt(
st)
)
, (8.1.3)
−Ult (st)
Uct (st)=[
1 − τlt(
st)]
At(
st)
(1 + γ)tFlt, (8.1.4)
Uct(
st) [
1 + τxt(
st)]
(8.1.5)
= β∑
st+1
πt(
st+1|st)
Uct+1
(
st+1)
At+1
(
st+1)
Fkt+1
(
st+1)
+ (1 − δ)[
1 + τxt+1
(
st+1)]
,
where, here and throughout, notations like Uct, Ult, Flt, and Fkt denote the derivatives
of the utility function and the production function with respect to their arguments and
πt(st+1|st) denotes the conditional probability πt(s
t+1)/πt(st). We assume that gt(s
t)
fluctuates around a trend of (1 + γ)t.
Notice that in this benchmark prototype economy, the efficiency wedge resembles a
blueprint technology parameter, and the labor wedge and the investment wedge resem-
ble tax rates on labor income and investment. Other more elaborate models could be
considered, models with other kinds of frictions that look like taxes on consumption or
on capital income. Consumption taxes induce a wedge between the consumption-leisure
marginal rate of substitution and the marginal product of labor in the same way as do
labor income taxes. Such taxes, if time-varying, also distort the intertemporal margins in
(8.1.5). Capital income taxes induce a wedge between the intertemporal marginal rate of
substitution and the marginal product of capital which is only slightly different from the
distortion induced by a tax on investment.
67
We emphasize that each of the wedges represents the overall distortion to the relevant
equilibrium condition of the model. For example, distortions both to labor supply affect-
ing consumers and to labor demand affecting firms distort the static first-order condition
(8.1.4). Our labor wedge represents the sum of these distortions. Thus, our method iden-
tifies the overall wedge induced by both distortions and does not identify each separately.
Likewise, liquidity constraints on consumers distort the consumer’s intertemporal Euler
equation, while investment financing frictions on firms distort the firm’s intertemporal Eu-
ler equation. Our method combines the Euler equations for the consumer and the firm
and therefore identifies only the overall wedge in the combined Euler equation given by
(8.1.5). We focus on the overall wedges because what matters in determining business
cycle fluctuations is the overall wedges, not each distortion separately.
8.2. Mapping Frictions to Wedges
Now we illustrate the mapping between detailed economies and prototype economies for
two types of wedges. We show that input-financing frictions in a detailed economy map
into efficiency wedges in our prototype economy. Sticky wages in a monetary economy map
into our prototype (real) economy with labor wedges. In an appendix, we show as well
that investment-financing frictions map into investment wedges and that fluctuations in net
exports in an open economy map into government consumption wedges in our prototype
(closed) economy. In general, our approach is to show that the frictions associated with
specific economic environments manifest themselves as distortions in first-order conditions
and resource constraints in a growth model. We refer to these distortions as wedges.
We choose simple models in order to illustrate how the detailed models map into the
prototypes. Since many models map into the same configuration of wedges, identifying one
particular configuration does not uniquely identify a model; rather, it identifies a whole
class of models consistent with that configuration. In this sense, our method does not
uniquely determine the model most promising to analyze business cycle fluctuations. It
does, however, guide researchers to focus on the key margins that need to be distorted in
order to capture the nature of the fluctuations.
8.2.1. Efficiency Wedges
In many economies, underlying frictions either within or across firms cause factor inputs to
be used inefficiently. These frictions in an underlying economy often show up as aggregate
68
productivity shocks in a prototype economy similar to our benchmark economy. Schmitz
(2005) presents an interesting example of within-firm frictions resulting from work rules
that lower measured productivity at the firm level. Lagos (2006) studies how labor market
policies lead to misallocations of labor across firms and, thus, to lower aggregate produc-
tivity. And Chu (2001) and Restuccia and Rogerson (2003) show how government policies
at the levels of plants and establishments lead to lower aggregate productivity.
Here we develop a detailed economy with input-financing frictions and use it to make
two points. This economy illustrates the general idea that frictions which lead to ineffi-
cient factor utilization map into efficiency wedges in a prototype economy. Beyond that,
however, the economy also demonstrates that financial frictions can show up as efficiency
wedges rather than as investment wedges. In our detailed economy, financing frictions lead
some firms to pay higher interest rates for working capital than do other firms. Thus, these
frictions lead to an inefficient allocation of inputs across firms.
A Detailed Economy With Input-Financing Frictions
Consider a simple detailed economy with financing frictions which distort the alloca-
tion of intermediate inputs across two types of firms. Both types of firms must borrow to
pay for an intermediate input in advance of production. One type of firm is more finan-
cially constrained, in the sense that it pays a higher interest rate on borrowing than does
the other type. We think of these frictions as capturing the idea that some firms, such as
small firms, often have difficulty borrowing. One motivation for the higher interest rate
faced by the financially constrained firms is that moral hazard problems are more severe
for small firms.
Specifically, consider the following economy. Aggregate gross output qt is a combi-
nation of the gross output qit from the economy’s two sectors, indexed i = 1, 2, where 1
indicates the sector of firms that are more financially constrained and 2 the sector of firms
that are less financially constrained. The sectors’ gross output is combined according to
qt = qφ1tq1−φ2t , (8.2.1)
where 0 < φ < 1. The representative producer of the gross output qt chooses q1t and q2t
to solve this problem:
max qt − p1tq1t − p2tq2t
subject to (8.2.1), where pit is the price of the output of sector i.
69
The resource constraint for gross output in this economy is
ct + kt+1 +m1t +m2t = qt + (1 − δ) kt, (8.2.2)
where ct is consumption, kt is the capital stock, and m1t and m2t are intermediate goods
used in sectors 1 and 2, respectively. Final output, given by yt = qt− m1t− m2t, is gross
output less the intermediate goods used.
The gross output of each sector i, qit, is made from intermediate goods mit and a
composite value-added good zit according to
qit = mθitz
1−θit , (8.2.3)
where 0 < θ < 1. The composite value-added good is produced from capital kt and labor
lt according to
z1t + z2t = zt = F (kt, lt) . (8.2.4)
The producer of gross output of sector i chooses the composite good zit and the
intermediate good mit to solve this problem:
max pitqit − vtzit −Ritmit
subject to (8.2.3). Here vt is the price of the composite good and Rit is the gross within-
period interest rate paid on borrowing by firms in sector i. If firms in sector 1 are more
financially constrained than those in sector 2, then R1t > R2t. Let Rit = Rt(1+τit), where
Rt is the rate consumers earn within period t and τit measures the within-period spread,
induced by financing constraints, between the rate paid to consumers who save and the
rate paid by firms in sector i. Since consumers do not discount utility within the period,
Rt = 1.
In this economy, the representative producer of the composite good zt chooses kt and
lt to solve this problem:
max vtzt − wtlt − rtkt
subject to (8.2.4), where wt is the wage rate and rt is the rental rate on capital.
Consumers solve this problem:
max
∞∑
t=0
βtU (ct, lt) (8.2.5)
70
subject to
ct + kt+1 = rtkt + wtlt + (1 − δ) kt + Tt,
where lt = l1t + l2t is the economy’s total labor supply and Tt = Rt∑
i τitmit lump-sum
transfers. Here we assume that the financing frictions act like distorting taxes, and the
proceeds are rebated to consumers. If, instead, we assumed that these frictions represent,
say, lost gross output, then we would adjust the economy’s resource constraint (8.2.2)
appropriately.
The Associated Prototype Economy
Now consider a version of the benchmark prototype economy that will have the same
aggregate allocations as the input-financing frictions economy just detailed. This prototype
economy is identical to our benchmark prototype except that the new prototype economy
has an investment wedge that resembles a tax on capital income rather than a tax on
investment. Here the government consumption wedge is set equal to zero.
Now the consumer’s budget constraint is
ct + kt+1 = (1 − τkt) rtkt + (1 − τlt)wtlt + (1 − δ) kt + Tt, (8.2.6)
and the efficiency wedge is
At = κ(
a1−φ1t aφ2t
)θ
1−θ
[1 − θ (a1t + a2t)] , (8.2.7)
where a1t = φ/(1 + τ1t), a2t = (1− φ)/(1 + τ2t), κ = [φφ(1 − φ)1−φθθ]1
1−θ , and τ1t and τ2t
are the interest rate spreads in the detailed economy.
Comparing the first-order conditions in the detailed economy with input-financing
frictions to those of the associated prototype economy with efficiency wedges leads imme-
diately to this proposition:
Proposition 1: Consider the prototype economy with resource constraint (8.1.2) and
consumer budget constraint (8.2.6) with exogenous processes for the efficiency wedge At
given in (8.2.7), the labor wedge given by
1
1 − τlt=
1
1 − θ
[
1 − θ
(
φ
1 + τ∗1t+
1 − φ
1 + τ∗2t
)]
, (8.2.8)
and the investment wedge given by τkt = τlt where τ∗1t and τ∗2t are the interest rate spreads
from the detailed economy with input-financing frictions. Then the equilibrium allocations
71
for aggregate variables in the detailed economy are equilibrium allocations in this prototype
economy.
Consider the following special case of Proposition 1 in which only the efficiency wedge
fluctuates. Specifically, suppose that in the detailed economy the interest rate spreads τ1t
and τ2t fluctuate over time, but in such a way that the weighted average of these spreads,
a1t + a2t =φ
1 + τ1t+
1 − φ
1 + τ2t, (8.2.9)
is constant while a1−φ1t aφ2t fluctuates. Then from (8.2.8) we see that the labor and invest-
ment wedges are constant, and from (8.2.7) we see that the efficiency wedge fluctuates. In
this case, on average, financing frictions are unchanged, but relative distortions fluctuate.
An outside observer who attempted to fit the data generated by the detailed economy
with input-financing frictions to the prototype economy would identify the fluctuations in
relative distortions with fluctuations in technology and would see no fluctuations in either
the labor wedge 1 − τlt or the investment wedge τkt. In particular, periods in which the
relative distortions increase would be misinterpreted as periods of technological regress.
8.2.2. Labor Wedges
Now we show that a monetary economy with sticky wages is equivalent to a (real) prototype
economy with labor wedges. In the detailed economy, the shocks are to monetary policy,
while in the prototype economy, the shocks are to the labor wedge.
A Detailed Economy With Sticky Wages
Consider a monetary economy populated by a large number of identical, infinitely lived
consumers. The economy consists of a competitive final goods producer and a continuum
of monopolistically competitive unions that set their nominal wages in advance of the
realization of shocks to the economy. Each union represents all consumers who supply a
specific type of labor.
In each period t, the commodities in this economy are a consumption-capital good,
money, and a continuum of differentiated types of labor, indexed by j ∈ [0, 1]. The
technology for producing final goods from capital and a labor aggregate at history, or
state, st has constant returns to scale and is given by y(st) = F (k(st−1), l(st), where y(st)
is output of the final good, k(st−1) is capital, and
l(
st)
=
[∫
l(
j, st)vdj
]1v
(8.2.10)
72
is an aggregate of the differentiated types of labor l(j, st).
The final goods producer in this economy behaves competitively. This producer
has some initial capital stock k(s−1) and accumulates capital according to k(st) = (1 −δ)k(st−1) + x(st), where x(st) is investment. The present discounted value of profits for
this producer is
∞∑
t=0
∑
st
Q(
st) [
P(
st)
y(
st)
− P(
st)
x(
st)
−W(
st−1)
l(
st)]
, (8.2.11)
where Q(st) is the price of a dollar at st in an abstract unit of account, P (st) is the dollar
price of final goods at st, and W (st−1) is the aggregate nominal wage at st which depends
on only st−1 because of wage stickiness.
The producer’s problem can be stated in two parts. First, the producer chooses se-
quences for capital k(st−1), investment x(st), and aggregate labor l(st) in order to maximize
(8.2.11) given the production function and the capital accumulation law. The first-order
conditions can be summarized by
P(
st)
Fl(
st)
= W(
st−1)
(8.2.12)
Q(
st)
P(
st)
=∑
st+1
Q(
st+1)
P(
st+1) [
Fk(
st+1)
+ 1 − δ]
. (8.2.13)
Second, for any given amount of aggregate labor l(st), the producer’s demand for each
type of differentiated labor is given by the solution to
minl(j,st),j∈[0,1]
∫
W(
j, st−1)
l(
j, st)
dj (8.2.14)
subject to (8.2.10); here W (j, st−1) is the nominal wage for differentiated labor of type j.
Nominal wages are set by unions before the realization of the event in period t; thus, wages
depend on, at most, st−1. The demand for labor of type j by the final goods producer is
ld(
j, st)
=
[
W(
st−1)
W (j, st−1)
]1
1−v
l(
st)
, (8.2.15)
where W (st−1) ≡[∫
W (j, st−1)v
v−1 dj]
v−1
v is the aggregate nominal wage. The minimized
value in (8.2.14) is, thus, W (st−1)l(st).
In this economy, consumers can be thought of as being organized into a continuum of
unions indexed by j. Each union consists of all the consumers in the economy with labor
73
of type j. Each union realizes that it faces a downward-sloping demand for its type of
labor, given by (8.2.15). In each period, the new wages are set before the realization of
the economy’s current shocks.
The preferences of a representative consumer in the jth union is
∞∑
t=0
∑
st
βtπt(
st)
[U(
c(
j, st)
, l(
j, st)
+ V(
M(
j, st)
/P(
st)))
], (8.2.16)
where c(j, st), l(j, st),M(j, st) are the consumption, labor supply, and money holdings of
this consumer, and P (st) is the economy’s overall price level. Note that the utility function
is separable in real balances. This economy has complete markets for state-contingent
nominal claims. The asset structure is represented by a set of complete, contingent, one-
period nominal bonds. Let B(j, st+1) denote the consumers’ holdings of such a bond
purchased in period t at history st, with payoffs contingent on some particular event st+1
in t+ 1, where st+1 = (st, st+1). One unit of this bond pays one dollar in period t + 1 if
the particular event st+1 occurs and 0 otherwise. Let Q(st+1|st) denote the dollar price of
this bond in period t at history st, where Q(st+1|st) = Q(st+1)/Q(st).
The problem of the jth union is to maximize (8.2.16) subject to the budget constraint
P(
st)
c(
j, st)
+M(
j, st)
+∑
st+1
Q(
st+1|st)
B(
j, st+1)
≤W(
j, st−1)
l(
j, st)
+M(
j, st−1)
+B(
j, st)
+ P(
st)
T(
st)
+D(
st)
,
the constraint l(j, st) = ld(j, st), and the borrowing constraint B(st+1) ≥ −P (st)b, where
ld(j, st) is given by (8.2.15). Here T (st) is transfers and the positive constant b constrains
the amount of real borrowing by the union. Also, D(st) = P (st)y(st) − P (st)x(st) −W (st−1)l(st) are the dividends paid by the firms. The initial conditions M(j, s−1) and
B(j, s0) are given and assumed to be the same for all j. Notice that in this problem, the
union chooses the wage and agrees to supply whatever labor is demanded at that wage.
The first-order conditions for this problem can be summarized by
Vm (j, st)
P (st)− Uc (j, st)
P (st)+ β
∑
st+1
π(
st+1|st) Uc
(
j, st+1)
P (st+1)= 0, (8.2.17)
Q(
st|st−1)
= βπt(
st|st−1) Uc (j, st)
Uc (j, st−1)
P(
st−1)
P (st), and (8.2.18)
W(
j, st−1)
= −∑
st Q (st)P (st)Ul (j, st) /Uc (j, st) ld (j, st)
v∑
st Q (st) ld (j, st). (8.2.19)
74
Here πt(st+1|st) = πt(s
t+1)/πt(st) is the conditional probability of st+1 given st. Notice
that in a steady state, (8.2.19) reduces to W/P = (1/v)(−Ul/Uc), so that real wages are
set as a markup over the marginal rate of substitution between labor and consumption.
Given the symmetry among the unions, all of them choose the same consumption, labor,
money balances, bond holdings, and wages, which are denoted simply by c(st), l(st), M(st),
B(st+1), and W (st).
Consider next the specification of the money supply process and the market-clearing
conditions for this sticky-wage economy. The nominal money supply process is given
by M(st) = µ(st)M(st−1), where µ(st) is a stochastic process. New money balances
are distributed to consumers in a lump-sum fashion by having nominal transfers satisfy
P (st)T (st) = M(st)−M(st−1). The resource constraint for this economy is c(st)+k(st) =
y(st) + (1 − δ)k(st−1). Bond market–clearing requires that B(st+1) = 0.
The Associated Prototype Economy
Consider now a real prototype economy with labor wedges and the production function
for final goods given above in the detailed economy with sticky wages. The representative
firm maximizes (8.2.11) subject to the capital accumulation law given above. The first-
order conditions can be summarized by (8.2.12) and (8.2.13). The representative consumer
maximizes∞∑
t=0
∑
st
βtπt(
st)
U(
c(
st)
, l(
st))
subject to the budget constraint
c(
st)
+∑
st+1
q(
st+1|st)
b(
st+1)
≤[
1 − τl(
st)]
w(
st)
l(
st)
+ b(
st)
+ v(
st)
+ d(
st)
with w(st) replacing W (st−1)/P (st) and q(st+1/st) replacing Q(st+1)P (st+1)/Q(st)P (st)
and a bound on real bond holdings, where the lowercase letters q, b, w, v, and d denote the
real values of bond prices, debt, wages, lump-sum transfers, and dividends. Here the first-
order condition for bonds is identical to that in (8.2.18) once symmetry has been imposed
with q(st/st−1) replacing Q(st/st−1)P (st)/P (st−1). The first-order condition for labor is
given by
−Ul (st)
Uc (st)=(
1 − τl(
st))
w(
st)
.
Consider an equilibrium of the sticky wage economy for some given stochastic process
M∗(st) on money supply. Denote all of the allocations and prices in this equilibrium with
asterisks. Then this proposition can be easily established:
75
Proposition 2: Consider the prototype economy just described with labor wedges
given by
1 − τl(
st)
= −U∗l (st)
U∗c (st)
1
F ∗l (st)
, (8.2.20)
where U∗l (st), U∗
c (st), and F ∗l (st) are evaluated at the equilibrium of the sticky wage econ-
omy and where real transfers are equal to the real value of transfers in the sticky wage
economy adjusted for the interest cost of holding money. Then the equilibrium allocations
and prices in the sticky wage economy are the same as those in the prototype economy.
The proof of this proposition is immediate from comparing the first-order conditions,
the budget constraints, and the resource constraints for the prototype economy with labor
wedges to those of the detailed economy with sticky wages. The key idea is that distortions
in the sticky-wage economy between the marginal product of labor implicit in (8.2.19) and
the marginal rate of substitution between leisure and consumption are perfectly captured
by the labor wedges (8.2.20) in the prototype economy.
8.3. The Accounting Procedure
Having established our equivalence result, we now describe our accounting procedure at a
conceptual level and discuss a Markovian implementation of it.
Our procedure is to conduct experiments that isolate the marginal effect of each wedge
as well as the marginal effects of combinations of these wedges on aggregate variables. In
the experiment in which we isolate the marginal effect of the efficiency wedge, for example,
we hold the other wedges fixed at some constant values in all periods. In conducting this
experiment, we ensure that the probability distribution of the efficiency wedge coincides
with that in the prototype economy. In effect, we ensure that agents’ expectations of
how the efficiency wedge will evolve are the same as in the prototype economy. For each
experiment, we compare the properties of the resulting equilibria to those of the prototype
economy. These comparisons, together with our equivalence results, allow us to identify
promising classes of detailed economies.
8.3.1. The Accounting Procedure at a Conceptual Level
Suppose for now that the stochastic process πt(st) and the realizations of the state st in
some particular episode are known. Recall that the prototype economy has one underlying
(vector-valued) random variable, the state st, which has a probability of πt(st). All of the
76
other stochastic variables, including the four wedges—the efficiency wedge At(st), the labor
wedge 1 − τlt(st), the investment wedge 1/[1 + τxt(s
t)], and the government consumption
wedge gt(st)—are simply functions of this random variable. Hence, when the state st is
known, so are the wedges.
To evaluate the effects of just the efficiency wedge, for example, we consider an econ-
omy, referred to as an efficiency wedge alone economy, with the same underlying state
st and probability πt(st) and the same function At(s
t) for the efficiency wedge as in the
prototype economy, but in which the other three wedges are set to constants, in that
τlt(st) = τl, τxt(s
t) = τx, and gt(st) = g. Note that this construction ensures that the
probability distribution of the efficiency wedge in this economy is identical to that in the
prototype economy.
For the efficiency wedge alone economy, we then compute the equilibrium outcomes
associated with the realizations of the state st in a particular episode and compare these
outcomes to those of the economy with all four wedges. We find this comparison to be
of particular interest because in our applications, the realizations st are such that the
economy with all four wedges exactly reproduces the data on output, labor, investment,
and consumption.
In a similar manner, we define the labor wedge alone economy, the investment wedge
alone economy, and the government consumption wedge alone economy, as well as economies
with a combination of wedges such as the efficiency and labor wedge economy.
8.3.2. A Markovian Implementation
So far we have described our procedure assuming that we know the stochastic process
πt(st) and that we can observe the state st. In practice, of course, we need to either specify
the stochastic process a priori or use data to estimate it, and we need to uncover the state
st from the data. Here we describe a set of assumptions that makes these efforts easy.
Then we describe in detail the three steps involved in implementing our procedure.
We assume that the state st follows a Markov process of the form π(st|st−1) and that
the wedges in period t can be used to uniquely uncover the event st, in the sense that the
mapping from the event st to the wedges (At, τlt, τxt, gt) is one-to-one and onto. Given this
assumption, without loss of generality, let the underlying event st = (sAt, slt, sxt, sgt), and
let At(st) = sAt, τlt(s
t) = slt, τxt(st) = sxt, and gt(s
t) = sgt. Note that we have effectively
assumed that agents use only past wedges to forecast future wedges and that the wedges
in period t are sufficient statistics for the event in period t.
77
The first step in our procedure is to use data on yt, lt, xt, and gt from an actual
economy to estimate the parameters of the Markov process π(st|st−1). We can do so using
a variety of methods, including the maximum likelihood procedure described below.
The second step in our procedure is to uncover the event st by measuring the realized
wedges. We measure the government consumption wedge directly from the data as the sum
of government spending and net exports. To obtain the values of the other three wedges,
we use the data and the model’s decision rules. With ydt , ldt , x
dt , g
dt , and kd0 denoting the
data and y(st, kt), l(st, kt), and x(st, kt) denoting the decision rules of the model, the
realized wedge series sdt solves
ydt = y(
sdt , kt)
, ldt = l(
sdt , kt)
, andxdt = x(
sdt , kt)
, (8.3.1)
with kt+1 = (1 − δ)kt + xdt , k0 = kd0 , and gt = gdt . Note that we construct a series for the
capital stock using the capital accumulation law (8.1.1), data on investment xt, and an
initial choice of capital stock k0. In effect, we solve for the three unknown elements of the
vector st using the three equations (8.1.3)–(8.1.5) and thereby uncover the state. We use
the associated values for the wedges in our experiments.
Note that the four wedges account for all of the movement in output, labor, investment,
and government consumption, in that if we feed the four wedges into the three decision
rules in (8.3.1) and use gt(sdt ) = sgt along with the law of motion for capital, we simply
recover the original data.
Note also that, in measuring the realized wedges, the estimated stochastic process
plays a role in measuring only the investment wedge. To see that the stochastic process
does not play a role in measuring the efficiency and labor wedges, note that these wedges
can equivalently be directly calculated from (8.1.3) and (8.1.4) without computing the
equilibrium of the model. In contrast, calculating the investment wedge requires computing
the equilibrium of the model because the right side of (8.1.5) has expectations over future
values of consumption, the capital stock, the wedges, and so on. The equilibrium of the
model depends on these expectations and, therefore, on the stochastic process driving the
wedges.
The third step in our procedure is to conduct experiments to isolate the marginal
effects of the wedges. To do that, we allow a subset of the wedges to fluctuate as they do
in the data while the others are set to constants. To evaluate the effects of the efficiency
wedge, we compute the decision rules for the efficiency wedge alone economy, denoted
ye(st, kt), le(st, kt), and xe(st, kt), in which At(s
t) = sAt, τlt(st) = τl, τxt(s
t) = τx, and
78
gt(st) = g. Starting from kd0 , we then use sdt , the decision rules, and the capital accumula-
tion law to compute the realized sequence of output, labor, and investment, yet , let , and xet ,
which we call the efficiency wedge components of output, labor, and investment. We com-
pare these components to output, labor, and investment in the data. Other components
are computed and compared similarly.
Notice that in this experiment we computed the decision rules for an economy in
which only one wedge fluctuates and the others are set to be constants in all events. The
fluctuations in the one wedge are driven by fluctuations in a 4 dimensional state st.
Notice also that our experiments are designed to separate out the direct effect and the
forecasting effect of fluctuations in wedges. As a wedge fluctuates, it directly affects either
budget constraints or resource constraints. This fluctuation also affects the forecasts of
that wedge as well as of other wedges in the future. Our experiments are designed so that
when we hold a particular wedge constant, we eliminate the direct effect of that wedge,
but we retain its forecasting effect on the other wedges. By doing so, we ensure that
expectations of the fluctuating wedges are identical to those in the prototype economy.
79
Chapter 9.
Structural VARs
9.1. A Version of the RBC Model
9.1.1. Maximization problems
Consider an economy with households, firms, and the government. The representative
household chooses consumption, investment, and labor to solve the following maximization
problem:
maxct,xt,lt
E∞∑
t=0
βt U (ct, 1 − lt)Nt
subject to (1 + τct) ct + (1 + τxt)xt = (1 − τkt) rtkt + (1 − τlt)wtlt + τktδkt + trt
Nt+1kt+1 = [(1 − δ) kt + xt]Nt
ct, xt ≥ 0 in all states
taking processes for the rental rate, wage rate, the tax rates, and transfers as given. The
representative firm solves a simple static problem at t:
maxKt,Lt
F (Kt, ZtLt) − rtKt − wtLt.
The government sets rates of taxes and transfers in such a way that their budget constraint
at t, namely,
Gt +Nttrt = τkt (rt − δ)Ntkt + τltwtltNt + τctNtct + τxtNtxt
is satisfied. In equilibrium, the following conditions must hold:
Nt (ct + xt) +Gt = F (Kt, ZtLt) (9.1.1)
Ntkt = Kt
Ntlt = Lt.
80
9.1.2. First-order conditions
We now derive first-order conditions in this economy. The Lagrangian for the household
optimization problem is given by
L = E∑
t
βtNt
U (ct, 1 − lt)
+ µt
(1 − τkt) rtkt + (1 − τlt)wtlt + τktδkt + trt − (1 + τct) ct − (1 + τxt) xt
+ λt
(1 − δ) kt + xt − (1 + gn) kt+1
Here, it is assumed that the investment decision will be interior.
The relevant first-order conditions are found by taking derivatives of L with respect
to ct, lt, xt, and kt+1:
0 = U1 (ct, 1 − lt) − µt (1 + τct)
0 = −U2 (ct, 1 − lt) + µt (1 − τlt)wt
0 = µt (1 + τxt) + λt = 0
0 = − (1 + gn)λt + Etµt+1 [(1 − τkt+1) rt+1 + δτkt+1] + λt+1 (1 − δ)
Eliminating multipliers yields:
U2 (ct, 1 − lt)
U1 (ct, 1 − lt)=
1 − τlt1 + τct
wt (9.1.2)
1 + τxt1 + τct
U1 (ct, 1 − lt) = βEt
[
U1 (ct+1, 1 − lt+1)
1 + τct+1
(1 − τkt+1) rt+1 + δτkt+1
+ (1 − δ) (1 + τxt+1)
]
. (9.1.3)
In addition, there are first-order conditions for the firm’s static problem. These are
rt = F1 (Kt, ZtLt) (9.1.4)
wt = F2 (Kt, ZtLt)Zt. (9.1.5)
Finally, there is a resource constraint given by (9.1.1).
From here on, the following functional form assumptions and auxiliary choices are
made:
F (k, l) = kθl1−θ (9.1.6)
81
U (c, 1 − l) =(
c (1 − l)ψ)1−σ
/ (1 − σ) (9.1.7)
τkt = τct = 0
st = [log zt, τlt, τxt, log gt]′
st+1 = P0 + Pst +Qηs,t+1, ηs ∼ N (04×1, I4×4) . (9.1.8)
The tax rate τc has been set to 0 in all periods since it plays a similar role to τn in distorting
the labor-leisure choice. Similarly, τk has been set to 0 since it plays a similar role to τx
in distorting the intertemporal margin.
If we substitute the choices (9.1.6)-(9.1.7) into (9.1.1) and (9.1.2)-(9.1.5), then substi-
tute the equilibrium rates rt and wt into (9.1.2) and (9.1.3), We have:
Nt (ct + gt) +Nt+1kt+1 − (1 − δ)Ntkt = (Ntkt)θ(ZtNtlt)
1−θ(9.1.9)
ψct1 − lt
= (1 − τlt) (1 − θ) (Ntkt)θZ1−θt (Ntlt)
−θ(9.1.10)
(1 + τxt) c−σt (1 − lt)
ψ(1−σ)
= βEt[
c−σt+1 (1 − lt+1)ψ(1−σ)
θ (Nt+1kt+1)θ−1
(Zt+1Nt+1lt+1)1−θ
+ (1 − δ) (1 + τxt+1)]
. (9.1.11)
9.1.3. Log-linear computation
We first normalize the variables as follows:
ct = ct/Zt, xt = xt/Zt, gt = gt/Zt, yt = yt/Zt, kt = kt/Zt−1.
Using the functional forms for F and U in (9.1.6) and (9.1.7), respectively, the equilibrium
rental and wage rates are:
rt = θKθ−1t (ZtLt)
1−θ= θkθ−1
t (ztlt)1−θ
wt = (1 − θ)Kθt (ZtLt)
−θZt = (1 − θ) kθt (ztlt)
−θZt.
This implies the following first-order conditions
ct + gt + (1 + gn) kt+1 − (1 − δ) z−1t kt = yt = kθt l
1−θt z−θt (9.1.12)
ψct1 − lt
= (1 − τlt) (1 − θ) kθt (ztlt)−θ
(9.1.13)
(1 + τxt) c−σt (1 − lt)
ψ(1−σ)
= βz−σt+1Etc−σt+1 (1 − lt+1)
ψ(1−σ)[
θkθ−1t+1 (zt+1lt+1)
1−θ+ (1 − δ) (1 + τxt+1)
]
.(9.1.14)
82
Next, we compute the steady state of the system for constant values for z, the taxes,
and government spending:
k/l =
(
(1 + τx) (1 − βz−σ (1 − δ))
βz−σθz1−θ
)1/(θ−1)
c =
[
(
k/l)θ−1
z−θ − (1 + gn) + (1 − δ) z−1
]
k − g = ξ1k − g
c =
[
(1 − τl) (1 − θ)(
k/l)θ
z−θ/ψ
]
(
1 − 1/(
k/l)
k)
= ξ2 − ξ3k
where the last 2 equations imply k = (ξ2 + g)/(ξ1 + ξ3), c = ξ1k − g, l = (1/(k/l))k.
Assume that the solution for the capital decision takes the form:
log kt+1 = γk log kt + γ [ log zt τlt τxt log gt ]′+ constant, (9.1.15)
where γk is a scalar and γ is 1 × 4 and equal to [γz, γl, γx, γg]. Assume the residual from
the dynamic first-order condition (9.1.14) can be written (after substitutions from (9.1.12)
and (9.1.13)):
f(
Et log kt+2, log kt+1, log kt, log zt+1, log zt, τlt+1, τlt, τxt+1, τxt, log gt+1, log gt
)
≈ a0Et log kt+2 + a1 log kt+1 + a2 log kt + b0Etst+1 + b1st.
Then the general solution algorithm is to find γk that solves the quadratic equation
a0γ2k + a1γk + a2 = 0,
and γ that solves the linear equations:
a0γkγ + a0γP + a1γ + b0P + b1 = 01×4.
Note that this implies:
γ = − [(a0a+ a1) I4×4 + a0P′]−1
(b0P + b1I4×4)′. (9.1.16)
Once we have values for the the coefficients γk and γ, We can use (9.1.12) and (9.1.13) to
back out ct and lt (either nonlinearly or by way of a log-linear approximation).
One property of the solution that we use later is the fact that γk = −γz. A second look
at This is true because kt is everywhere divided by zt in the first-order conditions (9.1.12)-
(9.1.13). Thus, when the first-order conditions are log-linearized, the same coefficients hit
log(kt) and − log(zt).
83
Given values for the coefficients in (9.1.15), We can derive expressions for labor, con-
sumption, and investment using the static first-order conditions. In particular, we log-
linearize (9.1.13) after substituting in for consumption from (9.1.12):
0 ≈ ψ
kθl1−θz−θ[
θ(
log kt − log zt
)
+ (1 − θ) log lt
]
− (1 + gn) k log kt+1 + (1 − δ) z−1k(
log kt − log zt
)
− g log gt
+ (1 − θ) (1 − τl) kθ (zl)
−θ(1 − l)
1/ (1 − τl) τlt
− θ log kt + θ (log lt + log zt) + l/ (1 − l) log lt
.
which can be written succinctly as
log lt = φlk log kt + φlz log zt + φllτlt + φlg log gt + φlk′ log kt+1.
With this equation for log l, we use the production relation and the capital accumulation
equation to write log y and log x as follows:
log yt = (θ + (1 − θ)φlk) log kt + ((1 − θ)φlz − θ) log zt
+ (1 − θ)[
φllτlt + φlg log gt + φlk′ log kt+1
]
≡ φyk log kt + φyz log zt + φylτlt + φyg log gt + φyk′ log kt+1 (9.1.17)
log xt ≈ (1 + gn) k/x log kt+1 − (1 − δ) z−1k/x(
log kt − log zt
)
≡ φxk log kt + φxz log zt + φxk′ log kt+1. (9.1.18)
Finally, we can log-linearize (9.1.12) to get
log ct ≈ y[
θ(
log kt − log zt
)
+ (1 − θ) log lt
]
− g log gt
− (1 + gn) k log kt+1 + (1 − δ) z−1k[
log kt − log zt
]
/c
=[
θy/c+ (1 − θ)φlk y/c+ (1 − δ) k/ (cz)]
log kt
−[
θy/c− (1 − θ)φlz y/c+ (1 − δ) k/ (cz)]
log zt
+ [(1 − θ)φlly/c] τlt
+ [(1 − θ)φlg y/c− g/c] log gt
+[
(1 − θ)φlk′ y/c− (1 + gn) k/c]
log kt+1
≡ φck log kt + φcz log zt + φclτlt + φcg log gt + φck′ log kt+1. (9.1.19)
84
9.2. VARs and the 2-Shock Version of the Model
9.2.1. The Decision Functions
Assume the economy has only two shocks and they are orthogonal: a unit root in
technology log z and an AR(1) in the tax rate on labor τ . (For convenience We drop l on
τlt throughout this section.) The capital decision function has the form:
log kt+1 = γ0 + γk log kt + γz log zt + γlτt
and the labor decision function can be written:
log lt = φlz log zt + φllτt + φlk log kt + φlk′ log kt+1
= φlz log zt + φllτt + φlk log kt + φlk′[
γ0γk log kt + γz log zt + γlτt
]
= (φlk + φlk′γk) log kt + (φlz + φlk′γz) log zt + (φll + φlk′γl) τt.
These imply that output from a Cobb-Douglas production technology with capital share
θ is:
log yt = θ(
log kt − log zt
)
+ (1 − θ) log lt
= (θ + (1 − θ)φlk) log kt − (θ − (1 − θ)φlz) log zt + (1 − θ)φllτt
+ (1 − θ)φlk′ log kt+1
= (θ + (1 − θ) (φlk + φlk′γk)) log kt − (θ − (1 − θ) (φlz + φlk′γz)) log zt
+ (1 − θ) (φll + φlk′γl) τt
We can write the capital stock in terms of all lagged shocks as follows:
log kt = γ0 + γk
(
γ0 + γk log kt−2 + γz log zt−2 + γlτt−2
)
+ γz log zt−1 + γlτt−1
= γ0
[
1 + γk + γ2k + . . .
]
+ γz[
log zt−1 + γk log zt−2 + γ2k log zt−3 + . . .
]
+ γl[
τt−1 + γkτt−2 + γ2kτt−3 + . . .
]
or in differences as follows:
log kt − log kt−1 = γz[
log zt−1 + (γk − 1) log zt−2 + γk log zt−3 + γ2k log zt−4 + . . .
]
+ γl[
τt−1 + (γk − 1) τt−2 + γkτt−3 + γ2kτt−4 + . . .
]
85
or in quasi-differences as follows:
log kt − α log kt−1 = γz[
log zt−1 + (γk − α) log zt−2 + γk log zt−3 + γ2k log zt−4 + . . .
]
+ γl[
τt−1 + (γk − α) τt−2 + γkτt−3 + γ2kτt−4 + . . .
]
We can also write hours in terms of past shocks as follows:
log lt = φlz log zt + φllτt + φlk log kt + φlk′ log kt+1
= φlz log zt + φllτt
+ φlkγz[
log zt−1 + γk log zt−2 + γ2k log zt−3 + . . .
]
+ φlkγl[
τt−1 + γkτt−2 + γ2kτt−3 + . . .
]
+ φlk′γz[
log zt + γk log zt−1 + γ2k log zt−2 + . . .
]
+ φlk′γl[
τt + γkτt−1 + γ2kτt−2 + . . .
]
= [(φlz + φlk′γz) log zt + (φlk + φlk′γk) γz log zt−1 + (φlk + φlk′γk) γkγz log zt−2 + . . .]
+ [(φll + φlk′γl) τt + (φlk + φlk′γk) γlτt−1 + (φlk + φlk′γk) γkγlτt−2 + . . .]
where constant terms have been ignored.
We can write logged hours in differences as follows:
log lt − log lt−1 = φlz (log zt − log zt−1) + φll (τt − τt−1)
+ φlk′(
log kt+1 − log kt
)
+ φlk
(
log kt − log kt−1
)
= φlz (log zt − log zt−1) + φll (τt − τt−1)
+ φlk′γz[
log zt + (γk − 1) log zt−1 + γk log zt−2 + γ2k log zt−3 + . . .
]
+ φlk′γl[
τt + (γk − 1) τt−1 + γkτt−2 + γ2kτt−3 + . . .
]
+ φlkγz[
log zt−1 + (γk − 1) log zt−2 + γk log zt−3 + γ2k log zt−4 + . . .
]
+ φlkγl[
τt−1 + (γk − 1) τt−2 + γkτt−3 + γ2kτt−4 + . . .
]
= [φlz + φlk′γz] log zt − [φlz − φlkγz − φlk′γz (γk − 1)] log zt−1
+ γz (γk − 1) [φlk′γk + φlk] log zt−2 + γk log zt−3 + γ2k log zt−4 + . . .
+ [φll + φlk′γl] τt − [φll − φlkγl − φlk′γl (γk − 1)] τt−1
+ γl (γk − 1) [φlk′γk + φlk] τt−2 + γkτt−3 + γ2kτt−4 + . . .
86
or in quasi-difference form as follows:
log lt − α log lt−1 = φlz (log zt − α log zt−1) + φll (τt − ατt−1)
+ φlk′(
log kt+1 − α log kt
)
+ φlk
(
log kt − α log kt−1
)
= φlz (log zt − α log zt−1) + φll (τt − ατt−1)
+ φlk′γz[
log zt + (γk − α) log zt−1 + γk log zt−2 + γ2k log zt−3 + . . .
]
+ φlk′γl[
τt + (γk − α) τt−1 + γkτt−2 + γ2kτt−3 + . . .
]
+ φlkγz[
log zt−1 + (γk − α) log zt−2 + γk log zt−3 + γ2k log zt−4 + . . .
]
+ φlkγl[
τt−1 + (γk − α) τt−2 + γkτt−3 + γ2kτt−4 + . . .
]
= [φlz + φlk′γz] log zt − [αφlz − φlkγz − φlk′γz (γk − α)] log zt−1
+ γz (γk − α) [φlk′γk + φlk] log zt−2 + γk log zt−3 + γ2k log zt−4 + . . .
+ [φll + φlk′γl] τt − [αφll − φlkγl − φlk′γl (γk − α)] τt−1
+ γl (γk − α) [φlk′γk + φlk] τt−2 + γkτt−3 + γ2kτt−4 + . . .
We can use the expressions for output and hours to write out the change in produc-
tivity as follows:
log (yt/lt) − log (yt−1/lt−1)
= log yt − log yt−1 + log zt − log lt − log lt−1
= log zt + θ(
log kt − log kt−1 − log lt + log lt−1 − log zt + log zt−1
)
= (1 − θ) log zt + θ log zt−1
− θ(
log lt − log lt−1 − log kt + log kt−1
)
= (1 − θ) log zt + θ log zt−1 − θ[φlz + φlk′γz] log zt
− [φlz − (φlk − 1) γz − φlk′γz (γk − 1)] log zt−1
+ γz (γk − 1) [φlk′γk + φlk − 1][
log zt−2 + γk log zt−3 + γ2k log zt−4 + . . .
]
+ [φll + φlk′γl] τt − [φll − (φlk − 1) γl − φlk′γl (γk − 1)] τt−1
+ γl (γk − 1) [φlk′γk + φlk − 1][
τt−2 + γkτt−3 + γ2k log zt−4 + . . .
]
= 1 − θ − θ [φlz + φlk′γz] log zt
+ θ [1 + φlz − (φlk − 1) γz − φlk′γz (γk − 1)] log zt−1
− θγz (γk − 1) [φlk′γk + φlk − 1][
log zt−2 + γk log zt−3 + γ2k log zt−4 + . . .
]
− θ [φll + φlk′γl] τt
+ θ [φll − (φlk − 1) γl − φlk′γl (γk − 1)] τt−1
− θγl (γk − 1) [φlk′γk + φlk − 1][
τt−2 + γkτt−3 + γ2kτt−4 + . . .
]
87
9.2.2. The Model’s Moving Average
The moving average for the model is given by:
[
(1 − L) log yt/lt(1 − αL) log lt
]
≡ Xt = D0ωt +D1ωt−1 +D2ωt−2 + . . .
where ωt = [log zt, τt]′ and
D0 =
[
1 − θ − θ (φlz + φlk′γz) −θ (φll + φlk′γl)φlz + φlk′γz φll + φlk′γl
]
D1 =
[
θ (1 + φlz − (φlk − 1) γz − φlk′γz (γk − 1)) θ (φll − (φlk − 1) γl − φlk′γl (γk − 1))−αφlz + (φlk + φlk′ (γk − α)) γz −αφll + (φlk + φlk′ (γk − α)) γl
]
D2 =
[
−θγz (γk − 1) [φlk′γk + φlk − 1] −θγl (γk − 1) [φlk′γk + φlk − 1](φlk + φlk′γk) (γk − α) γz (φlk + φlk′γk) (γk − α) γl
]
and Dj = γkDj−1 for j ≥ 3.
Let a = φlk + φlk′γk and b = φll + φlk′γl. Also, note that φlz = −φlk and γz = −γkhold in the model economy with a unit root in technology.
D0 =
[
1 − θ + θa −θb−a b
]
D1 =
[
θ (1 − γk) (1 − a) θ (b+ (1 − a) γl)(α− γk) a −αb+ γla
]
D2 =
[
γk (1 − a) θ (1 − γk) −γl (1 − a) θ (1 − γk)γka (α− γk) −γla (α− γk)
]
and, again, Dj = γj−2k D2 for j ≥ 3. Note that D2 is singular.
If τt is an AR(1), it is more convenient to write the MA process in terms of ηt =
[log zt, ηlt] rather than in terms of ωt. In this case,
Xt = D0ηt+(D0P +D1) ηt−1+(
D0P2 +D1P +D2
)
ηt−2+(
D0P3 +D1P
2 +D2P +D3
)
ηt−3. . . .
We normalize the MA so it has an identity for the first coefficient. That is, set C0 = I,
C1 = (D0P +D1)D−10 , and Cj = Cj−1D0PD
−10 +DjD
−10 .
88
9.2.3. Special Property of the D’s
Next, we will see that the D matrices have a special property that will be exploited when
we characterize coefficients of the VAR found by regressing Xt on lags of itself. The D’s
for the RBC model satisfy the relation:
(
γkI −(
D0P2 +D1P +D2
)
(D0P +D1)−1)
D2 = 0. (9.2.1)
One method of proof is to multiply all terms of the matrices in (9.2.1) and show that all
elements are zero. We have done this but the algebra is messy.
A simpler proof is as follows. Note that
D2 =
[
(1 − a) θ (1 − γk)(α− γk) a
]
[ γk −γl ] ≡ gh′. (9.2.2)
Thus, we can rewrite the left hand side of (9.2.1) as follows
(
γkI −(
D0P2 +D1P +D2
)
(D0P +D1)−1)
D2
=[
γk (gh′) − (gh′) (D0P +D1)−1
(gh′)]
−[
(D0P +D1)P (D0P +D1)−1gh′]
.(9.2.3)
We will prove that both terms in (9.2.3) in square brackets is equal to 2×2 zero matrices.
The first step of the proof is to show that
(D0P +D1)−1g =
[
10
]
. (9.2.4)
The proof of this step is trivial since the first column ofD0P+D1 is equal to g. Substituting
(9.2.4) into (9.2.3), the result (9.2.1) follows immediately from the fact that h′[1, 0]′ = γk
and P [1, 0]′ = 0.
9.2.4. VAR Coefficients
Given expressions for the D coefficients in the model MA, and thus the normalized C
coefficients, We can directly write out expressions for the coefficients in the VAR of Xt
regressed on lags of itself. We will denote the VAR coefficients by Bj, j = 1, 2, . . .. They
are related to the MA coefficients as follows:
Bj = Cj −B1Cj−1 −B2Cj−2 − . . .Bj−1C1. (9.2.5)
89
9.2.5. Proposition 1: Model has infinite-order VAR
Proposition 1. The model described above has a VAR representation with coefficients Bj
that satisfy
Bj = MBj−1 (9.2.6)
for j ≥ 2, with B1 = C1 = (D0P + D1)D−10 . The matrix M is a 2×2 matrix with
eigenvalues equal to α andγk − γla/b− θ
1 − θ,
where a = φlk + φlk′γk and b = φll + φlk′γl are the coefficients on k and τl in the labor
decision function.
Proof of Proposition 1. Choose M = C2C−11 −C1. Using the formula (9.2.5) for the VAR
coefficient, it is easy to show that M = B2B−11 . Therefore, B2 = MB1 holds. Consider
the next coefficient. Using the formula (9.2.5), we have
B3 −MB2 = C3 −B1C2 −B2C1 −M (C2 −B1C1)
= C3 −B1C2 −MC2
= C3 − C1C2 −(
C2C−11 − C1
)
C2
= C3 − C2C−11 C2
= C2D0PD−10 + γkD2D
−10 − C2C
−11
(
C1D0PD−10 +D2D
−10
)
= γkD2D−10 − C2C
−11 D2D
−10
=(
γkI − C2C−11
)
D2D−10
=(
γkI −(
D0P2 +D1P +D2
)
(D0P +D1)−1)
D2D−10
= 0
where the last relation follows from intermediate calculations done in Section 3. The same
calculation can be done for any j∗ using the fact that (9.2.6) holds for all j < j∗, namely
Bj −MBj−1 = Cj −B1Cj−1 . . .−Bj−1C1 −M (Cj−1 − . . .Bj−2C1)
= Cj −B1Cj−1 −MCj−1
= Cj − C1Cj−1 −(
C2C−11 − C1
)
Cj−1
= Cj − C2C−11 Cj−1
= Cj−1D0PD−10 +DjD
−10 − C2C
−11
(
Cj−2D0PD−10 +Dj−1D
−10
)
= Cj−1D0PD−10 + γj−2
k D2D−10 − C2C
−11
(
Cj−2D0PD−10 + γj−3
k D2D−10
)
90
=(
Cj−1 − C2C−11 Cj−2
)
D0PD−10 + γj−3
k
(
γkI − C2C−11
)
D2D−10
= (Bj−1 −MBj−2)D0PD−10 + γj−3
k
(
γkI − C2C−11
)
D2D−10
= γj−3k
(
γkI − C2C−11
)
D2D−10
= γj−3k
(
γkI −(
D0P2 +D1P +D2
)
(D0P +D1)−1)
D2D−10
= 0.
Next, we prove that the two eigenvalues of M are λ1 = α and λ2 = (γk−γla/b−θ)/(1−θ).One way to do this is to write out all of the terms for matrix M and derive expressions
for the trace and the determinant. The trace is equal to the sum of the eigenvalues and
the determinant is equal to the product of the eigenvalues. This is 2 equations and 2
unknowns. We have done this but the algebra is messy.
A simpler proof that the eigenvalues are λ1 = α and λ2 = (γk − γla/b− θ)/(1 − θ) is
as follows. Using (9.2.2) and the definitions of the C’s in terms of the D’s, we can derive
the following expression for M in terms of the D’s, P , and h:
M = C2C−11 − C1
=(
D0P2 +D1P +D2
)
(D0P +D1)−1 − (D0P +D1)D
−10
= (D0P +D1)P (D0P +D1)−1
+D2 (D0P +D1)−1
− (D0P +D1)D−10 (D0P +D1) (D0P +D1)
−1
= D2 (D0P +D1)−1 − (D0P +D1)D
−10 D1 (D0P +D1)
−1
= (D0P +D1) [1, 0]′h′ (D0P +D1)
−1 − (D0P +D1)D−10 D1 (D0P +D1)
−1
= (D0P +D1)(
[1, 0]′h′ −D−1
0 D1
)
(D0P +D1)−1.
Appealing to standard results in linear algebra, the eigenvalues of M are equal to the
eigenvalues of the simpler matrix [1, 0]′h′ −D−10 D1, which is equal to
[1, 0]′h′−D−1
0 D1 =1
1 − θ
[
γk − θ (1 − a+ aα) −γl + θb (1 − α)(γk − α− θ (1 − a) (1 − α)) a/b α− θ (a+ α (1 − a)) − γla/b
]
Taking the trace, we get
trace(
[1, 0]′h′ −D−1
0 D1
)
= α+γk − γla/b− θ
1 − θ. (9.2.7)
Taking the determinant, we get
det(
[1, 0]′h′ −D−1
0 D1
)
= α× γk − γla/b− θ
1 − θ. (9.2.8)
The two equations (9.2.7) and (9.2.8) uniquely determine the two eigenvalues which are
those proposed.
91
9.2.6. Blanchard-Quah Identification
We now consider the procedure of Blanchard and Quah (1989) when applied to data from
the 2-shock version of the model described above.
Blanchard and Quah start with a VAR
Xt = B1Xt−1 +B2Xt−2 + . . .BpXt−p + vt, Evtvt = Ω
which is estimated using time series Xt. As described above, this implies the MA
Xt = vt + C1vt−1 + C2vt−2 + . . . . (9.2.9)
Some structure is needed to derive a “structural MA” with shocks that have economic
interpretation. In this case, we will use the following notation for the structural MA:
Xt = A0ǫt + A1ǫt−1 +A2ǫt−2 + . . . . (9.2.10)
where A0ǫt = vt and Aj = CjA0.
Because we will impose restrictions on the sums of the A’s and C’s, we define
C = I + C1 + C2 + C3 + . . .
A = A0 +A1 +A2 + A3 + . . .
= CA0.
Since A0ǫt = vt, it must be the case that A0EǫtǫtA′0 = Ω. Blanchard and Quah
assume that the elements of ǫt are orthogonal and demand shocks do not have a long-run
effect on productivity. Without loss of generality, we can normalize the magnitude of the
variances of the elements of ǫt and assume, therefore, that
A0A′0 = Ω
C (1, 1)A0 (1, 2) + C (1, 2)A0 (2, 2) = 0 (9.2.11)
which is four equations in the four unknown elements of A0. Condition (9.2.11) ensures
that the demand shock does not have a long-run effect on productivity. Writing out the
system of 4 equations and 4 unknowns yields:
ω11 = A0 (1, 1)2
+A0 (1, 2)2
ω12 = A0 (1, 1)A0 (2, 1) + A0 (1, 2)A0 (2, 2) (9.2.12)
ω22 = A0 (2, 1)2
+A0 (2, 2)2
0 = C (1, 1)A0 (1, 2) + C (1, 2)A0 (2, 2)
92
Eliminate A0(1, 2) using the fact that A0(1, 2) = −C(1, 2)A0(2, 2)/C(1, 1):
ω11 = A0 (1, 1)2+ f2A0 (2, 2)
2
ω12 = A0 (1, 1)A0 (2, 1) + fA0 (2, 2)2
ω22 = A0 (2, 1)2+ A0 (2, 2)
2
where f = −C(1, 2)/C(1, 1). Solve for A0(1, 1) and A0(2, 1):
A0 (1, 1) =[
ω11 − f2A0 (2, 2)2]1/2
(9.2.13)
A0 (2, 1) =[
ω22 −A0 (2, 2)2]1/2
(9.2.14)
and substitute to get:
ω12 = fA0 (2, 2)2
+[
ω11 − f2A0 (2, 2)2]1/2 [
ω22 − A0 (2, 2)2]1/2
.
Let λ = A0(2, 2)2 and the result is a quadratic in λ:
(ω12 − fλ)2
=(
ω11 − f2λ)
(ω22 − λ)
which can be written out:
ω212 − 2fλω12 + f2λ2 = ω11ω22 − f2λω22 − ω11λ+ f2λ2
and simplified as follows:
λ =ω11ω22 − ω2
12
ω11 + f2ω22 − 2fω12. (9.2.15)
In addition, we need to impose sign conventions since impulse responses can be either
positive or negative. We will consider one sign convention for the demand shock and two
different sign conventions for the technology shock.
The demand shock in our example is a shock to the tax rate on labor. For this choice
of shock, we want to impose A0(2, 2) < 0 so that hours fall with a positive shock to the tax
rate on labor. For A0(2, 2), it must be the case that A0(2, 2) = −√λ since λ is positive.
Given A0(2, 2), it immediately follows that A0(1, 2) = fA0(2, 2).
93
9.2.6.1. Sign convention on A0(1, 1)
For the technology shock, we first consider the sign convention that productivity rises on
impact in response to a positive technology shock, namely A0(1, 1) > 0. In this case, We
need to use the positive root of A0(1, 1)2 = ω11 − f2λ:
A0 (1, 1) =√
ω11 − f2λ.
Given a value for A0(1, 1), we have A0(2, 1) from:
A0 (2, 1) = (ω12 − fλ) /A0 (1, 1) .
9.2.6.2. Sign convention on A(1, 1)
The second sign convention assumes that productivity is positive in the long run so that
A(1, 1) > 0 and therefore
C (1, 1)A0 (1, 1) + C (1, 2)A0 (1, 2) > 0.
This condition can also be written in terms of A0(1, 1) and known parameters:
C (1, 1)A0 (1, 1) + C (1, 2) (ω1,2 − fλ) /A0 (1, 1) > 0. (9.2.16)
In this case, we choose the sign on the square root of ω11−f2λ so that (9.2.16) is satisfied.
9.2.6.3. Full solution
The full solution is
A0 (2, 2) = −√λ
A0 (1, 2) = fA0 (2, 2)
A0 (1, 1) = root of ω11 − f2λ satisfying sign convention
A0 (2, 1) = (ω12 − fλ) /A0 (1, 1)
where λ is defined in (9.2.15) and f = −C(1, 2)/C(1, 1).
94
9.2.6.4. Cholesky decomposition
In the literature, many report using the following formula for A0:
A0 = C−1L
where L is a lower triangular matrix such that with positive elements on the diagonal that
satisfies LL′ = CΩC′. This choice imposes the long-run restriction in (9.2.11) and the
long-run sign convention A(1, 1) automatically.
It does not impose A0(2, 2) < 0. However, in most cases, responses to demand shocks
are not discussed.
9.2.7. Proposition 2: OLS Results
Proposition 1 says that the model has an infinite-lag vector autoregressive structure. The
next proposition considers the outcome when OLS regressions are run with one lag. Let
V0 = EXtX′t be the theoretical variance matrix for Xt. Let V1 = EXtX
′t−1 be the
covariance matrix for Xt and its lag. If ED0ηtη′tD
′0 = Ω is the theoretical variance-
covariance of the model’s shock vector, then
V0 = Ω + C1ΩC′1 + C2ΩC
′2 + . . . (9.2.17)
V1 = C1Ω + C2ΩC′1 + C3ΩC
′2 + . . . . (9.2.18)
Proposition 2. Assume that a regression is run of the form
Xt = BolsXt−1 + vt, Evtv′t = Ωols
with Xt from the RBC model. Then, the variance-covariance matrix is
Ωols = V0 − V1V−10 V ′
1 (9.2.19)
= Ω +MΩM ′ −MΩV −10 ΩM ′ (9.2.20)
where M = C2C−11 − C1 and the inverse of the sum of MA coefficients is
C−1ols = I −Bols
= C−1 +M (I −M)−1C1 +M (Ω − V0)V
−10 .
95
In other words, the OLS matrices Ωols and Cols are not equal to their theoretical counter-
parts, Ω and C.
Proof of Proposition 2. The relation (9.2.19) follows from the standard projection formulas,
Bols = (EXtXt−1)(
EXt−1X′t−1
)−1= V1V
−10
Evtv′t = E (Xt −BolsXt−1) (Xt −BolsXt−1)
′= V0 − V1V
−10 V ′
1 . (9.2.21)
Before substituting in (9.2.17) and (9.2.18), we can exploit the nature of the model’s MA.
In particular, We can use the fact that
Cj = (C1 +M)Cj−1, (9.2.22)
which follows from the formula (9.2.5) and Proposition 1. That is,
Cj − (C1 +M)Cj−1 = (Bj +Bj−1C1 +Bj−2C2 + . . .+B1Cj−1)
− C1Cj−1 −M (Bj−1 +Bj−2C1 + . . .+B1Cj−2)
= B1Cj−1 − C1Cj−1
= 0.
Thus, we can write V0 as follows:
V0 = Ω + C1ΩC′1 + (C1 +M)C1ΩC
′1 (C1 +M)
′+ (C1 +M)
2C1ΩC
′1(C1 +M)
2′+ . . .
which implies that
V0 = (C1 +M)V0 (C1 +M)′+ Ω + C1ΩC
′1 − (C1 +M)Ω (C1 +M)
′. (9.2.23)
For V1,
V1 = C1Ω + (C1 +M)C1ΩC′1 + (C1 +M)
2Ω (C1 +M) + . . .
= (C1 +M)V0 −MΩ. (9.2.24)
Substituting (9.2.23) and (9.2.24) into (9.2.21) yields
V0 − V1V−10 V ′
1 = V0 − [(C1 +M)V0 −MΩ]V −10 [(C1 +M)V0 −MΩ]
′
= V0 − (C1 +M)V0 (C1 +M)′+MΩ (C1 +M)
′
+ (C1 +M)ΩM ′ −MΩV −10 ΩM ′
= Ω + C1ΩC′1 − (C1 +M)Ω (C1 +M)
′+MΩ (C1 +M)
′
+ (C1 +M)ΩM ′ −MΩV −10 ΩM ′
= Ω +MΩM ′ −MΩV −10 ΩM ′
96
which is the same as (9.2.20). This proves the first part of the proposition.
For the second part, we need to construct the matrix Cols using the relation between
the AR coefficients and the MA coefficients in (9.2.5). In this case,
Cols = (I −Bols)−1.
Thus, we have
Cols =(
I − V1V−10
)−1
=(
I − [(C1 +M)V0 −MΩ]V −10
)−1
=(
I − C1 −M +MΩV −10
)−1
=(
I − (I −M)−1C1 + (I −M)
−1C1 − C1 −M +MΩV −1
0
)−1
=(
C−1 +(
I +M +M2 + . . .)
C1 − C1 −M +MΩV −10
)−1
=(
C−1 +M (I −M)−1C1 +M (Ω − V0)V
−10
)−1
. (9.2.25)
The term M(I−M)−1C1 +M(Ω−V0)V−10 is not generically zero. This proves the second
part of the proposition.
The term MΩM ′ −MΩV −10 ΩM ′ is zero if the RBC model’s VAR representation has
only one lag (e.g., M = 0). It is close to zero if one of the shocks is close to 0. In this
latter case, ΩV −10 Ω ≈ Ω and the SVAR user detects correctly the variance of the one shock
driving the system. This is true even if the VAR coefficients are wrong (e.g., M is very
different than 0).
What happens if we have a VAR with n lags? In this case, the formula is messy but
Evtv′t can be written
Evtv′t = V0 − [V1 V2 · · · Vn ]
V0 V1 · · · Vn−1
V ′1 V0 · · · Vn−2
......
......
V ′n−1 V ′
n−2 · · · V0
−1
V ′1
V ′2...V ′n
with Vj = (C1 + M)j−1V1, where V0 is the matrix in (9.2.23) and V1 is the matrix in
(9.2.24).
97
9.2.8. The Propositions for Two Special Cases
In this section, we consider two special cases. The first has θ = 0. The second has
στ = 0. We show in these very special cases that the SVAR can uncover the true impulse
response for hours in response to a technology shock even if only one lag is used in the
VAR regression.
9.2.8.1. Proposition 3a: No capital in the model
Proposition 3a. Assume that θ is set to 0 in the RBC model. If a regression is run of the
form
Xt = BolsXt−1 + vt
with Xt from the RBC model, then the Blanchard-Quah procedure recovers the true
impulse response function for hours in response to technology, namely
Aj (2, 1) = 0 (9.2.26)
for all j.
Proof of Proposition 3a. It is important to note that C1 is singular in this case. Thus, we
can’t write M as C2C−11 − C1, but rather, we simply work with:
M =
[
0 0M (2, 1) α
]
for arbitrary M(2, 1) (which is what we would have in the limit as θ goes to 0). It is easy
to show that B2 = C2 −C21 = MB1, B3 = C3 −B2C2 −B1C1 = MB2 and so on. To prove
the result in (9.2.26), we need to show that the errors that a SVAR user encounters in
estimating Evtv′t and C do not affect the (2,1) elements of the Aj ’s. Starting with Evtv
′t:
Evtv′t = Ω +MΩM ′ −MΩV −1
0 M ′
=
[
σ2z 00 σ2
τ
]
+
[
0 00 x
]
where the second matrix is MΩM ′ −MΩV −10 M ′ with a nonzero (2,2) element x. The
value of x does not affect the result, so we don’t need to specify it precisely. Next, we
98
consider the C matrix:
C−1ols =
(
C−1 +M (I −M)−1C1 +M (Ω − V0)V
−10
)−1
=
(
[
1 00 (1 − α) / (1 − ρ)
]−1
+
[
0 00 y
]
)−1
where y is a nonzero term in the SVAR error. The magnitude of y does not affect the
result. Notice that the (1,1) and (1,2) elements of C are correctly computed. Notice also
that this implies f = −C(1, 2)/C(1, 1) = 0 and therefore A0(2, 1) = 0. For all higher
terms, Aj(2, 1) = 0 because V1V−10 has zeros in the first column.
9.2.8.2. Proposition 3b: Only one shock
Proposition 3b. Assume that στ = 0 in the RBC model. If a regression is run of the form
Xt = BolsXt−1 + vt
with Xt from the RBC model, then the Blanchard-Quah procedure recovers the true
impulse responses to technology, namely
Aj = DjQ
for all j.
Proof of Proposition 3b. The first part of the proof is concerned with the impact coefficient
A0. We show that Evtv′t = Ω if στ = 0, where Ω is the true variance-covariance matrix for
the model. This is the main step in showing that the impact coefficient is correct. Then
we show that the other coefficients can also be recovered by the SVAR. From Proposition
2, the following holds for the one-lag regression regardless of the size of the shocks:
Evtv′t = Ω +MΩM ′ −MΩV −1
0 ΩM ′.
We now show that Ω = ΩV −10 Ω if στ = 0, and therefore Evtv
′t = Ω. We do this in three
steps. First, we show that
Ω = Ω (1, 1)
[
1 ζζ ζ2
]
(9.2.27)
99
where ζ = −a/(1 − θ + θa). Second, we show that
1
1 + ζν
[
1 νζ ζν
]
V0 = Ω (9.2.28)
where ν = −θ(1− γk)(1−a)/[(α− γk)a]. Third, we show that (9.2.27) and (9.2.28) imply:
Ω − ΩV −10 Ω = 0. (9.2.29)
Writing out Ω yields
Ω = D0QQ′D′
0
=
[
1 − θ + θa −θb−a b
] [
σ2z 00 0
] [
1 − θ + θa −a−θb b
]
=
[
(1 − θ + θa)2 − (1 − θ + θa)a− (1 − θ + θa) a a2
]
σ2z
= Ω (1, 1)
[
1 −a/ (1 − θ + θa)
−a/ (1 − θ + θa) a2/ (1 − θ + θa)2
]
and (9.2.27) holds. Writing out V0 in (9.2.28) yields
1
1 + ζν
[
1 νζ ζν
]
(Ω + C1ΩC′1 + C2ΩC
′2 + C3ΩC
′3 . . .) . (9.2.30)
The second term on the right hand side of (9.2.30) is equal to a 2×2 matrix of zeros:[
1 νζ ζν
]
C1ΩC′1
=
[
1 νζ ζν
] [
θ2 (1 − γk)2(1 − a)
2θ (1 − γk) (1 − a) (α− γk) a
θ (1 − γk) (1 − a) (α− γk) a (α− γk)2a2
]
σ2z
=
[
0 00 0
]
.
All higher terms are also equal to 2×2 matrices of zeros:[
1 νζ ζν
]
CjΩC′j
=
[
1 νζ ζν
]
(
Cj−1D0PD−10 +DjD
−10
)
(D0QQ′D′
0)(
Cj−1D0PD−10 +DjD
−10
)′
=
[
1 νζ ζν
]
(
γj−2k D2
)
(QQ′)(
γj−2k D2
)′
=
[
1 νζ ζν
]
γ2j−4k
[
[γk (1 − a) θ (1 − γk)]2 γ2
k (1 − a) θ (1 − γk) a (α− γk)
γ2k (1 − a) θ (1 − γk) a (α− γk) [γka (α− γk)]
2
]
σ2z
=
[
0 00 0
]
.
100
Thus,
1
1 + ζν
[
1 νζ ζν
]
(Ω + C1ΩC′1 + C2ΩC
′2 + C3ΩC
′3 . . .)
=1
1 + ζν
[
1 νζ ζν
]
Ω
=Ω (1, 1)
1 + ζν
[
1 νζ ζν
] [
1 ζζ ζ2
]
= Ω
which proves (9.2.28). Now, we are ready for
Ω − ΩV −10 Ω = Ω (1, 1)
[
1 ζζ ζ2
]
− 1
1 + ζν
[
1 νζ ζν
] [
1 ζζ ζ2
]
= Ω (1, 1)
[
1 ζζ ζ2
]
− 1
1 + ζν
[
1 + ζν ζ (1 + ζν)ζ (1 + ζν) ζ2 (1 + ζν)
]
=
[
0 00 0
]
which proves that there is no error in computing Evtv′t, that is Evtv
′t = Ω. The next step
is to show that this is all that is needed for the correct inference. Recall the formulas for
the elements of A0 and λ in Section 3.6. Because Ωols = Ω and det(Ω)=0, it must be the
case that λ = 0. Thus, A0(2, 1) found by the SVAR is
A0 =
[√ω11 0√ω22 0
]
where√ωjj =
√
Ω(j, j). Using the formulas above, we have A0(1, 1) = (1− θ+ θa)σz and
A0(2, 1) = −aσz. Thus, A0 = D0Q. This proves that there is no mistaken inference for
the impact coefficient. Next, we check Aj for j > 1, which is equal to BjolsA0. For j = 1,
BolsA0 = V1V−10 A0
= V1V−10 D0Q
=(
C1 +M −MΩV −10
)
D0Q
= C1D0Q
= (D0P +D1)D−10 D0Q
= D1Q
where (I − ΩV −10 )D0Q = 0 has been used. For j = 2,
B2olsA0 =
(
V1V−10
)2A0
101
=(
V1V−10
)
C1D0Q
=(
C1 +M −MΩV −10
)
C1D0Q
= C2D0Q−MΩV −10 C1D0Q
= C2D0Q
=(
D0P2 +D1P +D2
)
D−10 D0Q
= D2Q
where ΩV −10 C1D0Q = 0 has been used. Similarly, we can prove it for higher terms by
noting that if Bj−1ols A0 = Cj−1D0Q holds, then BjolsA0 = CjD0Q holds and so does the
following:
BjolsA0 =(
V1V−10
)
Cj−1D0Q
=(
C1 +M −MΩV −10
)
Cj−1D0Q
= CjD0Q−MΩV −10 Cj−1D0Q
= CjD0Q
=(
D0Pj +D1P + . . .+Dj
)
D−10 D0Q
= DjQ.
This establishes that in the case with στ = 0, the SVAR uncovers the true impulse responses
to technology.
What is interesting about the last two propositions is that the special cases are not
relevant for modern business cycle theorists. Modern business cycle theorists assume that
both capital accumulation and shocks in addition to technology (e.g., distortions to labor)
are quantitatively important. Furthermore, adding these factors is not a recent phenomena.
They are central to the work following Kydland and Prescott (1982) (which includes my
thesis).
9.3. VARs and 3-Shock Versions of the Model
We consider several versions of an RBC model with three shocks and three variables in the
VAR. The first has a investment tax shock and the log of the investment-output ratio in the
VAR. The second has a government spending shock and the log of the investment-output
ratio in the VAR. The third has a investment tax shock and the log of the consumption-
output ratio in the VAR.
102
9.3.1. Adding an Investment Tax Shock
Assume the economy is an RBC model with three orthogonal shocks: a unit root in
technology log z, an AR(1) in the tax rate on labor τl, and an AR(1) in the tax rate on
investment τx. The capital decision function has the form:
log kt+1 = γ0 + γk log kt + γz log zt + γlτlt + γxτxt (9.3.1)
and the labor decision function can be written:
log lt = φlz log zt + φllτlt + φlxτxt + φlk log kt + φlk′ log kt+1
= φlz log zt + φllτlt + φlxτxt + φlk log kt + φlk′[
γ0γk log kt + γz log zt + γlτlt + γxτxt
]
= (φlk + φlk′γk) log kt + (φlz + φlk′γz) log zt + (φll + φlk′γl) τlt + (φlx + φlk′γx) τxt.
Note that we include the term φlxτxt here even though it is equal to 0 in equilibrium.
We do so because the same mathematics will be used later in the case of the government
spending shock.
Next, we write out output from a Cobb-Douglas production technology with capital
share θ is:
log yt = θ(
log kt − log zt
)
+ (1 − θ) log lt
= (θ + (1 − θ)φlk) log kt − (θ − (1 − θ)φlz) log zt + (1 − θ)φllτlt
+ (1 − θ)φlxτxt + (1 − θ)φlk′ log kt+1
= (θ + (1 − θ) (φlk + φlk′γk)) log kt − (θ − (1 − θ) (φlz + φlk′γz)) log zt
+ (1 − θ) (φll + φlk′γl) τlt
+ (1 − θ) (φlx + φlk′γx) τxt
We can write the capital stock in terms of all lagged shocks as follows:
log kt = γ0 + γk
(
γ0 + γk log kt−2 + γz log zt−2 + γlτlt−2 + γxτxt−2
)
+ γz log zt−1 + γlτlt−1 + γxτxt−1
= γ0
[
1 + γk + γ2k + . . .
]
+ γz[
log zt−1 + γk log zt−2 + γ2k log zt−3 + . . .
]
+ γl[
τlt−1 + γkτlt−2 + γ2kτlt−3 + . . .
]
+ γx[
τxt−1 + γkτxt−2 + γ2kτxt−3 + . . .
]
or in differences as follows:
log kt − log kt−1 = γz[
log zt−1 + (γk − 1) log zt−2 + γk log zt−3 + γ2k log zt−4 + . . .
]
+ γl[
τlt−1 + (γk − 1) τlt−2 + γkτlt−3 + γ2kτlt−4 + . . .
]
+ γx[
τxt−1 + (γk − 1) τxt−2 + γkτxt−3 + γ2kτxt−4 + . . .
]
103
or in quasi-differences as follows:
log kt − α log kt−1 = γz[
log zt−1 + (γk − α) log zt−2 + γk log zt−3 + γ2k log zt−4 + . . .
]
+ γl[
τlt−1 + (γk − α) τlt−2 + γkτlt−3 + γ2kτlt−4 + . . .
]
+ γx[
τxt−1 + (γk − α) τxt−2 + γkτxt−3 + γ2kτxt−4 + . . .
]
We can also write hours in terms of past shocks as follows:
log lt = φlz log zt + φllτlt + φlxτxt + φlk log kt + φlk′ log kt+1
= φlz log zt + φllτlt + φlxτxt
+ φlkγz[
log zt−1 + γk log zt−2 + γ2k log zt−3 + . . .
]
+ φlkγl[
τlt−1 + γkτlt−2 + γ2kτlt−3 + . . .
]
+ φlkγx[
τxt−1 + γkτxt−2 + γ2kτxt−3 + . . .
]
+ φlk′γz[
log zt + γk log zt−1 + γ2k log zt−2 + . . .
]
+ φlk′γl[
τlt + γkτlt−1 + γ2kτlt−2 + . . .
]
+ φlk′γx[
τxt + γkτxt−1 + γ2kτxt−2 + . . .
]
= [(φlz + φlk′γz) log zt + (φlk + φlk′γk) γz log zt−1 + (φlk + φlk′γk) γkγz log zt−2 + . . .]
+ [(φll + φlk′γl) τlt + (φlk + φlk′γk) γlτlt−1 + (φlk + φlk′γk) γkγlτlt−2 + . . .]
+ [(φll + φlk′γx) τxt + (φlk + φlk′γk) γxτxt−1 + (φlk + φlk′γk) γkγxτxt−2 + . . .]
where constant terms have been ignored.
104
We can write logged hours in differences as follows:
log lt − log lt−1 = φlz (log zt − log zt−1) + φll (τlt − τlt−1) + φlx (τxt − τxt−1)
+ φlk′(
log kt+1 − log kt
)
+ φlk
(
log kt − log kt−1
)
= φlz (log zt − log zt−1) + φll (τlt − τlt−1) + φlx (τxt − τxt−1)
+ φlk′γz[
log zt + (γk − 1) log zt−1 + γk log zt−2 + γ2k log zt−3 + . . .
]
+ φlk′γl[
τlt + (γk − 1) τlt−1 + γkτlt−2 + γ2kτlt−3 + . . .
]
+ φlk′γx[
τxt + (γk − 1) τxt−1 + γkτxt−2 + γ2kτxt−3 + . . .
]
+ φlkγz[
log zt−1 + (γk − 1) log zt−2 + γk log zt−3 + γ2k log zt−4 + . . .
]
+ φlkγl[
τlt−1 + (γk − 1) τlt−2 + γkτlt−3 + γ2kτlt−4 + . . .
]
+ φlkγx[
τxt−1 + (γk − 1) τxt−2 + γkτxt−3 + γ2kτxt−4 + . . .
]
= [φlz + φlk′γz] log zt − [φlz − φlkγz − φlk′γz (γk − 1)] log zt−1
+ γz (γk − 1) [φlk′γk + φlk] log zt−2 + γk log zt−3 + γ2k log zt−4 + . . .
+ [φll + φlk′γl] τlt − [φll − φlkγl − φlk′γl (γk − 1)] τlt−1
+ γl (γk − 1) [φlk′γk + φlk] τlt−2 + γkτlt−3 + γ2kτlt−4 + . . .
+ [φlx + φlk′γx] τxt − [φlx − φlkγx − φlk′γx (γk − 1)] τxt−1
+ γx (γk − 1) [φlk′γk + φlk] τxt−2 + γkτxt−3 + γ2kτxt−4 + . . .
or in quasi-difference form as follows:
log lt − α log lt−1 = φlz (log zt − α log zt−1) + φll (τlt − ατlt−1) + φlx (τxt − ατxt−1)
+ φlk′(
log kt+1 − α log kt
)
+ φlk
(
log kt − α log kt−1
)
= φlz (log zt − α log zt−1) + φll (τlt − ατlt−1) + φlx (τxt − ατxt−1)
+ φlk′γz[
log zt + (γk − α) log zt−1 + γk log zt−2 + γ2k log zt−3 + . . .
]
+ φlk′γl[
τlt + (γk − α) τlt−1 + γkτlt−2 + γ2kτlt−3 + . . .
]
+ φlk′γx[
τxt + (γk − α) τxt−1 + γkτxt−2 + γ2kτxt−3 + . . .
]
+ φlkγz[
log zt−1 + (γk − α) log zt−2 + γk log zt−3 + γ2k log zt−4 + . . .
]
+ φlkγl[
τlt−1 + (γk − α) τlt−2 + γkτlt−3 + γ2kτlt−4 + . . .
]
+ φlkγx[
τxt−1 + (γk − α) τxt−2 + γkτxt−3 + γ2kτxt−4 + . . .
]
= [φlz + φlk′γz] log zt − [αφlz − φlkγz − φlk′γz (γk − α)] log zt−1
+ γz (γk − α) [φlk′γk + φlk] log zt−2 + γk log zt−3 + γ2k log zt−4 + . . .
+ [φll + φlk′γl] τlt − [αφll − φlkγl − φlk′γl (γk − α)] τlt−1
105
+ [φlx + φlk′γl] τxt − [αφlx − φlkγl − φlk′γx (γk − α)] τxt−1
+ γl (γk − α) [φlk′γk + φlk] τlt−2 + γkτlt−3 + γ2kτlt−4 + . . .
+ γx (γk − α) [φlk′γk + φlk] τxt−2 + γkτxt−3 + γ2kτxt−4 + . . . (9.3.2)
We can use the expressions for output and hours to write out the change in produc-
tivity as follows:
log (yt/lt) − log (yt−1/lt−1)
= log yt − log yt−1 + log zt − log lt − log lt−1
= log zt + θ(
log kt − log kt−1 − log lt + log lt−1 − log zt + log zt−1
)
= (1 − θ) log zt + θ log zt−1
− θ(
log lt − log lt−1 − log kt + log kt−1
)
= (1 − θ) log zt + θ log zt−1
− θ(
log lt − log lt−1 − log kt + log kt−1
)
= (1 − θ) log zt + θ log zt−1 − θ[φlz + φlk′γz] log zt
− [φlz − (φlk − 1) γz − φlk′γz (γk − 1)] log zt−1
+ γz (γk − 1) [φlk′γk + φlk − 1][
log zt−2 + γk log zt−3 + γ2k log zt−4 + . . .
]
+ [φll + φlk′γl] τlt − [φll − (φlk − 1) γl − φlk′γl (γk − 1)] τlt−1
+ [φlx + φlk′γx] τxt − [φlx − (φlk − 1) γx − φlk′γx (γk − 1)] τxt−1
+ γl (γk − 1) [φlk′γk + φlk − 1][
τlt−2 + γkτlt−3 + γ2k log τlt−4 + . . .
]
+ γx (γk − 1) [φlk′γk + φlk − 1]
[
τxt−2 + γkτxt−3 + γ2k log τxt−4 + . . .
]
= 1 − θ − θ [φlz + φlk′γz] log zt
+ θ [1 + φlz − (φlk − 1) γz − φlk′γz (γk − 1)] log zt−1
− θγz (γk − 1) [φlk′γk + φlk − 1][
log zt−2 + γk log zt−3 + γ2k log zt−4 + . . .
]
− θ [φll + φlk′γl] τlt
+ θ [φll − (φlk − 1) γl − φlk′γl (γk − 1)] τlt−1
− θγl (γk − 1) [φlk′γk + φlk − 1][
τlt−2 + γkτlt−3 + γ2kτlt−4 + . . .
]
− θ [φlx + φlk′γx] τxt
+ θ [φlx − (φlk − 1) γx − φlk′γx (γk − 1)] τxt−1
− θγx (γk − 1) [φlk′γk + φlk − 1][
τxt−2 + γkτxt−3 + γ2kτxt−4 + . . .
]
(9.3.3)
106
Now we write out the log of the investment share:
log (xt/yt) = log xt − log yt
= φxk
(
log kt − log zt
)
+ φxk′ log kt+1 − θ(
log kt − log zt
)
− (1 − θ) log lt
= (φxk − θ)(
log kt − log zt
)
+ φxk′[
γk log kt + γz log zt + γlτlt + γxτxt
]
− (1 − θ) [(φlk + φlk′γk) log kt + (φlz + φlk′γz) log zt
+ (φll + φlk′γl) τlt + (φlx + φlk′γx) τxt]
= [−φxk + θ + φxk′γz − (1 − θ) (φlz + φlk′γz)] log zt
+ [φxk′γl − (1 − θ) (φll + φlk′γl)] τlt
+ [φxk′γx − (1 − θ) (φlx + φlk′γx)] τxt
+ [φxk − θ + φxk′γk − (1 − θ) (φlk + φlk′γk)]
γz[
log zt−1 + γk log zt−2 + γ2k log zt−3 + . . .
]
+ γl[
τlt−1 + γkτlt−2 + γ2kτlt−3 + . . .
]
+ γx[
τxt−1 + γkτxt−2 + γ2kτxt−3 + . . .
]
9.3.1.1. The Model’s Moving Average
The moving average for the model is given by:
(1 − L) log yt/lt(1 − αL) log lt
log xt/yt
≡ Xt = D0ωt +D1ωt−1 +D2ωt−2 + . . .
where ωt = [log zt, τlt, τxt]′ and
D0 =
1 − θ + θa −θb −θc−a b c−d e f
(9.3.4)
D1 =
θ (1 − a) (1 − γk) θ (b+ (1 − a) γl) θ (c+ (1 − a) γx)(α− γk) a −αb+ γla −αc+ γxa
−dγk dγl dγx
(9.3.5)
D2 =
γk (1 − a) θ (1 − γk) −γl (1 − a) θ (1 − γk) −γx (1 − a) θ (1 − γk)γka (α− γk) −γla (α− γk) −γxa (α− γk)
−dγ2k dγlγk dγxγk
,(9.3.6)
107
and Dj = γkDj−1 for j ≥ 3 where a = φlk + φlk′γk, b = φll + φlk′γl, c = φlx + φlk′γx,
d = φxk + φxk′γk − θ − (1 − θ)a, e = φxk′γl − (1 − θ)b, and f = φxk′γx − (1 − θ)c. Note
that φlz = −φlk, φxz = −φxk, and γz = −γk hold in the model economy with a unit root
in technology. Note also that D2 is singular for all parameterizations, and D1 is singular
if α = 0.
If τlt and τxt are AR(1) processes, then it is more convenient to write the MA process
in terms of ηt = [log zt, ηlt, ηxt] rather than in terms of ωt. In this case,
Xt = D0ηt+(D0P +D1) ηt−1+(
D0P2 +D1P +D2
)
ηt−2+(
D0P3 +D1P
2 +D2P +D3
)
ηt−3. . . .
We normalize the MA so it has an identity for the first coefficient. That is, set C0 = I,
C1 = (D0P +D1)D−10 , and Cj = Cj−1D0PD
−10 +DjD
−10 .
9.3.1.2. Special Property of the D’s
As in the 2-shock case, it is the case that the D matrices have a special property that can
be exploited when we characterize coefficients of the VAR of Xt. In other words, the D’s
for the 3-shock RBC model also satisfy the relation:(
γkI −(
D0P2 +D1P +D2
)
(D0P +D1)−1)
D2 = 0. (9.3.7)
which is the same as (9.2.1). Because D1 is singular when α = 0, we will assume that the
choice of P and α is such that D0P +D1 is invertible. This rules out the case with P and
α identically equal to 0. If that is the case of interest, assume that α is positive but very
close to zero.
The steps of the proof of (9.3.7) in the 3-variable case is the same as in the 2-variable
case. First note from (9.3.6) that
D2 =
(1 − a) θ (1 − γk)(α− γk) a
−dγk
[ γk −γl −γx ] ≡ gh′
Thus, we can rewrite the left hand side as follows
(
γkI −(
D0P2 +D1P +D2
)
(D0P +D1)−1)
D2
=[
γk (gh′) − (gh′) (D0P +D1)−1
(gh′)]
−[
(D0P +D1)P (D0P +D1)−1gh′]
.(9.3.8)
Both terms in (9.3.8) in square brackets are equal to 3×3 zero matrices. The first step is
to show that
(D0P +D1)−1g =
100
. (9.3.9)
108
The proof of this step is trivial since the first column ofD0P+D1 is equal to g. Substituting
(9.3.9) into (9.3.8), the result (9.3.7) follows immediately from the fact that h′[1, 0, 0]′ = γk
and P [1, 0, 0]′ = 0.
9.3.1.3. Proposition 4: Model has infinite-order VAR
The map between the theoretical MA and the VAR is the same as before. What is new is
the VAR representation.
Proposition 4. The model described above has a VAR representation with coefficients Bj
that satisfy
Bj = MBj−1 (9.3.10)
for j ≥ 2, with B1 = C1 = (D0P +D1)D−10 . The matrix M is 3×3 with eigenvalues equal
to 0, α, and (1 − δ)/[z(1 + gn)].
Proof of Proposition 4. The first part of the proof is the same as for Proposition 1.
The second part, involving the expressions of the eigenvalues, is different. In the three
shock case, one can use the same derivations as those in Proposition 1 to show that
[1, 0, 0]′h′ −D−10 D1 has the same eigenvalues as M . In this case, D−1
0 is given by:
D−10 =
1
|D0|
bf − ce θ (bf − ce) 0af − cd (1 − θ) f + θ (af − cd) − (1 − θ) cbd− ae − (1 − θ) e+ θ (bd− ae) (1 − θ) b
and the elements of [1, 0, 0]′h′ −D−10 D1 are given by
(1, 1) = γk − θ (1 − γk − a+ aα) / (1 − θ)
(1, 2) = −γl − θ (γl + b− bα) / (1 − θ)
(1, 3) = −γx − θ (γx + c− cα) / (1 − θ)
(2, 1) = (af − cd) (γk − θ + θa− θaα) − af (1 − θ)α/|D0|(2, 2) = (af − cd) (−γl − θb+ θbα) + bf (1 − θ)α/|D0|(2, 3) = (af − cd) (−γx − θc+ θcα) + cf (1 − θ)α/|D0|(3, 1) = (bd− ae) (γk − θ + θa− θaα) + ae (1 − θ)α/|D0|(3, 2) = (bd− ae) (−γl − θb+ θbα) − be (1 − θ)α/|D0|(3, 3) = (bd− ae) (−γx − θc+ θcα) − ce (1 − θ)α/|D0|
where |D0| = (1−θ)(bf−ce). To prove the proposition, we will show that trace([1, 0, 0]′h′−D−1
0 D1) equals the sum of the proposed eigenvalues, |[1, 0, 0]′h′ − D−10 D1| = 0, and
109
|[1, 0, 0]′h′ − D−10 D1 − αI| = 0. These three conditions uniquely determine the three
eigenvalues.
To compute the trace, sum (1,1), (2,2), and (3,3):
trace(
[1, 0, 0]′h′ −D−1
0 D1
)
= γk − θ (1 − γk − a+ aα) / (1 − θ)
+ (af − cd) (−γl − θb+ θbα) + bf (1 − θ)α
+ (bd− ae) (−γx − θc+ θcα) − ce (1 − θ)α/|D0|
= α+ (γk − θ) (bf − ce) − γl (af − cd) − γx (bd− ae)/|D0|
= α+ (γk − θ)φxk′ (bγx − cγl) − γl (aφxk′γx − c (φxk + φxk′γk − θ))
− γx (b (φxk + φxk′γk − θ) − aφxk′γl)/ [(1 − θ)φxk′ (bγx − cγl)]
= α+θ (1 − φxk′) − φxk
φxk′ (1 − θ)(9.3.11)
= α+(1 − δ)
z (1 + gn)(9.3.12)
where z without a subscript is the steady state value.
Next we compute the determinant of [1, 0, 0]′h′ −D−10 D1 and show it is 0. Denoting
the matrix by M, we get
det (M) = M1,1|M ([2, 3] , [2, 3]) | −M1,2|M ([2, 3] , [1, 3]) | + M1,3|M ([2, 3] , [1, 2]) |= (γk − θ (1 − γk − a+ aα) / (1 − θ)) (1 − θ)αd (fb− ec) (γlc− γxb) /|D0|2
+ (γl + θ (γl + b− bα) / (1 − θ)) (1 − θ)αd (fb− ec) (−γkc+ θc+ γxa) /|D0|2
− (γx + θ (γx + c− cα) / (1 − θ)) (1 − θ)αd (fb− ec) (−γkb+ θb+ γla) /|D0|2
=
(γk (1 − θ) − θ (1 − γk − a+ aα)) (γlc− γxb)
+ (γl (1 − θ) + θ (γl + b− bα)) (−γkc+ θc+ γxa)
− (γx (1 − θ) + θ (γx + c− cα)) (−γkb+ θb+ γla)
αd (fb− ec) /|D0|2
=
(γk − θ + θa (1 − α)) (γlc− γxb)
+ (γl + θb (1 − α)) (− (γk − θ) c+ γxa)
− (γx + θc (1 − α)) (− (γk − θ) b+ γla)
αd (fb− ec) /|D0|2
= 0 (9.3.13)
110
Finally, we take the determinant of M− αI and show it is 0 as follows:
det (M− αI)
= (M1,1 − α) |M ([2, 3] , [2, 3]) − αI| −M1,2 (|M ([2, 3] , [1, 3]) | − αM2,1)
+ M1,3 (|M ([2, 3] , [1, 2]) | + αM3,1)
= αα [M1,1 + M2,2 + M3,3 − α]
−M1,1M2,2 −M1,1M3,3 −M2,2M3,3
+ M1,2M2,1 + M1,3M3,1 + M2,3M3,2= αα [trace (M) − α]
− (M1,1M2,2 −M1,2M2,1)
− (M1,1M3,3 −M1,3M3,1)
− (M2,2M3,3 −M2,3M3,2)= αα[(bf − ce) (γk − θ + θa− θaα)
+ (af − cd) (−γl − θb+ θbα) + bf (1 − θ)α
+ (bd− ae) (−γx − θc+ θcα) − ce (1 − θ)α− α (1 − θ) (bf − ce)]
− fα [b (γk − θ) − aγl]
+ eα [c (γk − θ) − aγx]
− dα [γlc− γxb]/|D0|= 0 (9.3.14)
The result in (9.3.14) implies that α is an eigenvalue. The result in (9.3.13) implies that
0 is an eigenvalue. Given these results, the fact that the trace is (9.3.12) implies that
(1 − δ)/[z(1 + gn)] is the third eigenvalue. This completes the proof.
9.3.1.4. A Way to Make M Singular
Above we included the investment share in the VAR. The investment share is typically
added to capture the capital dynamics if capital is unobserved. What if we assume that
capital is observed and use the capital share instead?
We can see the answer directly from the proof in Proposition 4. At the step (equation
(9.3.11)) that we fill in expressions for φxk and φxk′ using (9.1.18), we could instead use
φxk = 1 and φxk′ = 0. This yields a third eigenvalue equal to -1/0 or −∞. This clearly
doesn’t work since the MA is not invertible.
111
However, it shows me what would work: adding the capital next period relative to
output, log(kt+1/yt), and, therefore, setting φxk = 0 and φxk′ = 1. If α = 0, then the
matrix M has 3 zero eigenvalues. A researcher running a VAR would find that B2 is
singular and the rest of the Bj , j ≥ 3, are zero matrices. In fact the structure would be
such that the second and third column of B2 would be equal and equal to the negative of
the first column of B2. That is how it works: certain lags are cancelling so it effectively
mimics the model’s finite-lag VAR.
What is interesting is that it won’t work if we divide kt+1 by kt and include the log of
the growth rate of capital. If we add log(kt+1/kt) to the VAR, then we proceed the same
way through the proof of Proposition 4 using d = γk − 1, e = γl, and f = γx. The result
is that the eigenvalues of M are 0, α, and 1. The fact that one is 1 means that the MA is
not invertible.
What these results tell me is that one has to proceed carefully, using lots of the details
of the model, to determine if the SVAR has a short-lag representation. Since most business
cycle models have a short-lag state-space representation, we advise using it directly. The
state-space representation also allows us to treat the capital stocks as unobserved. This
is certainly necessary in business cycle models with sticky price models and staggered
constracts; the state vector includes the distribution of capital stocks which is unobserved.
112
References
Aiyagari, S. Rao and Ellen R. McGrattan. 1998. The optimum quantity of debt, Journal
of Monetary Economics, 42: 447-469.
Anderson, Brian D. O. and John B. Moore. 1979. Optimal Filtering, Englewood Cliffs:
Prentice-Hall.
Anderson, Evan, Lars Peter Hansen, Ellen R. McGrattan, and Thomas J. Sargent. 1996.
Mechanics of Forming and Estimating Dynamic Linear Economies, Handbook of
Computational Economics, eds. H. Amman, D. Kendrick, and J. Rust, (North-
Holland).
Bertsekas, Dimitri and Steven Shreve. 1978. Stochastic Optimal Control: The Discrete
Time Case, New York: Academic Press.
Blanchard, Olivier J. and Charles M. Kahn. 1980. The solution of linear difference models
under rational expectations, Econometrica, 48: 1305-1311.
Braun, Richard A. and Ellen R. McGrattan. 1993. The macroeconomics of war and peace.
NBER Macroeconomics Annual 1993. Cambridge: MIT Press.
Candler, G.V., Wright, M.J., and McDonald, J.D. (1994). A Data Parallel LU Relaxation
Method for Reacting Flows, AIAA Journal, Vol. 32, No. 12, pp. 2380-2386.
Chari, V.V., Patrick Kehoe, and Ellen R. McGrattan. 1997. The poverty of nations: A
quantitative investigation. Staff Report #204, Federal Reserve Bank of Minneapo-
lis.
Chari, V. V., Patrick J. Kehoe, and Ellen R. McGrattan. 1999. Sticky price models
of the business cycle: Can the contract multiplier solve the persistence problem?
Econometrica, 68(5): 1151–1179, September 2000.
Chari, V. V., Patrick J. Kehoe, and Ellen R. McGrattan. 2002 Accounting for the Great
Depression, American Economic Review, Papers and Proceedings, 92(2): 22–27.
Chari, V. V., Patrick J. Kehoe, and Ellen R. McGrattan. 2005. A critique of structural
VARS Using business cycle theory, Staff Report #364, Federal Reserve Bank of
Minneapolis.
Chari, V. V., Patrick J. Kehoe, and Ellen R. McGrattan. 2006. Business cycle accounting,
Econometrica, forthcoming.
Christiano, Lawrence J. 1990. Solving the stochastic growth model by linear-quadratic
113
approximation and by value-function iteration, Journal of Business and Economic
Statistics, 8: 23-26.
Christiano, Lawrence J. and Jonas D. Fisher. 2000. Algorithms for solving dynamic models
with occasionally binding constraints, Journal of Economic Dynamics and Control,
24(8): 1179–1232.
Computational Methods for the Study of Dynamic Economies, eds. R. Marimon and A. Scott
(Oxford University Press, Oxford, U.K.).
Den Haan, W. J. and A. Marcet. 1990. Solving the stochastic growth model by parame-
terized expectations, Journal of Business and Economic Statistics, 8: 31-34.
Fletcher, R. 1987. Practical methods of optimization, (Wiley: Chichester, U.K.).
Frontiers of Business Cycle Research. 1994. ed. T. F. Cooley (Princeton University Press,
Princeton, NJ).
Courant, R., and Hilbert, D. (1962). Methods of Mathematical Physics, Vols. I and II,
New York, Interscience.
Ferziger, J.H., and Peric, M. (1996). Computational Methods for Fluid Dynamics, Berlin,
Springer-Verlag.
Golub, G. H. and J. M. Ortega. 1992. Scientific Computing and Differential Equations:
An Introduction to Numerical Methods (Academic Press: New York, NY).
Golub, G. H. and C. F. Van Loan. 1989. Matrix Computations (Johns Hopkins Press:
Baltimore, MD).
Hansen, Lars Peter. 1982. Large sample properties of generalized method of moments
estimators, Econometrica, 50:1029–1054.
Hansen, Lars Peter, and Kenneth Singleton. 1982. Generalized instrumental variables
estimation of nonlinear rational expectations models, Econometrica, 50(5): 1269–
1286.
Hirsch, C. (1988). Numerical Computation of Internal and External Flows, Vols. I and II,
New York, Wiley.
Hughes, Thomas J.R. 1987. The Finite Element Method: Linear Static and Dynamic
Finite Element Analysis. Englewood Cliffs: Prentice-Hall
Judd, Kenneth L.. 1992. Projection methods for solving aggregate growth models, Journal
of Economic Theory, 58: 410-452.
114
Judd, Kenneth L. 1998. Numerical Methods in Economics (MIT Press, Cambridge, MA).
Kwakernaak, H. and R. Sivan. 1972. Linear Optimal Control Systems, New York: Wiley
and Sons.
Kwakernaak, Huibert and Raphael Sivan. 1972. Linear Optimal Control Systems, New
York: Wiley and Sons.
Kydland, Finn E. and Edward C. Prescott. 1982. Time to build and aggregate fluctuations,
Econometrica, 50, 1345-1370.
McGrattan, Ellen R. 1989. Computation and Application of Equilibrium Models with
Distortionary Taxes, Stanford University, Thesis.
McGrattan, Ellen R. 1990. Solving the stochastic growth model by linear-quadratic ap-
proximation, Journal of Business and Economic Statistics, 8: 41-44.
McGrattan, Ellen R. 1994. The macroeconomic effects of distortionary taxation, Journal
of Monetary Economics, 33: 573-601.
McGrattan, Ellen R. 1994. A note on computing competitive equilibria in linear models,
Journal of Economic Dynamics and Control, 18: 149-160.
McGrattan, Ellen R. 1994. A progress report on business cycle models, Federal Reserve
Bank of Minneapolis Quarterly Review, Fall.
McGrattan, Ellen R. 1996. Solving the stochastic growth model with a finite element
method, Journal of Economic Dynamics and Control, 20: 19-42.
McGrattan, Ellen R., Richard Rogerson, and R. Wright, 1997, An Equilibrium Model
of the Business Cycle with Household Production and Fiscal Policy, International
Economic Review, 38: 267-290.
McGrattan, E. R., and E. C. Prescott, 2004, The 1929 stock market: Irving Fisher was
right, International Economic Review, 45(4): 991–1009.
Press, W.H., B.P. Flannery, S.A. Teukolsky, and W.T. Vetterling. 1986. Numerical recipes:
The art of scientific computing. Cambridge: Cambridge University Press.
Reddy, J.N. 1993. An introduction to the finite element method. New York: McGraw-Hill.
Saad, Yousef. 1996. Iterative Methods for Sparse Linear Systems. Boston: PWS.
Sargent, Thomas J. 1980. Notes on Filtering, Control, and Rational Expectations, unpub-
lished manuscript, University of Minnesota.
115
Sargent, Thomas J. 1987. Dynamic Macroeconomic Theory. Cambridge: Harvard Univer-
sity Press.
Taylor, John B. and Harald Uhlig. 1990. Solving nonlinear stochastic growth models:
A comparison of alternative solution methods. Journal of Business and Economic
Statistics 8: 1-17.
Vaughan, David R. 1970. A Nonrecursive Algebraic Solution for the Discrete Riccati
Equation, IEEE Transactions on Automatic Control, AC-15, 597-599.
116