Wellesley Bellman Notes 1

8/3/2019 Wellesley Bellman Notes 1

1/6

Fall Semester 05-06

Akila Weerapana

Lecture 22: Dynamic Optimization in Discrete Time

I. INTRODUCTION

The last lecture served as a general introduction to dynamic optimization problems. Welooked at how we can use the Lagrange multiplier method for solving simple dynamic opti-mization problems.

Todays lecture focuses on solving dynamic optimization problems without resorting to thesometimes cumbersome Lagrange multiplier technique. Keep in mind that the Lagrangemultiplier method can always be used to solve a dynamic problem, we are simply looking fora more tractable, and more powerful way.

The technique we use is known as the Bellman equation. The Bellman equation is a recur-

sive representation of a maximization decision, in other words, it represents a maximizationdecision as a function of a smaller maximization decision.

II. BELLMAN EQUATIONS

Consider a very general T-period optimization decision faced by an agent today, which wecall period 1.

maxx1,x2,xT

Tt=1

t1f(xt, At)

subject to the constraints At = g(xt, At1) for t = 1, 2, , T where A0, AT are given to us.

In this type of optimization, we can identify several variables that will be useful in formulatingthe Bellman equation for the problem.

f(xt, At1) is the objective function - the present discounted value of the sum of the objectivefunctions in each time period is our maximization objective.

xt is the choice variable(s) - the variable(s) whose time path we are choosing in order tomaximize the PDV of the sum of the objective functions.

At

1 is known as the state variable: a stock variable that the agent inherits from the past at

time t, which is affected by her choice at time t, and which she passes on to the next period.Note that depending on the setup of the problem, the A variable may have a time t subscriptinstead of a time t 1 subscript - examples can be seen later.

g(xt, At1) is known as the transition equation, it describes how the choice variable at timet affects the state variable inherited from time t 1 in order to determined the value of thestate variable we pass on to time t + 1.


2/6

The centerpiece of the Bellman equation is a function known as the value function. Thevalue function denotes the maximized value of the objective function from time t onwards.The value ofV at any given period in time depends on the value of the state variable at thattime since that is what the decision maker inherits from the past.

So the definition of the value function in the general problem defined above is

V1(A0) = maxx1,x2, ,xT

Tt=1

t1f(xt, At1) subject to At = g(xt, At1) for t = 1, , T

We can think of formulating the Bellman equation in a recursive manner by thinking of thedynamic decision as follows. Picking a value for the choice variable today (say x1), given aninitial value for the stock variable A0, has two effects: it directly affects todays objectivefunction (through f(x1, A0)) but it also affects the optimal decisions for next periods choicevariables (by changing A1).

Thus, in picking x1 we have to be concerned about two things: i) what the choice ofx1 meansfor the current periods objective function and ii) how the choice ofx1 affects the best we can

do after today if we pass on A1 = g(x1, A0).

Since we defined V1(A0) as the maximized value of the objective function at time 1 given aninitial stock of assets A0, then V2(A1) is the maximized value of the objective function attime 2 given an initial stock of assets A1. In other words our maximization decision can besimplified greatly in a recursive form using the Bellman equation as

V1(A0) = maxx1

[f(x1, A0) + V2(A1)] where A1 = (1 + r)A0 + Y1 C1

The same recursive definition applies to V2 as a function of V3 and so on. In general, thedecision of the individual at any point in time t can be written as

Vt(At1) = maxxt

[f(xt, At1) + Vt+1(At)] where At = (1 + r)At1 + Yt Ct

This recursive definition is enormously useful in solving discrete time dynamic optimizationproblems. We do not have to carry around many different choice variables and Lagrangemultipliers and the problem is reduced to a simple two step equation.

We will write down Bellman equations for a variety of optimization problems and then demon-strate how to solve them.

III. APPLICATIONS OF BELLMAN EQUATIONS

Utility Maximization

Lets return to the first problem of multi-period consumer optimization decision

maxC1,CT

Tt=1

t1U(Ct)

where At = (1 + r)At1 + Yt Ct for t = 1, 2, , T and A0, AT are given


3/6

The choice variable here is consumption, Ct, while the state variable is assets, At. The objec-tive function is utility function the transition equation simply states that income-consumptioneither adds on or subtracts from the principal and interest on assets inherited from the pastto determine next periods assets.

We will define the maximization problem faced at time t using the Bellman equation as

Vt(At1) = maxCt

[U(Ct) + Vt+1(At)] where At = (1 + r)At1 + Yt Ct

Since the form of the utility function does not depend on time, we can drop the t subscripton the value function and write V(At) instead ofVt(At)

The next task is to find the solution using the Bellman equation. Finding the solution to aBellman equation involves two steps. First, we do the traditional FOC with respect to thechoice variable, in this case Ct. The FOC will be

U(Ct) + V(At)

At

Ct= 0

U(Ct) V (At) = 0

In order to solve this FOC, we need to know the value of V(At1). We do this by taking thederivative with respect to the state variable, At1. The envelope theorem says that we canignore the impact of changing A on our choice variable in calculating that derivative.

This results in the following equation

V(At1) = V(At)

At

At1 V(At1) = (1 + r)V(At)

From the FOC we know that V(At1) = U(Ct1)

and (1 + r)V(At) (1 + r)U(Ct).Plugging these into the envelope condition we get

U(Ct1) = (1 + r)U(Ct)

This equation is more commonly written as

U(Ct) = (1 + r)U(Ct+1)

This equation, which relates consumption in 1 period to the consumption in the next period

(more broadly speaking the choice variable in one period to the choice variable in subsequentperiod(s)) is called an EULER equation

This modified FOC is known as the Euler equation: it relates consumption in one period toconsumption in the next period. The Euler equation states that at the optimum choices, wecannot gain utility by making a feasible switch of consumption from one period to the next.

What is a feasible switch in consumption? For example, by consuming 1 unit less in periodt we will have (1+r) units more to consume in period t+1. If we lower our consumption by1 unit at time t this period, the net impact on our maximized utility is U(Ct). Increasing


4/6

our consumption by (1+ r) units next period will have a net impact on our maximized utilityof (1 + r)U(Ct+1). Since this gain is in the next period, we need to discount it, which givesus (1 + r)U(Ct+1).

At the optimal point, the discounted gain from any feasible reallocation is zero, so we get(1 + r)U(Ct+1) U(Ct) = 0 (1 + r)U(Ct+1) = U(Ct) The Euler equation states thatat the optimum choices, we cannot gain by making a feasible switch of consumption from one

period to the next.

This holds for all T 1 pairs of consecutive years in the sample. If we had a particularfunctional form say U(Ct) = 2

Ct we can show that the Euler equation (lagged) implies

that Ct = [(1 + r)]2Ct1. This difference equation, combined with the budget constraint

At = (1 + r)At1 + Yt Ct and the initial and terminal conditions A(0) = 0 and A(T) = 0form a two variable system of difference equations which can be solved to find out the timepath of consumption.

Cake Eating Problem

Lets consider a classic dynamic optimization problem, known as the cake-eating problem.

This is a simplified version of the consumption problem defined earlier. Define t to be thesize of a cake at time t and assume that the utility function is U(Ct) = 2

C. The problem is

maxC1,CT

Tt=1

t1U(Ct)

where t = t1 Ct for t = 1, 2, , T and 0,T are given

The choice variable here is consumption, Ct, while the state variable is the size of the cake, t.The objective function is utility function the transition equation simply states that income-consumption either adds on or subtracts from the principal and interest on assets inherited

from the past to determine next periods assets.

We will define the maximization problem faced at time t using the Bellman equation as

V(t1) = maxCt

2Ct + V(t)

where t = t1 Ct

The FOC will be1Ct

+ V (t)(1) = 0

1

Ct= V (t)

The envelope condition is

V(t1) = V(t)

From the FOC we know that V(t1) = 1

Ct1and V (t) 1Ct . Plugging these into

the envelope condition we get Ct =

Ct1


5/6

This equation is more commonly written asCt+1 =

Ct

This is the Euler equation for the model. The Euler equation states that at the optimumchoices, we cannot gain utility by making a feasible switch of consumption from one period

to the next. What is a feasible switch in consumption? For example, by consuming 1 unit less in period t

we will have 1 unit more to consume in period t+1. If we lower our consumption by 1 unit attime t this period, the net impact on our maximized utility is U(Ct) 1Ct . Increasingour consumption by 1 unit next period will have a net impact on our maximized utility ofU(Ct+1) 1

Ct+1. Since this gain is in the next period, we need to discount it, which gives

us 1Ct+1

.

At the optimal point, the discounted gain from any feasible reallocation is zero, so we get 1

Ct+1 1

Ct= 0

This difference equation (lagged one period), combined with the budget constraint t =t1 Ct and initial and terminal conditions (0) and (T) form a two variable system ofdifference equations which can be solved to find out the time path of consumption.

Profit Maximization

The second problem was the multi-period firm profit maximization decisionmaxI1,IT,L1,LT

Tt=1

1

1+r

t[F(Kt, Lt) wtLt tIt] subject to the constraints

Kt+1 = (1 )Kt + It for t = 1, 2, , T and K1,KT are given.

Couple of differences from the previous example: The state variable here is Kt. It has a tsubscript because that is what we inherit from the past (i.e it is the value of the variable atthe beginning of time t). There are now TWO choice variables It and Lt.

The recursive Bellman equation is

Vt(Kt) = maxIt,Lt

F(Kt, Lt) wtLt tIt +

1

1 + r

Vt+1(Kt+1)

where Kt+1 = (1 )Kt + It

Once again the maximized value of profits from time t onwards can be recursively defined asthe optimal value(s) of the choice variables at time t taking into account the direct impactof thee variables on the objective function today and the indirect effect on the state variable,

which affects the optimal choices that can be made next period. wt is the real wage and t isthe real price of investment goods.

Note that we have to keep the subscript t on the value function here because the profit functionchanges depending on the wage prevailing at that time. So two different time periods willhave two different wages and therefore two different levels of profits even if K, L and I werethe same. In the previous example, the value function did not have a t subscript because theutility function did not have any parameters that varied with time.


6/6

The FOCs of this model areF

Ltwt = 0

t +

1

1 + r

Vt+1(Kt+1)

Kt+1

It t +

1

1 + r

Vt+1(Kt+1) = 0

The first equation says that the marginal product of labor is set equal to its marginal cost.Since labor hiring is not dynamic we get the same condition as in the static case.

The final step is to take the derivative with respect to the state variable so that we caneliminate the V terms. This results in the following equation

Vt (Kt) =F

Kt+

1

1 + r

Vt+1(Kt+1)

Kt+1

Kt F

Kt+

1

1 + r

(1 )Vt+1(Kt+1)

From the FOC we know that

11+r

Vt+1(Kt+1) = t and V

t (Kt) = (1 + r)t1

Putting the last two conditions together, with the envelope condition, we get the Euler equa-tion

(1 + r)t1 =F

Kt+ t(1 )

We can move this one period forward and rearrange as

t =1

1 + r

F

Kt+1+ t+1(1 )

This is the Euler equation for this problem: it relates investment in one period to investmentin the next period. Recall that the Euler equation reflects the core intuition that at the

optimum choices, we cannot gain any profits by making a feasible switch of investment fromone period to the next.

What is a feasible switch in investment? If we invest 1 unit more in period t we can invest(1 ) units less in period t+1 and leave every other periods investment path unchanged.

If we increase investment by 1 unit at time t this period, the net impact on our maximizedprofits is t+

1

1+r

F

Kt+1. The first term is the cost of buying 1 more unit of the investment

good, the second is the additional profit that we can make in the next period as a result ofincreasing our capital stock by 1 unit via investment.

If we decrease investment next period by (1

) units, it will have a net impact on our

maximized profits of (1 )t+1. Since this is in the next period, we need to discount it,which gives us

1

1+r

(1 )t+1

At the optimal point, the discounted gain from any feasible reallocation is zero, so we gett +

1

1+r

F

Kt+1+

11+r

(1 )t+1 = 0, which is the Euler equation.

Documents

Wellesley Bellman Notes 1