Chapter 13 Stochastic Optimal Control The state of the system is represented by a controlled stochastic process. Section 13.2 formulates a stochastic optimal

Chapter 13 Stochastic Optimal Control

The state of the system is represented by a controlled

stochastic process. Section 13.2 formulates a stochastic

optimal control problem. We shall consider stochastic

differential equations of a type known as Itô equations,

which are perturbed by Markov diffusion processes,

and our goal will be to synthesize optimal feedback

controls for systems subject to Itô equations.

In Section 13.3, we shall extend the production

planning model of Chapter 6.

In Section 13.4, we solve an optimal stochastic

advertising problem.

In Section 13.5, we will introduce investment decisions

in the consumption model of Example 1.3, and

consider both risk-free and risky investments.

In Section 13.6, we will conclude the chapter by

mentioning stochastic optimal control problems

involving jump Markov processes, which are treated in

Sethi and Zhang (1994a, 1994c).

13.2 Stochastic Optimal Control

We assume the state variables to be observable, and

we use the dynamic programming or the Hamilton-

Jacobi-Bellman framework rather than stochastic

maximum principles.

Maximize

subject to the Itô stochastic differential equation

V(x,t), the value function satisfies

By Taylor’s expansion, we have

From (3.29), we can formally write

The multiplication rules of the stochastic calculus are:

13.3 A stochastic Production Planning Model

Xt = the inventory level at time t (state variable),

Ut = the production rate at time t (control variable),S = the constant demand rate at time t ; S >0,T = the length of planning period, = the factory-optimal inventory level, = the factory-optimal production level,

x0 = the initial inventory level, h = the inventory holding cost coefficient, c = the production cost coefficient, B = the salvage value per unit of inventory at time T,

zt = the standard Wiener process, = the constant diffusion coefficient.

Let V(x,t) be the value function. It satisfies the Hamilton-Jacobi-Bellman (HJB) equation

Remark 13.1: If production rate were restricted to be

nonnegative, then (13.44) would be changed to

13.3.1 Solution for the Production Planning Problem

Since (13.51) must hold for any value of x , we musthave

where the boundary conditions are obtained by comparing (13.47) with the boundary condition V(x,T) = Bx of (13.43).

To solve (13.52), we expand by partial

fractions to obtain

where

Since S is assumed to be a constant, we can reduce

(13.53) to

By the change of variable defined by the

solution is given by

Remark 13.2 The optimal production rate in (13.59)

equals the demand plus a correction term which

depends on the level of inventory and the distance

from the horizon time T. Since (y-1) < 0 for t < T, it is

clear that for lower values of x, the optimal production

rate is likely to be positive. However, if x is very high,

the correction term will become smaller than –S, and

the optimal control will be negative. In other words, if

inventory level is too high, the factory can save money

by disposing a part of the inventory resulting in lower

holding costs.

Figure 13.2: A Sample Path of Xt with X0=x0 >0 and

B > 0

13.4 A Stochastic Advertising Problem

The model is :

where Xt is the market share and Ut is the rate of

advertising at time t .

V(x) is the expected value of the discounted profits

from time t to infinity. Since T = , the future looks the

same from any time t, and therefore the value function

does not depend on t.

We can write the HJB equation as

We obtain the explicit formula for the optimal feedback

control as

Eventually, the market share process hovers around

the equilibrium level

13.5 An Optimal Consumption-Investment Problem

Consider investing a part of Rich’s wealth in a

risky security or stock that earns an expected rate of

return that equals > r . The problem of Rich, known

now as Rich Investor, is to optimally allocate his

wealth between the risky-free savings account and the

risky stock over time and also consume over time so

as to maximize his total utility of consumption.

The savings account is easy to model.

Modeling the stock is more complicated.

where is the average rate of return on stock, is

the standard deviation associated with the return, and

zt is a standard Wiener process.

The price process Pt given by (13.74) is often referred

to as a logarithmic Brownian Motion.

Notation:

Wt = the wealth at time t ,

Ct = the consumption rate at time t ,

Qt = the fraction of the wealth invested in stock at time t,

1-Qt= the fraction of the wealth kept in the savings account at time t ,U(c) = the utility of consumption when consumption is at the rate c; the function U(c) is assumed to be increasing and concave, = the rate of discount applied to consumption utility, B = the bankruptcy parameter to be explained later.

The term QtWtdt represents the expected return from

the risky investment of QtWt dollars during the period

from t to t+dt. The term QtWtdzt represents the risk

involved in investing QtWt dollars in stock. The term

(1-Qt)rWt dt is the amount of interest earned on the

balance of (1-Qt)Wt dollars in the savings account.

Finally, Ctdt represents the amount of consumption

during the interval from t to t+dt .

We shall say that Rich goes bankrupt at time T , when

his wealth falls to zero at that time. T is a random

variable, called a stopping time, since it is observed

exactly at the instant of time when wealth falls to zero.

Rich’s objective function is:

See Sethi(1997a) for a detailed discussion of the

bankruptcy parameter B .

Simplify the problem by assuming:

We also assume B =-. The condition (13.81) together

with B =- implies a strictly positive consumpton level

at all times and no bankruptcy.

Karatzas, Lehoczky, Sethi, and Shreve(1986) assumed

that the value function is strictly concave and, therefore,

Vx is monotonically decreasing in x . This means that

The function c(•) defined in (13.83) has an inverse X(•)

such that (13.84) can be written as

Note that c(X(c))=c, c’(X)X’(c)=1, and therefore,

c’(X)= 1/ X’(c).

Differentiation with respect to c yields the intendedsecond-order, linear ordinary differential equation

This problem and its many extensions have been studied in great detail. See,e.g., Sethi(1997a).

13.6 Concluding Remarks

For stochastic optimal control application to

manufacturing problems, see Sethi and Zhang (1994a)

and Yin and Zhang (1997). For applications to

problems in finance, see Sethi(1997a) and Karatzas

and Shreve(1998). For applications in marketing, see

Tapiero(1988). For applications in economics including

economics of natural resources, see Derzko and

Sethi(1981a), and Malliaris and Brock(1982).

Documents

Chapter 13 Stochastic Optimal Control The state of the system is represented by a controlled stochastic process. Section 13.2 formulates a stochastic optimal