Dynamic Programming Lecture 13 (5/21/2014). - A Forest Thinning Example - 1 260 2 650 3 535 4 410 5 850 6 750 7 650 8 600 9 500 10 400 11 260 0 50 100

Dynamic Programming

Lecture 13 (5/21/2014)

- A Forest Thinning Example -

1260

2650

3535

4410

5850

6750

7650

8600

9500

10400

11260

0

50

100

0150

200

850

750

650

600

500

400

0

0

100

175

150

75

Stage1

Stage2

Stage3

Age10

Age20

Age30

Age10

Source: Dykstra’s Mathematical Programming for Natural Resource Management (1984)

Projected yield (m3/ha) at age 20 as function of action taken at age 10

Beginning Volume Residual Ten-year Volumevolume thinned volume growth at age 20

260 0 260 390 650260 50 210 325 535260 100 160 250 410

Projected yield (m3/ha) at age 30 as function of beginning volume at age 20 and action taken at age 20

Beginning Volume Residual Ten-year Volumevolume thinned volume growth at age 30

0 650 200 850150 500 250 750200 450 200 650

0 535 215 750100 435 215 650175 360 140 500

0 410 190 60075 335 165 500

150 260 140 400410

Age 10

Age 20

650

535

Dynamic Programming Cont.

• Stages, states and actions (decisions)

• Backward and forward recursion

decision variable: immediate destination node at stage i;

( , ) the maximum total volume harvested during all remaining stages,

given that the stand has reached the state corr

i

i i

x

f s x

*

* * *

esponding to node s

at stage i and the immediate destination node is ;

the value of that maximizes ( , ); and

( ) the maximum value of ( , ), i.e., ( ) ( ,

i

i i i i

i i i i i i

x

x x f s x

f s f s x f s f s x

).

Solving Dynamic Programs

• Recursive relation at stage i (Bellman Equation):

* *, 1

,

( ) max ( ) ,

where is the volume harvested by thinning at the beginning of stage

in order to move the stand from state at the beginning of

s

ii

i

i s x i ix

s x

f s H f x

H i

s

tage to state at the end of stage .ii x i

Dynamic Programming

• Structural requirements of DP– The problem can be divided into stages– Each stage has a finite set of associated

states (discrete state DP)– The impact of a policy decision to transform a

state in a given stage to another state in the subsequent stage is deterministic

– Principle of optimality: given the current state of the system, the optimal policy for the remaining stages is independent of any prior policy adopted

Examples of DP

• The Floyd-Warshall Algorithm (used in the Bucket formulation of ARM):

Let ( , ) denote the length (or weight) of the path

between node and ; and

Let ( , , ) denote the length (or total weight) of

the shortest path between nodes i and

c i j

i j

s i j k

j

going through intermediate node 1, 2, ...,k.

Then, the following recursion will give the length of the

shortest paths between all pairs of nodes in a graph:

( , ) if 1( , , )

min ( , , 1), ( , , 1) ( , , 1) otherwise

c i j ks i j k

s i j k s i k k s k j k

Examples of DP cont.• Minimizing the risk of losing an endangered species

(non-linear DP):

Source: Buongiorno and Gilles’ (2003) Decision Methods for Forest Resource Management

Probability of Project Failure

Project (stage) $0 $1M $2M1 50% 30% 20%2 70% 50% 30%3 80% 50% 40%

Funding level (state)

$2M

$2M

$1M

$0

$0

$1M

$2M

Stage 1 Stage 2 Stage 3

$2M$0

$1M

$1M

$0

$2M

$0

$1M

$0

$2M

$2M

$1M

$0

Minimizing the risk of losing an endangered species (DP example)

• Stages (t=1,2,3) represent $ allocation decisions for Project 1, 2 and 3;• States (i=0,1,2) represent the budgets available at each stage t;• Decisions (j=0,1,2) represent the budgets available at stage t+1;• denotes the probability of failure of Project t if decision j is made (i.e., i-j is spent on Project t); and• is the smallest probability of total failure from stage t onward, starting in state i and making the best decision j*.

( , )tp i j

*( )tV i

* *1( ) min ( , ) ( )t t t

jV i p i j V j

Then, the recursive relation can be stated as:

Markov Chains(based on Buongiorno and Gilles 2003)

Volume by stand statesState i Volume (m3/ha)

L <400M 400-700H 700<

20-yr Transition probabilities w/o managementStartState i L M H

L 40% 60% 0%M 0% 30% 70%H 5% 5% 90%

End State j

Transition probability matrix P

= probability distribution of stand states in period tp t

-1 t tp p P t

• Vector pt converges to a vector of steady-state probabilities p*.• p* is independent of p0!

Markov Chains cont.

• Berger-Parker Landscape Index

• Mean Residence Times

• Mean Recurrence Times

max( , , )Lt Mt Ht

tLt Mt Ht

p p pBP

p p p

,where is the probability that the stand

is in state , or in period .itp

i L M H t

(1 )iii

Dm

p

,where D is the length of each period, and

is the probability that a stand in state

in the beginning of period stays there till

the end of period .

iip i

t

t

iii

Dm

,where is the steady-state probability of state . i i

Markov Chains cont.• Forest dynamics (expected revenues or biodiversity with

vs. w/o management:

20-yr Transition probabilities w/o managementStartState i L M H

L 40% 60% 0%M 0% 30% 70%H 5% 5% 90%

20-yr Transition probabilities w/ managementStartState i L M H

L 40% 60% 0%M 0% 30% 70%H 40% 60% 0%

End State j

End State j

Expected long-term biodiversity:

Expected long-term periodic income: L L M M H H

L L M M H H

B B B B

R R R R

Markov Chains cont.

• Present value of expected returns

, 1 20

,

1

(1 )

where is the present value of the expected return from

a stand in state in state = , or managed with a

specific harvest policy with periods to go befo

i t i iL Lt iM Mt iH Ht

i t

V R p V p V p Vr

V

i L M H

t

re the end of

the planning horizon. is the immediate return from

managing a stand in state with the given harvest policy.

, , are the probabilities that a stand in state moves to

state

i

iL iM iH

R

i

p p p i

, or , respectively.L M H

Markov Decision Processes20-yr Transition probabilities depending on the harvest decision

Start StartState i L M H State i L M H

L 40% 60% 0% L 40% 60% 0%M 0% 30% 70% M 40% 60% 0%H 5% 5% 90% H 40% 60% 0%

End State j End State jNo Cut Cut

* * * *, 1 20

*, 1

1max

(1 )

where is the highest present value of the expected return from

a stand in state in state = , or managed with a

specific harvest pol

i t ij iLj Lt iMj Mt iHj Htj

i t

V R p V p V p Vr

V

i L M H

icy with periods to go before the end of

the planning horizon. is the immediate return from

managing a stand in state with harvest policy .

, , are the probabilities that a stand

ij

iLj iMj iHj

t

R

i j

p p p in state moves to

state , or , respectively if the harvest policy is .

i

L M H j

Documents

Dynamic Programming Lecture 13 (5/21/2014). - A Forest Thinning Example - 1 260 2 650 3 535 4 410 5 850 6 750 7 650 8 600 9 500 10 400 11 260 0 50 100