DYNAMIC PROGRAMMING · 08/03/2020  · DYNAMIC PROGRAMMING. ABSTRACT DEFINITION i DIE fCxa a c 0214...

Preview:

Citation preview

8 CONTINUOUS TIME

DYNAMIC PROGRAMMING

ABSTRACT DEFINITION

iDIE fCx a

a c

0214 ITCH OR ItOBJECTIVE MAXIMIZEsum of REWARDS

T

Lyfe MAX locket.it dt 14kt OVER at c A te IR

pVALUEFUNCTION x T

T

THEH.AM LToN JACoBt BEL QUATONTHE HTB EQUATION

BBBh M

Mta

AHEURISTICD.CI ToNoFTHEHTBEQNWE USE TAYLOR APPROXIMATIONS G cot

Xt s Xt T flat de Ct

G Coco a E l as ccxt.at ft Ct ar cCxt ftte 0,8 T s

THE BELLMAN EQUATION FORTHE DYNAMICPROGRAMWITH COSTS Ct DYNAMICCH IS

Gc Maw doc a S tar Lt xtsfcx.at f

NOTICE THAT

G as gx offCxa Cx

5

deLe x floe a Gc a G

SO THE BELLMAN EQUATION BECOMES

0 MAIN Cca a tot Ca t fGea d LEG Gel

THEOREM'I SUPPOSE cecx.it For A POLICY IT

SATISFIES THE HTB EQUATIONthe tTHEN IT IS OPTIMAL

PROOFI LET It It BEFROMSOMEOTHERPOLICY

INTEGRATWLGIVES

TC x it Tedx E f e elsie e dt

CsC xo.it e Colao

LINEARQUADN.TK 2EG ZATIoN Continuous time

PRoot The HTB EQI IS

o Maw EQ x at Rat OtLte AxtRaf0 4.64WE Guess Lt x XIA t xd 4Cx 2Act x I x set Ct

SUBSTITUTINGGIVES

O MAIN atRa 2xIACt Ba xtQx xiHx t 2x G Ax

a't R BEACH x asabove

RooF_ CONTFINALLY SUBST BACK a GIVES

0 xT Q Act NHA t ATNH Alt BR Bha x

9WHICH IMPLIES THERICCARTI EQUATION MUSTHOLD D

Recommended