Upload
others
View
11
Download
0
Embed Size (px)
Citation preview
8 CONTINUOUS TIME
DYNAMIC PROGRAMMING
ABSTRACT DEFINITION
iDIE fCx a
a c
0214 ITCH OR ItOBJECTIVE MAXIMIZEsum of REWARDS
T
Lyfe MAX locket.it dt 14kt OVER at c A te IR
pVALUEFUNCTION x T
T
THEH.AM LToN JACoBt BEL QUATONTHE HTB EQUATION
BBBh M
Mta
AHEURISTICD.CI ToNoFTHEHTBEQNWE USE TAYLOR APPROXIMATIONS G cot
Xt s Xt T flat de Ct
G Coco a E l as ccxt.at ft Ct ar cCxt ftte 0,8 T s
THE BELLMAN EQUATION FORTHE DYNAMICPROGRAMWITH COSTS Ct DYNAMICCH IS
Gc Maw doc a S tar Lt xtsfcx.at f
NOTICE THAT
G as gx offCxa Cx
5
deLe x floe a Gc a G
SO THE BELLMAN EQUATION BECOMES
0 MAIN Cca a tot Ca t fGea d LEG Gel
THEOREM'I SUPPOSE cecx.it For A POLICY IT
SATISFIES THE HTB EQUATIONthe tTHEN IT IS OPTIMAL
PROOFI LET It It BEFROMSOMEOTHERPOLICY
INTEGRATWLGIVES
TC x it Tedx E f e elsie e dt
CsC xo.it e Colao
LINEARQUADN.TK 2EG ZATIoN Continuous time
PRoot The HTB EQI IS
o Maw EQ x at Rat OtLte AxtRaf0 4.64WE Guess Lt x XIA t xd 4Cx 2Act x I x set Ct
SUBSTITUTINGGIVES
O MAIN atRa 2xIACt Ba xtQx xiHx t 2x G Ax
a't R BEACH x asabove
RooF_ CONTFINALLY SUBST BACK a GIVES
0 xT Q Act NHA t ATNH Alt BR Bha x
9WHICH IMPLIES THERICCARTI EQUATION MUSTHOLD D