10
8 CONTINUOUS TIME DYNAMIC PROGRAMMING

DYNAMIC PROGRAMMING · 08/03/2020  · DYNAMIC PROGRAMMING. ABSTRACT DEFINITION i DIE fCxa a c 0214 ITCH OR It OBJECTIVE MAXIMIZEsum of REWARDS T Lyfe MAX locket.it dt ... AHEURISTICD.CI

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DYNAMIC PROGRAMMING · 08/03/2020  · DYNAMIC PROGRAMMING. ABSTRACT DEFINITION i DIE fCxa a c 0214 ITCH OR It OBJECTIVE MAXIMIZEsum of REWARDS T Lyfe MAX locket.it dt ... AHEURISTICD.CI

8 CONTINUOUS TIME

DYNAMIC PROGRAMMING

Page 2: DYNAMIC PROGRAMMING · 08/03/2020  · DYNAMIC PROGRAMMING. ABSTRACT DEFINITION i DIE fCxa a c 0214 ITCH OR It OBJECTIVE MAXIMIZEsum of REWARDS T Lyfe MAX locket.it dt ... AHEURISTICD.CI

ABSTRACT DEFINITION

iDIE fCx a

a c

0214 ITCH OR ItOBJECTIVE MAXIMIZEsum of REWARDS

T

Lyfe MAX locket.it dt 14kt OVER at c A te IR

pVALUEFUNCTION x T

T

Page 3: DYNAMIC PROGRAMMING · 08/03/2020  · DYNAMIC PROGRAMMING. ABSTRACT DEFINITION i DIE fCxa a c 0214 ITCH OR It OBJECTIVE MAXIMIZEsum of REWARDS T Lyfe MAX locket.it dt ... AHEURISTICD.CI

THEH.AM LToN JACoBt BEL QUATONTHE HTB EQUATION

BBBh M

Mta

Page 4: DYNAMIC PROGRAMMING · 08/03/2020  · DYNAMIC PROGRAMMING. ABSTRACT DEFINITION i DIE fCxa a c 0214 ITCH OR It OBJECTIVE MAXIMIZEsum of REWARDS T Lyfe MAX locket.it dt ... AHEURISTICD.CI

AHEURISTICD.CI ToNoFTHEHTBEQNWE USE TAYLOR APPROXIMATIONS G cot

Xt s Xt T flat de Ct

G Coco a E l as ccxt.at ft Ct ar cCxt ftte 0,8 T s

THE BELLMAN EQUATION FORTHE DYNAMICPROGRAMWITH COSTS Ct DYNAMICCH IS

Gc Maw doc a S tar Lt xtsfcx.at f

Page 5: DYNAMIC PROGRAMMING · 08/03/2020  · DYNAMIC PROGRAMMING. ABSTRACT DEFINITION i DIE fCxa a c 0214 ITCH OR It OBJECTIVE MAXIMIZEsum of REWARDS T Lyfe MAX locket.it dt ... AHEURISTICD.CI

NOTICE THAT

G as gx offCxa Cx

5

deLe x floe a Gc a G

SO THE BELLMAN EQUATION BECOMES

0 MAIN Cca a tot Ca t fGea d LEG Gel

Page 6: DYNAMIC PROGRAMMING · 08/03/2020  · DYNAMIC PROGRAMMING. ABSTRACT DEFINITION i DIE fCxa a c 0214 ITCH OR It OBJECTIVE MAXIMIZEsum of REWARDS T Lyfe MAX locket.it dt ... AHEURISTICD.CI

THEOREM'I SUPPOSE cecx.it For A POLICY IT

SATISFIES THE HTB EQUATIONthe tTHEN IT IS OPTIMAL

PROOFI LET It It BEFROMSOMEOTHERPOLICY

INTEGRATWLGIVES

TC x it Tedx E f e elsie e dt

CsC xo.it e Colao

Page 7: DYNAMIC PROGRAMMING · 08/03/2020  · DYNAMIC PROGRAMMING. ABSTRACT DEFINITION i DIE fCxa a c 0214 ITCH OR It OBJECTIVE MAXIMIZEsum of REWARDS T Lyfe MAX locket.it dt ... AHEURISTICD.CI

LINEARQUADN.TK 2EG ZATIoN Continuous time

Page 8: DYNAMIC PROGRAMMING · 08/03/2020  · DYNAMIC PROGRAMMING. ABSTRACT DEFINITION i DIE fCxa a c 0214 ITCH OR It OBJECTIVE MAXIMIZEsum of REWARDS T Lyfe MAX locket.it dt ... AHEURISTICD.CI

PRoot The HTB EQI IS

o Maw EQ x at Rat OtLte AxtRaf0 4.64WE Guess Lt x XIA t xd 4Cx 2Act x I x set Ct

SUBSTITUTINGGIVES

O MAIN atRa 2xIACt Ba xtQx xiHx t 2x G Ax

a't R BEACH x asabove

Page 9: DYNAMIC PROGRAMMING · 08/03/2020  · DYNAMIC PROGRAMMING. ABSTRACT DEFINITION i DIE fCxa a c 0214 ITCH OR It OBJECTIVE MAXIMIZEsum of REWARDS T Lyfe MAX locket.it dt ... AHEURISTICD.CI

RooF_ CONTFINALLY SUBST BACK a GIVES

0 xT Q Act NHA t ATNH Alt BR Bha x

9WHICH IMPLIES THERICCARTI EQUATION MUSTHOLD D

Page 10: DYNAMIC PROGRAMMING · 08/03/2020  · DYNAMIC PROGRAMMING. ABSTRACT DEFINITION i DIE fCxa a c 0214 ITCH OR It OBJECTIVE MAXIMIZEsum of REWARDS T Lyfe MAX locket.it dt ... AHEURISTICD.CI