127
Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGill University Joint work: Ashutosh Nayyar (UIUC) and Demosthenis Teneketzis (Univ of Michigan) GERAD Seminar, April 23, 2012

Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Optimal decentralized stochastic control:A common information approach

Aditya MahajanMcGill University

Joint work: Ashutosh Nayyar (UIUC) and Demosthenis Teneketzis (Univ of Michigan)

GERAD Seminar, April 23, 2012

Page 2: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Common theme:

multi-stage multi-agent decision

making under uncertainty

Page 3: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Interconnected Power Systems

Topics

Aditya Mahajan Optimal decentralized stochastic control

Page 4: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Interconnected Power Systems

Topics

Aditya Mahajan Optimal decentralized stochastic control1

Region 1 Region 2

Page 5: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Interconnected Power Systems

Topics

Aditya Mahajan Optimal decentralized stochastic control1

Region 1 Region 2

Controller 1 Controller 2

Page 6: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Interconnected Power Systems

Topics

Aditya Mahajan Optimal decentralized stochastic control1

Region 1 Region 2

Controller 1 Controller 2

Interconnect

Page 7: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Interconnected Power Systems

Topics

Aditya Mahajan Optimal decentralized stochastic control1

Region 1 Region 2

Controller 1 Controller 2

Interconnect

Communication

Page 8: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Interconnected Power Systems

Topics

Aditya Mahajan Optimal decentralized stochastic control1

Region 1 Region 2

Controller 1 Controller 2

Interconnect

Communication

Challenges

How to coordinate?

When, what, and how to communicate?

Page 9: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Sensor and Surveillance Networks

Topics

Aditya Mahajan Optimal decentralized stochastic control2

Page 10: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Sensor and Surveillance Networks

Topics

Aditya Mahajan Optimal decentralized stochastic control2

Limited resources

Page 11: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Sensor and Surveillance Networks

Topics

Aditya Mahajan Optimal decentralized stochastic control2

Limited resources Noisy observations

Page 12: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Sensor and Surveillance Networks

Topics

Aditya Mahajan Optimal decentralized stochastic control2

Fusion Center

Limited resources Noisy observations

Communication

Page 13: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Sensor and Surveillance Networks

Topics

Aditya Mahajan Optimal decentralized stochastic control2

Fusion Center

Limited resources Noisy observations

Communication

Challenges

Real-time communication

Scheduling measurements and communication

Detect node failures

Page 14: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Networked Control Systems

Topics

Aditya Mahajan Optimal decentralized stochastic control3

Page 15: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Networked Control Systems

Topics

Aditya Mahajan Optimal decentralized stochastic control3

Page 16: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Networked Control Systems

Topics

Aditya Mahajan Optimal decentralized stochastic control3

Page 17: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Networked Control Systems

Topics

Aditya Mahajan Optimal decentralized stochastic control3

Challenges

Control and communication over networks

(internet ⇒ delay, wireless ⟹ losses)

Page 18: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Networked Control Systems

Topics

Aditya Mahajan Optimal decentralized stochastic control3

Challenges

Control and communication over networks

(internet ⇒ delay, wireless ⟹ losses)

Distributed estimation

Page 19: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Networked Control Systems

Topics

Aditya Mahajan Optimal decentralized stochastic control3

Challenges

Control and communication over networks

(internet ⇒ delay, wireless ⟹ losses)

Distributed estimation

Distribued learning

Page 20: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Salient features indecentralized decision making

Topics

Aditya Mahajan Optimal decentralized stochastic control4

Multiple decision makersDecisions made by multiple controllers in a stochastic environment

Page 21: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Salient features indecentralized decision making

Topics

Aditya Mahajan Optimal decentralized stochastic control4

Multiple decision makersDecisions made by multiple controllers in a stochastic environment

Coordination issuesAll controllers must coordinate to achieve a system-wide objective

Page 22: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Salient features indecentralized decision making

Topics

Aditya Mahajan Optimal decentralized stochastic control4

Multiple decision makersDecisions made by multiple controllers in a stochastic environment

Coordination issuesAll controllers must coordinate to achieve a system-wide objective

Communication issuesControllers can communicate either directly or indirectly

Page 23: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Salient features indecentralized decision making

Topics

Aditya Mahajan Optimal decentralized stochastic control4

Multiple decision makersDecisions made by multiple controllers in a stochastic environment

Coordination issuesAll controllers must coordinate to achieve a system-wide objective

Communication issuesControllers can communicate either directly or indirectly

RobustnessSystem model may not be completely known

Page 24: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Outline of this talk

Topics

Aditya Mahajan Optimal decentralized stochastic control5

Decentralized stochastic controlClassification and examples

Solution approachesA common information based approach

Delayed sharing information structureStructure of optimal strategies and dynamic programming decomposition

Concluding remarksGeneralizations and Connection to other results

Page 25: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Outline of this talk

Topics

Aditya Mahajan Optimal decentralized stochastic control6

Decentralized stochastic controlClassification and examples

Solution approachesA common information based approach

Delayed sharing information structureStructure of optimal strategies and dynamic programming decomposition

Concluding remarksGeneralizations and Connection to other results

Page 26: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Classification of decentralized systems

Topics

Aditya Mahajan Optimal decentralized stochastic control7

Controllers/agents are coupled in two ways:1. Coupling due to cost/utility

2. Coupling due to dynamics

Page 27: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Classification of decentralized systems

Topics

Aditya Mahajan Optimal decentralized stochastic control7

Controllers/agents are coupled in two ways:1. Coupling due to cost/utility

2. Coupling due to dynamics

Decentralized systems may be classified according to:

Page 28: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Classification of decentralized systems

Topics

Aditya Mahajan Optimal decentralized stochastic control7

Controllers/agents are coupled in two ways:1. Coupling due to cost/utility

2. Coupling due to dynamics

Decentralized systems may be classified according to:1. Objective

Team vs Games

Page 29: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Classification of decentralized systems

Topics

Aditya Mahajan Optimal decentralized stochastic control7

Controllers/agents are coupled in two ways:1. Coupling due to cost/utility

2. Coupling due to dynamics

Decentralized systems may be classified according to:1. Objective

Team vs Games

2. DynamicsStatic vs Dynamic

Page 30: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Classification of decentralized systems

Topics

Aditya Mahajan Optimal decentralized stochastic control7

Controllers/agents are coupled in two ways:1. Coupling due to cost/utility

2. Coupling due to dynamics

Decentralized systems may be classified according to:1. Objective

Team vs Games

2. DynamicsStatic vs Dynamic

This talk will focus on Dynamic Teams

Page 31: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Classification of decentralized systems

Topics

Aditya Mahajan Optimal decentralized stochastic control7

Controllers/agents are coupled in two ways:1. Coupling due to cost/utility

2. Coupling due to dynamics

Decentralized systems may be classified according to:1. Objective

Team vs Games

2. DynamicsStatic vs Dynamic

This talk will focus on Dynamic TeamsStudied in economics and systems and control since the mid 50s.

Unlike games, agents have no incentive to cheat.

Instead of equilibrium, we seek globally optimal strategies.

Page 32: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Why is decentralized

stochastic control difficult?

Page 33: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of centralized static optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control8

� = [ • • • • ]

Page 34: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of centralized static optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control8

� = [ • • • • ]= 1 1 2 2

Page 35: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of centralized static optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control8

� = [ • • • • ]= 1 1 2 2

= ∈ { , , }

Page 36: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of centralized static optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control8

� = [ • • • • ]= 1 1 2 2

= ∈ { , , }

, = • • • •= • • • •= • • • •= ��[ , ]

Page 37: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of centralized static optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control8

� = [ • • • • ]= 1 1 2 2

= ∈ { , , }

, = • • • •= • • • •= • • • •= ��[ , ]Brute force search min� , | | = | || | = 9 possibilities.

Page 38: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of centralized static optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control8

� = [ • • • • ]= 1 1 2 2

= ∈ { , , }

, = • • • •= • • • •= • • • •= ��[ , ]Brute force search min� , | | = | || | = 9 possibilities.

Systematic search + = 6 possibilities= =min �[ , | = ] min �[ , | = ]

Page 39: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of centralized static optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control8

� = [ • • • • ]= 1 1 2 2

= ∈ { , , }

, = • • • •= • • • •= • • • •= ��[ , ]Brute force search min� , | | = | || | = 9 possibilities.

(functional opt.)

Systematic search + = 6 possibilities (parametric opt.)= =min �[ , | = ] min �[ , | = ]

Page 40: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of decentralizedstatic optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control9

� = [ • • • • ]

Page 41: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of decentralizedstatic optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control9

� = [ • • • • ]= 1 1 2 2= 2 1 1 2

Page 42: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of decentralizedstatic optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control9

� = [ • • • • ]= 1 1 2 2= 2 1 1 2= ∈ { , , } = ℎ ∈ { , }

Page 43: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of decentralizedstatic optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control9

� = [ • • • • ]= 1 1 2 2= 2 1 1 2= ∈ { , , } = ℎ ∈ { , }

, , = • • • • • • • •= • • • • • • • •= • • • • • • • •=, ℎ = ��,ℎ[ , , ]

Page 44: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of decentralizedstatic optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control9

� = [ • • • • ]= 1 1 2 2= 2 1 1 2= ∈ { , , } = ℎ ∈ { , }

, , = • • • • • • • •= • • • • • • • •= • • • • • • • •=, ℎ = ��,ℎ[ , , ]

Brute force search min�,ℎ , ℎ , | | = | || |, |ℎ| = | || |,9 × = 6 possibilities.

Page 45: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of decentralizedstatic optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control9

� = [ • • • • ]= 1 1 2 2= 2 1 1 2= ∈ { , , } = ℎ ∈ { , }

, , = • • • • • • • •= • • • • • • • •= • • • • • • • •=, ℎ = ��,ℎ[ , , ]

Brute force search min�,ℎ , | | = | || |, |ℎ| = | || |,9 × = 6 possibilities.

For one controller/agent to choose an optimal action, it must

second guess the other controller’s/agent’s policy

Page 46: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of decentralizedstatic optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control9

� = [ • • • • ]= 1 1 2 2= 2 1 1 2= ∈ { , , } = ℎ ∈ { , }

, , = • • • • • • • •= • • • • • • • •= • • • • • • • •=, ℎ = ��,ℎ[ , , ]

Orthogonal search1. Suppose ℎ is fixed: min�ℎ[ , , | = ], = , , .

2. Suppose is fixed: min��[ , , | = ], = , .

Page 47: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of decentralizedstatic optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control9

� = [ • • • • ]= 1 1 2 2= 2 1 1 2= ∈ { , , } = ℎ ∈ { , }

, , = • • • • • • • •= • • • • • • • •= • • • • • • • •=, ℎ = ��,ℎ[ , , ]

Orthogonal search yields person-by-person opt strategy

1. Suppose ℎ is fixed: min�ℎ[ , , | = ], = , , .

2. Suppose is fixed: min��[ , , | = ], = , .

Page 48: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

To find globally optimal strategies,

in general, we cannot do

better than brute force search

Page 49: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of centralizedmulti-stage optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control10

Page 50: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of centralizedmulti-stage optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control10

==

Page 51: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of centralizedmulti-stage optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control10

= = ∈ { , }=

Page 52: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of centralizedmulti-stage optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control10

= = ∈ { , }== ⟹ = 1 1 2 2= ⟹ = 1 1 2 2= ⟹ = 1 2 2 1= ⟹ = 1 2 2 1

Page 53: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of centralizedmulti-stage optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control10

= = ∈ { , }== ⟹ = 1 1 2 2 = , , ∈ { , }= ⟹ = 1 1 2 2= ⟹ = 1 2 2 1= ⟹ = 1 2 2 1

Page 54: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of centralizedmulti-stage optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control10

= = ∈ { , }== ⟹ = 1 1 2 2 = , , ∈ { , }= ⟹ = 1 1 2 2= ⟹ = 1 2 2 1= ⟹ = 1 2 2 1 , + ,, = �� ,� [ , + , ]

Page 55: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of centralizedmulti-stage optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control10

= = ∈ { , }= = { }= ⟹ = 1 1 2 2 = , , ∈ { , }= ⟹ = 1 1 2 2 = { , , }= ⟹ = 1 2 2 1= ⟹ = 1 2 2 1 , + ,, = �� ,� [ , + , ]

Page 56: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of centralizedmulti-stage optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control10

= = ∈ { , }= = { }= ⟹ = 1 1 2 2 = , , ∈ { , }= ⟹ = 1 1 2 2 = { , , }= ⟹ = 1 2 2 1= ⟹ = 1 2 2 1 , + ,, = �� ,� [ , + , ]Critical Assumption: Centralized information ⊆

Page 57: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Solution approach for centralizedmulti-stage optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control11

Brute force search min� ,� , .| | = | || |, | | = | || |×| |×|� |. × = possiblities.

Page 58: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Solution approach for centralizedmulti-stage optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control11

Brute force search min� ,� , .| | = | || |, | | = | || |×| |×|� |. × = possiblities.

Dynamic programming decomposition= min �[ , | , ]= min �[ , + | , ]

Page 59: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Solution approach for centralizedmulti-stage optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control11

Brute force search min� ,� , . (functional opt.)| | = | || |, | | = | || |×| |×|� |. × = possiblities.

Dynamic programming decomposition (parametric opt.)= min �[ , | , ]= min �[ , + | , ]

Page 60: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Solution approach for centralizedmulti-stage optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control11

Brute force search min� ,� , . (functional opt.)| | = | || |, | | = | || |×| |×|� |. × = possiblities.

Dynamic programming decomposition (parametric opt.)= min �[ , | , ]= min �[ , + | , ]Step 1 works because ℙ | does not depend on .

Step 2 works because ℙ | , does not depend on .

Page 61: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Solution approach for centralizedmulti-stage optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control11

Brute force search min� ,� , . (functional opt.)| | = | || |, | | = | || |×| |×|� |. × = possiblities.

Dynamic programming decomposition (parametric opt.)= min �[ , | , ]= min �[ , + | , ]Step 1 works because ℙ | does not depend on .

Step 2 works because ℙ | , does not depend on .

Both steps work because ⊆

Page 62: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of decentralizedmulti-stage optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control12

= = ∈ { , }= = { }= ⟹ = 1 1 2 2 = ∈ { , }= ⟹ = 1 1 2 2 = { }= ⟹ = 1 2 2 1= ⟹ = 1 2 2 1 , + ,, = �� ,� [ , + , ]

Page 63: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An example of decentralizedmulti-stage optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control12

= = ∈ { , }= = { }= ⟹ = 1 1 2 2 = ∈ { , }= ⟹ = 1 1 2 2 = { }= ⟹ = 1 2 2 1= ⟹ = 1 2 2 1 , + ,, = �� ,� [ , + , ]Critical Assumption: Decentralized information ⊈

Can we do better than brute force search?

Page 64: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Usual Dynamic programming does not work?

Topics

Aditya Mahajan Optimal decentralized stochastic control13

≟ min�� [ , | , ]≟ min�� [ , + | , ]

Page 65: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Usual Dynamic programming does not work?

Topics

Aditya Mahajan Optimal decentralized stochastic control13

≟ min�� [ , | , ]≟ min�� [ , + | , ]A sequential decomposition is possible (Witsenhausen, 1973)

Define � = ℙ | : − .� = min�� ���[ , + + � + | � ]But, the worst case complexity remains the same.

Page 66: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Can we obtain a systematic

approach to find optimal

strategies that does better

than brute force search?

Page 67: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Outline of this talk

Topics

Aditya Mahajan Optimal decentralized stochastic control14

Decentralized stochastic controlClassification and examples

Solution approachesA common information based approach

Delayed sharing information structureStructure of optimal strategies and dynamic programming decomposition

Concluding remarksGeneralizations and Connection to other results

Page 68: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

The intrinsic model forcontrolled dynamical systems

Topics

Aditya Mahajan Optimal decentralized stochastic control15

Ω,ℱ, �

Dynamical Model

Page 69: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

The intrinsic model forcontrolled dynamical systems

Topics

Aditya Mahajan Optimal decentralized stochastic control15

Ω,ℱ, �Dynamical

system

Controller

Dynamical Model

Page 70: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

The intrinsic model forcontrolled dynamical systems

Topics

Aditya Mahajan Optimal decentralized stochastic control15

Ω,ℱ, �Dynamical

system

Controller

Dynamical Model

Page 71: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

The intrinsic model forcontrolled dynamical systems

Topics

Aditya Mahajan Optimal decentralized stochastic control15

Ω,ℱ, �Dynamical

system

Controller

Dynamical Model

Page 72: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

The intrinsic model forcontrolled dynamical systems

Topics

Aditya Mahajan Optimal decentralized stochastic control15

Ω,ℱ, �Dynamical

system

Controller

Ω,ℱ, �

Dynamical Model Intrinsic Model

Page 73: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

The intrinsic model forcontrolled dynamical systems

Topics

Aditya Mahajan Optimal decentralized stochastic control15

Ω,ℱ, �Dynamical

system

Controller

Ω,ℱ, � −

+ +Dynamical Model Intrinsic Model

Page 74: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

The intrinsic model forcontrolled dynamical systems

Topics

Aditya Mahajan Optimal decentralized stochastic control15

Ω,ℱ, �Dynamical

system

Controller

Ω,ℱ, � −

+ +all obs data

: −

Dynamical Model Intrinsic Model

Page 75: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

The intrinsic model forcontrolled dynamical systems

Topics

Aditya Mahajan Optimal decentralized stochastic control15

Ω,ℱ, �Dynamical

system

Controller

Ω,ℱ, � −

+ +

: −

Dynamical Model Intrinsic Model

Page 76: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Information state and a general solutionapproach for centralized stochastic systems

Topics

Aditya Mahajan Optimal decentralized stochastic control16

In a centralized system, i.e., ⊆ + , a

function � = � is an information

state if it satisfies:

1. The controller Markov property��[� + | , ] = �[� + | � , ]2. The expected cost property��[ | , ] = �[ | � , ]

Page 77: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Information state and a general solutionapproach for centralized stochastic systems

Topics

Aditya Mahajan Optimal decentralized stochastic control16

In a centralized system, i.e., ⊆ + , a

function � = � is an information

state if it satisfies:

1. The controller Markov property��[� + | , ] = �[� + | � , ]2. The expected cost property��[ | , ] = �[ | � , ]

Info-state in MDPs: current state

Info-state in POMDPs:

posterior belief on current state

Page 78: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Information state and a general solutionapproach for centralized stochastic systems

Topics

Aditya Mahajan Optimal decentralized stochastic control16

In a centralized system, i.e., ⊆ + , a

function � = � is an information

state if it satisfies:

1. The controller Markov property��[� + | , ] = �[� + | � , ]2. The expected cost property��[ | , ] = �[ | � , ]

Info-state in MDPs: current state

Info-state in POMDPs:

posterior belief on current state

Structure of optimal strategyRestricting attention to control strategies

of the form = �is without any loss.

Page 79: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Information state and a general solutionapproach for centralized stochastic systems

Topics

Aditya Mahajan Optimal decentralized stochastic control16

In a centralized system, i.e., ⊆ + , a

function � = � is an information

state if it satisfies:

1. The controller Markov property��[� + | , ] = �[� + | � , ]2. The expected cost property��[ | , ] = �[ | � , ]

Info-state in MDPs: current state

Info-state in POMDPs:

posterior belief on current state

Structure of optimal strategyRestricting attention to control strategies

of the form = �is without any loss.

Search of optimal strategyAn optimal strategy of the form

above is given by the solution of the

following dynamic program:� = min� �[ + + � + | � , ]

Page 80: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

How do we define an information

state for a decentralized system?

Page 81: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Common Knowledge (Aumann, 1976)

Topics

Aditya Mahajan Optimal decentralized stochastic control17

Ω,ℱ, �

Page 82: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Common Knowledge (Aumann, 1976)

Topics

Aditya Mahajan Optimal decentralized stochastic control17

Ω,ℱ, � � ∩ �

Page 83: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Common Knowledge (Aumann, 1976)

Topics

Aditya Mahajan Optimal decentralized stochastic control17

Ω,ℱ, � � ∩ �

Page 84: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Common Knowledge (Aumann, 1976)

Topics

Aditya Mahajan Optimal decentralized stochastic control17

Ω,ℱ, � � ∩ �

Page 85: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Exploiting common knowledge tosimplify decentralized static optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control18

= , = ℎ, ℎ = ��,ℎ[ , , ]

Page 86: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Exploiting common knowledge tosimplify decentralized static optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control18

= , = ℎ, ℎ = ��,ℎ[ , , ]Let denote the common knowledge

between and . Write:≡ , , ≡ , ,= ˜ , . = ℎ̃ , .

Page 87: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Exploiting common knowledge tosimplify decentralized static optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control18

= , = ℎ, ℎ = ��,ℎ[ , , ]Let denote the common knowledge

between and . Write:≡ , , ≡ , ,= ˜ , . = ℎ̃ , .˜ : , ↦ , ˜ : ↦ ↦⏝⎵⏟⎵⏝�

Page 88: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Exploiting common knowledge tosimplify decentralized static optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control18

= , = ℎ, ℎ = ��,ℎ[ , , ]Let denote the common knowledge

between and . Write:≡ , , ≡ , ,= ˜ , . = ℎ̃ , .˜ : , ↦ , ˜ : ↦ ↦⏝⎵⏟⎵⏝�Let � ⋅ = ˜ , ⋅ and � ⋅ = ℎ̃ , ⋅

Page 89: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Exploiting common knowledge tosimplify decentralized static optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control18

= , = ℎ, ℎ = ��,ℎ[ , , ]Let denote the common knowledge

between and . Write:≡ , , ≡ , ,= ˜ , . = ℎ̃ , .˜ : , ↦ , ˜ : ↦ ↦⏝⎵⏟⎵⏝�Let � ⋅ = ˜ , ⋅ and � ⋅ = ℎ̃ , ⋅A common knowledge based solutionmin�,� ��,�[ , , | ]

Page 90: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Exploiting common knowledge tosimplify decentralized static optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control18

= , = ℎ, ℎ = ��,ℎ[ , , ]Let denote the common knowledge

between and . Write:≡ , , ≡ , ,= ˜ , . = ℎ̃ , .˜ : , ↦ , ˜ : ↦ ↦⏝⎵⏟⎵⏝�Let � ⋅ = ˜ , ⋅ and � ⋅ = ℎ̃ , ⋅A common knowledge based solution (functional opt. over smaller space)min�,� ��,�[ , , | ]

Page 91: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Exploiting common knowledge tosimplify decentralized static optimization

Topics

Aditya Mahajan Optimal decentralized stochastic control18

= , = ℎ, ℎ = ��,ℎ[ , , ]Let denote the common knowledge

between and . Write:≡ , , ≡ , ,= ˜ , . = ℎ̃ , .˜ : , ↦ , ˜ : ↦ ↦⏝⎵⏟⎵⏝�Let � ⋅ = ˜ , ⋅ and � ⋅ = ℎ̃ , ⋅A common knowledge based solution (functional opt. over smaller space)min�,� ��,�[ , , | ]Brute force: × possiblities. CK-based soln: ⋅ × possibilities.

Page 92: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Main idea: Extend CK-based

approach to decentralized

multi-stage systems.

Page 93: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Main idea: Extend CK-based

approach to decentralized

multi-stage systems.

Page 94: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

A common information based approachfor decentralized multi-stage systems

Topics

Aditya Mahajan Optimal decentralized stochastic control19

(Nayyar, 2010; Nayyar, Mahajan, Teneketzis, 2011)

Split data at each controller/agent into two parts:

Common information: = ⋂≥Private information: = ∖

Page 95: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

A common information based approachfor decentralized multi-stage systems

Topics

Aditya Mahajan Optimal decentralized stochastic control19

(Nayyar, 2010; Nayyar, Mahajan, Teneketzis, 2011)

Split data at each controller/agent into two parts:

Common information: = ⋂≥Private information: = ∖

Objective Choose = , to minimize

:� = �� :�[ , :� ]

Page 96: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

A common information based approachfor decentralized multi-stage systems

Topics

Aditya Mahajan Optimal decentralized stochastic control19

(Nayyar, 2010; Nayyar, Mahajan, Teneketzis, 2011)

Split data at each controller/agent into two parts:

Common information: = ⋂≥ ⊆ +Private information: = ∖

Objective Choose = , to minimize

:� = �� :�[ , :� ]

Page 97: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

A common information based approachfor decentralized multi-stage systems

Topics

Aditya Mahajan Optimal decentralized stochastic control19

(Nayyar, 2010; Nayyar, Mahajan, Teneketzis, 2011)

Split data at each controller/agent into two parts:

Common information: = ⋂≥ ⊆ +Private information: = ∖

Objective Choose = , to minimize

:� = �� :�[ , :� ]Solution approach

1. Construct a coordinated system (that has classical info-struct.)

2. Show that coordinated system ≡ original system.

3. Find a solution to coordinated system using centralized stoc. control.

4. Translate the result back to original system

Page 98: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

A common information based approachfor decentralized multi-stage systems

Topics

Aditya Mahajan Optimal decentralized stochastic control20

⋯ ⋯ ����

Page 99: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

A common information based approachfor decentralized multi-stage systems

Topics

Aditya Mahajan Optimal decentralized stochastic control20

⋯ ⋯ ����

Coordinator{…, , …}{…, � , …}

Prescription: � : ↦ ,

chosen according to� = , � : −= �

Page 100: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

A common information based approachfor decentralized multi-stage systems

Topics

Aditya Mahajan Optimal decentralized stochastic control20

⋯ ⋯ ����

Coordinator{…, , …}{…, � , …}

Prescription: � : ↦ ,

chosen according to� = , � : −= �The two systems are equivalent, = �⏟�� �,� :�−

Page 101: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

A common information based approachfor decentralized multi-stage systems

Topics

Aditya Mahajan Optimal decentralized stochastic control20

⋯ ⋯ ����

Coordinator{…, , …}{…, � , …}

Prescription: � : ↦ ,

chosen according to� = , � : −= �The two systems are equivalent, = �⏟�� �,� :�−Coordinated system is centralized

Find information state � .

Without loss of optimality, choose � = �Write DP in terms of � : � = min�� �[ ⋅ + + � + | � , � ]

Page 102: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

A common information based approachfor decentralized multi-stage systems

Topics

Aditya Mahajan Optimal decentralized stochastic control20

⋯ ⋯ ����

Coordinator{…, , …}{…, � , …}

Prescription: � : ↦ ,

chosen according to� = , � : −= �The two systems are equivalent, = �⏟�� �,� :�−Coordinated system is centralized

Find information state � .

Without loss of optimality, choose � = � ≡ = � ,Write DP in terms of � : � = min�� �[ ⋅ + + � + | � , � ]

Page 103: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Outline of this talk

Topics

Aditya Mahajan Optimal decentralized stochastic control21

Decentralized stochastic controlClassification and examples

Solution approachesA common information based approach

Delayed sharing information structureStructure of optimal strategies and dynamic programming decomposition

Concluding remarksGeneralizations and Connection to other results

Page 104: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Delayed sharing information structure

Topics

Aditya Mahajan Optimal decentralized stochastic control22

Sys

Obs channel

Obs channel

Controller 1

Controller 2

Page 105: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Delayed sharing information structure

Topics

Aditya Mahajan Optimal decentralized stochastic control22

Sys

Obs channel

Obs channel

Controller 1

Controller 2

+ = , : , = ℎ , �

Page 106: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Delayed sharing information structure

Topics

Aditya Mahajan Optimal decentralized stochastic control22

Sys

Obs channel

Obs channel

Controller 1

Controller 2

+ = , : , = ℎ , ��-step delayed info sharing Perfect recall at controller

Page 107: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Delayed sharing information structure

Topics

Aditya Mahajan Optimal decentralized stochastic control22

Sys

Obs channel

Obs channel

Controller 1

Controller 2

+ = , : , = ℎ , ��-step delayed info sharing Perfect recall at controller,:� = �� .:�[ , , ]

Page 108: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Literature Overview

Topics

Aditya Mahajan Optimal decentralized stochastic control23

(Witsenhausen, 1971):

Proposed delayed-sharing information structure.

Asserted a structure of optimal control law (without proof).

Page 109: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Literature Overview

Topics

Aditya Mahajan Optimal decentralized stochastic control23

(Witsenhausen, 1971):

Proposed delayed-sharing information structure.

Asserted a structure of optimal control law (without proof).

(Varaiya and Walrand, 1978):

Proved Witsenhausen’s assertion for � = .

Counter-example to disproved the assertion for delay � > .

Page 110: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Literature Overview

Topics

Aditya Mahajan Optimal decentralized stochastic control23

(Witsenhausen, 1971):

Proposed delayed-sharing information structure.

Asserted a structure of optimal control law (without proof).

(Varaiya and Walrand, 1978):

Proved Witsenhausen’s assertion for � = .

Counter-example to disproved the assertion for delay � > .

The result of one-step delayed sharing used in various applications:

Queueing theory: Kuri and Kumar, 1995

Communication networks: Altman et. al, 2009, Grizzle et. al, 1982

Stochastic games: Papavassilopoulos, 1982; Chang and Cruz, 1983

Economics: Li and Wu, 1991

Page 111: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Solution based on commoninformation approach

Topics

Aditya Mahajan Optimal decentralized stochastic control24

Common information = ,: −�, ,: −� .

Private information � = −�+ : , −�+ : −Control actions = , � , = , �

Page 112: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Solution based on commoninformation approach

Topics

Aditya Mahajan Optimal decentralized stochastic control24

Common information = ,: −�, ,: −� .

Private information � = −�+ : , −�+ : −Control actions = , � , = , �Coordinated System

Data observerd (increasing with time)

Control actions � , � , where � : � ↦

Page 113: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Solution based on commoninformation approach

Topics

Aditya Mahajan Optimal decentralized stochastic control24

Common information = ,: −�, ,: −� .

Private information � = −�+ : , −�+ : −Control actions = , � , = , �Coordinated System

Data observerd (increasing with time)

Control actions � , � , where � : � ↦Find a solution to the coordinated system and translate it back to

the original system.

Page 114: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

The coordinated system:state for I/O mapping

Topics

Aditya Mahajan Optimal decentralized stochastic control25

� �� , �

� �,

Page 115: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

The coordinated system:state for I/O mapping

Topics

Aditya Mahajan Optimal decentralized stochastic control25

� �� , �

� �,

State for I/O mapping: , � , �

Page 116: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Information state for coordinated system

Topics

Aditya Mahajan Optimal decentralized stochastic control26

The coordinated system is a centralized partially observed system.

Info state = ℙ state for I/O mapping | data at controller

Page 117: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Information state for coordinated system

Topics

Aditya Mahajan Optimal decentralized stochastic control26

The coordinated system is a centralized partially observed system.

Info state = ℙ state for I/O mapping | data at controller� = ℙ , � , � | , � , �Structural Result There is no loss of optimality in restricting

prescriptions of the form� = � and hence, = � , �

Page 118: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Information state for coordinated system

Topics

Aditya Mahajan Optimal decentralized stochastic control26

The coordinated system is a centralized partially observed system.

Info state = ℙ state for I/O mapping | data at controller� = ℙ , � , � | , � , �Structural Result There is no loss of optimality in restricting

prescriptions of the form� = � and hence, = � , �Dynamic Programming decomposition An optimal coordination strategy

is given by the solution to the following dynamic program� = min�� ,�� �[ , � � , � � + + � + | � , � , � ]

Page 119: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Information state for coordinated system

Topics

Aditya Mahajan Optimal decentralized stochastic control26

The coordinated system is a centralized partially observed system.

Info state = ℙ state for I/O mapping | data at controller� = ℙ , � , � | , � , �Structural Result There is no loss of optimality in restricting

prescriptions of the form� = � and hence, = � , �Dynamic Programming decomposition An optimal coordination strategy

is given by the solution to the following dynamic program� = min�� ,�� �[ , � � , � � + + � + | � , � , � ]Setting � , � = � � gives optimal control strategy.

Page 120: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

An easy solution to long

standing open problem

Page 121: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Outline of this talk

Topics

Aditya Mahajan Optimal decentralized stochastic control27

Decentralized stochastic controlClassification and examples

Solution approachesA common information based approach

Delayed sharing information structureStructure of optimal strategies and dynamic programming decomposition

Concluding remarksGeneralizations and Connection to other results

Page 122: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Connections

Topics

Aditya Mahajan Optimal decentralized stochastic control28

Many existing results on decentralized control are special casesDelayed state sharing (Aicardi et al, 1987)

Periodic sharing information structures (Ooi et al, 1997)

Control sharing (Bismut, 1972; Sandell and Athans, 1974; Mahajan 2011)

Finite sate memory controllers (Sandell, 1974, Mahajan, 2008)

Page 123: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Connections

Topics

Aditya Mahajan Optimal decentralized stochastic control28

Many existing results on decentralized control are special casesDelayed state sharing (Aicardi et al, 1987)

Periodic sharing information structures (Ooi et al, 1997)

Control sharing (Bismut, 1972; Sandell and Athans, 1974; Mahajan 2011)

Finite sate memory controllers (Sandell, 1974, Mahajan, 2008)

Generalization to other modelsInfinite horizon (discounted and average cost) models using

standard results for POMDPs

Computation algorithms based on algorithms for POMDPs

Extend results to systems with unknown models based on

Q-learning and adaptive control algorithms

Page 124: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Conclusion

Topics

Aditya Mahajan Optimal decentralized stochastic control29

Summary of the main ideaFind common information at the controllers

Look from the point of view of a coordinator that observes common

information and chooses prescriptions to the controllers

Find information state for the coordinated system and use it to set

up a dynamic program

Page 125: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Conclusion

Topics

Aditya Mahajan Optimal decentralized stochastic control29

Summary of the main ideaFind common information at the controllers

Look from the point of view of a coordinator that observes common

information and chooses prescriptions to the controllers

Find information state for the coordinated system and use it to set

up a dynamic program

Future DirectionsComputational algorithms

Connections with sequential games

Connections with large scale systems/mean field theory

Page 126: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

Thank you

Page 127: Optimal decentralized stochastic control: A common ...amahaj1/talks/gerad-2012.pdf · Optimal decentralized stochastic control: A common information approach Aditya Mahajan McGillUniversity

References

Topics

Aditya Mahajan Optimal decentralized stochastic control30

1. A. Nayyar, A. Mahajan, D. Teneketzis,

Optimal control strategies for delayed sharing information structures,

IEEE Trans. on Automatic Control, vol. 56, no. 7, pp. 1606-1620, July 2011.

2. A. Nayyar, A. Mahajan, D. Teneketzis,

Dynamic programming for decentralized stochastic control with partial

information sharing: a common information approach,

submitted to IEEE Trans. on Automatic Control, Dec 2011.