81
Game Theoretic Learning and Social Influence Jeff S. Shamma & IPAM Workshop on Decision Support for Traffic November 16–20, 2015 Jeff S. Shamma Game Theoretic Learning and Social Influence 1/42

Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Game Theoretic Learningand Social Influence

Jeff S. Shamma

&

IPAM Workshop on Decision Support for TrafficNovember 16–20, 2015

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 1/42

Page 2: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Interconnected decisions

Traffic decision makers:

UsersFleet managersInfrastructure planners

cf., Ritter: “Traffic decision support”, Thursday AM

Satisfaction with decision depends on decisions of others.

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 2/42

Page 3: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Interconnected decisions

Traffic decision makers:

UsersFleet managersInfrastructure planners

cf., Ritter: “Traffic decision support”, Thursday AM

Satisfaction with decision depends on decisions of others.

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 2/42

Page 4: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Illustrations

Commuting:

Decision = PathSatisfaction depends on paths of others

Planning:

Decision = Infrastructure allocationSatisfaction depends on utilization

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 3/42

Page 5: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Viewpoint: Game theory

“... the study of mathematical models of conflict and cooperationbetween intelligent rational decision-makers”

Myerson (1991), Game Theory.

Extensive literature...

Game elements:

Players/Agents/ActorsActions/Strategies/ChoicesPreferences over joint choices

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 4/42

Page 6: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Viewpoint: Game theory

“... the study of mathematical models of conflict and cooperationbetween intelligent rational decision-makers”

Myerson (1991), Game Theory.

Extensive literature...

Game elements:

Players/Agents/ActorsActions/Strategies/ChoicesPreferences over joint choices

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 4/42

Page 7: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Outline

Game models

Influence models

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 5/42

Page 8: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Viewpoint: Game theory

Game elements:

Players/Agents/ActorsActions/Strategies/ChoicesPreferences over joint choices

Solution concept: What to expect?

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 6/42

Page 9: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Viewpoint: Game theory

Game elements:

Players/Agents/ActorsActions/Strategies/ChoicesPreferences over joint choices

Solution concept: What to expect?

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 6/42

Page 10: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Making decisions vs modeling decisions

Single decision maker:

Choice set: x ∈ XUtility function: U(x)

x = arg maxx′∈X

U(x ′)

Linear/Convex/Semidefinite/Integer/Dynamic...Programming

Good model?

Issues:

Complexity, randomness, incompleteness, framing...Furthermore...are preferences even consistent with utility function?

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 7/42

Page 11: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Making decisions vs modeling decisions

Single decision maker:

Choice set: x ∈ XUtility function: U(x)

x = arg maxx′∈X

U(x ′)

Linear/Convex/Semidefinite/Integer/Dynamic...Programming

Good model?

Issues:

Complexity, randomness, incompleteness, framing...Furthermore...are preferences even consistent with utility function?

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 7/42

Page 12: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Making decisions vs modeling decisions

Single decision maker:

Choice set: x ∈ XUtility function: U(x)

x = arg maxx′∈X

U(x ′)

Linear/Convex/Semidefinite/Integer/Dynamic...Programming

Good model?

Issues:

Complexity, randomness, incompleteness, framing...Furthermore...are preferences even consistent with utility function?

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 7/42

Page 13: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Modeling decisions in games

Elements: Players, choices, and preferences over joint choices:

Ui (x) = Ui (xi , x−i )

Solution concept: What to expect?

Nash Equilibrium

Everyone’s choice is optimal given the choices of others.

Alternatives:

Bounded rationality modelsHannan consistency, correlated equilibrium, ...

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 8/42

Page 14: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Modeling decisions in games

Elements: Players, choices, and preferences over joint choices:

Ui (x) = Ui (xi , x−i )

Solution concept: What to expect?

Nash Equilibrium

Everyone’s choice is optimal given the choices of others.

Alternatives:

Bounded rationality modelsHannan consistency, correlated equilibrium, ...

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 8/42

Page 15: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Modeling decisions in games

Elements: Players, choices, and preferences over joint choices:

Ui (x) = Ui (xi , x−i )

Solution concept: What to expect?

Nash Equilibrium

Everyone’s choice is optimal given the choices of others.

Alternatives:

Bounded rationality modelsHannan consistency, correlated equilibrium, ...

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 8/42

Page 16: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Illustration: Congestion games (discrete)

Setup:

Players: 1, 2, ..., nSet of resources: R = {r1, ..., rm} (roads)Action sets: Ai ⊂ 2R (paths)Joint action: (a1, a2, ..., an)

Cost: (vs Utility)

Resource level:cr (a) = φr (σr (a)︸ ︷︷ ︸

#users

)

User level:Ci (ai , a−i ) =

∑r∈ai

cr (a)

Nash equilibrium: a∗ = (a∗1 , ..., a∗n)

Ci (a∗i , a∗−i ) ≤ Ci (a

′i , a∗−i )

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 9/42

Page 17: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Illustration: Congestion games (discrete)

Setup:

Players: 1, 2, ..., nSet of resources: R = {r1, ..., rm} (roads)Action sets: Ai ⊂ 2R (paths)Joint action: (a1, a2, ..., an)

Cost: (vs Utility)

Resource level:cr (a) = φr (σr (a)︸ ︷︷ ︸

#users

)

User level:Ci (ai , a−i ) =

∑r∈ai

cr (a)

Nash equilibrium: a∗ = (a∗1 , ..., a∗n)

Ci (a∗i , a∗−i ) ≤ Ci (a

′i , a∗−i )

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 9/42

Page 18: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Illustration: Congestion games (discrete)

Setup:

Players: 1, 2, ..., nSet of resources: R = {r1, ..., rm} (roads)Action sets: Ai ⊂ 2R (paths)Joint action: (a1, a2, ..., an)

Cost: (vs Utility)

Resource level:cr (a) = φr (σr (a)︸ ︷︷ ︸

#users

)

User level:Ci (ai , a−i ) =

∑r∈ai

cr (a)

Nash equilibrium: a∗ = (a∗1 , ..., a∗n)

Ci (a∗i , a∗−i ) ≤ Ci (a

′i , a∗−i )

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 9/42

Page 19: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Social influence: Equilibrium shaping

Claim: For such congestion games, NE minimizes

P(a) =∑r

σr (a)∑k=0

φr (k)

Overall congestion:

G (a) =∑r

φr (σr (a))︸ ︷︷ ︸cost

· σr (a)︸ ︷︷ ︸# users

Claim: Modified resource cost

φr (σr (a)) + (σr (a)− 1) ·(φr (σr (a))− φr (σr (a)− 1)

)︸ ︷︷ ︸

imposition toll

results in NE that minimizes overall congestion.

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 10/42

Page 20: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Social influence: Equilibrium shaping

Claim: For such congestion games, NE minimizes

P(a) =∑r

σr (a)∑k=0

φr (k)

Overall congestion:

G (a) =∑r

φr (σr (a))︸ ︷︷ ︸cost

· σr (a)︸ ︷︷ ︸# users

Claim: Modified resource cost

φr (σr (a)) + (σr (a)− 1) ·(φr (σr (a))− φr (σr (a)− 1)

)︸ ︷︷ ︸

imposition toll

results in NE that minimizes overall congestion.

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 10/42

Page 21: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Social influence: Equilibrium shaping

Claim: For such congestion games, NE minimizes

P(a) =∑r

σr (a)∑k=0

φr (k)

Overall congestion:

G (a) =∑r

φr (σr (a))︸ ︷︷ ︸cost

· σr (a)︸ ︷︷ ︸# users

Claim: Modified resource cost

φr (σr (a)) + (σr (a)− 1) ·(φr (σr (a))− φr (σr (a)− 1)

)︸ ︷︷ ︸

imposition toll

results in NE that minimizes overall congestion.

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 10/42

Page 22: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

NE & Price of X

Discussion:

Model presumes NE as outcomeHow does NE compare to social planner optimal measured by G(·)?

Price-of-Anarchy (PoA):

PoA =maxa∈NE G (a)

mina G (a)

pessimistic ratio of performance at NE vs optimal performance

Price-of-Stability (PoS):

PoS =mina∈NE G (a)

mina G (a)

optimistic ratio of performacne at NE vs optimal performance

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 11/42

Page 23: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

NE & Price of X

Discussion:

Model presumes NE as outcomeHow does NE compare to social planner optimal measured by G(·)?

Price-of-Anarchy (PoA):

PoA =maxa∈NE G (a)

mina G (a)

pessimistic ratio of performance at NE vs optimal performance

Price-of-Stability (PoS):

PoS =mina∈NE G (a)

mina G (a)

optimistic ratio of performacne at NE vs optimal performance

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 11/42

Page 24: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

NE & Price of X

Discussion:

Model presumes NE as outcomeHow does NE compare to social planner optimal measured by G(·)?

Price-of-Anarchy (PoA):

PoA =maxa∈NE G (a)

mina G (a)

pessimistic ratio of performance at NE vs optimal performance

Price-of-Stability (PoS):

PoS =mina∈NE G (a)

mina G (a)

optimistic ratio of performacne at NE vs optimal performance

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 11/42

Page 25: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Illustration: Equal cost sharing

Setup:

Equally shared cost of resource:

α

# users

(vs increasing congestion)High road: n − εLow road: 1

S D

High road

Low road

NE:

All use High road at individual cost n−εn< 1

All use Low road at individual cost 1n

PoX: G (a) is sum of individual costs

PoA ≈ n & PoS = 1

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 12/42

Page 26: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Illustration: Equal cost sharing

Setup:

Equally shared cost of resource:

α

# users

(vs increasing congestion)High road: n − εLow road: 1

S D

High road

Low road

NE:

All use High road at individual cost n−εn< 1

All use Low road at individual cost 1n

PoX: G (a) is sum of individual costs

PoA ≈ n & PoS = 1

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 12/42

Page 27: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Illustration: Equal cost sharing

Setup:

Equally shared cost of resource:

α

# users

(vs increasing congestion)High road: n − εLow road: 1

S D

High road

Low road

NE:

All use High road at individual cost n−εn< 1

All use Low road at individual cost 1n

PoX: G (a) is sum of individual costs

PoA ≈ n & PoS = 1

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 12/42

Page 28: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Illustration: Equal cost sharing, cont.

Setup:

Equally shared cost as beforeUser specific starting points

NE:

All use private resource at individual cost 1All use shared resource at individual cost k

n

PoX: G (a) is sum of individual costs

PoA = n/k & PoS = 1

Balcan, Blum, & Mansour (2013), “Circumventing the Price of Anarchy: Leading Dynamics to Good Behavior”

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 13/42

Page 29: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Illustration: Equal cost sharing, cont.

Setup:

Equally shared cost as beforeUser specific starting points

NE:

All use private resource at individual cost 1All use shared resource at individual cost k

n

PoX: G (a) is sum of individual costs

PoA = n/k & PoS = 1

Balcan, Blum, & Mansour (2013), “Circumventing the Price of Anarchy: Leading Dynamics to Good Behavior”

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 13/42

Page 30: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Illustration: Equal cost sharing, cont.

Setup:

Equally shared cost as beforeUser specific starting points

NE:

All use private resource at individual cost 1All use shared resource at individual cost k

n

PoX: G (a) is sum of individual costs

PoA = n/k & PoS = 1

Balcan, Blum, & Mansour (2013), “Circumventing the Price of Anarchy: Leading Dynamics to Good Behavior”

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 13/42

Page 31: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

PoX analysis

Extensions: Broader solution concepts, various families of games,price-of-uncertainty, price-of-byzantine, price-of- ...

(λ, µ)-smoothness: For any two action profiles, a∗ & a′∑i

Ci (a∗i , a′−i ) ≤ λ

∑i

Ci (a∗) + µ

∑i

C (a′)

Theorem: Under (λ, µ)-smoothness,

PoA ≤ λ

1− µ

Think of a′ as NE and a∗ as central optimum

Roughgarden (2009), “Intrinsic robustness of the price of anarchy”

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 14/42

Page 32: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

PoX analysis

Extensions: Broader solution concepts, various families of games,price-of-uncertainty, price-of-byzantine, price-of- ...

(λ, µ)-smoothness: For any two action profiles, a∗ & a′∑i

Ci (a∗i , a′−i ) ≤ λ

∑i

Ci (a∗) + µ

∑i

C (a′)

Theorem: Under (λ, µ)-smoothness,

PoA ≤ λ

1− µ

Think of a′ as NE and a∗ as central optimum

Roughgarden (2009), “Intrinsic robustness of the price of anarchy”

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 14/42

Page 33: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

PoX analysis

Extensions: Broader solution concepts, various families of games,price-of-uncertainty, price-of-byzantine, price-of- ...

(λ, µ)-smoothness: For any two action profiles, a∗ & a′∑i

Ci (a∗i , a′−i ) ≤ λ

∑i

Ci (a∗) + µ

∑i

C (a′)

Theorem: Under (λ, µ)-smoothness,

PoA ≤ λ

1− µ

Think of a′ as NE and a∗ as central optimum

Roughgarden (2009), “Intrinsic robustness of the price of anarchy”

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 14/42

Page 34: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Outline/Recap

Game models

Setup & equilibriumPrice-of-X

Influence models

Equilibrium shaping

Lingering issues:Uncertain landscapesEquilibrium analysis

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 15/42

Page 35: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Equilibrium shaping & uncertain landscapes

Marginal contribution utility:

Assume global objetive, G(·)Define “null” action ∅Set

U(ai , a−i ) = G(ai , a−i )− G(∅, a−i )

Claim: PoS = 1

Recall:

φr (σr (a)) +

new term︷︸︸︷β (σr (a)− 1) ·

(φr (σr (a))− φr (σr (a)− 1)

)︸ ︷︷ ︸

imposition toll:τ(a)

Uncertain landscape: What if users have different β’s?many more sources of uncertainty...

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 16/42

Page 36: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Equilibrium shaping & uncertain landscapes

Marginal contribution utility:

Assume global objetive, G(·)Define “null” action ∅Set

U(ai , a−i ) = G(ai , a−i )− G(∅, a−i )

Claim: PoS = 1

Recall:

φr (σr (a)) +

new term︷︸︸︷β (σr (a)− 1) ·

(φr (σr (a))− φr (σr (a)− 1)

)︸ ︷︷ ︸

imposition toll:τ(a)

Uncertain landscape: What if users have different β’s?many more sources of uncertainty...

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 16/42

Page 37: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Equilibrium shaping & uncertain landscapes

Marginal contribution utility:

Assume global objetive, G(·)Define “null” action ∅Set

U(ai , a−i ) = G(ai , a−i )− G(∅, a−i )

Claim: PoS = 1

Recall:

φr (σr (a)) +

new term︷︸︸︷β (σr (a)− 1) ·

(φr (σr (a))− φr (σr (a)− 1)

)︸ ︷︷ ︸

imposition toll:τ(a)

Uncertain landscape: What if users have different β’s?many more sources of uncertainty...

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 16/42

Page 38: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Equilibrium shaping under uncertainty

φr (σr (a)) +

new term︷︸︸︷β (σr (a)− 1) ·

(φr (σr (a))− φr (σr (a)− 1)

)︸ ︷︷ ︸

imposition toll:τ(a)

Theorem: As κ→∞

κ · (φr (σr (a)) + τ(a))

leads to PoA→ 1

Extensions: Optimal bounded tolls in special case of parallel links &affine costs.

Brown & Marden (2015), “Optimal mechanisms for robust coordination in congestion games”

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 17/42

Page 39: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Equilibrium shaping under uncertainty

φr (σr (a)) +

new term︷︸︸︷β (σr (a)− 1) ·

(φr (σr (a))− φr (σr (a)− 1)

)︸ ︷︷ ︸

imposition toll:τ(a)

Theorem: As κ→∞

κ · (φr (σr (a)) + τ(a))

leads to PoA→ 1

Extensions: Optimal bounded tolls in special case of parallel links &affine costs.

Brown & Marden (2015), “Optimal mechanisms for robust coordination in congestion games”

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 17/42

Page 40: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Social influence: Mechanism design

Private infoD

=⇒ Social decision

vs

Private infoS

=⇒ MessagesM

=⇒ Social decision

A “mechanism” M is a rule from reports to decisions.

Mechanism M induces a game in reporting strategies.

Seek to implement D as solution of game, i.e.,

D =M◦ S?

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 18/42

Page 41: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Standard illustration: Sealed bid/second price auctions

Private infoD

=⇒ Social decision

vs

Private infoS

=⇒ MessagesM

=⇒ Social decision

Planner objective: Assign item to highest private valuation

Agent objective: Item value minus payment

Messages: Bids

Social decision:

Item to high bidderPayment from high bidder = Second highest bid

Claim: Truthful bidding is a NE.

Special case of broad discussion...

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 19/42

Page 42: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Illustration: Sequential resource allocation

Private infoD

=⇒ Social decision

vs

Private infoS

=⇒ MessagesM

=⇒ Social decision

Private info is revealed sequentially

Agents do not know own valuations in advance

Decisions based on sequential messages

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 20/42

Page 43: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Example

Two users: {1, 2}User’s need for resource is low or high: θi ∈ {L,H}

Planner allocates resource a ∈ {1, 2}Planner objective: Fair allocation with User 2 priority

Dynamics: Coupled state transitions according to 4× 4 matrices over set

{(L, L), (H, L), (L,H), (H,H)}

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 21/42

Page 44: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Example

Two users: {1, 2}User’s need for resource is low or high: θi ∈ {L,H}Planner allocates resource a ∈ {1, 2}

Planner objective: Fair allocation with User 2 priority

Dynamics: Coupled state transitions according to 4× 4 matrices over set

{(L, L), (H, L), (L,H), (H,H)}

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 21/42

Page 45: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Example

Two users: {1, 2}User’s need for resource is low or high: θi ∈ {L,H}Planner allocates resource a ∈ {1, 2}Planner objective: Fair allocation with User 2 priority

Dynamics: Coupled state transitions according to 4× 4 matrices over set

{(L, L), (H, L), (L,H), (H,H)}

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 21/42

Page 46: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Example

Two users: {1, 2}User’s need for resource is low or high: θi ∈ {L,H}Planner allocates resource a ∈ {1, 2}Planner objective: Fair allocation with User 2 priority

Dynamics: Coupled state transitions according to 4× 4 matrices over set

{(L, L), (H, L), (L,H), (H,H)}

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 21/42

Page 47: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Sequential resource allocation, cont

Private infoD

=⇒ Social decision

vs

Private infoS

=⇒ MessagesM

=⇒ Social decision

Planner objective: Induce truthful reporting

Agent objective: Future discounted resource access minus payments

∞∑t=0

δt(vi (π(r t1 , ..., r

tn), θti )− qti (r t1 , ..., r

tn))

Messages: High/Low resource need

Theorem: LP computations so that truthful reporting is a NE.Kotsalis & Shamma (2013): “Dynamic mechanism design in correlated environments”

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 22/42

Page 48: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Example, cont

Optimal efficient policy: Favor Agent 2

π(L, L) = 1/2, π(H, L) = 1, π(L,H) = 2, π(H,H) = 2

Agent 2 can monopolize by misreporting

Payment rule:q1(·, ·) = 0

q2(L, L) < 0, q2(L,H) < 0, q2(H, L) < 0

q2(H,H) > 0

Ex ante payment from agent 2 = 0

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 23/42

Page 49: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Critique

Lingering issues:Uncertain landscapesEquilibrium analysis

Tuned for behavior only at specific Nash equilibrium

Presumes agents solve coordinated (dynamic) optimization

Presumes knowledge of full system dynamics available to all

Neglects model mismatch

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 24/42

Page 50: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Outline/Recap

Game models

Setup & equilibriumPrice-of-X

Influence models

Equilibrium shapingMechanism design

Lingering issues:Uncertain landscapesEquilibrium analysis

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 25/42

Page 51: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Learning/evolutionary games

Shift of focus:

Away from equilibrium—Nash equilibrium

Towards how players might arrive to solution—i.e., dynamics

“The attainment of equilibrium requires a disequilibrium process.”

Arrow, 1987.

“The explanatory significance of the equilibrium concept dependson the underlying dynamics.”

Skyrms, 1992.

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 26/42

Page 52: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Learning/evolutionary games

Shift of focus:

Away from equilibrium—Nash equilibrium

Towards how players might arrive to solution—i.e., dynamics

“The attainment of equilibrium requires a disequilibrium process.”

Arrow, 1987.

“The explanatory significance of the equilibrium concept dependson the underlying dynamics.”

Skyrms, 1992.

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 26/42

Page 53: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Literature

Monographs:

Weibull, Evolutionary Game Theory, 1997.

Young, Individual Strategy and Social Structure, 1998.

Fudenberg & Levine, The Theory of Learning in Games, 1998.

Samuelson, Evolutionary Games and Equilibrium Selection, 1998.

Young, Strategic Learning and Its Limits, 2004.

Sandholm, Population Dynamics and Evolutionary Games, 2010.

Surveys:

Hart, “Adaptive heuristics”, Econometrica, 2005.

Fudenberg & Levine, “Learning and equilibrium”, Annual Review ofEconomics, 2009.

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 27/42

Page 54: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Stability & multi-agent learning

Caution! Single agent learning 6= Multiagent learning

Sato, Akiyama, & Farmer, “Chaos in a simple two-person game”, PNAS, 2002.Piliouras & JSS, “Optimization despite chaos: Convex relaxations to complete limit sets via Poincare recurrence”, SODA, 2014.

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 28/42

Page 55: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Best reply dynamics (with inertia)

ai (t) ∈

{Bi (a−i (t − 1)) w.p. 0 < ρ < 1

ai (t − 1) w.p. 1− ρ

Features:

Pure NE is a stationary point

Based on greedy response to myopic forecast:

aguess−i (t) = a−i (t − 1)?

Need not converge to NE

Theorem

For finite-improvement-property games under best reply with inertia,player strategies converge to NE .

(Includes anonymous congestion games...)

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 29/42

Page 56: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Best reply dynamics (with inertia)

ai (t) ∈

{Bi (a−i (t − 1)) w.p. 0 < ρ < 1

ai (t − 1) w.p. 1− ρ

Features:

Pure NE is a stationary point

Based on greedy response to myopic forecast:

aguess−i (t) = a−i (t − 1)?

Need not converge to NE

Theorem

For finite-improvement-property games under best reply with inertia,player strategies converge to NE .

(Includes anonymous congestion games...)

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 29/42

Page 57: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Best reply dynamics (with inertia)

ai (t) ∈

{Bi (a−i (t − 1)) w.p. 0 < ρ < 1

ai (t − 1) w.p. 1− ρ

Features:

Pure NE is a stationary point

Based on greedy response to myopic forecast:

aguess−i (t) = a−i (t − 1)?

Need not converge to NE

Theorem

For finite-improvement-property games under best reply with inertia,player strategies converge to NE .

(Includes anonymous congestion games...)

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 29/42

Page 58: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Fictitious play

Each player:

Maintain empirical frequencies (histograms) of opposing actionsForecasts (incorrectly) that others play independently according toobserved empirical frequenciesSelects an action that maximizes expected payoff

Compare to best reply:

aguess−i (t) = a−i (t − 1)?

vs

aguess−i (t) ∼ q−i (t) ∈ ∆(A−i )

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 30/42

Page 59: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Fictitious play

Each player:

Maintain empirical frequencies (histograms) of opposing actionsForecasts (incorrectly) that others play independently according toobserved empirical frequenciesSelects an action that maximizes expected payoff

Compare to best reply:

aguess−i (t) = a−i (t − 1)?

vs

aguess−i (t) ∼ q−i (t) ∈ ∆(A−i )

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 30/42

Page 60: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Fictitious play: Convergence results

Theorem

For zero-sum games (1951), 2× 2 games (1961), potential games(1996), and 2× N games (2003) under fictitious play , player empiricalfrequencies converge to NE .

Theorem

For Shapley “fashion game” (1964), Jordan anti-coordination game(1993), Foster & Young merry-go-round game (1998) under fictitiousplay , player empirical frequencies DO NOT converge to NE .

Detail: Discussion extended to mixed/randomized NE

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 31/42

Page 61: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Fictitious play: Convergence results

Theorem

For zero-sum games (1951), 2× 2 games (1961), potential games(1996), and 2× N games (2003) under fictitious play , player empiricalfrequencies converge to NE .

Theorem

For Shapley “fashion game” (1964), Jordan anti-coordination game(1993), Foster & Young merry-go-round game (1998) under fictitiousplay , player empirical frequencies DO NOT converge to NE .

Detail: Discussion extended to mixed/randomized NE

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 31/42

Page 62: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

FP simulations

0 50 100 150 200 2500

0.2

0.4

0.6

0.8

1

0 50 100 150 200 2500

0.2

0.4

0.6

0.8

1

Rock-Paper-Scissors

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 32/42

Page 63: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

FP simulations

Shapley “Fashion Game”R G B

R 0, 1 0, 0 1, 0G 1, 0 0, 1 0, 0B 0, 0 1, 0 0, 1

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 32/42

Page 64: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

FP & large games

FP bookkeeping

Observe actions of all players

Construct probability distribution of all possible opponentconfigurations

Prohibitive for large games

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 33/42

Page 65: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

FP & large games

FP bookkeeping

Observe actions of all players

Construct probability distribution of all possible opponentconfigurations

Prohibitive for large games

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 33/42

Page 66: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Joint strategy FP

Modification:

Maintain empirical frequencies (histograms) of opposing actions

Forecasts (incorrectly) that others play /////////////////independently according toobserved empirical frequencies

Selects an action that maximizes expected payoff

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 34/42

Page 67: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

JSFP bookkeeping

Virtual payoff vector: The payoffs that could have been obtained

Ui (t) =

ui (1, a−i (t))ui (2, a−i (t))

...ui (m, a−i (t))

Time averaged virtual payoff :

Vi (t + 1) = (1− ρ)Vi (t) + ρUi (t)

Stepsize ρ is either constant (fading) or diminishing (averaging)

Equivalent JSFP: At each stage, select best virtual payoff action

Viewpoint: Bookkeeping is oracle based (cf., traffic reports)

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 35/42

Page 68: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

JSFP bookkeeping

Virtual payoff vector: The payoffs that could have been obtained

Ui (t) =

ui (1, a−i (t))ui (2, a−i (t))

...ui (m, a−i (t))

Time averaged virtual payoff :

Vi (t + 1) = (1− ρ)Vi (t) + ρUi (t)

Stepsize ρ is either constant (fading) or diminishing (averaging)

Equivalent JSFP: At each stage, select best virtual payoff action

Viewpoint: Bookkeeping is oracle based (cf., traffic reports)

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 35/42

Page 69: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

JSFP simulation

Anonymous congestion

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 36/42

Page 70: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Outline/Recap

Game models

Setup & equilibriumPrice-of-XLearning/evolutionary games

Influence models

Equilibrium shapingMechanism design

Lingering issues:Uncertain landscapesEquilibrium analysis

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 37/42

Page 71: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Social influence: Sparse seeding

Public service advertisingPhase I:

Receptive agents: Follow planner’s advice (e.g., with probability α)Non-receptive agents: Unilateral best-response dynamics

Phase II: Receptive agents may revert to best-response dynamics

Main results: Desirable bounds on resulting PoA for various settings(anonymous congestion, shared cost, set coverage...)

Compare: Nash equilibrium vs learning agents

Balcan, Blum, & Mansour (2013), “Circumventing the Price of Anarchy: Leading Dynamics to Good Behavior”Balcan, Krehbiel, Piliouras, and Shin (2012), “Minimally invasive mechanism design: Distributed covering with carefully chosenadvice”

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 38/42

Page 72: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Social influence: Sparse seeding

Public service advertisingPhase I:

Receptive agents: Follow planner’s advice (e.g., with probability α)Non-receptive agents: Unilateral best-response dynamics

Phase II: Receptive agents may revert to best-response dynamics

Main results: Desirable bounds on resulting PoA for various settings(anonymous congestion, shared cost, set coverage...)

Compare: Nash equilibrium vs learning agents

Balcan, Blum, & Mansour (2013), “Circumventing the Price of Anarchy: Leading Dynamics to Good Behavior”Balcan, Krehbiel, Piliouras, and Shin (2012), “Minimally invasive mechanism design: Distributed covering with carefully chosenadvice”

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 38/42

Page 73: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Social influence: Sparse seeding

Public service advertisingPhase I:

Receptive agents: Follow planner’s advice (e.g., with probability α)Non-receptive agents: Unilateral best-response dynamics

Phase II: Receptive agents may revert to best-response dynamics

Main results: Desirable bounds on resulting PoA for various settings(anonymous congestion, shared cost, set coverage...)

Compare: Nash equilibrium vs learning agents

Balcan, Blum, & Mansour (2013), “Circumventing the Price of Anarchy: Leading Dynamics to Good Behavior”Balcan, Krehbiel, Piliouras, and Shin (2012), “Minimally invasive mechanism design: Distributed covering with carefully chosenadvice”

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 38/42

Page 74: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Outline/Recap

Game models

Setup & equilibriumPrice-of-XLearning/evolutionary games

Influence models

Equilibrium shapingMechanism designDynamic incentives

Lingering issues:Uncertain landscapesEquilibrium analysis

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 39/42

Page 75: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Social influence & feedback control

vs

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 40/42

Page 76: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Social influence & feedback control

vs

Benefits of feedback (Astrom)

Reliable behavior from unreliable components humans

Mitigate disturbances

Shape dynamic behavior

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 40/42

Page 77: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Dynamic incentive challenges

Modeling:

Order?Time-scale?Heterogeneity?Non-stationarity?Resolution?Social network effects?

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 41/42

Page 78: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Dynamic incentive challenges

Modeling:

Order?Time-scale?Heterogeneity?Non-stationarity?Resolution?Social network effects?

“Thus unless we know quite a lot about the topology of interac-tion and the agents’ decision-making processes, estimates of thespeed of adjustment could be off by many orders of magnitude.”

Young, “Social dynamics: Theory and applications”, 2001.

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 41/42

Page 79: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Dynamic incentive challenges

Modeling:

Order?Time-scale?Heterogeneity?Non-stationarity?Resolution?Social network effects?

Reinforcement learning vs Trend-based reinforcement learning

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 41/42

Page 80: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Dynamic incentive challenges

Modeling:

Order?Time-scale?Heterogeneity?Non-stationarity?Resolution?Social network effects?

Measurement & actuation

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 41/42

Page 81: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over

Recap/Outline/Conclusions

Game models

Setup & equilibriumPrice-of-XLearning/evolutionary games

Influence models

Equilibrium shapingMechanism designDynamic incentives

Challenges

ModelingMeasurement & actuation June 2015

Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 42/42