View
220
Download
0
Category
Preview:
Citation preview
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
1/167
Lecture Notes on Game Theory∗
William H. Sandholm†
October 21, 2014
Contents
0 Basic Decision Theory 30.1 Ordinal Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30.2 Expected Utility and the von Neumann-Morgenstern Theorem . . . . . . . . 40.3 Bayesian Rationality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1 Normal Form Games 81.1 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.1.2 Randomized strategies and beliefs . . . . . . . . . . . . . . . . . . . . 9
1.2 Dominance and Iterated Dominance . . . . . . . . . . . . . . . . . . . . . . . 151.2.1 Strictly dominant strategies . . . . . . . . . . . . . . . . . . . . . . . . 16
1.2.2 Strictly dominated strategies . . . . . . . . . . . . . . . . . . . . . . . 171.2.3 Iterated strict dominance . . . . . . . . . . . . . . . . . . . . . . . . . 181.2.4 Weak dominance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.3 Rationalizability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211.3.1 Definition and examples . . . . . . . . . . . . . . . . . . . . . . . . . . 211.3.2 The separating hyperplane theorem . . . . . . . . . . . . . . . . . . . 261.3.3 A positive characterization . . . . . . . . . . . . . . . . . . . . . . . . 29
1.4 Nash Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311.4.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311.4.2 Computing Nash equilibria . . . . . . . . . . . . . . . . . . . . . . . . 331.4.3 Interpretations of Nash equilibrium . . . . . . . . . . . . . . . . . . . 381.4.4 Existence of Nash equilibrium and structure of the equilibrium set . 41
1.5 Correlated Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
∗Many thanks to Katsuhiko Aiba, Emin Dokumacı, Danqing Hu, Rui Li, Allen Long, Ignacio Monzón,Michael Rapp, and Ryoji Sawa for creating the initial draft of this document from my handwritten notesand various other primitive sources.
†Department of Economics, University of Wisconsin, 1180 Observatory Drive, Madison, WI 53706, USA.e-mail: whs@ssc.wisc.edu; website: http://www.ssc.wisc.edu/˜whs.
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
2/167
1.5.1 Definition and examples . . . . . . . . . . . . . . . . . . . . . . . . . . 421.5.2 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
1.6 The Minmax Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2 Extensive Form Games 56
2.1 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 562.1.1 Defining extensive form games . . . . . . . . . . . . . . . . . . . . . . 562.1.2 Pure strategies in extensive form games . . . . . . . . . . . . . . . . . 612.1.3 Randomized strategies in extensive form games . . . . . . . . . . . . 622.1.4 Reduced normal form games . . . . . . . . . . . . . . . . . . . . . . . 66
2.2 The Principle of Sequential Rationality . . . . . . . . . . . . . . . . . . . . . . 672.3 Games of Perfect Information and Backward Induction . . . . . . . . . . . . 68
2.3.1 Subgame perfect equilibrium, sequential rationality, and backwardinduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
2.3.2 Epistemic foundations for backward induction . . . . . . . . . . . . . 752.3.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
2.3.4 Subgame perfect equilibrium in more general classes of games . . . 80Interlude: Asymmetric Information, Economics, and Game Theory . . . . . . . . 802.4 Games of Imperfect Information and Sequential Equilibrium . . . . . . . . . 82
2.4.1 Subgames and subgame perfection in games of imperfect information 822.4.2 Beliefs and sequential rationality . . . . . . . . . . . . . . . . . . . . . 832.4.3 Definition of sequential equilibrium . . . . . . . . . . . . . . . . . . . 862.4.4 Computing sequential equilibria . . . . . . . . . . . . . . . . . . . . . 892.4.5 Existence of sequential equilibrium and structure of the equilibrium
set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 982.5 Invariance and Proper Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . 98
2.6 Forward Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1072.6.1 Motivation and discussion . . . . . . . . . . . . . . . . . . . . . . . . 1072.6.2 Forward induction in signaling games . . . . . . . . . . . . . . . . . . 109
2.7 Full Invariance and Kohlberg-Mertens Stability . . . . . . . . . . . . . . . . . 1162.7.1 Fully reduced normal forms and full invariance . . . . . . . . . . . . 1162.7.2 KM stability and set-valued solution concepts . . . . . . . . . . . . . 118
3 Bayesian Games 1213.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1213.2 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1253.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4 Repeated Games 1314.1 The Repeated Prisoner’s Dilemma . . . . . . . . . . . . . . . . . . . . . . . . 1314.2 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1364.3 The Folk Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1384.4 Computing the Set of Subgame Perfect Equilibrium Payoff s . . . . . . . . . 146
4.4.1 Dynamic programming . . . . . . . . . . . . . . . . . . . . . . . . . . 146
–2–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
3/167
4.4.2 Dynamic programs vs. repeated games . . . . . . . . . . . . . . . . . 1474.4.3 Factorization and self-generation . . . . . . . . . . . . . . . . . . . . . 148
4.5 Simple and Optimal Penal Codes . . . . . . . . . . . . . . . . . . . . . . . . . 157
0. Basic Decision Theory
0.1 Ordinal Utility
We consider a decision maker (or agent) who chooses among alternatives (or outcomes) in
some set Z. To begin we assume that Z is finite.
The primitive description of preferences is in terms of a preference relation . For any
ordered pair of alternatives (x, y) ∈ Z × Z, the agent can tell us whether or not he weakly
prefers x to y. If yes, we write x
y. If no, we write x
y.We can use these to define
strict preference: a b means [a b and b a].
indi ff erence a ∼ b means [a b and b a].
We say that the preference relation is a weak order if it satisfies the two weak order axioms:
Completeness: For all a, b ∈ Z, either a b or b a (or both).
Transitivity: For all a, b, c ∈ Z, if a b and b c, then a c.
Completeness says that there are no alternatives that the agent is unwilling or unable to
compare. (Consider Z = {do nothing, save five lives by murdering a person chosen at
random}.)
Transitivity rules out preference cycles. (Consider Z = {a scoop of ice-cream, an enormous
hunk of chocolate cake, a small plain salad}.)
The function u : Z → R is an ordinal utility function that represents if
u(a) ≥ u(b) if and only if a b.
Theorem 0.1. Let Z be finite and let be a preference relation. Then there is an ordinal utility
function u : Z → R that represents if and only if is complete and transitive.
Moreover, the function u is unique up to increasing transformations: v : Z → R also represents
if and only if v = f ◦ u for some increasing function f : R → R.
–3–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
4/167
In the first part of the theorem, the “only if” direction follows immediately from the
fact that the real numbers are ordered. For the “if” direction, assign the elements of Z
utility values sequentially; the weak order axioms ensure that this can be done without
contradiction.
“Ordinal” refers to the fact that only the order of the values of the utility function havemeaning. Neither the values nor diff erences between them convey information about
intensity of preferences. This is captured by the second part of the theorem, which says
that utility functions are only unique up to increasing transformations.
If Z is (uncountably) infinite, weak order is not enough to ensure that there is an ordinal
utility representation:
Example 0.2. Lexicographic preferences. Let Z = R2, and suppose that a b ⇔ a1 > b1 or
[a1 = b1 and a2 ≥ b2]. In other words, the agent’s first priority is the first component of
the prize; he only uses the second component to break ties. While satisfies the weakorder axioms, it can be shown that there is no ordinal utility function that represents .
In essence, there are too many levels of preference to fit them all into the real line. ♦
There are various additional assumptions that rule out such examples. One is
Continuity: Z ⊆ Rn, and for every a ∈ Z, the sets {b : b a} and {b : a b} are closed.
Notice that Example 0.2 violates this axiom.
Theorem 0.3. Let Z ⊆ Rn and let be a preference relation. Then there is a continuousordinal utility function u : Z → Rn that represents if and only if is complete, transitive, and
continuous.
In the next section we consider preferences over lotteries—probability distributions over
a finite set of prizes. Theorem 0.3 ensures that if preferences satisfy the weak order and
continuity axioms, then they can be represented by a continuous ordinal utility function.
By introducing an additional axiom, one can obtain a more discriminating representation.
0.2 Expected Utility and the von Neumann-Morgenstern Theorem
Now we consider preferences in settings with uncertainty: an agent chooses among
“lotteries” in which diff erent alternatives in Z have diff erent probabilities of being realized.
Example 0.4. Suppose you are off ered a choice between
–4–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
5/167
lottery 1: $1M for sure
lottery 2:
$2M with probability 1
2,
$0 with probability 12 .
One tempting possibility is to look at expected values: the weighted averages of the
possible values, with weights given by probabilities.
lottery 1: $1M × 1 = $1M
lottery 2: $2M × 12 + $0M ×
12 = $1M
But most people strictly prefer lottery 1.
The lesson: if outcomes are in dollars, ranking outcomes in terms of expected numbers of
dollars may not capture preferences. ♦
If Z is a finite set, then we let ∆Z represent the set of probability distributions over Z:∆Z = { p : Z → R+|
a∈Z p(a) = 1}.
The objects a player must evaluate, p ∈ ∆Z, are distributions over an outcome set Z.
We can imagine he has preferences , where “ p q” means that he likes p at least as much
as q. When can these preferences be represented using numerical assessments of each p?
When can these assessments take the form of expected utilities?
Let Z be a finite set of alternatives, so that ∆Z is the set of lotteries over alternatives.
Example 0.5. Z = {$0, $10, $100},∆Z =
p = ( p($0), p($10), p($100))
$0
$100 $100
$0
$10
.2
.8
.9
.1 1
p = (.2, .8, 0) q = (.9, 0, .1) r = (0, 0, 1) ♦
Let be a preference relation on ∆Z, where p q means that lottery p ∈ ∆Z is weakly
preferred to lottery q ∈ ∆Z.
If p and q are lotteries and α ∈ [0, 1] is a scalar, then the compound lottery c = α p + (1 − α)q
is the lottery defined by c( z) = α p( z) + (1 − α)q( z) for all z ∈ Z.
–5–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
6/167
Example 0.6. c = .7 p + .3q = .7(.2, .8, 0) + .3(.9, 0, .1) = (.41, .56, .03)
.7
.3
.2
.8
.9
.1
$0
$10
$0
$100
= .56.03
.41 $0
$10
$100
♦
Preference axioms:
(NM1) Weak order: is complete and transitive.
(NM2) Continuity: For all p, q, and r such that p q r, there exist δ, ε ∈ (0, 1) such that
(1 − δ) p + δr q (1 − ε)r + ε p.
Example 0.7. p = win Nobel prize, q = nothing, r = get hit by a bus. (Since p, q and r are
supposed to be lotteries, we should really write p = win Nobel prize with probability 1,
etc.) ♦
(NM3) Independence: For all p, q, and r and all α ∈ (0, 1),
p q ⇔ α p + (1 − α)r αq + (1 − α)r.
Example 0.8. p = (.2, .8, 0), q = (.9, 0, .1), r = (0, 0, 1)
$100
$10
$0.2
.8
1
.5
.5
.5
.5
.1
.9
1
$0
$100
$100
ˆ p = .5 p + .5r = (.1, .4, .5) q̂ = .5q + .5r = (.45, 0, .55) ♦
We say that u : Z →R
provides an expected utility representation for the preference relation on ∆Z if
(1) p q ⇔ z∈Z
u( z) p( z) ≥ z∈Z
u( z) q( z).
The function u is then called a von Neumann-Morgenstern (or NM) utility function.
–6–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
7/167
Theorem 0.9 (von Neumann and Morgenstern (1944)).
Let Z be a finite set, and let be a preference relation on ∆Z. Then there is an NM utility function
u : Z → R that provides an expected utility representation for if and only if satisfies (NM1),
(NM2), and (NM3).
Moreover, the function u is unique up to positive a ffine transformations. That is, v also satisfies
(1) if and only if v ≡ au + b for some a > 0 and b ∈ R.
If outcomes are money, ui(x) need not equal x. If it is, we say that utility is linear in money
or that the player is risk neutral.
Discussion of Theorem 0.9
(i) The theorem tells us that as long as (NM1)–(NM3) hold, there is some way of
assigning numbers to the alternatives such that taking expected values of these
numbers is the right way to evaluate lotteries over alternatives.(ii) The values of an NM utility function are sometimes called cardinal utilities (as
opposed to ordinal utilities). What more-than-ordinal information do cardinal
utilities provide?
The nature of this information can be deduced from the fact that a NM utility
function is unique up to positive affine transformations.
Example 0.10. Let a, b, c ∈ Z, and suppose that ua > uc > ub. Let λ = uc−ubua−ub
. This quantity is
not aff ected by positive affine transformations. Indeed, if v = αu + β, then
vc − vbva − vb
=(αuc + β) − (αub + β)
(αua + β) − (αub + β) =
α(uc − ub)
α(ua − ub) = λ.
To interpret λ, rearrange its definition to obtain
uc = λua + (1 − λ)ub.
This says that λ is the probability on a in a lottery over a and b that makes this lottery
exactly as good as getting c for sure. ♦
0.3 Bayesian Rationality
In settings with uncertainty, where all relevant probabilities are objective and known, we
call an agent NM rational if he acts as if he is maximizing a NM expected utility function.
–7–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
8/167
What if the probabilities are not given? We call an agent Bayesian rational (or say that he
has subjective expected utility preferences) if
(i) In settings with uncertainty, he forms beliefs describing the probabilities of all
relevant events.
(ii) When making decisions, he acts to maximize his expected utility given his beliefs.
(iii) After receiving new information, he updates his beliefs by taking conditional prob-
abilities whenever possible.
In game theory, it is standard to begin analyses with the assumption that players are
Bayesian rational.
Foundations for subjective expected utility preferences are obtained from state-space modelsof uncertainty. These models begin with a set of possible states whose probabilities arenot given, and consider preferences over maps from states to outcomes. Savage (1954)provides an axiomatization of subjective expected utility preferences in this framework.Both the utility function and the assignment of probabilities to states are determined aspart of the representation. Anscombe and Aumann (1963) consider a state-space model inwhich preferences are not over state-contingent alternatives, but over maps from states tolotteries à la von Neumann-Morgenstern. This formulation allows for a much a simplerderivation of subjective expected utility preferences, and fits very naturally into game-theoretic models. See Gilboa (2009) for a textbook treatment of these and more generalmodels of decision under uncertainty.
1. Normal Form Games
Game theory models situations in which multiple players make strategically interdepen-
dent decisions. Strategic interdependence means that your outcomes depend both on what
you do and on what others do.
This course focuses on noncooperative game theory, which works from the hypothesis that
agents act independently, each in his own self interest. Cooperative game theory studies
situations in which subsets of the agents can make binding agreements.
We study some basic varieties of games and the connections among them:
1. Normal form games: moves are simultaneous
2. Extensive form games: moves take place over time
3. Bayesian games: players receive private information before play begins
4. Repeated games: a normal form game is played repeatedly, with all previous moves
being observed before each round of play
–8–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
9/167
1.1 Basic Concepts
Example 1.1. Prisoner’s Dilemma.
Story 1: Two bankers are each asked to report on excessive risk taking by the other. If
neither reports such activity, both get a $2M bonus. If only one reports such activity, hegets a $3M bonus and the other gets nothing. If both report such activity, then each gets a
$1M bonus.
Story 2: There are two players and a pile of money. Each player can either let the opponent
take $2, or take $1 for himself.
2
c d
1 C 2, 2 0, 3
D 3, 0 1, 1
(i) Players: P = {1, 2}
(ii) Pure strategy sets S1 = {C, D}, S2 = {c, d}
Set of pure strategy profiles: S = S1 × S2. For example: (C, d) ∈ S
(iii) Utility functions ui : S → R. For example: u1(C, d) = 0, u2(C, d) = 3. ♦
1.1.1 Definition
A normal form game G = {P , {Si}i∈P , {ui}i∈P } consists of:(i) a finite set of players P = {1,..., n},
(ii) a finite set of pure strategies Si for each player,
(iii) a von Neumann-Morgenstern (NM) utility function ui : S → R for each player, where
S =
i∈P Si is the set of pure strategy profiles (lists of strategies, one for each player).
If each player chooses some si ∈ Si, the strategy profile is s = (s1, . . . , sn) and player j’s
payoff is u j(s).
1.1.2 Randomized strategies and beliefs
In our description of a game above, players each choose a particular pure strategy si ∈ Si.
But it is often worth considering the possibility that each player makes a randomized
choice.
–9–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
10/167
Mixed strategies and mixed strategy profiles
If A is a finite set, then we let ∆ A represent the set of probability distributions over A: that
is, ∆ A = { p : A → R+|
a∈ A p(a) = 1}.
Then σi ∈ ∆Si is a mixed strategy for i, while σ = (σ1, . . . , σn) ∈ i∈P ∆Si is a mixed strategy profile.
Under a mixed strategy profile, players are assumed to randomize independently: for
instance, learning that 1 played C provides no information about what 2 did.
In other words, the distribution on the set S of pure strategy profiles created by σ is a
product distribution.
Example 1.2. Battle of the Sexes. 2
a b
1 A 3, 1 0, 0B 0, 0 1, 3
Suppose that 1 plays A with probability 34
, and 2 plays a with probability 14
. Then
σ = (σ1, σ2) =(σ1( A), σ1(B)), (σ2(a), σ2(b))
=
( 3
4 , 14
), ( 14 , 34
)
The pure strategy profile ( A, a) is played with probability σ1( A) · σ2(a) = 3
4 · 1
4 = 316
. The
complete product distribution is presented in the matrix below.
2
a ( 14
) b ( 34
)
1 A ( 34 )
316
916
B ( 14
) 116
316
♦
When player i has two strategies, his set of mixed strategies ∆Si is the simplex inR2, which
is an interval.
A
B1
1
0
A B
–10–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
11/167
When player i has three strategies, his set of mixed strategies ∆Si is the simplex in R3,
which is a triangle.
A
B C
A
B C
When player i has four strategies, his set of mixed strategies ∆Si is the simplex in R4,
which is a pyramid.
(use imagination here)
A
B
C
D
Correlated strategies
In some circumstances we need to consider the possibility that players all have access to
the same randomizing device, and so are able to correlate their behavior. This is not as
strange as it may seem, since any uncertain event that is commonly observed can serve to
correlate behavior.
Example 1.3. Battle of the Sexes revisited.
Suppose that the players observe a toss of a fair coin. If the outcome is Heads, they play
( A, a); if it is Tails, they play (B, b).
A formal description of their behavior specifies the probability of each pure strategyprofile: ρ = (ρ( A, a), ρ( A, b), ρ(B, a), ρ(B, b)) = ( 12 , 0, 0,
12 ).
2
a b
1 A 1
2 0
B 0 12
–11–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
12/167
This behavior cannot be achieved using a mixed strategy profile, since it requires correla-
tion: any mixed strategy profile putting weight on ( A, a) and (B, b) would also put weight
on ( A, b) and (B, a):
2
y > 0 (1 − y) > 0a b
x > 01
A xy x(1 − y)
(1 − x) > 0 B (1 − x) y (1 − x)(1 − y)
all marginal probabilities > 0 ⇒ all joint probabilities > 0 ♦
We call ρ ∈ ∆
i∈P Si
= ∆S a correlated strategy. It is an arbitrary joint distribution oni∈P Si.
Example 1.4. Suppose that P = {1, 2, 3} and Si = {1, . . . , k i}. Then a mixed strategy profileσ = (σ1, σ2, σ3) ∈
i∈P ∆Si consists of three probability vectors of lengths k 1, k 2, and k 3,
while a correlated strategy ρ ∈ ∆
i∈P Si is a single probability vector of length k 1 · k 2 · k 3.
♦
Because players randomize independently in mixed strategy profiles, mixed strategy
profiles generate the product distributions on
i∈P Si. Thus:
mixed strategy profiles = i∈P
∆Si “⊂” ∆ i∈P
Si = correlated strategies.
We write “⊂” because the items on each side are not the same kinds of mathematical
objects (i.e., they live in diff erent spaces).
Example 1.5. If S1 = { A, B} and S2 = {a, b}, then the set of mixed strategies ∆{ A, B} × ∆{a, b}
is the product of two intervals, and hence a square. The set of correlated strategies
∆({ A, B} × {a, b}) is a pyramid.
a b
A
B
Aa
Ab Ba
Bb
–12–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
13/167
The set of correlated strategies that correspond to mixed strategy profiles—in other words,
the product distributions on { A, B} × {a, b}—form a surface in the pyramid.
♦
Beliefs
One can divide traditional game-theoretic analyses into two classes: equilibrium and
non-equilibrium. In equilibrium analyses (e.g., using Nash equilibrium), one assumes
that players correctly anticipate how opponents will act. In this case, Bayesian rational
players will maximize their expected utility with respect to correct predictions about how
opponents will act. In nonequilibrium analyses (e.g., dominance arguments) this is not
assumed. Instead, Bayesian rationality requires players to form beliefs about how their
opponents will act, and to maximize their expected payoff s given their beliefs. In some
cases knowledge of opponents’ rationality leads to restrictions on plausible beliefs, and
hence on our predictions of play.
Let us consider beliefs in a two-player game. Suppose for now first that player i expects
his opponent to play a pure strategy, but that i may not be certain of which strategy j will
play. Then player i should form beliefs µi ∈ ∆S j about what his opponent will do. Note
that in this case, player i’s beliefs about player j are the same sort of object as a mixed
strategy of player j.
Remarks:
(i) If player i thinks that player j might randomize, then i’s beliefs µ j would need to bea probability measure on ∆S j (so that loosely speaking, µ j ∈ ∆(∆S j).) Such beliefs
can be reduced to a probability measure on S j by taking expectations. Specifically,let µ̄i = Eµiσ j be the mean of a random variable that takes values in ∆S j and whosedistribution is µ j. Then µ̄i ∈ ∆S j, and µ̄i(s j) represents the probability that i assignsto the realization of j’s mixed strategy being the pure strategy s j. In the end, theseprobabilities are all that matter for player i’s expected utility calculations. Thus innonequilibrium analyses, there is no loss in restricting attention to beliefs that onlyput weight on opponents’ pure strategies. We do just this in Sections 1.2 and 1.3.
–13–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
14/167
On the other hand, if player j plays mixed strategy σ j, then player i’s beliefs are onlycorrect if µi(σ j) = 1. But when we consider solution concepts that require correct
beliefs (especially Nash equilibrium—see Section 1.4), there will be no need to referto beliefs explicitly in the formal definitions, since the definitions will implicitlyassume that beliefs are correct.
(ii) When player i’s beliefs µi assigns probability 1 to player j choosing a pure strategy but put weight on multiple pure strategies, these beliefs are formally identical toa mixed strategy σ j of player j. Therefore, the optimization problem player i faceswhen he is uncertain and holds beliefs µi is equivalent to the the optimizationproblem he faces when he knows player j will play the mixed strategy σ j (see
below). The implications of this point will be explored in Section 1.3.
Now we consider beliefs in games with many players. Suppose again that each player
expects his opponents to play pure strategies, although he is not sure which pure strategies
they will choose. In this case, player i’s beliefs µi are an element of ∆ ji S j, and so are
equivalent to a correlated strategy among player i’s opponents. Remark (i) above applieshere as well: in nonequilibrium analyses, defining beliefs as just described is without loss
of generality.
(It may be preferable in some applications to restrict a player’s beliefs about diff erent
opponents’ choices to be independent, in which case beliefs are described by elements
of
ji ∆S j, the set of opponents’ mixed strategy profiles. We do not do so here, but we
discuss this point further in Section 1.3.)
In all cases, we assume that if a player chooses a mixed strategy, learning which of his
pure strategies is realized does not alter his beliefs about his opponents.
Expected utility
To compute a numerical assessment of a correlated strategy or mixed strategy profile, a
player takes the weighted average of the utility of each pure strategy profile, with the
weights given by the probabilities that each pure strategy profile occurs. This is called the
expected utility associated with σ. See Section 0.2.
Example 1.6. Battle of the Sexes once more.
2a b
1 A 3, 1 0, 0
B 0, 0 1, 3
payoff s
2a ( 14 ) b (
34 )
1 A ( 3
4) 3
169
16
B ( 14
) 116
316
probabilities
–14–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
15/167
Suppose σ = (σ1, σ2) =(σ1( A), σ1(B)), (σ2(a), σ2(b))
=
( 3
4 , 14
), ( 14 , 34
)
is played. Then
u1(σ) = 3 · 316
+ 0 · 916
+ 0 · 116
+ 1 · 316
= 34 ,
u2(σ) = 1 · 316
+ 0 · 916
+ 0 · 116
+ 3 · 316
= 34
. ♦
In general, player i’s expected utility from correlated strategy ρ is
(2) ui(ρ) =s∈S
ui(s) · ρ(s).
Player i’s expected utility from mixed strategy profile σ = (σ1, . . . , σn) is
(3) ui(σ) =s∈S
ui(s) ·
j∈P
σ j(s j).
In (3), the term in parentheses is the probability that s = (s1, . . . , sn) is played.
We can also write down an agent’s expected utility in a setting in which he is uncertain
about his opponent’s strategies. If player i plays mixed strategy σi ∈ ∆Si and his beliefs
about his opponents’ behavior are given by µi ∈ ∆
ji S j, his expected utility is
(4) ui(σi, µi) =s∈S
ui(s) · σi(si) µi(s−i).
There is a standard abuse of notation here. In (2) ui acts on correlated strategies (so thatui : ∆
j∈P S j
→ R), in (3) ui acts on mixed strategy profiles (so that ui :
j∈P ∆S j → R),
and in (4) ui acts on mixed strategy / beliefs pairs (so that ui : ∆Si × ∆
ji S j
→ R).
Sometimes we even combine mixed strategies with pure strategies, as in ui(si, σ−i). In the
end we are always taking the expectation of ui(s) over the relevant distribution on pure
strategy profiles s, so there is really no room for confusion.
1.2 Dominance and Iterated Dominance
Suppose we are given some normal form game G. How should we expect Bayesian rational
players (i.e., players who form beliefs about opponents’ strategies and choose optimally
given their beliefs) playing G to behave? We consider a sequence of increasingly restrictive
methods for analyzing normal form games. We start by considering the implications of
Bayesian rationality and of common knowledge of rationality. After this, we introduce
–15–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
16/167
equilibrium assumptions.
We always assume that the structure and payoff s of the game are common knowledge: that
everyone knows these things, that everyone knows that everyone knows them, and so on.
Notation:G = {P , {Si}i∈P , {ui}i∈P } a normal form game
s−i ∈ S−i =
ji S j a profile of pure strategies for i’s opponents
µi ∈ ∆
ji S j i’s beliefs about his opponents’ strategies
(formally equivalent to a correlated strategy for i’s opponents)
Remember that (i) in a two-player game, player i’s beliefs µi are the same kind of object
as player j’s mixed strategy σ j, and (ii) in a game with more than two players, player i’s
beliefs µi can emulate any mixed strategy profile σ−i of i’s opponents, but in addition can
allow for correlation.
1.2.1 Strictly dominant strategies
“Dominance” concerns strategies whose performance is good (or bad) regardless of how
opponents behave.
Pure strategy si ∈ Si is strictly dominant if
(5) ui(si, s−i) > ui(si , s−i) for all s
i si and s−i ∈ S−i.
In words: player i prefers s i to any alternative si regardless of the pure strategy profile
played by his opponents.
Example 1.7. Prisoners’ Dilemma revisited.2
c d
1 C 2, 2 0, 3
D 3, 0 1, 1
Joint payoff s are maximized if both players cooperate. But regardless of what player 2does, player 1 is better off defecting. The same is true for player 2. In other words, D and
d are strictly dominant strategies.
The entries in the payoff bimatrix are the players’ NM utilities. If the game is supposed to
represent the banker story from Example 1.1, then having these entries correspond to the
dollar amounts in the story is tantamount to assuming that (i) each player is risk neutral,
–16–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
17/167
and (ii) each player cares only about his own dollar payoff s. If other considerations are
important—for instance, if the two bankers are friends and care about each others’ fates—
then the payoff matrix would need to be changed to reflect this, and the analysis would
diff er correspondingly. Put diff erently, the analysis above tells us only that if each banker
is rational and cares only about his dollar payoff s, then we should expect to see (D, d). ♦
The next observation shows that a Bayesian rational player must play a strictly dominant
strategy whenever one is available.
Observation 1.8. Strategy si is strictly dominant if and only if
(6) ui(si, µi) > ui(si , µi) for all s
i si and µi ∈ ∆S−i.
Thus, if strategy si is strictly dominant, then it earns the highest expected utility regardless of
player i’s beliefs.
While condition (6) directly addresses Bayesian rationality, condition (5) is easier to check.
Why are the conditions equivalent? (⇐) is immediate. (⇒) follows from the fact that the
inequality in (6) is a weighted average of those in (5).
Considering player i’s mixed strategies would not allow anything new here: First, a pure
strategy that strictly dominates all other pure strategies also dominates all other mixed
strategies. Second, a mixed strategy that puts positive probability on more than one pure
strategy cannot be strictly dominant (since it cannot be the unique best response to anys−i; see Observation 1.14).
1.2.2 Strictly dominated strategies
Most games do not have strictly dominant strategies. How can we get more mileage the
notion of dominance?
A strategy σi ∈ ∆Si is strictly dominated if there exists a σi ∈ ∆Si such that
ui(σ
i , s−i) > ui(σi, s−i) for all s−i ∈ S−i
Remarks on strictly dominated strategies:
(i) σi is strictly dominated if and only if
ui(σi , µi) > ui(σi, µi) for all µi ∈ ∆
ji S j.
–17–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
18/167
Thus, Bayesian rational players never choose strictly dominated strategies.
(ii) A strategy that is not dominated by any pure strategy may be dominated by a
mixed strategy:
2
L R
1
T 3, − 0, −
M 0, − 3, −
B 1, − 1, −
B is not dominated by T or M but it is dominated by 12
T + 12 M. Note how this
conclusion depends on taking expected utility seriously: the payoff of 1.5 generated
by playing 12
T + 12 M against L is just as “real” as the payoff s of 3 and 0 obtained by
playing T and M against L.
(iii) If a pure strategy s
i is strictly dominated, then so is any mixed strategy σi with s
i inits support (i.e., that uses si with positive probability). This is because any weight
placed on a strictly dominated strategy can instead be placed on the dominating
strategy, which raises player i’s payoff s regardless of how his opponents act.
For instance, in the example from part (ii), 23 M + 1
3B is strictly dominated by
23 M + 1
3( 1
2T + 1
2 M) = 1
6T + 5
6 M.
(iv) But even if a group of pure strategies are not dominated, mixed strategies that
combine them may be:
2
L R
1
T 3, − 0, −
M 0, − 3, −
B 2, − 2, −
T
M B
not dominated
T , M,and B areallbestresponsestosome σ2 ∈ ∆S2, and so are not strictlydominated.
But 12 T + 12 M (guarantees 32 ) is strictly dominated by B (guarantees 2). In fact, any
mixed strategy with both T and M in its support is strictly dominated.
1.2.3 Iterated strict dominance
Some games without a strictly dominant strategy can still be solved using the idea of
dominance.
–18–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
19/167
Example 1.9. In the game below, 2 does not have a dominated pure strategy.
2
L C R
1 T 2, 2 6, 1 1, 1
M 1, 3 5, 5 9, 2
B 0, 0 4, 2 8, 8
B is dominated for 1 (by M), so if 1 is rational he will not play B.
If 2 knows that 1 is rational, she knows that he will not play B.
So if 2 is rational, she won’t play R, which is strictly dominated by L once B is removed.
Now if 1 knows:
(i) that 2 knows that 1 is rational
(ii) that 2 is rational
then 1 knows that 2 will not play R. Hence, since 1 is rational he will not play M.
Continuing in a similar vein: 2 will not play C.
Therefore, (T , L) solves G by iterated strict dominance. ♦
Iterated strict dominance is driven by common knowledge of rationality—by the assumption
that all statements of the form “i knows that j knows that . . . k is rational” are true—as
well as by common knowledge of the game itself.
To see which strategies survive iterated strict dominance it is enough to
(i) Iteratively remove all dominated pure strategies.
(ii) When no further pure strategies can be removed, check all remaining mixed strate-
gies. (We do not have to do this earlier because in the early rounds, we are only
going to check performance versus pure strategies anyway.)
A basic fact about iterated strict dominance is:
Proposition 1.10. The set of strategies that remains after iteratively removing strictly dominated
strategies does not depend on the order in which the dominated strategies are removed.
See Dufwenberg and Stegeman (2002) or Ritzberger (2002) for a proof.
Often, iterated strict dominance will eliminate a few strategies but not completely solve
the game.
–19–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
20/167
1.2.4 Weak dominance
Strategy σi ∈ ∆Si is weakly dominated by σi if
ui(σi , s−i) ≥ ui(σi, s−i) for all s−i ∈ S−i, and
ui(σi , s−i) > ui(σi, s−i) for some s−i ∈ S−i
Strategy si ∈ Si is weakly dominant if it weakly dominates all other strategies.
Example 1.11. Weakly dominated strategies are not ruled out by Bayesian rationality alone.
2
L R
1 T 1, − 0, −
B 0, − 0, − ♦
While the use of weakly dominated strategies is not ruled out by Bayesian rationality
alone, the avoidance of such strategies is often taken as a first principle. In decision
theory, this principle is referred to as admissibility; see Kohlberg and Mertens (1986) for
discussion and historical comments. In game theory, admissibility is sometimes deduced
from the principle of cautiousness, which requires that players not view any opponents’
behavior as impossible; see Asheim (2006) for discussion.
It is natural to contemplate iteratively removing weakly dominated strategies. However,
iterated removal and cautiousness conflict with one another: removing a strategy means
viewing it as impossible, which contradicts cautiousness. See Samuelson (1992) for dis-cussion and analysis. One consequence is that the order of removal of weakly dominated
strategies can matter—see Example 1.12 below. (For results on when order of removal
does not matter, see Marx and Swinkels (1997) and Østerdal (2005).) But versions of
iterated weak dominance can be placed on a secure epistemic footing (see Brandenburger
et al. (2008)), and moreover, iterated weak dominance is a powerful tool for analyzing
extensive form games (see Section 2.6.1).
Example 1.12. Order of removal matters under IWD.
2
L R
1
U 5, 1 4, 0
M 6, 0 3, 1
D 6, 4 4, 4
–20–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
21/167
In the game above, removing weakly dominated stategy U first leads to prediction (D, R),
while removing the weakly dominated strategy M first leads to the prediction (D, L). ♦
An intermediate solution concept between ISD and IWD is introduced by Dekel and Fu-
denberg (1990), who suggest one round of elimination of all weakly dominated strategies,
followed by iterated elimination of strictly dominated strategies. Since weak dominance
is not applied iteratively, the tensions described above do not arise. Strategies that survive
this Dekel-Fudenberg procedure are sometimes called permissible. See Section 2.5 for further
discussion.
1.3 Rationalizability
1.3.1 Definition and examples
Q: What is the tightest prediction that we can make assuming only common knowledge
of rationality?
A: Bayesian rational players not only avoid dominated strategies; they also avoid strate-
gies that are never a best response. If we apply this idea iteratively, we obtain the
sets of rationalizable strategies.
Strategy σi is a best response to beliefs µi (denoted σi ∈ Bi(µi)) if
ui(σi, µi) ≥ ui(σi , µi) for all σi ∈ ∆Si
(In contrast with dominance, there is no “for all µi”.)
The set-valued map Bi is called player i’s best response correspondence. As with the notation
ui(·), we will abuse the notation Bi(·) as necessary, writing both Bi(µi) and Bi(σ−i).
Informally, the rationalizable strategies (Bernheim (1984), Pearce (1984)) are those that re-
main after we iteratively remove all strategies that are cannot be a best response, account-
ing for each player’s uncertainty about his opponents’ behavior. We provide a definition
below, and an alternate characterization in Theorem 1.23.
Because it only requires CKR, rationalizability is a relatively weak solution concept. Still,
when rationalizability leads to many rounds of removal, it can result in stark predictions.
Example 1.13. Guessing 34
of the average.
There are n players. Each player’s strategy set is Si = {0, 1, . . . , 100}.
–21–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
22/167
The target integer is defined to be 34
of the average strategy chosen, rounding down.
All players choosing the target integer split a prize worth V > 0 (or, alternatively, each is
given the prize with equal probability). If no one chooses the target integer, the prize is
not awarded.
Which pure strategies are rationalizable in this game?
To start, we claim that for any pure strategy profile s−i of his opponents, player i has a
response ri ∈ Si such that the target integer generated by (ri, s−i) is ri. (You are asked to
prove this on the problem set.) Thus for any beliefs µi about his opponents, player i can
obtain a positive expected payoff (for instance, by playing a best response to some s−i in
the support of µi).
So: Since Si = {0, 1, . . . , 100},
⇒ The highest possible average is 100.
⇒ The highest possible target is 34 · 100 = 75.
⇒ Strategies in {76, 77, . . . , 100} yield a payoff of 0.
⇒ Since player i has a strategy that earns a positive expected payoff given his beliefs,
strategies in {76, 77, . . . , 100} are not best responses.
Thus if players are rational, no player chooses a strategy above 75.
⇒ The highest possible average is 75.
⇒ The highest possible target is 34 · 75 = 56 1
4 = 56.
⇒ Strategies in {57, . . . , 100} yield a payoff of 0.
⇒ Since player i has a strategy that earns a positive expected payoff given his beliefs,strategies in {57, . . . , 100} are not best responses.
Thus if players are rational and know that others are rational, no player chooses a strategy
above 56.
Proceeding through the rounds of eliminating strategies that cannot be best responses, we
find that no player will choose a strategy higher than
75 . . . 56 . . . 42 . . . 31 . . . 23 . . . 17 . . . 12 . . . 9 . . . 6 . . . 4 . . . 3 . . . 2 . . . 1 . . . 0.
Thus, after 14 rounds of iteratively removing strategies that cannot be best responses, we
conclude that each player’s unique rationalizable strategy is 0. ♦
When applying rationalizability, we may reach a point in our analysis at which a player
has multiple pure strategies, none of which can be removed (meaning that for each such
strategy, there are beliefs against which that strategy is optimal). In this case, we should
–22–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
23/167
consider whether any mixtures of these pure strategies can be removed.
The following observation provides an easy way of checking whether a mixed strategy is
a best response.
Observation 1.14. Strategy σi is a best response to µi if and only if every pure strategy si in thesupport of σi is a best response to µi.
This follows immediately from thefact that thepayoff toamixedstrategyistheappropriate
weighted average of the payoff s to the pure strategies in its support.
Example 1.15. Determining the rationalizable strategies in a normal form game.
2
L C R
T 3, 3 0, 0 0, 21 M 0, 0 3, 3 0, 2
B 2, 2 2, 2 2, 0
To find B1 : ∆S2 ⇒ ∆S1
u1(T , µ1) ≥ u1( M, µ1) ⇔ 3l ≥ 3c
⇔ l ≥ c
u1(T , µ1) ≥ u1(B, µ1) ⇔ 3l ≥ 2
⇔ l ≥ 23
To find B2 : ∆S1 ⇒ ∆S2
u2(µ2, L) ≥ u2(µ2, C) ⇔ 3t + 2b ≥ 3m + 2b
⇔ t ≥ m
u2(µ2, L) ≥ u2(µ2, R) ⇔ 3t + 2b ≥ 2t + 2m
⇔ t + 2b ≥ 2m
–23–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
24/167
L
RC
B
T
M
B1: ΔS
2 ⇒ ΔS
1
T
M B
Best responsesfor player 1
¹⁄₃C + ²⁄₃L
²⁄₃C + ¹⁄₃L
¹⁄₂C + ¹⁄₂L
²⁄₃L + ¹⁄₃R
²⁄₃C + ¹⁄₃R
T
BM
R L
C
B2: ΔS
1 ⇒ ΔS
2
L
RC
Everything is a best responsefor player 2.
¹⁄₂B + ¹⁄₂T
²⁄₅M + ²⁄₅T + ¹⁄₅B
¹⁄₂M + ¹⁄₂B
¹⁄₂M + ¹⁄₂T
¹⁄₃M + ²⁄₃T
²⁄₃M + ¹⁄₃T
Q: No mixtures of T and M are a best response for 1. Since 2 knows this, can it be a best
response for her to play R?
A: R is not a best response to any point on the dark lines TB and BM, which represent
mixtures between strategies T and B and between B and M.
Since player 2 is uncertain about which best response player 1 will play, Bayesian ratio-
nality requires her to form beliefs about this. These beliefs µ2 are a probability measure
on the set of player 1’s best responses.
If 2’s beliefs about 1’s behavior are µ2(T ) = µ2( M) = 1
2, then it is as if 2 knows that 1 will
play 12 T + 12 M, and R is a best response to these beliefs.
In fact, if µ2(T ) = µ2( M) = 2
5 and µ2(B) =
15
, then it is as if 2 knows that 1 will play25
T + 25 M + 15
B, so all of 2’s mixed strategies are possible best responses.
Thus, the player 1’s set of rationalizable strategies isR ∗1
= {σ1
∈ ∆S1
: [σ1(T ) = 0 or σ
1( M) =
0]}, and player 2’s set of rationalizable strategies is simply R ∗2 = ∆S2. ♦
When we compute the rationalizable strategies, we must account for each player’s un-
certainty about his opponent’s strategies. Thus, during each iteration we must leave in
all of his best responses to any mixture of the opponent’s surviving pure strategies, even
mixtures that are never a best response. Put diff erently, strategic uncertainty leads us to
–24–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
25/167
include the convex hull of the surviving mixed strategies at each intermediate stage of the
elimination process.
Iterative definition of (and procedure to compute) rationalizable strategies:
(i) Iteratively remove pure strategies that are never a best response (to any allowable
beliefs).(ii) When no further pure strategies can be removed, remove mixed strategies that are
never a best response.
The mixed strategies that remain are the rationalizable strategies.
There are refinements of rationalizability based on assumptions beyond CKR that generate
tighter predictions in some games, while still avoiding the use of equilibrium knowledge
assumptions—see Section 2.5.
Rationalizability and iterated strict dominance in two-player games
It is obvious that
Observation 1.16. If σi is strictly dominated, then σi is never a best response.
In two-player games, the converse statement is not obvious, but is nevertheless true:
Proposition 1.17. In a two-player game, any strategy that is never a BR is strictly dominated.
The proof is based on the separating hyperplane theorem: see Section 1.3.2.
So “never a best response” and “strictly dominated” are equivalent in two-player games.
Iterating yields
Theorem 1.18. In a two-player game, a strategy is rationalizable if and only if it satisfies iterated
strict dominance.
Rationalizability and iterated strict dominance in games with three or more players
For games with three or more players, there are two definitions of rationalizability in use.
The original one (sometimes called independent rationalizability) computes best responses
under the assumption that a player’s beliefs about diff erent opponents’ choices are in-
dependent, so that these beliefs are formally equivalent to an opponents’ mixed strategy
profile. The alternative (sometimes called correlated rationalizability) allows correlation in
a player’s beliefs about diff erent opponents’ choices. This agrees with the way we defined
beliefs in Section 1.1.2. In either case, [σi strictly dominated] ⇒ [σi never a best response],
so all rationalizable strategies survive iterated strict dominance. But the analogues of
Proposition 1.17 and Theorem 1.18 are only true under correlated rationalizability.
–25–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
26/167
While opinion is not completely uniform, most game theorists would choose correlated
rationalizability as the more basic of the two concepts. See Hillas and Kohlberg (2002) for
a compelling defense of this point of view. We take “rationalizability” to mean “correlated
rationalizability” unless otherwise noted.
Example 1.19. Consider the following three-player game in which only player 3’s payo ff sare shown.
3:A2
L R
1 T −, −, 5 −, −, 2
B −, −, 2 −, −, 1
3:B2
L R
1 T −, −, 4 −, −, 0
B −, −, 0 −, −, 4
3:C2
L R
1 T −, −, 1 −, −, 2
B −, −, 2 −, −, 5
Strategy B is not strictly dominated, since a dominating mixture of A and C would needto put at least probability 34
on both A (in case 1 and 2 play (T , L)) and C (in case 1 and2 play (B, R)). If player 3’s beliefs about player 1’s choices and player 2’s choices areindependent, B is not a best response: Independence implies that for some t, l ∈ [0, 1], wecan write µ3(T , L) = tl, µ3(T , R) = t(1 − l), µ3(B, L) = (1 − t)l, and µ3(B, R) = (1 − t)(1 − l).Then
u3(C, µ3) > u3(B, µ3)
⇔ tl + 2t(1 − l) + 2(1 − t)l + 5(1 − t)(1 − l) > 4tl + 4(1 − t)(1 − l)
⇔ 1 + t + l > 6tl,
which is true whenever t + l ≤ 1 (why?); symmetrically, u3( A, µ3) > u3(B, µ3) whenevert + l ≥ 1. But B is a best response to the correlated beliefs µ3(T , L) = µ3(B, R) =
12
. ♦
1.3.2 The separating hyperplane theorem
A hyperplane is a set of points in Rn that satisfy a scalar linear equality. More specifically,
the hyperplane H p,c = {x ∈ Rn : p · x = c} is identified by some normal vector p ∈ Rn − {0}
and intercept c ∈ R. Since the hyperplane is an n − 1 dimensional affine subset of Rn, its
normal vector is unique up to a multiplicative constant.
A half space is a set {x ∈ Rn : p · x ≤ c}.
Example 1.20. In R2, a hyperplane is a line. x2 = ax1 + b ⇒ (−a, 1) · x = b, so p = (−a, 1).
The figure below displays cases in which a = − 12
, so that p = ( 12 , 1).
–26–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
27/167
6
62
2
4
4
8
p.x = 4
p.x = 0
p.x = 2
p
Interpreting the figure:
H p,0 = {x : p · x = 0} is the hyperplane through the origin containing all vectors orthogonal
to p.
H p,c is a hyperplane parallel to H p,0. (Why? If y, ˆ y ∈
H p,c = {
x : p·
x =
c}, then the tangentvector ˆ y − y is orthogonal to p: that is, p · ( ˆ y − y) = 0, or equivalently, ˆ y − y ∈ H p,0.)
The normal vector p points towards H p,c with c > 0. (Why? Because x · y = |x|| y| cos θ > 0
when the angle θ formed by x and y is acute.) ♦
Theorem 1.21 (The Separating Hyperplane Theorem).
Let A, B ⊂ Rn be closed convex sets such that A ∩ B ⊆ bd( A) ∩ bd(B). Then there exists a
p ∈ Rn − {0} such that p · x ≤ p · y for all x ∈ A and y ∈ B.
p
z
p x < c.
p x = p z = c. .
p x > c.
A
B
In cases where B consists of a single point on the boundary of A, the hyperplane whose
existence is guaranteed by the theorem is often called a supporting hyperplane.
For proofs, discussion, examples, etc. see Hiriart-Urruty and Lemaréchal (2001).
Application: Best responses and dominance in two-player games
Observation 1.16. If σi is strictly dominated, then σi is never a best response.
–27–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
28/167
Proposition 1.17. Let G be a two-player game. Then σi ∈ ∆Si is strictly dominated if and only if
σi is not a best response to any µi ∈ ∆S−i
Theorem 1.18. In a two-player game, a strategy is rationalizable if and only if it satisfies iterated
strict dominance.
We illustrate the idea of the proof of Proposition 1.17 with an example.
Example 1.22. Our goal is to show that in the two-player game below, [σi ∈ ∆Si is not
strictly dominated] implies that [σi is a best response to some µi ∈ ∆S−i].
2
L R
1
A 2, − 5, −
B 6, − 3, −
C 7, − 1, −
D 3, − 2, −
Let v1(σ1) = (u1(σ1, L), u1(σ1, R)) be the vector payoff induced by σ1. Note that u1(σ1, µ1) =
µ1 · v1(σ1).
Let V 1 = {v1(σ1) : σ1 ∈ ∆S1} be the set of such vector payoff s. Equivalently, V 1 is the convex
hull of the vector payoff s to player 1’s pure strategies. It is closed and convex.
Now σ1 ∈ ∆S1 is not strictly dominated if and only if v1(σ1) lies on the northeast boundary
of V 1. For example, σ̃1 = 1
2 A + 12 B is not strictly dominated, with v1(σ̃1) = (4, 4). We want
to show that σ̃1 is a best response to some µ̃1 ∈ ∆S2.
v1(A) = (2, 5)
62
2
4
4
5
5
3
73
v1(D) = (3, 2)
v1(B) = (6, 3)
v1(C) = (7, 1)
v1(σ
1) = v
1( A + B) = (4, 4)
V1
μ1 = ( , )
L
R
1
2
1
2
3 3
21
~
~
μ1‧ w
1= 4~
μ1‧ w
1< 4
~
1
1
6
8
–28–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
29/167
A general principle: when you are given point on the boundary of a convex set, the normal
vector at that point often reveals something interesting.
The point v1(σ̃1) lies on the hyperplane µ̃1 · w1 = 4, where µ̃1 = (13
, 23
).
This hyperplane separates the point v1(σ̃
1) from the set V
1, on which µ̃
1· w
1 ≤ 4.
Put diff erently,
µ̃1 · w1 ≤ µ̃1 · v1(σ̃1) for all w1 ∈ V 1
⇒ µ̃1 · v1(σ1) ≤ µ̃1 · v1(σ̃1) for all σ1 ∈ ∆S1 (by the definiton of V 1)
⇒ u1(σ1, µ̃1) ≤ u1(σ̃1, µ̃1) for all σ1 ∈ ∆S1
Therefore, σ̃1 is a best response to µ̃1.
The same argument shows that every mixture of A and B is a best response to µ̃1.
We can repeat this argument for all mixed strategies of player 1 corresponding to points onthe northeast frontier of V 1, as in the figure below at left. The figure below at right presents
player 1’s best response correspondence, drawn beneath graphs of his pure strategy payoff
functions. Both figures link player 1’s beliefs and best responses: in the left figure, player
1’s beliefs are the normal vectors, while in the right figure, player 1’s beliefs correspond
to diff erent horizontal coordinates.
v1(A)
v1(D)
62
2
4
4
5
5
3
73
v1(B)
v1(C)
V1
μ1 = (1/3, 2/3)
L
R
μ1 = (2/3, 1/3)
~
u1(A ,μ
1)
u1(B ,μ
1)
u1(D ,μ
1)
u1(C ,μ
1)
L R1
3
2
3 L+ R 2
3
1
3 L+ RC B A
♦
1.3.3 A positive characterization
The procedure introduced earlier defines rationalizability in a “negative” fashion, by
iteratively removing strategies that are not rationalizable. It is good to have a “positive”
–29–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
30/167
characterization, describing rationalizability in terms of what it requires rather than what
it rules out.
One informal way to state a positive characterization says that each rationalizable strategy
is a best response to beliefs that only put weight on best responses . . . to beliefs that only
put weight on best responses . . . to beliefs that only put weight on best responses . . .In this way, the choice of the strategy is justified by a chain of expectations of rational
behavior.
best responses
beliefs best responses
beliefs best responses
justified by
placed on
justified by
placed on
It is possible to eliminate this infinite regress by introducing a fixed point.
best responses
beliefs
ustified by placed on
The precise version of this fixed point idea is provided by part (i) of Theorem 1.23, which
later will allow us to relate rationalizability to Nash equilibrium. Part (ii) of the theorem
provides the new characterization of rationalizability. We state the characterization for
pure strategies. (To obtain the version for mixed strategies, take Ψi ⊆ ∆Si as the candidate
set and let Ri = ∪σi∈Ψi support(σi).)
Theorem 1.23. Let Ri ⊆ Si for all i ∈ P , and let R−i =
ji R j
(i) Suppose that for each i ∈ P and each si ∈ Ri, there is a µi ∈ ∆S−i such that
(a) si is a best response to µi, and
(b) the support of µi is contained in R−i.
Then for each player i, all strategies in Ri are rationalizable.
(ii) There is a largest product set
j∈S R∗ j
such that the collection R∗1 , . . . , R
∗n satisfies (i).
Moreover, for each i ∈ P , R∗i is player i’s set of rationalizable pure strategies.
–30–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
31/167
Osborne (2004, p. 380–382) provides a clear discussion of these ideas.
Example 1.24. Which pure strategies are rationalizable in the following game?
2
a b c d
A 7, 0 2, 5 0, 7 0, 1
1 B 5, 2 3, 3 5, 2 0, 1
C 0, 7 2, 5 7, 0 0, 1
D 0, 0 0, −2 0, 0 9, −1
Notice that d is strictly dominated by 12
a + 12
c. Once d is removed, D is strictly dominated
in the game that remains.
To show that all remaining pure strategies are rationalizable, we apply Theorem 1.23.
Let (R∗1 , R
∗2 ) = ({ A, B, C}, {a, b, c}).
Then:
B is optimal for 1 when µ1 = b ∈ R∗2 , and
b is optimal for 2 when µ2 = B ∈ R∗1
.
Also:
A is optimal for 1 when µ1 = a ∈ R∗2 ,
a is optimal for 2 when µ2 = C ∈ R∗1 .C is optimal for 1 when µ1 = c ∈ R
∗2 , and
c is optimal for 2 when µ2 = A ∈ R∗1
.
Thus the strategies in (R∗1 , R∗2 ) are rationalizable.
We can gain further insight by focusing on the collections of smaller sets that satisfy the
conditions of Theorem 1.23.
(R1, R2) = ({B}, {b}) satisfies these conditions. When each Ri is a singleton, as in this case,
there is no flexibility in choosing the beliefs µi: the beliefs must be correct. Indeed, thestrategy profile generated by the Ri is a pure strategy Nash equilibrium.
(R1, R2) = ({ A, C}, {a, c}) also satisfies the conditions of Theorem 1.23(i). The strategies in
these sets form a best response cycle. In this case, each strategy si ∈ Ri is justified using
diff erent beliefs µi. Thus, the fact that rationalizability does not assume that players’
beliefs are correct plays a crucial role here. ♦
–31–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
32/167
1.4 Nash Equilibrium
Rationalizability only relies on common knowledge of rationality. Unfortunately, it often
fails to provide tight predictions of play. To obtain tighter predictions, we need to impose
stronger restrictions on players’ beliefs about their opponents’ behavior. Doing so will
lead to the central solution concept of noncooperative game theory.
1.4.1 Definition
To reduce the amount of notation, let Σi = ∆Si denote player i’s set of mixed strategies.
Similarly, let Σ =
j∈P ∆S j and Σ−i =
ji ∆S j.
Define player i’s best response correspondence Bi : Σ−i ⇒ Σi by
Bi(σ−i) = argmaxσi∈Σi ui(σi, σ−i)
Strategy profile σ ∈ Σ is a Nash equilibrium (Nash (1950)) if
σi ∈ Bi(σ−i) for all i ∈ P .
In words: each player plays a best response to the strategies of his opponents.
Underlying assumptions:
(i) Each player has correct beliefs about what opponents will do (vs. rationalizability:
reasonable beliefs).
(ii) Each behaves rationally given these beliefs.
Example 1.25. Good Restaurant, Bad Restaurant.
2
g b
1 G 2, 2 0, 0
B 0, 0 1, 1
Everything is rationalizable.
The Nash equilibria are: (G, g), (B, b), ( 13 G + 23
B, 13 g + 23
b).
Checking the mixed equilibrium:
u2(13
G + 23
B, g) = 13
· 2 + 23
· 0 = 23
–32–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
33/167
u2(13
G + 23
B, b) = 13 · 0 + 23
· 1 = 23
⇒ All strategies in Σ2 are best responses.
In the mixed equilibrium each player is indiff erent between his mixed strategies.
Each chooses the mixture that makes his opponent indiff erent. ♦
We can refine our prediction applying the notion of strict equilibrium. s∗ is a strict
equilibrium if for each i, s∗i is the unique best response to s∗
−i. That is, Bi(s
∗−i
) = {s∗i
} for all i.
Strict equilibria seem especially compelling.
But strict equilibria do not exist in all games (unlike Nash equilibria: see Section 1.4.4).
In the previous example, the Nash equilibrium (G, g) maximizes both players’ payoff s.
One might be tempted to say that a Nash equilibrium with this property is always the one
to focus on. But this criterion is not always compelling:
Example 1.26. Joint investment.
Each player can make a safe investment that pays 8 for sure, or a risky investment that
pays 9 if the other player joins in the investment and 0 otherwise.
2
s r
1 S 8, 8 8, 0
R 0, 8 9, 9
The Nash equilibria here are (S, s), (R, r), and ( 19 S +89
R, 19 s +89
r). Although (R, r) yields both
players the highest payoff , each player might be tempted by the sure payoff of 8 that the
safe investment guarantees. ♦
1.4.2 Computing Nash equilibria
The next proposition provides links between Nash equilibrium and rationalizability.
Proposition 1.27. (i) Any pure strategy used with positive probability in a Nash equilibrium
is rationalizable.
(ii) If each player has a unique rationalizable strategy, the profile of these strategies is a Nash
equilibrium.
Proof. Theorem 1.23 provided conditions under which strategies in the sets Ri ⊆ Si are
rationalizable: for each i ∈ P and each si ∈ Ri, there is a µi ∈ ∆S−i such that
–33–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
34/167
(a) si is a best response to µi, and
(b) the support of µi is contained in R−i.
To prove part (i) of the proposition, let σ ∈ (σ1, . . . , σn) be a mixed equilibrium, and let Ri
be the support of σi. Observation 1.14 tells us that each si ∈ Ri is a best response to σ−i.
Thus (a) and (b) hold with µi determined by σ−i.
To prove part (ii) of the proposition, suppose that s = (si, . . . , sn) is the unique rationalizable
strategy profile. Then (a) and (b) say that si is a best response to s−i, and so s is a Nash
equilibrium.
See Osborne (p. 383–384) for further discussion.
Proposition 1.27 provides guidelines for computing the Nash equilibria of a game. First
eliminate all non-rationalizable strategies. If this leaves only one pure strategy profile,this profile is a Nash equilibrium.
Guidelines for computing all Nash equilibria:
(i) Eliminate pure strategies that are not rationalizable.
(ii) For each profile of supports, find all equilibria.
Once the profile of supports is fixed, one identifies all equilibria with this profile of
supports by introducing the optimality conditions implied by the supports: namely, that
the pure strategies in the support of a player’s equilibrium strategy receive the same
payoff , which is at least as high as payoff s for strategies outside the support. In this way,
each player’s optimality conditions restrict what the other players’ strategies may be.
This approach is simply a convenient way of evaluating every strategy profile. In eff ect,
one finds all equilibria by ruling out all non-equilibria and keeping what remains.
Inevitably, this approach is computationally intensive: if player i has k i strategies, there
are
i∈P (2k i − 1) possible profiles of supports, and each can have multiple equilibria. (In
practice, one fixes the supports of only n−1 players’ strategies, and determines the support
for the nth player’s strategy using the optimality conditions—see the examples below.)
–34–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
35/167
Example 1.28. (Example 1.15 revisited).
2
L C R
1
T 3, 3 0, 0 0, 2
M 0, 0 3, 3 0, 2
B 2, 2 2, 2 2, 0
L
RC
B
T
M
B1: ΔS
2 ⇒ ΔS
1
T
M B
Best responsesfor player 1
¹⁄₃C + ²⁄₃L
²⁄₃C + ¹⁄₃L
¹⁄₂C + ¹⁄₂L
²⁄₃L + ¹⁄₃R
²⁄₃C + ¹⁄₃R
T
BM
RL
C
B2: ΔS
1 ⇒ ΔS
2
L
RC
Everything is a best responsefor player 2.
¹⁄₂B + ¹⁄₂T
²⁄₅M + ²⁄₅T + ¹⁄₅B
¹⁄₂M + ¹⁄₂B
¹⁄₂M + ¹⁄₂T
¹⁄₃M + ²⁄₃T
²⁄₃M + ¹⁄₃T
R ∗1
R ∗2
R is rationalizable since it is a best response to some probability distributions over R 1, as
such distributions can replicate every point in ∆S1.
But since R is not a best response to any σ1 ∈ R ∗1, R is never played in a Nash equilibrium.
The key point here is that in Nash equilibrium, player 2’s beliefs are correct (i.e., place
probability on player 1’s actual strategy).)
Thus, we need not consider any support for σ2 that includes R. Three possible supportsfor σ2 remain:
{L} ⇒ 1’s BR is T ⇒ 2’s BR is L ∴ (T , L) is Nash
{C} ⇒ 1’s BR is M ⇒ 2’s BR is C ∴ ( M, C) is Nash
{L, C} ⇒ u2(σ1, L)(i)= u2(σ1, C)
(ii)
≥ u2(σ1, R): look at B2, or compute as follows:
(i) 3t + 2b = 3m + 2b (ii) 3m + 2b ≥ 2t + 2m (use t = m, b = 1 − m − t = 1 − 2t)
⇒ t = m ⇒ 3t + 2(1 − 2t) ≥ 4t
∴ t = m ≤ 25
Looking at B1 (or R ∗1), we see that this is only possible if player 1 plays B for sure. Player
1 is willing to do this if
u1(B, σ2) ≥ u1(T , σ2) ⇔ l ≤ 23 , and
–35–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
36/167
u1(B, σ2) ≥ u1( M, σ2) ⇔ c ≤ 23
SinceweknowthatR is not used in any Nash equilibrium, we conclude that (B, αL+(1−α)C)
is a Nash equilibrium for α ∈ [ 13
, 23
].
Since we have checked all possible supports for σ2, we are done. ♦
Example 1.29. Zeeman’s (1980) game.2
A B C
1 A 0, 0 6, −3 −4, −1B −3, 6 0, 0 5, 3C −1, −4 3, 5 0, 0
Since the game is symmetric, both players have the same incentives as a function of the
opponent’s behavior.
A B ⇔ 6b − 4c ≥ −3a + 5c ⇔ a + 2b ≥ 3c;
A C ⇔ 6b − 4c ≥ −a + 3b ⇔ a + 3b ≥ 4c;
B C ⇔ −3a + 5c ≥ −a + 3b ⇔ 5c ≥ 2a + 3b.
A
B C
A
B
C⅘A+⅕C
⁵⁄₇A+²⁄₇C
⅗B+⅖C
Now consider each possible support of player 1’s equilibrium strategy.
A Implies that 2 plays A, and hence that 1 plays A. Equilibrium.B Implies that 2 plays A, and hence that 1 plays A.C Implies that 2 plays B, and hence that 1 plays A.
A, B Implies that 2 plays A, and hence that 1 plays A. A, C This allows many best responses for player 2, but the only one that
makes both A and C a best response for 1 is 45 A + 15 C, which is only
a best response for 2 if 1 plays 45 A + 15 C himself. Equilibrium.
B, C Implies that 2 plays A, B or a mixture of the two, and hence that 1plays A.
all This is only optimal for 1 if 2 plays 13 A + 1
3B + 1
3C, which 2 is only
willing to do if 1 plays 13 A + 1
3B + 1
3C. Equilibrium.
–36–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
37/167
∴ There are three Nash equilibria: ( A, A)( 4
5 A + 1
5C, 4
5 A + 1
5C)
( 13 A + 1
3B + 1
3C, 1
3 A + 1
3B + 1
3C) ♦
Example 1.30. Selten’s (1975) horse.
1 2
222
3 3
A a
D d
L R L R
033
001
102
003
3:L2
a d
1 A 2, 2, 2 0, 0, 1
D 0, 0, 3 0, 0, 3
3:R2
a d
1 A 2, 2, 2 0, 3, 3
D 1, 0, 2 1, 0, 2
Consider all possible mixed strategy supports for players 1 and 2:
(D, d) Implies that 3 plays L. Since 1 and 2 are also playing best responses,this is a Nash equilibrium.
(D, a) Implies that 3 plays L, which implies that 1 prefers to deviate to A.(D, mix) Implies that 3 plays L, which with 2 mixing implies that 1 prefers
to deviate to A.( A, d) Implies that 3 plays R, which implies that 1 prefers to deviate to D.( A, a) 1 and 2 are willing to do this if σ3(L) ≥
13
. Since 3 cannot aff ect his
payoff
s given the behavior of 1 and 2, these are Nash equilibria.( A, mix) 2 only mixes if σ3(L) = 1
3 ; but if 1 plays A and 2 mixes, 3 strictlyprefers R – a contradiction.
(mix, a) Implies that 3 plays L, which implies that 1 strictly prefers A.(mix, d) If 2 plays d, then for 1 to be willing to mix, 3 must play L; this leads
2 to deviate to a.(mix, mix) Notice that 2 can only aff ect her own payoff s when 1 plays A.
Hence, for 2 to be indiff erent, σ3(L) = 1
3. Given this, 1 is willing to
mix if σ2(d) = 2
3. Then for 3 to be indiff erent, σ1(D) =
47
. This is aNash equilibrium.
∴ There are three components of Nash equilibria: (D, d, L)( A, a, σ3(L) ≥
13
)( 3
7 A + 4
7D, 1
3a + 2
3d, 1
3L + 2
3R) ♦
–37–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
38/167
1.4.3 Interpretations of Nash equilibrium
Example 1.31. (The Good Restaurant, Bad Restaurant game (Example 1.25))
2
g b1
G 2, 2 0, 0
B 0, 0 1, 1
NE: (G, g)(B, b)
( 13 G + 23
B, 13 g + 23
b). ♦
Example 1.32. Matching Pennies.2
h t
1 H 1, −1 −1, 1
T −1, 1 1, −1 unique NE: ( 12 H + 12
T , 12 h + 12
t). ♦
Nash equilibrium is a minimal condition for self enforcing behavior.
This explains why we should not expect players to behave in a way that is not Nash, but
not why we should expect players to coordinate on a Nash equilibrium.
Justifications of equilibrium knowledge: why expect correct beliefs?
There is no general justification for assuming equilibrium knowledge. But justifications can be
found in certain specific instances:
(i) Coordination of play by a mediator.If a mediator proposes a Nash equilibrium, no player can benefit from deviating.
Of course, this only helps if there actually is a mediator.
(ii) Pre-play agreement.
But it may be more appropriate to include the “pre-play” communication explicitly
in the game. (The result is a model of cheap talk : see Crawford and Sobel (1982)
and a large subsequent literature.) This raises two new issues: (i) one now needs
equilibrium knowledge in a larger game, and (ii) the expanded game typically has
a “babbling” equilibrium in which all communication is ignored.
(iii) Focal points (Schelling (1960)). Something about the game makes some Nash
equilibrium the obvious choice about how to behave.
ex: meeting in NYC at the information booth at Grand Central Station at noon.
ex: coordinating on the good restaurant.
–38–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
39/167
Focal points can also be determined abstractly, using (a)symmetry to single out
certain distinct strategies: see Alós-Ferrer and Kuzmics (2013).
(iv) Learning / Evolution: If players repeatedly face the same game, they may find their
way from arbitrary initial behavior to Nash equilibrium.
Heuristic learning: Small groups of players, typically employing rules that
condition on the empirical distribution of past play (Young (2004))
Evolutionary game theory: Large populations of agents using myopic up-
dating rules (Sandholm (2010))
In some classes of games (that include the two examples above), many learning
and evolutionary processes do converge to Nash equilibrium.
But there is no general guarantee of convergence:
Many games lead to cycling or chaotic behavior, and in some games any “reason-
able” dynamic process fails to converge to equilibrium (Shapley (1964), Hofbauer
and Swinkels (1996), Hart and Mas-Colell (2003)).
Some games introduced in applications are known to have poor convergence prop-
erties (Hopkins and Seymour (2002), Lahkar (2011)).
In fact, evolutionary game theory models do not even support the elimination of
strictly dominated strategies in all games (Hofbauer and Sandholm (2011)).
Interpretation of mixed strategy Nash equilibrium: why mix in precisely the way that
makes your opponents indi ff erent?
In the unique equilibrium of Matching Pennies, player 1 is indiff erent among all of his
mixed strategies. He chooses (12
, 12
) because this makes player 2 indiff erent. Why should
we expect player 1 to behave in this way?
(i) Deliberate randomization
Sometimes it makes sense to expect players to deliberately randomize (ex.: poker).
In zero-sum games (Section 1.6), randomization can be used to ensure that you obtain
at least the equilibrium payoff regardless of how opponents behave:
In a mixed equilibrium, you randomize to make your opponent indiff
erent betweenher strategies. In a zero-sum game, this implies that you are indiff erent between
your opponent’s strategies. This implies that you do not care if your opponent
finds out your randomization probabilities in advance, as this does not enable her
to take advantage of you.
(ii) Mixed equilibrium as equilibrium in beliefs
–39–
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
40/167
We can interpret σ∗i as describing the beliefs that player i’s opponents have about
player i’s behavior. The fact that σ∗i is a mixed strategy then reflects the opponents’
uncertainty about how i will behave, even if i is not actually planning to randomize.
But as Rubinstein (1991) observes, this interpretation
“. . . implies that an equilibrium does not lead to a prediction (statistical or other-wise) of the players’ behavior. Any player i’s action which is a best response givenhis expectation about the other players’ behavior (the other n − 1 strategies) is con-sistent as a prediction for i’s action (this might include actions which are outside thesupport of the mixed strategy). This renders meaningless any comparative staticsor welfare analysis of the mixed strategy equilibrium and brings into question theenormous economic literature which utilizes mixed strategy equilibrium.”
(iii) Mixed equilibria as time averages of play: fictitious play (Brown (1951))
Suppose that the game is played repeatedly, and that in each period, each player
chooses a best response to the time average of past play.
Then in certain classes of games, the time average of each players’ behavior con-
verges to his part in some Nash equilibrium strategy profile.
(iv) Mixed equilibria as population equilibria (Nash (1950))
Suppose that there is one population for the player 1 role and another for the player
2 role, and that players are randomly matched to play the game.
If half of the players in each population play Heads, no one has a reason to deviate.
Hence, the mixed equilibrium describes stationary distributions of pure strategies in
each population.
(v) Purification: mixed equilibria as pure equilibria of games with payoff uncertainty
(Harsanyi (1973))
Example 1.33. Purification in Matching Pennies. Suppose that while the Matching Pennies
payoff bimatrix gives player’s approximate payoff s, players’ actual payoff s also contain
small terms ε H , εh representing a bias toward playing heads, and that each player only
knows his own bias. (The formal framework for modeling this situation is called a Bayesian
game—see Section 3.)
2
h t
1 H 1 + ε H , −1 + εh −1 + ε H , 1
T −1, 1 + εh 1, −1
Specifically, suppose that ε H and εh are independent random variables with P(ε H > 0) =
P(ε H 0) = P(εh
8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm
41/167
player to follow his bias. From the ex ante point of view, the distribution over actions that
this equilibrium generates in the original normal form game is ( 12 H + 12
T , 12 h + 12
t).
Harsanyi (1973) shows that any mixed equilibrium can be purified in this way. This
includes not only “reasonable” mixed equilibria like that in Matching Pennies, but also
“unreasonable” ones like those in coordination games. ♦
1.4.4 Existence of Nash equilibrium and structure of the equilibrium set
Existence and structure theorems for finite normal form games
When does Nash equilibrium provide us with at least one prediction of play? Always, at
least in the context of finite nor
Recommended