
The Evolution of Conventions

H. Peyton Young

What is a convention?

Customary behavior Self-enforcing Not always symmetric Follow given that other people do Examples

• Driving on the right• Eating with utensils• Men propose to women

How are conventions “chosen”?

A convention is an equilibrium, but there could be others

Some equilibria are inherently more reasonable (Harsanyi and Selten)

One equilibrium more prominent (Schelling)

Evolutionary explanation

Past plays influence players’ choices One equilibrium eventually becomes

more prevalent This paper shows that behavior will

converge over time to a Nash, given some limitations on the game

The model

n people randomly selected from large population

Base actions on sampling of plays from recent past

No individual learning Mistakes possible “Adaptive play”


In weakly acyclic games:– If samples are sufficiently incomplete and

memory is finite, converge to Nash With mistakes:

– Almost always converges to a particular equilibrium

Adaptive play

n-person game G, strategy set Si

N divided into classes C1, C2, ..., Cn.

G played once per period; t = 1, 2, ... Play at time t is s(t) = (s1(t), s2(t), ... sn(t))

In class Ci, utility ui(s)

History of plays is h(t) = (s(1), s(2), ..., s(t))

Choosing strategies

Choose m, k such that 1≤k≤m In period t+1, where t ≥ m:

– Each player sees k plays from past m periods

– k/m is completeness of information– Plays are not necessarily equally likely to

be seen

First m plays random H consists of all sequences of length m

drawn from ∏Si

Finite Markov chain on H with initial h(m) Successor of h H is h’ H For s Si, pi(s|h)

Pi( · ) is a best-reply distribution

– pi(s|h) > 0 iff s is i’s best reply for some k

– pi(s|h) independent of t

P moving from h to h’ is ∏i=1,npi(si|h)

Convergence of adaptive play

h is an absorbing state iff it is Nash played m times

h = h’ = (s, s, ..., s) Convergence strict Nash

– But strict Nash does not guarantee convergence– Cycling

Use weakly acyclic games

Best-reply graph





G is a weakly acyclic n-person game L(s) = length shortest path from s to Nash LG = maxsL(s)

If k ≤ m/(LG + 2), adaptive play “almost surely” converges to convention

Main idea: If information is sufficiently incomplete, adaptive play converges

Proof Positive probability that:

– At some t + 1, all agents sample last k plays (call this µ)

– From periods t + 1 to t + k, all agents choose sample µ

– Each agent makes same best-reply to µ k times in a row

So positive probability of a run (s, s, ..., s) from t + 1 to t + k

If s is a strict Nash:

Positive probability that from t + k + 1 to t + m, each agent samples last k plays

s is played for m - k more periods, then absorbing state has been reached

If s is not a strict Nash:

There is a best-reply path from s to strict Nash sr along the path

For ss1:– Player i samples from periods t + 1 to t + k (i.e. samples s)– Everyone else samples µ– Positive probability that these will occur for the next k

periods By similar argument, you can move from s1 to s2, and

so on to sr

Hence limiting the size of k


Battle of the sexes– Opera vs. football game - yield or not yield


WomanYield Not Yield

Yield 0,0 1,√2

Not Yield √2,1 0,0

Why must we limit k? Let k = m Consider initial sequence where they both

yielded/both didn’t yield To decide next round: pick choice with

highest expected payoff (in this case, each yields if 1 - f > f√2)

What would happen if k is bounded as specified by adaptive play?

Is this the best we can do?

Note that the theorem guarantees convergence to an equilibrium– But which equilibrium?

Also, it seems unlikely that people would always play best response perfectly

Back to our example...

With slightly different payoffs



Yield Not Yield

Yield 0,0 1,√2

Not Yield √2/2, 1/2 0,0

Let k = 1, m = 3 We can imagine a situation where

– Both yield on first round– Both not yield on second round– On 3rd round, woman samples yielding

round, man not yielding round– What would be each player’s best reply?– Next round?– Get stuck in suboptimal equilibrium

Perhaps introducing mistakes could solve this problem


