Overview
I Evolution of Cooperation
I Repeated Prisoner’s Dilemma
I The logic of indirect speech
I Signaling Games
I Horn’s Division of Pragmatic Labor
Human Language is grounded on Cooperation
Reasons for cooperation:
I kin selection (r > cb )
I direct reciprocity
I indirect reciprocity
I spatial selection
I group selection
b > c > 0b = 3, c = 1
Cooperate Defect
Cooperate b-c2;2 0;3Defect b 3;0 c1;1
Tabelle : Prisoner’s Dilemma
Repeated PD: The Evolution of Cooperation
Robert Axelrod’s Computer tournament (1979):
C D
C 3;3 0;5D 5;0 1;1
Tabelle : Prisoner’s Dilemma
I Finding the best strategy for the Repeated Prisoners’Dilemma (RPD)
I Game theorists were invited to submit their favourite strategy(decision rule)
I All submitted strategies play against each other for 200 rounds
I The strategy with the highest average score wins thetournament
Repeated PD: The Evolution of Cooperation
I TIT FOR TAT: Cooperate in the first round and then do whatyour opponent did last round
I FRIEDMAN: Cooperate until the opponent defects, thendefect all the time
I DOWNING:I Estimate probabilities p1 = P(C t
O |Ct−1I ), p2 = P(C t
O |Dt−1I )
I If p1 >> p2 the opponent is responsive: CooperateI Else the opponent is not responsive: Defect
I TRANQUILIZER:I Cooperate the first moves and check the opponents responseI If there arises a pattern of mutual cooperation: Defect from
time to timeI If opponent continues cooperating, defections become more
frequent
I TIT FOR 2 TATS: Play TIT FOR TAT, but response withdefect if the opponent defected on the previous two moves
I JOSS: Play TIT FOR TAT, but response with defects in 10%of opponent’s cooperation moves
Repeated PD: The Evolution of Cooperation
Results:
1. The winner was TIT FOR TAT with 504 points
2. Success in such a game correlated with the followingcharacteristics:
I Be nice: cooperate, never be the first to defect.I Be provocable: return defection for defection, cooperation for
cooperation.I Don’t be envious: be fair with your partner.I Don’t be too clever: or, don’t try to be tricky.
The Logic of indirect Speech
Three major relationship types
I Dominance: ”Don’t mess with me” (inherited from primates’dominance hierarchy)
I Communality: ”share and share alike” (kin selection andmutualism, group selection)
I Reciprocity: ”You scratch my bag, I scratch yours”(Tit-For-Tat exchanges)
The Logic of indirect Speech
I Would you like to come up and see my etchings? [sexualcome-on]
I If you could pass the guacamole, that would be awesome.[polite request]
I Nice store you got here. Would be a real shame if somethinghappened to it. [a threat]
I Gee, officer, is there some way we could take care of the tickethere? [a bribe]
Dishonest Officer Honest Officer
Don’t bribe Ticket TicketBribe Go Free Arrest for BriberyImplicate Bribe Go Free Ticket
Tabelle : Bribe Game
The Logic of indirect Speech: Mutual knowledge
I Indirect speech acts prohibits mutual knowledge
I Conventions need mutual knowledge?
aS aGtS 1;1 0;0tG 0;0 1;1
I Linguistic conventions can be modeled by signaling games
Signaling Games: Coordination games
Coordination games:
I The two players’ shared goal is to coordinate on the samestrategy
I Any 2× 2-coordination game has two Nash Equilibria
I Which Equilibrium is the best / unique solution???
Solution concepts:
I Aligned preferences
I Focal point: Convention
I Coordination by communication ⇒ Signaling games
Signaling Games: An example
The situation
I There is a monkey society, whose members can use two alarmcalls for an emergency case: ”Ooh!” and ”Aah!”
I There are two main emergency cases: Attacks by skypredators (tS)and ground predators (tG )
I There are two possible actions: Hide in the bushes (aS) orclimb on the trees (aG )
aS aGtS 1;1 0;0tG 0;0 1;1
Question: What could the monkeys do to coordinate theirbehavior?
Signaling Games: An example
A simple signaling game:
I A set of states T = {tS , tG}I A set of actions A = {aS , aG}I A set of messages M = {mOoh!,mAah!}I A Probability function Pr ∈ ∆(T )
Pr(tS) = .5, Pr(tG ) = .5
Signaling Games: Strategies
”Language use” can be depicted as a strategy:
I A sender strategy S : T → M
I A receiver strategy R : M → A
S1:tS mOoh!
tG mAah!
S2:tS
mAah!tG
mOoh!
S3:tS mOoh!
tG mAah!
S4:tS
mAah!tG
mOoh!
R1:mOoh! aS
mAah! aG
R2:mOoh!
aGmAah!
aS
R3:mOoh! aS
mAah! aG
R4:mOoh!
aGmAah!
aS
Question: What is S1(tS), S4(tG ), R2(mOoh!), R3(mAah!)?
Signaling Games: Signaling Systems
Resulting strategy matrix:
R1 R2 R3 R4
S1 1 0 .5 .5S2 0 1 .5 .5S3 .5 .5 .5 .5S4 .5 .5 .5 .5
I (S1,R1) and (S2,R2) are combinations of perfectcommunication and called signaling systems (Lewis 1969).
(S1,R1):tS mOoh! aS
tG mAah! aG(S2,R2):
tS
mOoh!
aS
tG
mAah!
aG
Signaling Games: Simulation result
Simulation:
I Agents are placed on a lattice andcan only communicate to directneighbors
I Agents play repeatedly Lewis’signaling game and learn bylearning dynamics
I Agents are colored according astrategy combination
Results
I The society is distributed in two types of languages users.
I One group is using (S1,R1), the other is using (S2,R2).
I The agents on the borders miscommunicate
Review: Reinforcement Learning
S Rts
tg
m1
m2
as
ag
0
0
0
0I the sender has an urn for each
state t ∈ T
I each urn contains balls of eachmessage m ∈ M
I the sender decides by drawingfrom urn 0t
I the receiver has an urn foreach message m ∈ M
I each urn contains balls of eachaction a ∈ A
I the receiver decides bydrawing from urn 0t
I successful communication → urn update
I in general a signaling system emerges over time
Signaling Games: Research in Tubingen
We are applying signaling games...
I as dynamic games (repeatedly played)
I on a multi-agent system (lattice, network)
I combined with update dynamics and learning accounts
Main results:If all members of a group of agents use / have learned an uniquesignaling system...
I a convention emerged (Lewis, 1969)
I for which meanings are assigned to messages
I ergo: a fragment of a language / dialect / slang emerged
I Since simple learning dynamics like reinforcement learningleads to the evolution of conventions, mutual knowledge andrationality is not necessary to explain the phenomena, but canwe necessarily exclude them?
Signaling Games: Horn’s division of pragmatic labor
The modelA simple signaling game:
I A set of states T = {tf , tr}I A set of actions A = {af , ar}I A set of messages M = {mu,mm}I A Probability function Pr ∈ ∆(T )
Pr(tf ) > Pr(tr )
I A cost function C : M → RC (mu) > C (mm)
Signaling Games: Horn’s division of pragmatic labor
Examples:
I tf : Miss X sang Home Sweet Home in a normal way
tr : Miss X sang Home Sweet Home in a strange way
mu: ”Miss X sang Home Sweet Home”
mm: ”Miss X produced a series of sounds that corresponded closelywith the score of Home Sweet Home”
I tf : John became a prisoner
tr : John goes to the prison building
mu: ”John went to jail”
mm: ”John went to the jail”
I tf : a person that cooks / a tool that drills
tr : a tool that cooks / a person that drills
mu: ”cook”/”drill”
mm: ”cooker”/”driller”