ECAL11 Paris - August 2011

Devil in the Details: Analysis of a Coevolutionary Model of Language Evolution via Relaxation of Selection

Luke McCrohon & Olaf WitkowskiUniversity of TokyoJapan

A Model of Linguistic Gene-Culture Coevolution

��

��

��

��

��

Figure 1: An illustration of the simulation model. Each agent is born with innate knowledge

of language (UG) which is inherited through the genetic channel (1). By receiving linguistic data

from the previous generation, agents develop their own I-language. With the knowledge, they

communicate with each other, and leave data for the next generation. By doing so, linguistic

knowledge is not only genetically but also culturally inherited at both I-language (2) and E-language

levels (3).

25

<sou

rce

: Y&H

201

0>

Yamauchi & Hashimoto 2010 2

Yamauchi & Hashimoto 2010 : The Baldwin Effect

• Learned behavior gradually assimilated into the agents genetic repertoireFi

tnes

s

Landscape

LearningNo learning

3

• Culture compensating for genetically maladaptive traits

• e.g. biosynthesis of vitamin C

Yamauchi & Hashimoto 2010 : Cultural Masking

4

Yamauchi & Hashimoto 2010: Why Do We Care ?

• Shows cyclic repetition of stages

• Biological selection is masked, innate behavior degrades

• Selection is unmasked, behaviors are nativized (Baldwin)

• We initially wanted to investigate the influence of the rates of cultural change

5

Yamauchi & Hashimoto 2010

• Agents

• Chromosome: Length 12 Array (0 or 1)

• Grammar: Length 12 Array (0, 1 or null)

• Learning resource: Integer value (initially 24)

• Fitness: Integer value (initially 1)

UG

I-language

UG

I-language

E-language

UG

I-language

UG

I-language

t

BiologicalInheritance

CulturalInheritance

Agent

Learning

Biological Niche Construction

Cultural Niche Construction

Figure 2: A schematized version of Fig. 1 capturing the niche-constructive aspect of the model. In

this model of language evolution, the dual inheritance mechanism is implemented as genetic inher-

itance of linguistic knowledge (UG), and cultural inheritance of linguistic activities (E-language).

E-language here is “biphysitic” i.e., it plays the role of selective environment, and learning environ-

ment. E-language is niche-constructed: it is dynamically constructed/modified by linguistic activ-

ities based on the acquired I-language. As the acquisition of I-language is sensitive to E-language,

both learning and selection are closely interrelated within one single feedback loop.

��

��

��

Figure 3: The spatial organization of the population.

26

chromosome

grammar

6


• Every generation, for each agent:

• Learning: exposed to the utterances of the previous generation, with a chance to learn them

• Invention: if still null values in grammar, he has a chance to invent new values

• Communication: interaction with neighbors in the current generation to determine his fitness

• Reproduction: new generation created, replacing the previous one

UG

I-language

UG

I-language

E-language

UG

I-language

UG

I-language

t


CulturalInheritance

Agent

Learning










��

��

��


26

<sou

rce

: Y&H

201

0>

chromosome

grammar7


• An agent learning a word

• matching grammar: $1

• mismatching grammar: $4

UG

I-language

UG

I-language

E-language

UG

I-language

UG

I-language

t


CulturalInheritance

Agent

Learning










��

��

��


26

teacher

grammar

UG

I-language

UG

I-language

E-language

UG

I-language

UG

I-language

t


CulturalInheritance

Agent

Learning










��

��

��


26

learner

grammar

utterance

8







UG

I-language

UG

I-language

E-language

UG

I-language

UG

I-language

t


CulturalInheritance

Agent

Learning










��

��

��


26

<sou

rce

: Y&H

201

0>

chromosome

grammar9







UG

I-language

UG

I-language

E-language

UG

I-language

UG

I-language

t


CulturalInheritance

Agent

Learning










��

��

��


26

<sou

rce

: Y&H

201

0>

chromosome

grammar10







UG

I-language

UG

I-language

E-language

UG

I-language

UG

I-language

t


CulturalInheritance

Agent

Learning










��

��

��


26

<sou

rce

: Y&H

201

0>

chromosome

grammar11

Yamauchi & Hashimoto 2010 : The Original Results

• The paper identifies 3 distinct stages :

• Stage 1 - Baldwin effect

• Stage 2 - Functional Redundancy

• Stage 3 - Unmasking of Natural Selection

12

Lear

ning

Inte

nsity

Gen

e-G

ram

mar

Mat

ch

Generations

1 2 3 2*

0 1000 2000 3000 4000 5000 0

4

8

12

16

20

24

0

2

4

6

8

10

12

Learning Intensity

Gene-Grammar Match

(a) Overall Result

0

4

8

12

16

20

24

Lear

ning

Inte

nsity

0

2

4

6

8

10

12

Gen

e-G

ram

mar

Mat

ch

0 200 400 600 800 1000Generations

Learning Intensity

Gene-Grammar Match

(b) Stage 1

Learning Intensity

Gene-Grammar Match

0

2

4

6

8

10

12

Gen

e-G

ram

mar

Mat

ch

0

4

8

12

16

20

24

Lear

ning

Inte

nsity

1000 1500 2000 2500 3000Generations

(c) Stage 2

0

2

4

6

8

10

12

Gen

e-G

ram

mar

Mat

ch

0

4

8

12

16

20

24

Lear

ning

Inte

nsity

3000 3100 3200 3300 3400 3500Generations

Learning IntensityGene-Grammar Match

(d) Stage 3

Figure 5: Evolution of the population at the underlying level. On each graph, the expended

cognitive resource (after the learning process) indicates the intensity of learning. The degree of

assimilation is measured by the number of grammatical element whose value is same as its corre-

sponding allele on UG.

28


• Stage 1: Baldwin effect, Niche Construction

• Agents go from no culturally transmitted language, to a highly uniform language shared between agents, and consequently a high fitness

Lear

ning

Inte

nsity

Gen

e-G

ram

mar

Mat

ch

Generations

1 2 3 2*

0 1000 2000 3000 4000 5000 0

4

8

12

16

20

24

0

2

4

6

8

10

12

Learning Intensity

Gene-Grammar Match

(a) Overall Result

0

4

8

12

16

20

24

Lear

ning

Inte

nsity

0

2

4

6

8

10

12

Gen

e-G

ram

mar

Mat

ch

0 200 400 600 800 1000Generations

Learning Intensity

Gene-Grammar Match

(b) Stage 1

Learning Intensity

Gene-Grammar Match

0

2

4

6

8

10

12

Gen

e-G

ram

mar

Mat

ch

0

4

8

12

16

20

24

Lear

ning

Inte

nsity

1000 1500 2000 2500 3000Generations

(c) Stage 2

0

2

4

6

8

10

12

Gen

e-G

ram

mar

Mat

ch

0

4

8

12

16

20

24

Lear

ning

Inte

nsity

3000 3100 3200 3300 3400 3500Generations


(d) Stage 3





28

13


• Stage 1: Baldwin effect, Niche Construction

• Agents go from no culturally transmitted language, to a highly uniform language shared between agents, and consequently a high fitness

14


!

• Stage 2: Functional Redundancy

• Cultural transmission masks biological selection, progressive drop of correlation between the gene pool and the environment

Lear

ning

Inte

nsity

Gen

e-G

ram

mar

Mat

ch

Generations

1 2 3 2*

0 1000 2000 3000 4000 5000 0

4

8

12

16

20

24

0

2

4

6

8

10

12

Learning Intensity

Gene-Grammar Match

(a) Overall Result

0

4

8

12

16

20

24

Lear

ning

Inte

nsity

0

2

4

6

8

10

12

Gen

e-G

ram

mar

Mat

ch

0 200 400 600 800 1000Generations

Learning Intensity

Gene-Grammar Match

(b) Stage 1

Learning Intensity

Gene-Grammar Match

0

2

4

6

8

10

12

Gen

e-G

ram

mar

Mat

ch

0

4

8

12

16

20

24

Lear

ning

Inte

nsity

1000 1500 2000 2500 3000Generations

(c) Stage 2

0

2

4

6

8

10

12

Gen

e-G

ram

mar

Mat

ch

0

4

8

12

16

20

24

Lear

ning

Inte

nsity

3000 3100 3200 3300 3400 3500Generations


(d) Stage 3





28

15


• Stage 3: Unmasking of Natural Selection

• Convergence on different languages, so that the gene-grammar match has deteriorated so much that biological selection is no longer masked.

• Biological assimilatory process like in 1, leads to cycles between 2 and 3

Lear

ning

Inte

nsity

Gen

e-G

ram

mar

Mat

ch

Generations

1 2 3 2*

0 1000 2000 3000 4000 5000 0

4

8

12

16

20

24

0

2

4

6

8

10

12

Learning Intensity

Gene-Grammar Match

(a) Overall Result

0

4

8

12

16

20

24

Lear

ning

Inte

nsity

0

2

4

6

8

10

12

Gen

e-G

ram

mar

Mat

ch

0 200 400 600 800 1000Generations

Learning Intensity

Gene-Grammar Match

(b) Stage 1

Learning Intensity

Gene-Grammar Match

0

2

4

6

8

10

12

Gen

e-G

ram

mar

Mat

ch

0

4

8

12

16

20

24

Lear

ning

Inte

nsity

1000 1500 2000 2500 3000Generations

(c) Stage 2

0

2

4

6

8

10

12

Gen

e-G

ram

mar

Mat

ch

0

4

8

12

16

20

24

Lear

ning

Inte

nsity

3000 3100 3200 3300 3400 3500Generations


(d) Stage 3





28

16


• Stage 3: Unmasking of Natural Selection

• Convergence on different languages, so that the gene-grammar match has deteriorated so much that biological selection is no longer masked.

• Biological assimilatory process like in 1, leads to cycles between 2 and 3

17

Model Reimplemented

18Code (Java) available at http://code.google.com/p/suzume/

http://code.google.com/p/suzume/

Analysis : Genetic Diversity

• Is “Stage 1” really showing of a Baldwin effect ?

• We ran the simulation with neutral biological selection, ignoring fitness

• The same reduction of genetic diversity is observed

19

• The diversity oscillates between 5 and 10, because of genetic drift

• Fast-forward simulation. Ready ?


20

• The diversity wanders between 5 and 10, because of genetic drift

• Fast-forward simulation. Ready ?

• Results:


21


• Phenotypes (grammars) are even fewer

22

(10 separate runs, over 20000 generations)

Analysis : Masked Genetic Selection

• We observe no significant drop below 8. Why ?

• 24 learning resources = 4 * 4 non-matching + 1 * 8 matching

• Any agent dropping below an 8 match would have its fitness penalized.


23

Analysis : Coevolutionary Attractors

• A few values of gene-grammar match occur more frequently than others, showing potential local attractors

• This is caused by language uniformity and lack of genetic variation

• Attractors are around integer values


24

• State Transitions Graph and Density Plot

Analysis : Coevolutionary Attractors

Agents begin with 24 learning resource and need to fillthe 12 alleles in their grammar. The cost of filling an allelethat matches their chromosome is 1, and the cost of filling anon-matching allele is 4. Therefore the maximum number ofnon-matching alleles the agent can learn while successfullyfilling its chromosome is four, any more and it won’t havesufficient resources left to fill the remaining alleles (4 non-matching ⇤ 4 + 8 matching ⇤ 1 = 24). Experiments chang-ing learning costs and the agents supply of learning resourcealter the shielding level as expected.

This means, any agent dropping below a gene-grammarmatch of 8 will not be able to fill its grammar, and so wouldhave its fitness penalized and would be selected against bybiological selection. This is different to the reasoning pre-sented in the original paper where it was suggested that atthis point the agents would be selected against due to in-creasing variation in their learning input. This will only oc-cur in subsequent generations being subjected to null inputs,but is not the original cause of the unmasking of biologicalselection.

Coevolutionary AttractorsA close inspection of figure 6 shows that there are certainvalues at which the gene-grammar match occurs more fre-quently. Specifically those values centered around integervalues between 8 and 12. This can be seen clearly in theprobability density plot shown in figure 7.

Figure 7: Gene Grammar Match Density (detail)[Seed=1303046232707, Runs=100, Generations=5000]

The reason the gene-grammar match occurs most fre-quently around these values is due to the previously men-tioned facts that the language is uniform across agents, andthat genetic variation is highly limited. If there is only onelanguage, and if the vast majority of agents share the samegenes then the average gene-grammar match will fall closeto an integer value. It is only when significant portions ofthe population posses different genes than that the popula-tion will move away from these points. In cases where itdoes start to move away from these points, that genetic driftwill likely sweep the population back to the original integer

value. In rare cases if the population moves sufficiently faraway from its previously stable genetic state, drift may causethe population to be swept to a different genetic state (andhence a different integer gene-grammar match value).

Of course there may be several different chromosome-grammar matches that result in agents exhibiting the samegene-grammar match value. However, as nothing in theagent’s learning algorithm changes their probability of learn-ing individual grammatical alleles due to a particular set ofgenetic biases (only the number of matches ultimately in-fluences learning), these different model states will behaveidentically. Because of this it is safe to view the integer valuegene-grammar matches as attractor states in the simulation,despite them potentially representing a number of differentunderlying gene-culture states.

to 12 to 11 to 10 to 9 to 8from 12 .55 .05 .00 .00 .00from 11 .01 .52 .07 .01 .00from 10 .00 .02 .42 .08 .00from 9 .00 .00 .02 .61 .03from 8 .00 .00 .00 .04 .78

We calculated the likelihood of the simulation jumpingbetween each of these attractor states (±0.2 units) over a pe-riod of 200 generations. The transition probability matrix ispresented in the table above and in the transition diagram infigure 8. We tested these results compared against equallysized intervals directly between the attractor states and ob-tained probabilities of the simulation staying in the samerange approximately 5 times lower than in the case of theattractors. This indicates that the attractors are significantlymore stable.

Figure 8: State Transition Diagram [Seed=1303037425613,Runs=50, Generations=20000]

Shape of the AttractorsAs can be seen in figure 7 the attractors are not symmetrical.For all attractors except the lowest one (at a gene-grammarmatch of 8) there are significantly more values in the re-gion directly below them, than in the region above. This

number of non-matching alleles the agent can learn whilesuccessfully filling its chromosome is four, any more andit won’t have sufficient resources left to fill the remainingalleles (4 non-matching ⇥ 4 + 8 matching ⇥ 1 = 24). Ex-periments changing learning costs and the agents supply oflearning resource alter the shielding level as expected.

This means any agent dropping below a gene-grammarmatch of 8 will not be able to fill its grammar, and so wouldhave its fitness penalized and would be selected against bybiological selection. This is different to the reasoning pre-sented in the original paper where it was suggested that atthis point the agents would be selected against due to in-creasing variation in their learning input. This will occur insubsequent generations with agents being subjected to nullinputs, but is not the original cause of the unmasking of bio-logical selection.

Coevolutionary AttractorsA close inspection of figure 6 shows that there are certainvalues at which the gene-grammar match occurs more fre-quently. Specifically those values centered around integervalues between 8 and 12. This can be seen clearly in theprobability density plot shown in figure 7.

Figure 7: Gene-Grammar Match Density[Seed=1303046232707, Runs=100, Generations=5000]

The reason the gene-grammar match occurs most fre-quently around these values is due to the previously men-tioned facts that the language is uniform across agents, andthat genetic variation is highly limited. If there is only onelanguage, and if the vast majority of agents share the samegenes, then the average gene-grammar match will fall closeto an integer value. It is only when significant portions ofthe population possesses different genes that the populationwill move away from these points. In cases where this doeshappen, genetic drift will usually sweep the population backto its original integer value point. In rare cases however, ifthe population moves sufficiently far away from its previ-ously stable genetic state, drift may cause the population tobe swept to a new uniform genetic state (and hence a differ-

ent integer gene-grammar match value).Of course there may be several different chromosome-

grammar matches that result in agents exhibiting the samegene-grammar match value. However, as nothing in theagent’s learning algorithm changes their probability of learn-ing individual grammatical alleles due to a particular set ofgenetic biases (only the number of matches ultimately in-fluences learning), these different model states will behaveidentically. Because of this it is safe to view the integer valuegene-grammar matches as attractor states in the simulation,despite them potentially representing a number of differentunderlying gene-culture states.

to 12 to 11 to 10 to 9 to 8from 12 .55 .05 – – –from 11 .01 .52 .07 .01 –from 10 – .02 .42 .08 –from 9 – – .02 .61 .03from 8 – – – .04 .78

We calculated the likelihood of the simulation jumpingbetween each of these attractor states (±0.2 units) over a pe-riod of 200 generations. The transition probability matrixis presented in the table above and in the transition diagramin figure 8. We tested these results against transitions be-tween equally sized intervals positioned directly between theattractor states and obtained probabilities of the simulationstaying in those intervals approximately 5 times lower thanin the case of the attractors. This indicates that the attractorsare significantly more stable.

Figure 8: State Transition Diagram [Seed=1303037425613,Runs=50, Generations=20000]

Shape of the AttractorsAs can be seen in figure 7 the attractors are not symmetrical.For all attractors except the lowest one (at a gene-grammarmatch of 8) there are significantly more values in the re-gion directly below them, than in the region above. Thisis representative of the fact that deviations from the fixedpoint are more likely to be in a downward direction. This

• Shape of the attractors

• attractors not symmetrical

• deviation downward more probable

• result of biological change: random change from optimal is often sub-optimal

End-3 slides

Analysis: Steady State in the Long Run

• Transient can be ignored

• Lower attractors favored in the long run

End-2

Analysis : Sensitivity to Initial Conditions

400 agents

1000 agents

50 agents

• Changing the population size, we get this kind of density for the gene-grammar matches.

• Drift effect masked for a higher population

• Qualitatively different behavior

End-1

End

Conclusions

• The model from Yamauchi & Hashimoto 2010 does capture some of the intended phenomena (e.g. Cultural Shielding, Niche Construction)

• “Stage 2” does not show the claimed degradation of gene-grammar matches, but rather is a random walk between a set of attractors

• Observed model behavior is the result of limited Cultural and Biological Diversity which are themselves the result of the small agent population

• Population Structure and Organization are factors that can potentially increase diversity without the computational costs of a larger agent population

Thank you

Devil in the Details: Analysis of a Coevolutionary Model of Language Evolution via Relaxation of Selection

Luke McCrohon & Olaf WitkowskiUniversity of TokyoJapan

Science

ECAL11 Paris - August 2011