47
12/1/2014 1 Natural Language Processing Diachronics Dan Klein – UC Berkeley Includes joint work with Alex BouchardCote, Tom Griffiths, and David Hall

Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

1

Natural Language Processing

DiachronicsDan Klein – UC Berkeley

Includes joint work with Alex Bouchard‐Cote, Tom Griffiths, and David Hall

Page 2: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

2

The Task

Page 3: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

3

Latin

focus

Lexical Reconstruction

French Spanish Italian Portuguese

feu fuego fuoco fogo

Page 4: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

4

Tree of Languages

We assume the phylogeny is known Much work in 

biology, e.g. work by Warnow, Felsenstein, Steele…

Also in linguistics, e.g. Warnow et al., Gray and Atkinson…

http://andromeda.rutgers.edu/~jlynch/language.html

Page 5: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

5

Evolution through Sound Changes

camera /kamera/Latin

chambre /ṏambЙ/French

Deletion: /e/, /a/

Change: /k/ .. /tṏ/ .. /ṏ/

Insertion: /b/

Eng. camera from Latin, “camera obscura”

Eng. chamber from Old Fr. before the initial /t/ dropped

Page 6: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

6

Changes are Systematic

camra /kamra/

camera /kamera/

e _

numrus /numrus/

numerus /numerus/

e _

Page 7: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

7

Changes are Contextual

camra /kamra/

camera /kamera/

e _ / after stress

e _

Page 8: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

8

Changes Have Structure

cambra /kambra/

camra /kamra/

_ b / m_r

_ b

_ [stop x] / [nasal x]_r

Page 9: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

9

Changes are Systematic

English Great Vowel Shift (Simplified!)

e

i

a

“time” = teem “time” = taim

Page 10: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

10

Diachronic Evidence

tonitru non tonotrutonight not tonite

Yahoo! Answers [ca 2000] Appendix Probi [ca 300]

Page 11: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

11

Synchronic (Comparative) Evidence

Key idea: changes occur uniformly across the lexicon

Page 12: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

12

The Data

Page 13: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

13

The Data Data sets Small: Romance

French, Italian, Portuguese, Spanish 2344 words Complete cognate sets Target: (Vulgar) Latin

FR IT PT ES

Page 14: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

14

The Data Data sets Small: Romance

French, Italian, Portuguese, Spanish 2344 words Complete cognate sets Target: (Vulgar) Latin

Large: Austronesian 637 languages 140K words Incomplete cognate sets Target: Proto‐Austronesian

FR IT PT ES

Page 15: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

15

Austronesian

Page 16: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

16

Austronesian Examples

From the Austronesian Basic Vocabulary Database

Page 17: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

17

The Model

Page 18: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

18

Simple Model: Single Characters

CG CC CC GG

G

C GG

[cf. Felsenstein 81]

Page 19: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

19

Changes are Systematic

/fokus/

/fweɋo/

/fogo/

/fogo//fwꜼko/

/fokus/

/fweɋo/

/fogo/

/fogo//fwꜼko/

/kentrum/

/sentro/

/sentro/

/sentro//tṏƌntro/

Page 20: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

20

Parameters are Branch‐Specific

focus /fokus/

fuego /fweɋo/

/fogo/

fogo /fogo/

fuoco /fwꜼko/

IB

IT ES PT

IB

LAES

IT PT

[Bouchard‐Cote, Griffiths, Klein, 07]

Page 21: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

21

Edits are Contextual, Structured

/fokus/

/fwꜼko/

f# o

f# w Ꜽ

IT

Page 22: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

22

Inference

Page 23: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

23

Learning: Objective

/fokus/

/fweɋo/

/fogo/

/fogo//fwꜼko/

z

w

Page 24: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

24

Learning: EM M‐Step Find parameters which fit 

(expected) sound change counts

Easy: gradient ascent on theta

E‐Step Find (expected) change 

counts given parameters Hard: variables are string‐

valued

/fokus/

/fweɋo/

/fogo/

/fogo//fwꜼko/

/fokus/

/fweɋo/

/fogo/

/fogo//fwꜼko/

Page 25: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

25

Computing Expectations

‘grass’

Standard approach, e.g. [Holmes 2001]: Gibbs sampling each sequence

[Holmes 01, Bouchard‐Cote, Griffiths, Klein 07]

Page 26: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

26

A Gibbs Sampler

‘grass’

Page 27: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

27

A Gibbs Sampler

‘grass’

Page 28: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

28

A Gibbs Sampler

‘grass’

Page 29: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

29

Getting Stuck

How could we jump to a state where the liquids /r/ and /l/ have a common 

ancestor?

?

Page 30: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

30

Getting Stuck

Page 31: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

31

Efficient Sampling: Vertical Slices

Single Sequence 

Resampling

Ancestry Resampling

[Bouchard‐Cote, Griffiths, Klein, 08]

Page 32: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

32

Results

Page 33: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

33

Results: Romance

Page 34: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

34

Learned Rules / Mutations

Page 35: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

35

Learned Rules / Mutations

Page 36: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

36

Results: Austronesian

Page 37: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

37

Examples: Austronesian

[Bouchard‐Cote, Hall, Griffiths, Klein, 13]

Page 38: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

38

Result: More Languages Help

Number of modern languages used

Mean ed

it distance

Distance from Blust [1993] Reconstructions

Page 39: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

39

Visualization: Learned Universals

*The model did not have features encoding natural classes

Page 40: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

40

Regularity and Functional Load

In a language, some pairs of sounds are more contrastive than others (higher functional load)

Example: English p/d versus t/th

High Load: p/d:  pot/dot, pin/dindress/press, pew/dew, ...

Low Load: t/th:  thin/tin

Page 41: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

41

Functional Load: Timeline1955: Functional Load Hypothesis (FLH): Sound changes are 

less frequent when they merge phonemes with high functional load    [Martinet, 55]

1967: Previous research within linguistics: “FLH does not seem to be supported by the data” [King, 67] (Based on 4 languages as noted by [Hocket, 67; Surandran et al., 06])

Our approach: we reexamined the question with two orders of magnitude more data [Bouchard‐Cote, Hall, Griffiths, Klein, 13]

Page 42: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

42

Regularity and Functional Load

Functional load as computed by [King, 67]

Data: only 4 languages from the Austronesian dataMerger p

osterio

r probability

Each dot is a sound change identified by the system

Page 43: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

43

Regularity and Functional Load

Data: all 637 languages from the Austronesian data

Functional load as computed by [King, 67]

Merger p

osterio

r probability

Page 44: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

44

Extensions

Page 45: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

45

Cognate Detection

/fweɋo/

/fogo/

/fwꜼko/

/berǍo/

/vƌrbo/

/vƌrbo/ /tṏƌntro/

/sentro/

/sƌntro/

‘fire’

[Hall and Klein, 11]

Page 46: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

46

Grammar Induction

010203040506070

Dut

ch

Dan

ish

Swed

ish

Span

ish

Portu

gues

e

Slov

ene

Chi

nese

Engl

ish

WG NGRMG

IEGLAvg rel gain: 29% 

[Berg‐Kirkpatrick and Klein, 07]

Page 47: Natural Language Processing - Peopleklein/cs288/fa14/slides/fa14lecture25-1pp.pdfEnglish Great Vowel Shift (Simplified!) e i a “time” = teem “time” = taim. 12/1/2014 10 Diachronic

12/1/2014

47

Language Diversity

Why are the languages of the world so similar?

Universal grammar answer: Hardware constraints

Common source answer: Not much time has passed

[Rafferty, Griffiths, and Klein, 09]