Upload
others
View
10
Download
0
Embed Size (px)
Citation preview
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Vocabulary lists in computationalhistorical linguistics
Licentiate Seminar
Taraka Rama
Språkbanken & GSLT
University of Gothenburg
1 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Outline
Introduction
Language change
Models of language change
Diversity and differences
Questions, answers and contributions
Acknowledgements
References
2 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Goal of the thesis
Applying techniques from Language Technology (LT)to the following problems:
I Dating of language families
I Structural similarity vs. genetic similarity
I Language classification
3 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Goal of the thesis
Applying techniques from Language Technology (LT)to the following problems:
I Dating of language families
I Structural similarity vs. genetic similarity
I Language classification
3 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Goal of the thesis
Applying techniques from Language Technology (LT)to the following problems:
I Dating of language families
I Structural similarity vs. genetic similarity
I Language classification
3 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Goal of the thesis
Applying techniques from Language Technology (LT)to the following problems:
I Dating of language families
I Structural similarity vs. genetic similarity
I Language classification
3 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Language count
I More than 7000 languages or
I 100,000 languoids1
I 400 language families
1Defined as a set of documented and closely related linguistic varieties. (Nordhoff &
Hammarström 2012)4 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Historical linguistics I
Concerned with:
I Language change: phonological, lexical,grammatical, and semantic change
I The processes introducing the language change
I Identifying the (pre-)historic relationshipsbetween languages
5 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Historical linguistics II
From Diamond (2011), Vajda (2010).
6 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Computational Historical Linguistics
Employs LT (including computational) techniques:
I To classify languages
I Evaluate language relatedness hypothesis
I Devise phonological rule systems
I Reconstruct proto-forms
7 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Basics: I
Cognates:
I Inherited words whose origin can be traced backto a common form
I Ex: Sanskrit dva ~ Armenian erku ‘two’Sanskrit chakra ~ English wheel < PIE kwekwelo
8 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Basics: II
Cognacy representation2
Item Danish Swedish Dutch English‘skin’ skind/1 skinn/1, hud/2 huid/2 skin/1
Items Danish Swedish Dutch English‘skin-1’ 1 1 0 1‘skin-2’ 0 1 1 0
9 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Basics: III
Families:
I Language families: related languages descended from acommon ancestor
I Ex: Indo-European, Dravidian, Niger-Congo, Mixe-Zoquean,and Austronesian
I Language group: subset of a language family
I Indo-European: Slavic, Germanic, Indo-Iranian (Iranian andIndic)
10 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Basics: IV
Spread of Indo-European family3
2Wichmann (2010)
3Bouckaert et al. (2012)
11 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Word lists
(from Grant 2010)
Holman et al. (2008):40-word lista
blood, bone, breasts,come, die, dog, drink, ear,eye, fire, fish, full, hand,hear, horn, I, knee, leaf,liver, louse, mountain,name, new, night, nose,one, path, person, see,skin, star, stone sun,tongue, tooth, tree, two,water, we, you
aSwadesh (1955): 100-word list
12 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Outline
Introduction
Language change
Models of language change
Diversity and differences
Questions, answers and contributions
Acknowledgements
References
13 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Phonological change: I
I Sound addition: Cypriot Arabic developed a [k]as in *pjara > pkjara
I Sound loss: *tracu > Pengo racu ‘snake’
I Metathesis: Latin miraculum > Spanish milagro‘miracle’
14 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Phonological change: II
Levenshtein (1966):Computes the distance between two strings as the minimumnumber of insertions, deletions, and substitutions to transform asource string to a target string (LD)Damerau (1964) is an extension to LD
15 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Phonological change: III
Linguistically sensitive Levenshtein distance4
I Represent each symbol as a vector of phoneticfeatures
I Compare the vectors of phonetic features usinga distance measure
4Kessler (2005)
16 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Semantic change: I
I Semantic change
I Lexical change
I Grammatical change
17 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Semantic change: II
Typology:
I Broadening and narrowing: English dog hound
I Melioration and pejoration: OHG diorna ‘younggirl’ > MHG dirne ‘prostitute’
I Metaphoric extension: head, tail, star
I Metonymic extension: Sanskrit ratha ‘chariot’~ Latin rota ‘wheel’
18 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Semantic change: III
Lexical change
I Borrowing: beef ‘cow’ from Norman French
I Neologisms: New words in a language
I vandalize (from Vandals, a Germanic tribe)
I all + together⇒ altogether
I gym < gymnasium
19 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Semantic change: IV
Grammatical change
I Morphological change: English umlaut⇒ foot :feet, mouse : mice
I Syntactic change: Word order, morphologicalcomplexity, verb chains, and grammaticalization
20 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Outline
Introduction
Language change
Models of language change
Diversity and differences
Questions, answers and contributions
Acknowledgements
References
21 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Tree model
Figure: IE family from Garrett (1999).
22 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Wave model
Figure: Indo-European isoglosses Bloomfield (1935, 316) andtree-envelope from Southworth (1964). 1. Sibilants for velars incertain forms. 2. Case-endings with [m] for [bh]. 3. Passive-voiceendings with [r]. 4. Prefix [e-] in past tenses. 5. Feminine nounswith masculine suffixes. 6. Perfect tense used as general pasttense.
23 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Network model5
5Gray & Atkinson (2003) and Huson & Bryant (2006)
24 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Outline
Introduction
Language change
Models of language change
Diversity and differences
Questions, answers and contributions
Acknowledgements
References
25 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Linguistic diversity: I
As defined by Nettle (1999):
I Language diversity
I Phylogenetic diversity
I Structural diversity
26 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Linguistic diversity: II
Language diversity
I Languages per square kilometer
I Ex. New Guinea has 800 languages(786× 103km2) ~ Iceland has only one language(103× 103km2)
27 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Linguistic diversity: III
Hotspots of language diversity6
28 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Linguistic diversity: IV
Phylogenetic diversity
I Number of families per square kilometer.
I North America has more than 20 languagefamilies in 24.49× 106km2
I South America has 53 families in 17.84× 106km2.
29 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Linguistic diversity: V
North American family distribution7
30 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Linguistic diversity: VI
Structural diversity
I Languages per square kilometer w.r.t a linguisticfeature.
I Ex. Word order, size of phoneme inventory,morphological type, or suffixing vs. prefixing.
31 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Linguistic diversity: VII
Plosive systems8
Encoding
32 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Linguistic diversity: VIII
Phonological segment distribution
Figure: Segment identity. White circles: Romance; Black circles:North Sea; Grey squares: Slavic; Grey triangles: Ireland, Britain;Isolates rest of them.
33 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Linguistic diversity: IX
Phonological diversity
I Similarity between the phonetic inventoriestranslates into language relatedness (Lohr 1998)
I Differences translate into inter-language distance(or divergence)
6Gorenflo et al. (2012)
7http://www.freelang.net/families/language_maps.php
8Donohue (2012)
34 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Outline
Introduction
Language change
Models of language change
Diversity and differences
Questions, answers and contributions
Acknowledgements
References
35 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Questions:
1. Can language relationships be inferred from parallelcorpora? Corpus-based phylogenetic inference
2. How well can structural relations be employed for the task oflanguage classification? Structural similarity and geneticclassification
3. How to develop a system for dating the split/divergence oflanguage groups present in the world’s language families?Estimating age of language families
4. How to generate a ranked list of concepts which can beused for investigating the problem of automatic languageclassification? Item stability
5. Which string similarity measure is the best for the tasks ofautomatic discrimination and internal classification oflanguages? Comparison of string similarity measures forautomated language classification
36 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Questions:
1. Can language relationships be inferred from parallelcorpora? Corpus-based phylogenetic inference
2. How well can structural relations be employed for the task oflanguage classification? Structural similarity and geneticclassification
3. How to develop a system for dating the split/divergence oflanguage groups present in the world’s language families?Estimating age of language families
4. How to generate a ranked list of concepts which can beused for investigating the problem of automatic languageclassification? Item stability
5. Which string similarity measure is the best for the tasks ofautomatic discrimination and internal classification oflanguages? Comparison of string similarity measures forautomated language classification
36 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Questions:
1. Can language relationships be inferred from parallelcorpora? Corpus-based phylogenetic inference
2. How well can structural relations be employed for the task oflanguage classification? Structural similarity and geneticclassification
3. How to develop a system for dating the split/divergence oflanguage groups present in the world’s language families?Estimating age of language families
4. How to generate a ranked list of concepts which can beused for investigating the problem of automatic languageclassification? Item stability
5. Which string similarity measure is the best for the tasks ofautomatic discrimination and internal classification oflanguages? Comparison of string similarity measures forautomated language classification
36 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Questions:
1. Can language relationships be inferred from parallelcorpora? Corpus-based phylogenetic inference
2. How well can structural relations be employed for the task oflanguage classification? Structural similarity and geneticclassification
3. How to develop a system for dating the split/divergence oflanguage groups present in the world’s language families?Estimating age of language families
4. How to generate a ranked list of concepts which can beused for investigating the problem of automatic languageclassification? Item stability
5. Which string similarity measure is the best for the tasks ofautomatic discrimination and internal classification oflanguages? Comparison of string similarity measures forautomated language classification
36 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Questions:
1. Can language relationships be inferred from parallelcorpora? Corpus-based phylogenetic inference
2. How well can structural relations be employed for the task oflanguage classification? Structural similarity and geneticclassification
3. How to develop a system for dating the split/divergence oflanguage groups present in the world’s language families?Estimating age of language families
4. How to generate a ranked list of concepts which can beused for investigating the problem of automatic languageclassification? Item stability
5. Which string similarity measure is the best for the tasks ofautomatic discrimination and internal classification oflanguages? Comparison of string similarity measures forautomated language classification
36 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Corpus-based phylogenetic inference9
I Use three different stringsimilarity measures
I Show that parallelcorpora can be used toautomatically extractcognates and infer aphylogenetic tree
I Work with 10 Europeanlanguages
Dice and LCSR tree
9Rama & Borin (2011)
37 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
The grouping of Dutch, English and French: is that the firsttwo have borrowed large parts of the vocabulary used inthe Europarl corpus (administrative and legal terms) fromFrench, and additionally in many cases have a spellingclose to the original French form of the words (whereasFrench loanwords in e.g. Swedish have often beenorthographically adapted, for example French jus ∼English juice ∼ Swedish sky ‘meat juice’).
38 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Structural similarity and geneticclassification10
Correlate typological distances
Withgenealogicalclassification
10Rama & Kolachina (2012)
39 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Correlate typological distances
With the lexicaldistancescomputed from40-wordSwadesh lists
40 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Estimating the age of language families I
The combination of phonotactic diversity and lexicaldivergence are used to predict the dates of splits formore than 50 language families11
41 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Estimating the age of language families II
Subfamily NOL CD Type FN MOS GA12
Brythonic 2 1450 H IE AGR EurasiaDardic 22 3550 A IE AGR Eurasia
Inuit 4 800 A EA PAS AmericasMalayo-Polynesian 954 4250 A An AGR Oceania
Ongamo-Maa 4 1150 A NS AGR AfricaSlavic 16 1450 H IE AGR EurasiaTurkic 51 2500 AH Alt AGR Eurasia
42 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Estimating the age of language families III
LGS
3.0 4.0 5.0
0.819***
0.907***
4 5 6 7 8 9
0.943***
0.913***
3 4 5 6 7 8 9
0.872***
13
57
0.712***
3.0
4.0
5.0
1−grams 0.911***
0.817***
0.724***
0.654***
0.727***
2−grams 0.961***
0.902***
0.853***
4.5
5.5
6.5
7.5
0.747***
45
67
89
3−grams 0.978***
0.945***
0.723***
4−grams 0.987***
46
8
0.667***
35
79
5−grams 0.646***
1 2 3 4 5 6 7 4.5 5.5 6.5 7.5 4 5 6 7 8 9 5.5 6.5 7.5 8.5
5.5
7.0
8.5
CD
Pairwise scatterplot matrix of group size, N−gram diversity and date; the lower matrix panels showscatterplots and LOESS lines; the upper matrix panels show Spearman rank correlation (ρ) andlevel of statistical significance (?). The diagonal panels display variable names. All the plots are ona log-log scale.
11Rama (2013)
12FN: family name; MOS: Mode of subsistence; GA: Geographical area; An: Austronesian; Alt: Altaic;
EA: Eskimo-Aleut; NS: Nilo-Saharan43 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Comparison of string similarity measures I
I Compare the performance of 14 different stringsimilarity techniques:13
1. IDENT
2. PREFIX
3. DICE
4. LCSR
5. TRIGRAM
6. XDICE
7. Jaccard’s index JCD
44 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Comparison of string similarity measures II
I The FDR procedure14
I Suggests that choice of string similarity measure isimmaterial for internal classification
I For Dist suggests that JCD(D) > JCD, JCD > TRI(D),DICED > IDENTD, LDND > LCSD, and LCSD > LDN
13Rama & Borin (2014a)
14False Discovery Rate (Benjamini & Hochberg 1995)
45 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Item stability I
Employ n-grams to quantify the resistance to lexicalreplacement across the branches of a languagefamily15
46 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Item stability II
Lexical items with widespread and numerouscognates are stable. This notion can be capturedusing self-entropy of n-grams.
47 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Item stability III
Ranks derived from n-gram analysis largely agree withthe item stability ranks based on phonologicalmatches found by Holman et al. (2008) using LD asthe similarity measure.
48 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Item stability IV
At the same time, n-gram analysis is cheaper in termsof computational resources – the fundamentalcomparison step has linear complexity, againstquadratic complexity for LD – which is importantwhen processing large quantities of language data.
15Rama & Borin (2014b)
49 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Future work: I
I Exploiting longer word lists such as IDS16 andLWT17 (Borin, Comrie & Saxena 2013)
I Apply all the available string similarity measures18
50 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Future work: II
I Check the relationship between reticulation andtypological distances (Donohue 2012)
I Use multilingual tree-banks for the comparison ofword order, part-of-speech, and syntacticsubtree (or treelet) distributions (Wiersmaet al. 2011)
51 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Future work: III
I Include the phylogenetic tree structure intoautomatic dating (Pagel 1999)
I Extract typological and phonological databases(Nordhoff 2012)
I digitized grammatical descriptions
I public resources such as Wikipedia andWiktionary
16Intercontinental Dictionary Series
17Loanword Typology
18SimMetrics: http://sourceforge.net/projects/simmetrics/
52 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Outline
Introduction
Language change
Models of language change
Diversity and differences
Questions, answers and contributions
Acknowledgements
References
53 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Thanks for listening!
54 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
Outline
Introduction
Language change
Models of language change
Diversity and differences
Questions, answers and contributions
Acknowledgements
References
55 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
References: I
Benjamini, Y. & Hochberg, Y. (1995), ‘Controlling the false discovery rate: A practical and powerfulapproach to multiple testing’, Journal of the Royal Statistical Society. Series B(Methodological) 57(1), 289–300.
Bloomfield, L. (1935), Language, Allen, George and Unwin, London.
Borin, L., Comrie, B. & Saxena, A. (2013), The intercontinental dictionary series – a rich andprincipled database for language comparison, in L. Borin & A. Saxena, eds, ‘Approaches toMeasuring Linguistic Differences’, De Gruyter Mouton, Berlin, pp. 285–302.
Bouckaert, R., Lemey, P., Dunn, M., Greenhill, S. J., Alekseyenko, A. V., Drummond, A. J., Gray, R. D.,Suchard, M. A. & Atkinson, Q. D. (2012), ‘Mapping the origins and expansion of theIndo-European language family’, Science 337(6097), 957–960.
Damerau, F. J. (1964), ‘A technique for computer detection and correction of spelling errors’,Communications of the ACM 7(3), 171–176.
Diamond, J. (2011), ‘Linguistics: Deep relationships between languages’, Nature476(7360), 291–292.
Donohue, M. (2012), ‘Typology and Areality’, Language Dynamics and Change 2(1), 98–116.
Garrett, A. (1999), A new model of Indo-European subgrouping and dispersal, in S. S. Chang,L. Liaw & J. Ruppenhofer, eds, ‘Proceedings of the Twenty-Fifth Annual Meeting of theBerkeley Linguistics Society’, Berkeley Linguistic Society, Berkeley, pp. 146–156.
Gorenflo, L. J., Romaine, S., Mittermeier, R. A. & Walker-Painemilla, K. (2012), ‘Co-occurrence oflinguistic and biological diversity in biodiversity hotspots and high biodiversity wildernessareas’, Proceedings of the National Academy of Sciences 109(21), 8032–8037.
56 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
References: II
Grant, A. P. (2010), ‘Swadesh’s life and place in linguistics’, Diachronica 27(2), 191–196.
Gray, R. D. & Atkinson, Q. D. (2003), ‘Language-tree divergence times support the Anatoliantheory of Indo-European origin’, Nature 426(6965), 435–439.
Holman, E. W., Wichmann, S., Brown, C. H., Velupillai, V., Müller, A. & Bakker, D. (2008), ‘Explorationsin automated language classification’, Folia Linguistica 42(3-4), 331–354.
Huson, D. H. & Bryant, D. (2006), ‘Application of phylogenetic networks in evolutionary studies’,Molecular Biology and Evolution 23(2), 254–267.
Kessler, B. (2005), ‘Phonetic comparison algorithms’, Transactions of the Philological Society
103(2), 243–260.
Levenshtein, V. I. (1966), Binary codes capable of correcting deletions, insertions and reversals, in‘Soviet physics doklady’, Vol. 10, p. 707.
Lohr, M. (1998), Methods for the genetic classification of languages, PhD thesis, University ofCambridge.
Nettle, D. (1999), Linguistic Diversity, Oxford University Press, Oxford.
Nordhoff, S., ed. (2012), Electronic Grammaticography, University of Hawaií, Honolulu, Hawaií.
Nordhoff, S. & Hammarström, H. (2012), Glottolog/Langdoc: Increasing the visibility of greyliterature for low-density languages., in ‘Language Resources and Evaluation Conference’,pp. 3289–3294.
57 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
References: III
Pagel, M. (1999), ‘Inferring the historical patterns of biological evolution’, Nature
401(6756), 877–884.
Rama, T. (2013), ‘Phonotactic diversity predicts the time depth of the world’s language families’,PloS one 8(5), e63238.
Rama, T. & Borin, L. (2011), Estimating language relationships from a parallel corpus. A study of theEuroparl corpus, in ‘NEALT Proceedings Series (NODALIDA 2011 Conference Proceedings)’,Vol. 11, pp. 161–167.URL: http://hdl.handle.net/10062/17303
Rama, T. & Borin, L. (2014a), ‘Comparison of string similarity measures for automated languageclassification’. Under review.
Rama, T. & Borin, L. (2014b), ‘N-gram approaches to the historical dynamics of basic vocabulary’,Journal of Quantitative Linguistics 21(1), 50–64.
Rama, T. & Kolachina, P. (2012), How good are typological distances for determining genealogicalrelationships among languages?, in ‘COLING (Posters)’, pp. 975–984.URL: http://aclweb.org/anthology/C/C12/C12-2095.pdf
Southworth, F. C. (1964), ‘Family-tree diagrams’, Language 40(4), 557–565.
Swadesh, M. (1955), ‘Towards greater accuracy in lexicostatistic dating’, International Journal of
American Linguistics 21(2), 121–137.
Vajda, E. (2010), Yeniseian, Na-Dene, and historical linguistics, in J. Kari & B. A. Potter, eds, ‘TheDene-Yeniseian Connection’, Anthropological papers of the University of Alaska,pp. 100–118.
58 / 59
Introduction
Languagechange
Models oflanguagechange
Diversity anddifferences
Questions,answers andcontributions
Acknowledgements
References
References: IV
Wichmann, S. (2010), Internal language classification, in S. Luraghi & V. Bubeník, eds, ‘ContinuumCompanion to Historical Linguistics’, Continuum International Publishing Group, pp. 70–88.
Wiersma, W., Nerbonne, J. & Lauttamus, T. (2011), ‘Automatically extracting typical syntacticdifferences from corpora’, Literary and Linguistic Computing 26(1), 107–124.
59 / 59