30
CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1 , Sun-Young Lee 2 , Jong-Sup J 1 Hankuk University of Foreign Languag 2 Cyber Hankuk University of Foreign Langua 26 JUNE, The International Conference on Corpus Linguistics CORPORA

CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

Embed Size (px)

Citation preview

Page 1: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

A study of inflectional morpheme develop-ment in English-speaking children using

CHILDES Corpus

Myung Sook Min1, Sun-Young Lee2, Jong-Sup Jun1

1 Hankuk University of Foreign Language & 2 Cyber Hankuk University of Foreign Language

26 JUNE, 2013

The International Conference on Corpus Linguistics CORPORA-2013

Page 2: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

2

Using the CHILDES(Child Language Data Exchange

System) database, this study investigated the order of ac-

quisition of inflectional morphemes and the overregular-

ization found in English children’s L1 acquisition.

Research Goal

Page 3: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

1. Introduction

Background Children’s L1 development is made by regularizing the linguistic knowledge ac-

quired through diverse input from caregivers. In English, the language development can be measured by the usage of the in-

flectional morphemes such as –ing and –(e)d. Brown(1973) proposed the mean order of acquisition of 14 morphemes and

Marcus et al.(1992) confirmed the U-shape development in the acquisition of

English verbal irregular past tense.

Research purpose Using the whole CHILDES database, this study verifies the previous studies that

studied a limited number of subjects on inflectional morpheme development in

child language.

3

Page 4: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

2. Literature Review

2.1 Acquisition order of inflectional morphemes

Berko(1958) Brown(1973)

Studied the acquisition of mor-

phemes in 4-7 year old American

children using WUG Test which

investigates children’s ability to ap-

ply the inflectional morphemes to

nonsense words. Order of acquisition of Infl.

(1) Present progressive(-ing)

(2) Past regular(-ed)

(3) Third Person regular(-s)

(4) Possessive(-’s)

Studied the acquisition of grammat-

ical morphemes by analyzing the

spontaneous utterance produced by

3 children.

Order of acquisition of Infl.

(1) Present progressive

(2) Plural (3) Past irregular

(4) Possessive (5) Past regular

(6) Third person regular

(7) Third person irregular

4

Page 5: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

2. Literature Review

2.2 Overregularization

Marcus et al.(1992) – (-ed) Kuczaj(1977) – (-ing)

Studied the overregularization of

past tense morpheme on the sponta-

neous utterance produced by 83

subjects. Overregularization rate was not

high but its tendency existed. Overregularization errors were

found from the age of 2 till the be-

ginning of school age. U-shape development confirmed.

Studied the overregularization of

present progressive morpheme on

the spontaneous utterance produced

by 15 subjects. Overregularization was rarely

found. Claimed that it is because there is

no irregular present progressive

form for irregular verbs.

5

Page 6: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

2. Literature Review

2.3 Research questions

Limit of pre-vious studies

The results of previous studies are insufficient for the

generalization of children’s language development due to a

limited number of participants.

Research questions

1) Do children apply the inflectional morphemes to diverse

verbs as they get older?

2) Is the overregularization error found? And is the U-shape

developmental pattern found in children’s language

acquisition?

3) Related to questions 1-2 above, is there a difference between

the UK and the USA children’s language development? If so,

is it due to mothers’ input?

6

Page 7: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

2. Literature Review

2.4 Research Method

Method The number of inflectional word types, their frequency and type

per token ratio, and D which stands for ‘lexical diversity’ were

calculated to measure the development of inflectional

morpheme by age.

D indicates the lexical diversity on randomly selected sentences.

The higher D is, the more diverse the words to which the

children apply the inflectional morphemes.

D is calculated by the command of VocD in CLAN on the

CHILDES Corpus with different lengths of texts.

7

Page 8: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

3. Corpus Study

3.1 CHILDES Corpus

The CHILDES Corpus is one of the most frequently used for research on

language acquisition and the caregiver’s input influence research.

Rearranged the entire CHILDES Corpus to analyze it in an easy way and

focused on the corpus from the age of 1 to 7 which accounts for 97% of the

entire CHILDES Corpus.

7,841 files were created with 2,272 files from 275 UK children and 5,569 files

from 1,355 USA children.

35,130 word types with 1,937,624 tokens from the UK children and 63,705

word types with 2,771,312 tokens from the USA children were extracted by the

command of FREQ in CLAN

8

Page 9: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

3. Corpus Study

3.3 Analysis

First, classified 4,700,000 words by regular inflectional morphemes such as –(e)d and then extracted irregular inflectional morphemes such as ‘wore’ and integrated it with the regular inflectional words. (1) Present progressive(-ing)(2) Regular and irregular past tense(-(e)d, irr), (3) Comparative and superlative(-er, -est, irr)(4) Third person singular present/plural (-(e)s, irr),(5) Possessive singular and plural(-’s, -s’)(6) Pronoun

Calculated Type, Token and TTR by the command of FREQ in CLAN- Command: freq +t*CHI +u +f @ file

Calculated D by the command of VocD in CLAN- Command: vocd +t"CHI" +r6 +s"@C:\CHILDES\CLAN\lib\17133_ ed_d_irr_2556.cut" +u +f @ file

9

Page 10: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

3. Corpus Study

3.4. Results

Extracted the inflectional word types of 13,528 and the tokens of 1,221,916

TTR and D of inflectional morphemes by country

Inflectional morphemes

UK USA

Type Token TTR D Type Token TTR D

-ing 1,229 39,759 0.031 33.26 1,084 47,458 0.023 27.38

-d_ed_irr(V) 1,006 82,474 0.012 10.78 1,472 132,405 0.011 19.23

-er_-est_irr(A) 217 11,499 0.008 0.77 198 13,978 0.014 1.57

-es_-s_irr(N) 4,245 114,524 0.037 18.18 3,905 172,631 0.023 30.78

pronoun 52 165,778 0.000 2.64 51 321,019 0.000 1.82

-s'_-'s 1,359 49,219 0.028 4.48 1,904 71,172 0.027 3.77

Total 8,108 463,2530.019 

11.685 8,614 758,663 0.016 14.0910

Page 11: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

3. Corpus Study

3.4.1 Present progressive (-ing)

TTR and D

AgeUK USA

Type Token TTR D Type Token TTR D

1 97 719 0.135 13.83 290 3,378 0.086 27.60

2 1,158 33,264 0.035 33.27 637 15,904 0.040 28.14

3 256 3,071 0.083 20.18 558 10,924 0.051 23.93

4 131 642 0.204 21.84 569 11,343 0.050 26.67

5 154 1,009 0.153 27.07 383 3,740 0.102 28.95

6 83 264 0.314 27.07 244 1,210 0.202 36.05

7 118 790 0.149 22.48 200 959 0.209 27.23

Total 1,997 39,759 0.153 23.68 2,881 47,4580.106 

28.3711

Page 12: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

3. Corpus Study

3.4.1 Present progressive (-ing)

The difference in D is not found between the UK and the USA children.

The correlations between D and the children’s age were not significant, which seems to indicate that children already apply the present progressive morpheme to diverse verbs from the age of 1.

- UK children: r =0.025, p >.05 / USA children: r =0.385, p >.05

80-90% of the most frequently used 50 words in children’s speech were found in the most frequently used 50 words in mothers’.

Overregularization errors were rarely found.

- Noun+ing(tennising, swording, appetizing) one or twice of each Adjective+ing(noticeabling) only once

- However, present progressive and gerund shares the same form, it needs further study by reviewing their usage.

12

Page 13: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

3. Corpus Study

3.4.2 Past tense (-(e)d_irr(V))

TTR and D

AgeUK USA

Type Token TTR D Type Token TTR D

1 94 1,800 0.052 5.24 223 5,145 0.043 16.34

2 913 68,022 0.013 11.45 726 33,190 0.022 19.92

3 262 6,474 0.040 12.63 757 33,020 0.023 21.24

4 166 1,600 0.104 15.68 820 38,482 0.021 21.00

5 181 2,327 0.078 14.82 547 13,106 0.042 22.60

6 108 610 0.177 12.34 352 4,755 0.074 24.52

7 162 1,641 0.099 13.39 321 4,707 0.068 20.05

Total 1,886 82,474 0.080 12.22 3,746 132,405 0.042 20.81

13

Page 14: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

3. Corpus Study

3.4.2 Past tense (-(e)d_irr(V))

As children get older, the D of past tense increased by the age of 5 or 6 and

decreased at age 7 in both the UK and the USA.

A marginal correlation was found between the D and the children’s age.(The

critical value of significant correlation coefficient was 0.68) It means children

tend to apply past tense morphemes to more diverse verbal words as their age

increased.

UK children: r=0.643 p>.05 / USA children r= 0.66, p>.05

In all age groups, the D of the USA children is higher than that of the UK

children.

That the D of past tense is lower than that of present progressive confirms the

grammatical morpheme developmental order proposed by Brown(1973).

14

Page 15: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

3. Corpus Study

3.4.2 Past tense (-(e)d_irr(V))

The words with the highest frequencies are occupied mostly by irregular verbs.

They were found four times more than regular verbs in both countries.

- UK: 25 irregular verbs, 8 irregular verbs whose bare form shares the same

form as the past and the past participle, 7 auxiliary verbs, 6 regular verbs, 4

words with regular past tense morphemes but probably used as adjectives

- USA: 25 irregular verbs, 9 irregular verbs whose bare form shares the same

form as the past and the past participle such as put, 5 auxiliary verbs, 5

regular verbs, 6 words with regular past tense morphemes but probably used

as adjectives

15

Page 16: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

3. Corpus Study

3.4.2 Past tense (-(e)d_irr(V))

‘go’ and ‘fall’ were the most overregularized irregular verbs attached with

regular past tense morpheme ‘-(e)d’.

Overregularization error type and frequency of irregular verb ‘go’

Age

Correct Overregularizationtotal

went Gone Subtotal goed goned wented subtotal

UKUSA

UK USA UK USA UKUSA

UKUSA

UK

USA

UKUSA

UK USA

1 - 29 580 239 580 268 1 - - - - - 1 - 581 268

2 784 572 4,978 627 5,762 1,199 17 38 3 2 - 1 20 41 5,782 1,240

3 73 675 192 142 265 817 3 52 - - - - 3 52 268 869

4 23 860 21 109 44 969 - 4 - - - - - 4 44 973

5 33 286 22 30 55 316 - - - - - - - - 55 316

6 24 158 2 6 26 164 - - - - - - - - 26 164

7 100 74 15 22 115 96 - - - - - - - - 115 9616

Page 17: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

3. Corpus Study

3.4.2 Past tense (-(e)d_irr(V))

Overregularization errors were not found at the age of 1 but appeared between the ages of 2 and 3 and then they began to disappear from the age of 4 or 5.

U-shape developmental pattern of irregular verb ‘go’

Overregularization rate of ‘go’ between the UK and the USA was significantly different by the Pearson chi-square.

17

1 2 3 4 5 6 791%

92%

93%

94%

95%

96%

97%

98%

99%

100%

UKUSA

age

Page 18: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

3. Corpus Study

3.4.2 Past tense (-(e)d_irr(V))

Overregularization error type and frequency of irregular verb ‘fall’

The overregularization error types of ‘fall’ were found more in the UK

children.

Age

Correct Overregularizationtotal

Fell Fallen subtotal falled felled fallened subtotal

UKUSA

UKUSA

UKUSA

UKUSA

UKUSA

UKUSA

UKUSA

UKUSA

1 3 73 - - 3 73 - - - 1 - - - 1 3 74

2 315 462 290 2 605 464 57 94 6 5 4 - 67 99 672 563

3 37 324 16 5 53 329 4 34 - 2 - - 4 36 57 365

4 15 264 2 1 17 265 1 8 - 1 - - 1 9 18 274

5 16 94 2 - 18 94 - 2 - 1 - - - 3 18 97

6 7 32 - 1 7 33 1 - - - - - 1 - 8 33

7 5 15 1 1 6 16 - 1 - 1 - - - 2 6 18

18

Page 19: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

3. Corpus Study

3.4.2 Past tense (-(e)d_irr(V))

Overregularization error in irregular past tense tended to appear at the age of 2 and began to decrease from the age of 3 and disappeared at the age of 4 or 5.

U-shape developmental pattern of irregular verb ‘fall’

Overregularization rate of ‘fall’ between the UK and the USA was not significantly different by the Pearson chi-square.

19

age1 2 3 4 5 6 780%

82%

84%

86%

88%

90%

92%

94%

96%

98%

100%

UKUSA

Page 20: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

3. Corpus Study

3.4.3 Comparative and Superlative (-er_-est_irr(A))

TTR and D

AgeUK USA

Type Token TTR D Type Token TTR D

1 11 661 0.017 0.10 19 1,660 0.011 0.26

2 76 9,940 0.008 0.74 70 4,242 0.017 0.92

3 26 420 0.062 1.24 99 2,758 0.036 2.03

4 21 126 0.167 2.89 122 3,349 0.036 2.70

5 28 194 0.144 2.92 84 1,211 0.069 3.44

6 20 57 0.351 5.82 47 376 0.125 4.06

7 17 101 0.168 2.54 41 382 0.107 2.17

total 199 11,499 0.131  2.32 482 13,978 0.057

  2.23

20

Page 21: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

3. Corpus Study

3.4.3 Comparative and Superlative (-er_-est_irr(A))

As children get older, the D of comparative -er and superlative –est increased

by the age of 6 and slightly decreased at age 7 in both the UK and the USA. It

confirms that children applied comparative and superlative form to diverse

adjectival words as they get older.

Strong correlations were found between D and the children’s age.

UK: r = 0.779, p < .05 / USA: r = 0.776, p < .05

The Ds between the UK and the USA were not distinctively noticeable.

21

Page 22: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

3. Corpus Study

3.4.3 Comparative and Superlative (-er_-est_irr(A))

The words with the most frequency are more and better followed by last, bigger, higher in the UK children and cleaner, higher, bigger, later in the USA children.

Overregularization error type and frequency of ‘little’

AgeCorrect Overregularization

Totalless littler littlest

UK USA UK USA UK USA UK USA1 0 0 0 0 0 0 0 02 3 1 1 7 0 5 4 123 0 3 2 6 1 9 3 184 0 6 0 6 1 4 1 165 0 7 0 4 1 0 1 116 0 0 1 4 1 3 2 77 0 1 0 0 0 0 0 1

22

Page 23: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

3. Corpus Study

3.4.3 Comparative and Superlative (-er_-est_irr(A))

The overregularization errors were found till the age of 6 but still show the U-shape developmental pattern.

U-shape developmental pattern of ‘little’

Overregularization rate of ‘little’ between the UK and the USA was significantly different by the Pearson chi-square.

23

age1 2 3 4 5 6 70%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

UK_littlerUK_littlestUSA_littlerUSA_littlest

Page 24: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

3. Corpus Study

3.4.3 Comparative and Superlative (-er_-est_irr(A))

4 files that littler was found in both child and mother. In these files, children

produced 8 times while mothers produced 17 times.

This finding tells us the possible influence of mothers’ input on child

langauage.

24

Page 25: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

4. Discussion

1) Do children apply the inflectional morphemes to diverse verbs as they get

older?

D

Present progressive(23.68~28.37) > Past tense (12.22~20.81) >

Comparative/Superlative(2.32~2.23)

- D confirms the grammatical morpheme developmental order

proposed by Brown(1973).

The developmental patterns of each inflectional morpheme were different as

children got older.

That irregular verbs were found more than 4 times than regular verbs in 50

most frequently used verbs supports Brown(1973)’s claim that children

acquired irregular verbs earlier than regular verbs.

25

Page 26: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

4. Discussion

2) Is the overregularization error found? And is the U-shape developmental

pattern found in children’s language acquisition?

The overregularization errors were found and the U-shape developmental

pattern which was claimed in the previous studies like Brown(1973) and

Marcus et al.(1992) were confirmed in CHILDES Corpus on a large scale.

The overregularizaiton errors were found in past tense the most and rarely

found in present progressive.

26

Page 27: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

4. Discussion

3) Related to questions 1-2 above, is there a difference between the UK and the

USA children’s language development? If so, is it due to mothers’ input?

Similarities

(1) As children get older, they apply the inflectional morpheme to more

diverse words.

(2) U-shape developmental patterns were found in both the UK and the

USA.

Differences

(1) The overregularization error rate in English children was lower

than that in American children.

(2) The possible influence of mothers’ input on children’s language is

suggestive.

27

Page 28: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

5. Conclusion

This study investigated the inflectional morpheme development in child language using the data from CHILDES Corpus from 1-7 years old.

Our findings are:

1) Children tended to apply the inflectional morpheme to more diverse words as they got older.

2) U-shape developmental pattern was confirmed.

3) The overregularization errors were found while children applied the inflectional morphemes to words.

4) With Ds, this study supports the grammatical developmental order proposed by Brown(1973).

5) This study showed the possible influence of mothers’ input on children’s language by the different developmental aspects of the UK and the USA children.

28

Page 29: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

References[1] Berko, Jean(1958), The child’s learning of English morphology. Word, 14, 47-56[2] Brown, Roger(1973), A first Language-The early Stages, Harvard University Press[3] CHILDES (http://childes.psy.cmu.edu/) [4] Johansson, Victoria(2008), Lexical diversity and lexical density in speech and writing: a developmental

perspective, Lund University, Dept. of Linguistics and Phonetics, Working Papers 53. p.61-79[5] Kuczaj, Stan A.(1977), Why do children fail to overgeneralize the progressive inflection?, Journal of

Child Language 5. p.167-171 [6] MacWhinney, B. & Snow, C. E.(2000), The Child Language Data Exchange System: An Update.

Journal of Child Language 17. p.457-472[7] Marcus, Gary F.; Pinker, Steven; Ullman, Michael; Hollander, Michelle; Rosen, T. John; and Su,

Fei(1992), Overregularization in Language Acquisition, MONOGRAPHS OF THE SOCIETY FOR RESEARCH IN CHILD DEVELOPMENT Serial No. 228 Vol. 57

[8] Malvern, David; Brian Richards; Ngoni Chipeer & Pilar Duran(2004), Lexical diversity and language development: quantification and assessment New York: Palgrave Macmillan

[9] Maslen, Robert J C; Theakston, Anna L; Lieven, Elena V M; Tomasello, Michael(2004), A Dense Corpus Study of Past Tense and Plural Overregularization in English, Journal of Speech, Language, and Hearing Research 47. 6. p.1319-1333

[10] McCathy, Philip M. & Jarvis S(2004), vocd: A theoretical and empirical evaluation, Language Testing 24.4 p.459-488

[11] Richards, Brian J. & David Malvern(1997), Quantifying lexical diversity in the study of language development. Reading: Faculty of Education and Community Studies

[12] Templin, M.C.(1957), Certain language skills in children. Minneapolis: University of Minnesota Press29

Page 30: CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup

CORPORA-2013

30

Myung Sook Min ([email protected])

Sun-Young Lee ([email protected])

Jong-Sup Jun ([email protected])

Contact Info.