Automatic Detection of “g-dropping” in American English Using Forced Alignment Automatic Detection of “g-dropping” in American English Using Forced Alignment Jiahong Yuan and Mark Liberman University of Pennsylvania Test material: 200 words randomly selected from Buckeye; 100 were transcribed as -in’ and 100 as -ing. Identification test: -in’/-ing forced choice 8 native English speakers 10 native Mandarin speakers Pairwise percentage agreements: Among the native English speakers: 79% - 96%, mean = 0.863 Between the aligner and the English speakers: 79% - 90%, mean = 0.849 Vowel quality difference: •Using the majority vote of the English listeners as gold standard, Mandarin listeners performed poorly (Mandarin has N and NG distinction, but no tense and lax vowel distinction). •We computed an approximate KL- divergence between the GMM acoustic models at each of the five HMM states. The distance between the two models reaches “g-dropping”: -ing is pronounced with an alveolar (instead of a velar) nasal coda, e.g., workin’, tryin’ . Automatic detection of “g-dropping”: Step 1. Two acoustic models were trained, one for -in' (/IHN/) and the other for -ing (/IHNG/). The models were GMM- based, five-state HMMs on 39 PLP coefficients. Step 2. The parameters of the models were initially estimated using the Buckeye corpus and then re-estimated using the SCOTUS corpus. Step 3. The models were added to the Penn Phonetics Lab Forced Aligner, and forced alignment will choose the more probable pronunciation from the two alternatives.

Automatic Detection of “g-dropping” in American English Using Forced Alignment

Download PPT Report

Upload
yank
View
48
Download
5

Tags:

Embed Size (px)

DESCRIPTION

Automatic Detection of “g-dropping” in American English Using Forced Alignment Jiahong Yuan and Mark Liberman University of Pennsylvania. “g-dropping” : -ing is pronounced with an alveolar (instead of a velar) nasal coda, e.g., workin’, tryin’ . Automatic detection of “g-dropping”: - PowerPoint PPT Presentation

Citation preview

Page 1: Automatic Detection of “g-dropping” in American English Using Forced Alignment

Automatic Detection of “g-dropping” in American English Using Forced AlignmentAutomatic Detection of “g-dropping” in American English Using Forced AlignmentJiahong Yuan and Mark Liberman

University of Pennsylvania

Test material: 200 words randomly selected from Buckeye; 100 were transcribed as -in’ and 100 as -ing.

Identification test: -in’/-ing forced choice 8 native English speakers 10 native Mandarin speakers

Pairwise percentage agreements: Among the native English speakers: 79% - 96%, mean = 0.863 Between the aligner and the English speakers: 79% - 90%, mean = 0.849

Vowel quality difference:•Using the majority vote of the English listeners as gold standard, Mandarin listeners performed poorly (Mandarin has N and NG distinction, but no tense and lax vowel distinction).

•We computed an approximate KL-divergence between the GMM acoustic models at each of the five HMM states. The distance between the two models reaches its peak at the middle state, and it is larger on the left side (the vowel side) than the right (the nasal coda side).

“g-dropping”: -ing is pronounced with an alveolar (instead of a velar) nasal coda, e.g., workin’, tryin’.

Automatic detection of “g-dropping”: Step 1. Two acoustic models were trained, one for -in' (/IHN/) and the other for -ing (/IHNG/). The models were GMM-based, five-state HMMs on 39 PLP coefficients. Step 2. The parameters of the models were initially estimated using the Buckeye corpus and then re-estimated using the SCOTUS corpus. Step 3. The models were added to the Penn Phonetics Lab Forced Aligner, and forced alignment will choose the more probable pronunciation from the two alternatives.