Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Phoneme Hierarchy
Techniques
Dual-domain Hierarchical Phoneme Classification Hossein Hamooni and Abdullah Mueen
Department of Computer Science, University of New Mexico
Abstract Phonemes are smallest unit of human speech in any language. We use both frequency and time doma in fea tu res fo r Eng l i sh phoneme classification. We use a hierarchy of phonemes based on their manners of pronunciation and applied non-parametric conditional classification at each node. The classifier is tested on three novel datasets and the results are significantly better than parametric methods.
Motivation Accurate classification of phonemes can lead to better understanding of speech variations such as accents, dialects and disorders. Phoneme based speech recognizers can be robust for such variations.
0 0.5 1 1.5 2 2.5 3 3.5 4 x 104 -0.6 -0.4 -0.2
0 0.2 0.4 0.6 0.8
1
0 0.5 1 1.5 2 2.5 3 3.5 4 x 104 -0.8 -0.6 -0.4 -0.2
0 0.2 0.4 0.6 0.8
1
0 0.5 1 1.5 2 2.5 3 3.5 4 x 104 -0.8 -0.6 -0.4 -0.2
0 0.2 0.4 0.6 0.8
1
0 0.5 1 1.5 2 2.5 3 3.5 4 x 104 -0.6 -0.4 -0.2
0 0.2 0.4 0.6 0.8
1
British : bɒs American : bɑːs
b ɑː s b ɒ s
Data Preparation Ø Crawling online dictionaries
Ø Google Translate
Ø Oxford Dictionary
Ø Merriam-Webster
Ø Segmentation
Ø Silent Removal
Ø Normalization
Obstruent
Fricative
S SH
Affricate
CH gasser /G AE S ER/
unattached /AH N AH T AE CH T/
appreciable /AH P R IY SH AH B AH L/
cliched /K L IY SH EY D/
DTW label MFCC label Hierarchy label
Sonorants
Vowel
EY IY
Semi-vowel
Y
DTW label MFCC label Hierarchy label
savagely /S AE V IH JH L IY /
deactivate /D IY AE K T IH V EY T /
valueless /V AE L Y U W L AH S/
philosophically /F IH L AH S AA F IH K L IY /
0 100 200 300 400 500 600 -2
-1
0
1
2
0 100 200 300 -2
-1
0
1
2
0 100 300 500 700 -4 -2 0 2 4 6
Equal length DTW Original DTW 0 20 40 60 80 100 1200
50
100
150
200
250
Original DTW
Equa
l len
gth
DTW
0 0.5 1 1.5 2 2.5 3 3.5 4 x 105
0.2 0.25 0.3
0.35 0.4
0.45 0.5
0.55 0.6
0.65 0.7
Training Set Size A
ccur
acy
0 0.5 1 1.5 2 2.5 3 3.5 4 x 105
0.1 0.15 0.2
0.25 0.3
0.35 0.4
0.45 0.5
0.55 0.6
Training Set Size
Acc
urac
y
0 0.5 1 1.5 2 2.5 3 3.5 4 x 105
0.2 0.25 0.3
0.35 0.4
0.45 0.5
0.55 0.6
0.65 0.7
MFCC DTW OAHPC DTW - MFCC MFCC -DTW
Training Set Size
Acc
urac
y MFCC DTW OAHPC DTW - MFCC MFCC -DTW
MFCC DTW OAHPC DTW - MFCC MFCC -DTW
1.5 2 2.5 3 3.5 4 x 104
0
20
40
60
80
100
120
Phoneme
Obstruent
Aspirate Fricative Affricate Stop
Sonorants
Nasal Vowel Semi-vowel Liquid
D G T
M N W Y L R P
HH DH TH F
V Z S SH ZH CH JH B
K NG AE AH UW
OY
OW
AW
EY
AY
Prefixed Lower Bound
Original DTW
AA
AO
ER
UH
IH
IY
EH
Results