Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Vocal Fold ModelingMartin Hagmuller
Signal Processing and Speech Communication Laboratory
Institute for Communication and Wave Propagation
Graz University of Technology
Vocal Fold Modeling – p.1/20
Overview
Motivation
Anatomy and Physiology
Vocal Fold Models
Conclusion
Vocal Fold Modeling – p.2/20
Motivation (1)
Waveform ModelParametric description of waveforme.g. Liljencrants-Fant modelflow derivative with four parameters
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
0 0.002 0.004 0.006 0.008 0.01
Tp Te
TaT0
Ee
dU
g/d
t(t)
L F- model
tim e (s ) (T 0=1 0ms)
Physiologically motivated voice generationApproximation of human voice generationFlexible tool to generate different voices
Study of physiology of voiceBetter understanding of physiologyStudy of voice source dynamics
Vocal Fold Modeling – p.3/20
Motivation (1)
Waveform ModelParametric description of waveforme.g. Liljencrants-Fant modelflow derivative with four parameters
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
0 0.002 0.004 0.006 0.008 0.01
Tp Te
TaT0
Ee
dU
g/d
t(t)
L F- model
tim e (s ) (T 0=1 0ms)
Physiologically motivated voice generationApproximation of human voice generationFlexible tool to generate different voices
Study of physiology of voiceBetter understanding of physiologyStudy of voice source dynamics
Vocal Fold Modeling – p.3/20
Motivation (1)
Waveform ModelParametric description of waveforme.g. Liljencrants-Fant modelflow derivative with four parameters
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
0 0.002 0.004 0.006 0.008 0.01
Tp Te
TaT0
Ee
dU
g/d
t(t)
L F- model
tim e (s ) (T 0=1 0ms)
Physiologically motivated voice generationApproximation of human voice generationFlexible tool to generate different voices
Study of physiology of voiceBetter understanding of physiologyStudy of voice source dynamics
Vocal Fold Modeling – p.3/20
Motivation (2)
Voice SourceVoice Quality (Individuality of Voice)Prosody (Speech Melody)
sub- and supraglottalpressure
Vocal foldparameters
Boundaryconditions
Vocal foldmodel
Glottalflow
Glottalarea
DisadvantagesMany parametersNo intuitive mapping to perceptionHard to controlBased on measured data → hard to obtain
Vocal Fold Modeling – p.4/20
Motivation (2)
Voice SourceVoice Quality (Individuality of Voice)Prosody (Speech Melody)
sub- and supraglottalpressure
Vocal foldparameters
Boundaryconditions
Vocal foldmodel
Glottalflow
Glottalarea
DisadvantagesMany parametersNo intuitive mapping to perceptionHard to controlBased on measured data → hard to obtain
Vocal Fold Modeling – p.4/20
Anatomy & Physiology (1)
play movie
Vocal Fold Modeling – p.5/20
Anatomy & Physiology (1)
play movie
Vocal Fold Modeling – p.5/20
Anatomy & Physiology (1a)
Self oscillating system
Steady airflow from lungs is converted into series offlow pulses by opening & closing airspace betweenvocal folds (glottis)
”Myoelastic-aerodynamic theory of voice production(Van den Berg, 1958)→ Oscillation due to Bernoulli effect
Vocal Fold Modeling – p.6/20
Anatomy & Physiology (2)
Upper and lower portion display phase difference→ Mucosal wave
Vocal Fold Modeling – p.7/20
Vocal Fold Models
Lumped elements models (n-mass models)one mass model (Flanagan & Landgraf, 1968)two mass model (Ishizaka & Flanagan, 1972)three mass model (Story & Titze, 1994)
Physically informed model
Finite elements model
Vocal Fold Modeling – p.8/20
One Mass Model
m
xz
Dk
Psub
P1 P2
Psup
Ug
x 0
x
lc th
d
A
Very coarse model
Vocal fold lumped into one mass.Only lateral displacement
Modelling of basic principles of self-sustained oscillationVocal Fold Modeling – p.9/20
Two Mass Model - Mechanics
m1
r 2
xz
m2
r1k2
k1
k1,2
Psub
P1 P2 P3 P4
Psup
Ug
x 1,0
x1x 2 ,0
x2
lc d 1 d 2
A1A 2
Thyroid Cartilage
Vocal TractTrachea
supA
Includes representation of mucosal wave
Reasonable agreement with physiological data
Very popular
Equations for the mechanical system:
m1x1(t) + r1x1(t) + k1(x1)[x1(t) − x01] + k12[x1(t) − x2(t)] = lgd1pm1(t)
m2x2(t) + r2x2(t) + k2(x2)[x2(t) − x02] − k12[x1(t) − x2(t)] = lgd2pm2(t)
Vocal Fold Modeling – p.10/20
Two Mass Model - Mechanics
m1
r 2
xz
m2
r1k2
k1
k1,2
Psub
P1 P2 P3 P4
Psup
Ug
x 1,0
x1x 2 ,0
x2
lc d 1 d 2
A1A 2
Thyroid Cartilage
Vocal TractTrachea
supA
Includes representation of mucosal wave
Reasonable agreement with physiological data
Very popular
Equations for the mechanical system:
m1x1(t) + r1x1(t) + k1(x1)[x1(t) − x01] + k12[x1(t) − x2(t)] = lgd1pm1(t)
m2x2(t) + r2x2(t) + k2(x2)[x2(t) − x02] − k12[x1(t) − x2(t)] = lgd2pm2(t)
Vocal Fold Modeling – p.10/20
Two Mass Model - Mechanics
m1
r 2
xz
m2
r1k2
k1
k1,2
Psub
P1 P2 P3 P4
Psup
Ug
x 1,0
x1x 2 ,0
x2
lc d 1 d 2
A1A 2
Thyroid Cartilage
Vocal TractTrachea
supA
Includes representation of mucosal wave
Reasonable agreement with physiological data
Very popular
Equations for the mechanical system:
m1x1(t) + r1x1(t) + k1(x1)[x1(t) − x01] + k12[x1(t) − x2(t)] = lgd1pm1(t)
m2x2(t) + r2x2(t) + k2(x2)[x2(t) − x02] − k12[x1(t) − x2(t)] = lgd2pm2(t)
Vocal Fold Modeling – p.10/20
Two Mass Model - Pressure
PP
P
P
Psub
1
2
3
supm
P4
P0
1 m2
P
z
Pressure distribution along glottis:(assumes continuity of the flow)
psub − p1(t) = 1.371
2ρair
u(t)2
A1(t)2
p1(t) − p2(t) = 12νd1
l2gu(t)
A1(t)3
p2(t) − p3(t) =1
2ρairu(t)2
�
1
A2(t)2−
1
A1(t)2�
p3(t) − p4(t) = 12νd2
l2gu(t)
A2(t)3
p4(t) − psup(t) =1
2ρair
u(t)2
A2(t)2�
2A2(t)
Asup
�
1 −
A2(t)
Asup
��
ν . . . air shear viscosity coefficient
ρair . . . air density
Vocal Fold Modeling – p.11/20
Body – Cover Concept
Vocal folds are divided into 2 tissue layersCover: pliable, non-contractible mucosa tissueBody: muscle fiber & ligamentous tissue
Vocal Fold Modeling – p.12/20
Three Mass Model
m1
D
xz
m2
2D1 k2k1
k12
Psub
P1 P2 P3 P4
Psup
Ug
x 1,0
x1x 2 ,0
x2
lc th 1 th 2
d1
d2
A1A 2
D3k3
m3
Thyroid Cartilage
Vocal Tract
Trachea
m3: Body mass
m1,m2: Cover masses
Vocal Fold Modeling – p.13/20
Multiple Mass Model
PSfrag replacements
One fold, side view, y [mm] –>
z[m
m]–
>
x [mm]y [mm]
z[m
m]
Both folds, top view, x [mm] –>
y[m
m]–
>
-3 -2 -1 0 1 2 3
00.5
11.5
22.5
3
0 2 4 6 8 10 12 14
0
2
4
6
8
10
12
14
05
1015
-2
-1
0
1
2
3
4
-2-10123
+ Possible modification ofproperties inlongitudinal dimension
- Increased complexity
- High calculation times
Vocal Fold Modeling – p.14/20
Finite Elements Model
Alipour & Titze 1996 Does not assume lumpedelements
Describes geometry & phys.properties of vocal folds withhuge number of very smallelements
+ Very good solution if allproperties are known
- Calculation times very high
- Real-time models notpossible (yet)
- huge number of parameters(hard to determine)
Vocal Fold Modeling – p.15/20
Physically-informed model
Physical model not for entire systemMechanical Model of Oscillator(calculation of displacement, fundamental frequency)
Nonlinear Map for calulation of glottal flow(interaction of vocal fold motion xi, flow u g and pressure pl)
z−1
fNL
Hres(z )
z−1
ugpl
x1 x2Nonlinear Map
Oscillator
Vocal Fold Modeling – p.16/20
Conclusion
Large Variety of Vocal Fold models
Trade off between Accuracy & Complexityreliable data & level of detail
Rather study of physiology than speech synthesis
Vocal Fold Modeling – p.17/20
Small dictionary
mucosa Schleimhauttissue Gewebe
ligament Bandcartilage Knorpeltrachea Luftröhre
inertia Massenträgheitprosody Sprachmelodie
lumped elements konzentrierte Elemente
Vocal Fold Modeling – p.18/20
References[1] J. Flanagan and L. Landgraf. Self-oscillating source for
vocal-tract synthesizers. IEEE Transactions on Audioand Electroacoustics, 16(1):57–64, March 1968.
[2] K. Ishizaka and J. L. Flanagan. Synthesis of voicedsounds from a two-mass model of the vocal cords. BellSystems Techn. J., 51(6):1233–1268, 1972.
[3] Brad H. Story and Ingo R. Titze. Voice simulation with abody-cover model of the vocal folds. J. Acoust. Soc.Am., 97(2):1249–1260, 1995.
Vocal Fold Modeling – p.19/20
THANK YOU
Vocal Fold Modeling – p.20/20