20
Bayesian Method in Speaker Verification Chengyuan Ma and Jinyu Li School of Electrical and Computer Engineering Bayesian Method in Speaker Verification. December 2, 2004 – p. 1/?

Chengyuan Ma and Jinyu Li - ISyE Home | ISyEbrani/isyebayes/bank/ISYE8843_Ma_and_Li.pdf · Chengyuan Ma and Jinyu Li School of Electrical and Computer Engineering Bayesian Method

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Chengyuan Ma and Jinyu Li - ISyE Home | ISyEbrani/isyebayes/bank/ISYE8843_Ma_and_Li.pdf · Chengyuan Ma and Jinyu Li School of Electrical and Computer Engineering Bayesian Method

Bayesian Method in Speaker Verification

Chengyuan Ma and Jinyu Li

School of Electrical and Computer Engineering

Bayesian Method in Speaker Verification. December 2, 2004 – p. 1/??

Page 2: Chengyuan Ma and Jinyu Li - ISyE Home | ISyEbrani/isyebayes/bank/ISYE8843_Ma_and_Li.pdf · Chengyuan Ma and Jinyu Li School of Electrical and Computer Engineering Bayesian Method

Outline

Speaker Verification System

MAP adaptation of Speaker Model

Experiment Setup, Conclusion and Future Work

Bayesian Method in Speaker Verification. December 2, 2004 – p. 2/??

Page 3: Chengyuan Ma and Jinyu Li - ISyE Home | ISyEbrani/isyebayes/bank/ISYE8843_Ma_and_Li.pdf · Chengyuan Ma and Jinyu Li School of Electrical and Computer Engineering Bayesian Method

Speaker Verification System

Bayesian Method in Speaker Verification. December 2, 2004 – p. 3/??

Page 4: Chengyuan Ma and Jinyu Li - ISyE Home | ISyEbrani/isyebayes/bank/ISYE8843_Ma_and_Li.pdf · Chengyuan Ma and Jinyu Li School of Electrical and Computer Engineering Bayesian Method

Speaker Verification SystemFramework

Speaker Verification

Determines whether person is who they claim to be

User makes identity claim (one to one mapping)

Three Main Components

Front-end processing (Feature Extraction)

Speaker and background modeling (Speaker Modeling)

Log-likelihood-ratio score normalization (Decision

Making)

Bayesian Method in Speaker Verification. December 2, 2004 – p. 4/??

Page 5: Chengyuan Ma and Jinyu Li - ISyE Home | ISyEbrani/isyebayes/bank/ISYE8843_Ma_and_Li.pdf · Chengyuan Ma and Jinyu Li School of Electrical and Computer Engineering Bayesian Method

Speaker Verification SystemFramework (Cont.)

Bayesian Method in Speaker Verification. December 2, 2004 – p. 5/??

Page 6: Chengyuan Ma and Jinyu Li - ISyE Home | ISyEbrani/isyebayes/bank/ISYE8843_Ma_and_Li.pdf · Chengyuan Ma and Jinyu Li School of Electrical and Computer Engineering Bayesian Method

Speaker and Background Modeling

GMM (Gaussian Mixtures Model) represent static acoustic

event distribution for each speaker

p(ot|Λ) =M∑

m=1

wm

(2π)d

2 |Σm|1

2

· e−1

2(ot−µm)τΣ−1

m (ot−µm)

M∑

m=1

wm = 1 and wm > 0

Λ is used to represent speaker model parameters. wm is the mth weight ofGaussian mixture N (ot; µm, Σm) with mean µm and covariance matrix Σm.

Bayesian Method in Speaker Verification. December 2, 2004 – p. 6/??

Page 7: Chengyuan Ma and Jinyu Li - ISyE Home | ISyEbrani/isyebayes/bank/ISYE8843_Ma_and_Li.pdf · Chengyuan Ma and Jinyu Li School of Electrical and Computer Engineering Bayesian Method

Speaker Recognition Scoring

p(O|Λ) = p(o1, o2, · · · , oT |Λ) =

T∏

t=1

p(ot|Λ)

The score (average Log-Likelihood Ratio (LLR)) of a given test segment O is

computed as follow,

score =1

T

T∑

t=1

(log p(ot|Λtar) − log p(ot|ΛUBM))

T is the length of the test segment, ot is the feature vector at time t and Λtar and

ΛUBM are parameters of the target model and UBM (Universal Background

Model) respectively.

Bayesian Method in Speaker Verification. December 2, 2004 – p. 7/??

Page 8: Chengyuan Ma and Jinyu Li - ISyE Home | ISyEbrani/isyebayes/bank/ISYE8843_Ma_and_Li.pdf · Chengyuan Ma and Jinyu Li School of Electrical and Computer Engineering Bayesian Method

Performance Measurement ofSpeaker Verification System

ROC and DET plots are used for performance measurement

Bayesian Method in Speaker Verification. December 2, 2004 – p. 8/??

Page 9: Chengyuan Ma and Jinyu Li - ISyE Home | ISyEbrani/isyebayes/bank/ISYE8843_Ma_and_Li.pdf · Chengyuan Ma and Jinyu Li School of Electrical and Computer Engineering Bayesian Method

MAP adaptation of Speaker Model

Bayesian Method in Speaker Verification. December 2, 2004 – p. 9/??

Page 10: Chengyuan Ma and Jinyu Li - ISyE Home | ISyEbrani/isyebayes/bank/ISYE8843_Ma_and_Li.pdf · Chengyuan Ma and Jinyu Li School of Electrical and Computer Engineering Bayesian Method

Speaker Model Adaptation UsingMAP Method

EM training is reliable only when sufficient training data are

available

If only a small amount of training data, adaptation method

will be used for model training.

Background model is regarded as prior information for each

speaker, speaker-specified training data are the observation

samples, these two can be combined in the Bayesian

framework.

Bayesian Method in Speaker Verification. December 2, 2004 – p. 10/??

Page 11: Chengyuan Ma and Jinyu Li - ISyE Home | ISyEbrani/isyebayes/bank/ISYE8843_Ma_and_Li.pdf · Chengyuan Ma and Jinyu Li School of Electrical and Computer Engineering Bayesian Method

Speaker Model Adaptation UsingMAP Method (Cont.)

Speaker model parameters to be estimated

Λ = (w1, w2, · · · , wM , µ1, µ2, · · · , µM ,Σ1,Σ2, · · · ,ΣM )

MAP estimation of speaker model parameters

θMAP = arg maxθ

p(O|Λ)g(Λ)

Bayesian Method in Speaker Verification. December 2, 2004 – p. 11/??

Page 12: Chengyuan Ma and Jinyu Li - ISyE Home | ISyEbrani/isyebayes/bank/ISYE8843_Ma_and_Li.pdf · Chengyuan Ma and Jinyu Li School of Electrical and Computer Engineering Bayesian Method

Speaker Model Adaptation UsingMAP Method (Cont.)

Prior distribution of mixture weights is a conjugete density

such as Dirichlet density

g(w1, w2, · · · , wM ) ∝M∏

m=1

wvk−1m

where vk > 0 are the parameters for the Dirichlet density. a

aJ.-L. Gauvain and C.-H. Lee, ”Maximum a Posterior Estimation for Multivariate Gaussian Mixture Ob-

servations of Markov Chains,” IEEE Trans. Speech and Audio Processing, vol. 2, no. 2, pp. 291-298, April

1994.

Bayesian Method in Speaker Verification. December 2, 2004 – p. 12/??

Page 13: Chengyuan Ma and Jinyu Li - ISyE Home | ISyEbrani/isyebayes/bank/ISYE8843_Ma_and_Li.pdf · Chengyuan Ma and Jinyu Li School of Electrical and Computer Engineering Bayesian Method

Speaker Model Adaptation UsingMAP Method (Cont.)

Prior distribution of (µm,Σm) is a normal_Wishart density

g(µm,Σm) ∝ |γm|(αm−p)/2

exp[

−τm

2(µm − mm)tΣm(µm − mm)

]

exp

[

−1

2tr(umΣm)

]

where (τm, mm, αm, um) are the prior density parameters

such that αm > p − 1, τm > 0, mm is a vector of dimension p,

and um is a p ∗ p positive definite matrix.

Bayesian Method in Speaker Verification. December 2, 2004 – p. 13/??

Page 14: Chengyuan Ma and Jinyu Li - ISyE Home | ISyEbrani/isyebayes/bank/ISYE8843_Ma_and_Li.pdf · Chengyuan Ma and Jinyu Li School of Electrical and Computer Engineering Bayesian Method

Speaker Model Adaptation UsingMAP Method (Cont.)

Practical method for parameter adaptation only the mean

vectors will be updated

µ̂m = αmEm(O) + (1 − αm)µm

Bayesian Method in Speaker Verification. December 2, 2004 – p. 14/??

Page 15: Chengyuan Ma and Jinyu Li - ISyE Home | ISyEbrani/isyebayes/bank/ISYE8843_Ma_and_Li.pdf · Chengyuan Ma and Jinyu Li School of Electrical and Computer Engineering Bayesian Method

Experiment Setup, Conclusion and Future Work

Bayesian Method in Speaker Verification. December 2, 2004 – p. 15/??

Page 16: Chengyuan Ma and Jinyu Li - ISyE Home | ISyEbrani/isyebayes/bank/ISYE8843_Ma_and_Li.pdf · Chengyuan Ma and Jinyu Li School of Electrical and Computer Engineering Bayesian Method

Experiment Setup

1996 NIST(National Institute of Standards and Technology)

dataset was used for experiment,

225 Speakers in the Corpus

1 minutes for training and 15 second for testing for each

speaker

Bayesian Method in Speaker Verification. December 2, 2004 – p. 16/??

Page 17: Chengyuan Ma and Jinyu Li - ISyE Home | ISyEbrani/isyebayes/bank/ISYE8843_Ma_and_Li.pdf · Chengyuan Ma and Jinyu Li School of Electrical and Computer Engineering Bayesian Method

System Performance using EMTraining

Bayesian Method in Speaker Verification. December 2, 2004 – p. 17/??

Page 18: Chengyuan Ma and Jinyu Li - ISyE Home | ISyEbrani/isyebayes/bank/ISYE8843_Ma_and_Li.pdf · Chengyuan Ma and Jinyu Li School of Electrical and Computer Engineering Bayesian Method

System Performance using MAPAdaptation

Bayesian Method in Speaker Verification. December 2, 2004 – p. 18/??

Page 19: Chengyuan Ma and Jinyu Li - ISyE Home | ISyEbrani/isyebayes/bank/ISYE8843_Ma_and_Li.pdf · Chengyuan Ma and Jinyu Li School of Electrical and Computer Engineering Bayesian Method

Future Work

Acoustic features robust against noisy environment and

different channels

Incorporation dynamical acoustic features and prosodic

features

Higher level information

novel probabilistic models for speaker modeling

Bayesian Method in Speaker Verification. December 2, 2004 – p. 19/??

Page 20: Chengyuan Ma and Jinyu Li - ISyE Home | ISyEbrani/isyebayes/bank/ISYE8843_Ma_and_Li.pdf · Chengyuan Ma and Jinyu Li School of Electrical and Computer Engineering Bayesian Method

Q & A

Bayesian Method in Speaker Verification. December 2, 2004 – p. 20/??