Research & development Component Score Weighting for GMM based Text-Independent Speaker Verification Liang Lu SNLP Unit, France Telecom R&D Beijing 2008-01-21

research & development

Component Score Weighting for GMM based Text-Independent Speaker Verification

Liang Lu

SNLP Unit, France Telecom R&D Beijing

2008-01-21

[email protected]


Outline

IntroductionConventional LLR and Motivation for

detailed score processingComponent Score WeightingExperimental ResultsConclusion


Introduction

State of the art GMM-UBM framework

GMM based model construction

Log-likelihood Ratio (LLR) based decision making

Score Normalisation (Tnorm, Hnorm, etc) for robustesses


Introduction

wc

| |ˆ argmax| |

T bcT wc

bc () ()1

1 TJ j jbc jj NN

()11 J jwc jj NN

Major challenges

Limited data for speaker model training

Mismatch between training and testing data


Motivation for Component Score Weighting

Motivation The insufficiency of training data and mismatch

between training and testing condition make the mixtures in GMM different in discriminative capability

The LLR just sum the score of each mixture without considering its reliability

Does it helpful if LLR considers the discriminative capability of each mixture?

wc

| |ˆ argmax| |

T bcT wc

bc () ()1

1 TJ j jbc jj NN

()11 J jwc jj NN

QuestionIf it does, how to explore the discriminative capabilities of Gaussian Component Mixtures


Component Score Weighting

Our MethodFirst, scatter the LLR to each Gaussian mixture

Where, the k-th mixture is dominant for frame , namely, tx

ktkt

tk

tiM

kii k

itkk

M

kiitiitkkt

ss

xp

xp

w

w

Txpw

T

xpwxpwT

xpT

~

1log1

log1

log1

log1

,1

,1

.,,1, kiMixpwxpw tiitkk

Let we call is the dominant score and is the residual score

kts

kts~


in original LLR


Extend the original LLR After doing this, the original LLR will be spitted

into two score serials, dominant score serial and residual score serial

Original:

If we consider the discriminative capacity of each Gaussian mixture

Extended:

Md sssS ,,, 21

Mr sssS ~,,~,~~21

~ M

1k

kk ssXLLR

M

kkkrkkd ssWssWXf ~~

.1,1 rd WW



Now the question is: How can we know the discriminative capability of

each Gaussian mixture and what the should be?

Our assumption: We believe that the high dominant scores will

have better discriminative capability and should be highlighted.

W



Why the high dominant scores?

If the test utterance is from the target speaker, then more components in GMM should get high value compared with UBM.

If the utterance is form imposter, then high-valued components in GMM are hardly more UBM.

If the test utterance is from the target speaker, the low-valued components in GMM is due to the mixtures are not well trained or mismatch exists between training and testing data.



xxW exp

Restrained Emphasized

We simply used an exponential function as the weighting function

The residual scores have little importance and we ignore them finally.

The final LLR score is as follows:

M

k

ubmk

ubmk

spkk

spkk ssssXf

1

expexp


Experimental Results

0.1 0.2 0.5 1 2 5 10 20 40

0.1

0.2

0.5

1

2

5

10

20

40

False Alarm probability (in %)

Mis

s p

robabili

ty (

in %

)

Cepstral GMM-UBM

Cepstral GMM-UBM with CSW Cepstral GMM-UBM with TNorm

Cepstral GMM-UBMwith CSW&TNorm

system EER (%) MinDCF(x100)

GMM baseline 7.64 4.16

GMM with CSW 7.45 3.66

GMM with TNorm 6.96 3.48

GMM with CSW&TNorm

7.14 3.10

Table: Results for GMM baseline and GMM with Component Score Weighting with TNorm

Experiments are performed in the 1conv4w-1conv4w task of the

2006 NIST SRE corpora


Conclusion Split the LLR score and consider the discriminative capacity of Gaussian mixtures is helpful to cope with the insufficiency of training data and mismatch between training and testing condition.

The score weighting function should be coincident with the component score distribution and discriminative capacity.

The exponential weighting function used in this investigation is not universal and also may not optimal. More work is needed to explore an optimal weighting function.


Documents

Research & development Component Score Weighting for GMM based Text-Independent Speaker Verification Liang Lu SNLP Unit, France Telecom R&D Beijing 2008-01-21