Digital Signal Processing on Speech Recognition

7/28/2019 Digital Signal Processing on Speech Recognition

1/59

Digital Signal Processing

Related to Speech Recognition

References:

1. A. V. Oppenheim and R. W. Schafer,Discrete-time Signal Processing, 1999

2. X. Huang et. al., Spoken Language Processing, Chapters 5, 6

3. J. R. Deller et. al., Discrete-Time Processing of Speech Signals, Chapters 4-6

4. J. W. Picone, Signal modeling techniques in speech recognition, proceedings of the IEEE,September 1993, pp. 1215-1247

Berlin Chen 2005


2/59

SP- Berlin Chen 2


Digital Signal

Discrete-time signal with discrete amplitude


Manipulate digital signals in a digital computer

[ ]nx


3/59

SP- Berlin Chen 3

Two Main Approaches to Digital Signal

Processing Filtering

Parameter Extraction

Filter

Signal in Signal out

[ ]nx [ ]ny

Parameter

Extraction

Signal in Parameter out

[ ]nx

1

12

11

mc

cc

2

22

21

mc

cc

2

1

Lm

L

L

c

cc

e.g.:

1. Spectrum Estimation

2. Parameters for Recognition

Amplify or attenuate some frequency

components of [ ]nx


4/59

SP- Berlin Chen 4

Sinusoid Signals

: amplitude () : angular frequency (

),

: phase ()

E.g., speech signals can be decomposed assums of sinusoids

[ ] ( ) += nAnx cos

Tf

2

2==

[ ]

=

2cos

nAnx

samples25=T

10

frequencynormalized:

f

f

Period, represented

by number of samples

[ ] [ ]

( ),...2,12

2

2)(cos

===

+=+=

kN

kkN

NnANnxnx

Right-shifted


5/59

SP- Berlin Chen 5

Sinusoid Signals (cont.)

is periodic with a period ofN(samples)

Examples (sinusoid signals)

is periodic with period N=8 is periodic with period N=16

is not periodic

[ ]nx

[ ] [ ]nxNnx=+

( ) ( ) +=++ nANnA cos)(cos)21(2 ,...,kkN ==

N

2

=

[ ] ( )4/cos1 nnx =[ ] ( )8/3cos2 nnx =

[ ] ( )nnx cos3 =


6/59

SP- Berlin Chen 6


[ ] ( )

( )

8

integerspositiveareand824

44cos)(

4cos

4cos

4/cos

1

111

11

1

=

==

+=

+=

=

=

N

kNkNkN

NnNnn

nnx

[ ] ( )

( )

( )

16

numberspositiveareand3

162

8

3

8

3

8

3cos8

3cos8

3cos

8/3cos

2

222

22

2

=

==

+=

+=

=

=

N

kNkNkN

NnNnn

nnx

[ ] ( )( ) ( )( ) ( )

!existtdoesn'

integerspositiveareand

2

cos1cos1cos

cos

3

3

3

33

3

N

kN

kN

NnNnn

nnx

=

+=+==

=

Q


7/59

SP- Berlin Chen 7


Complex Exponential Signal

Use Eulers relation to express complex numbers

( ) sincos

jAAez

jyxz

j +==

+=

( )numberrealaisA

Re

Im

sin

cos

Ay

Ax

=

=

22 yx + 22 yxx

+22 yx

y

+

Imaginary part

real part


8/59

SP- Berlin Chen 8


A Sinusoid Signal

[ ] ( )( ){ }

{ }

jnj

nj

eAe

Ae

nAnx

Re

Re

cos

=

=

+=+

real part


9/59

SP- Berlin Chen 9


Sum of two complex exponential signals with

same frequency

When only the real part is considered

The sum ofNsinusoids of the same frequency isanother sinusoid of the same frequency

( ) ( )

( )

( )

+

++

=

=

+=

+

nj

jnj

jjnj

njnj

Ae

Aee

eAeAe

eAeA

10

10

10

10

( ) ( ) ( ) +=+++ nAnAnA coscoscos 1100

numbersrealareand, 10 AAA


10/59

SP- Berlin Chen 10


Trigonometric Identities

1100

1100

coscos

sinsintan

AA

AA

+

+=

( ) ( )

( )

( )10102120

1010102

120

2

21100

21100

2

cos2

sinsincoscos2

sinsincoscos

++=

+++=

+++=

AAAA

AAAAA

AAAAA


11/59

SP- Berlin Chen 11

Some Digital Signals


12/59

SP- Berlin Chen 12

Some Digital Signals

Any signal sequence can be represented

as a sum ofshift and scaled unit impulse

sequences (signals)

[ ]nx

[ ] [ ] [ ]knkxnxk

=

=

scale/weighted

Time-shifted unit

impulse sequence

[ ] [ ] [ ] [ ] [ ]

[ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ]

( ) [ ] ( ) [ ] ( ) [ ] ( ) [ ] ( ) [ ] ( ) [ ]31211121221

33221101122

3

2

+++++++=

+++++++=

== =

=

nnnnnn

nxnxnxnxnxnx

knkxknkxnxkk

,...3,2,1,0,1,2..., =n


13/59

SP- Berlin Chen 13

Digital Systems

A digital system Tis a system that, given an

input signalx[n], generates an output signal y[n]

[ ] [ ]{ }nxTny =

[ ]nx { }T [ ]ny


14/59

SP- Berlin Chen 14

Properties of Digital Systems

Linear

Linear combination of inputs maps to linear

combination of outputs

Time-invariant (Time-shift) A time shift in the input by m samples gives a shift in

the output by m samples

[ ] [ ]{ } [ ]{ } [ ]{ }nxbTnxaTnbxnaxT 2121 +=+

[ ] [ ]{ } mmnxTmny = ,

time sift [ ][ ] samplesshiftleft)0(if

samplesshiftright)0(if

mmmnx

mmmnx

>+

>


15/59

SP- Berlin Chen 15

Properties of Digital Systems (cont.)

Linear time-invariant (LTI)

The system output can be expressed as a

convolution () of the inputx[n] and theimpulse response h[n]

The system can be characterized by the systems

impulse response h[n], which also is a signal

sequence

If the inputx[n] is impulse , the output is h[n][ ]n

Digital

System

[ ]n [ ]nh

[ ]{ } [ ] [ ] [ ] [ ] [ ] [ ]khknxknhkxnhnxnxTkk

===

=

=


16/59

SP- Berlin Chen 16



Explanation:

[ ] [ ] [ ]knkxnxk

=

=

[ ]{ } [ ] [ ]{ }[ ] [ ]{ }

[ ] [ ]

[ ] [ ]nhnx

knhkx

knTkx

knkxTnxT

k

k

k

=

=

=

=

=

=

=

scale

Time-shifted unitimpulse sequence

Time invariant

Digital

System

[ ]n [ ]nh

linear

Time-invariant

[ ] [ ]

[ ] [ ]knhkn

nhn

T

T

Impulse response

convolution


17/59

SP- Berlin Chen 17


Linear time-invariant (LTI) Convolution

Example[ ]n

0 1

21

3

-2[ ]nh

LTI[ ]nx

?0 1 2

2

3

1

0

3

0 1

23

9

-6

1

2

1 2

32

6

-4

LTI

2

1

2 3

41 3

-2

Length=M=3

Length=L=3

Length=L+M-1

[ ]n3

[ ]12 n

[ ]21 n

0

3

1

11

2

13

-1

4

-2

Sum up

[ ]ny

[ ]nh3

[ ]12 nh

[ ]2nh


18/59

SP- Berlin Chen 18



Convolution: Generalization

Reflect h[k] about the origin ( h[-k]) Slide (h[-k] h[-k+n] orh[-(k-n)] ), multiply it withx[k]

Sum up[ ]kx

[ ]kh

[ ]kh

Reflect Multiply

slide

Sum up

n

[ ] [ ] [ ]

[ ] ( )[ ]nkhkx

knhkxny

k

k

=

=

=

=


19/59

SP- Berlin Chen 19

0 1

21

3

-2

0-1

-21

3

-2

0 1 2

2

3

1

0

3

10

-11

3

-2

1

11

21

01

3

-2

1

2

32

11

3

-2 -1

3

432

1

3

-2 -2

4

0

3

1

11

2

13

-1

4

-2

Sum up

[ ]kh

[ ]kx

[ ]kh [ ] 0, =nny

[ ]1+ kh

[ ]2+ kh

[ ]3+ kh

[ ]4+ kh

[ ] 1, =nny

[ ] 2, =nny

[ ] 3, =nny

[ ] 4, =nny

[ ]ny

Reflect [ ] [ ] [ ]

[ ] [ ]knhkx

nhnxny

k=

=

=

Convolution


20/59

SP- Berlin Chen 20



Convolution is commutative and distributive

[ ] [ ] [ ]

[ ] [ ][ ] [ ]

[ ] [ ]knxkh

knhkx

nxnh

nhnxny

k

k

=

==

=

=

=

*

*

[ ]nh 1 [ ]nh 2 [ ]nh 2 [ ]nh 1

[ ]nh 2

[ ]nh 1

[ ] [ ]nhnh 21 +

An impulse response has finite duration

Finite-Impulse Response (FIR)

An impulse response has infinite duration

Infinite-Impulse Response (IIR)

[ ] [ ] [ ][ ] [ ] [ ]nhnhnx

nhnhnx

12

21

****

=

[ ] [ ] [ ]( )[ ] [ ] [ ] [ ]nhnxnhnx

nhnhnx

21

21

**

*

+=

+

Commutation

Distribution


21/59

SP- Berlin Chen 21


Prove convolution is commutative

[ ] [ ] [ ][ ] [ ]

[ ] [ ] ( )

[ ] [ ]

[ ] [ ]nxnh

knxkh

knmhmnx

knhkx

nhnxny

k

m

k

*

mlet

*

=

=

= =

=

=

=

=

=


22/59

SP- Berlin Chen 22


Linear time-varying System

E.g., is an amplitude modulator[ ] [ ] nnxny 0cos =

[ ] [ ] [ ] [ ]

[ ] [ ] [ ]

[ ] [ ] ( )0000

00011

0

?

101

cosBut

coscos

thatsuppose

nnnnxnny

nnnxnnxny

nnynynnxnx

=

==

==


23/59

SP- Berlin Chen 23


Bounded Input and Bounded Output (BIBO): stable

A LTI system is BIBO if only ifh[n] is absolutely summable

[ ] =k

kh

[ ][ ] nBnynBnx

y

x


24/59

SP- Berlin Chen 24


Causality

A system is casual if for every choice ofn0, the output

sequence value at indexing n=n0 depends on only the inputsequence value forn n0

Any noncausal FIR can be made causal by adding sufficient long

delay

[ ] [ ] [ ] = =

+=K

k

M

mkkk mnxknyny

1000

z-1

z-1

z-1

M

2

0[ ]ny[ ]nx

1

z-1

z-1

z-1

1

2

N


25/59

SP- Berlin Chen 25

Discrete-Time Fourier Transform (DTFT)

Frequency Response

Defined as the discrete-time Fourier Transform of

is continuous and is periodic with period=

is a complex function of

[ ]nh( )jeH

2( )jeH

( ) ( )

jeHjj

j

i

j

r

j

eeH

ejHeHeH

=

+=

magnitudephase

( )j

eH

proportional to twotimes of the sampling

frequency

njne nj sincos +=

Im

Re


26/59

SP- Berlin Chen 26

Discrete-Time Fourier Transform (cont.)

Representation of Sequences by Fourier Transform

A sufficient condition for the existence of Fourier transform

[ ]


27/59

SP- Berlin Chen 27


Convolution Property

( ) [ ]

[ ] [ ] [ ]

( ) [ ] [ ]

[ ] [ ]

( ) ( )

[ ] ( ) ( )

jj

jj

nj

nk

kj

nj

n k

j

k

n

njj

eHeXnhnx

eHeX

enhekx

eknhkxeY

knhkxnhnxny

enheH

=

=

=

==

=

=

=

=

=

=

=

][

'

][][

'

'

( ) ( ) ( )( ) ( ) ( )

( ) ( ) ( )

jjj

jjj

jjj

eHeXeY

eHeXeY

eHeXeY

+=

=

=

knnknn

knn

=+=

=

''

'


28/59

SP- Berlin Chen 28


Parsevals Theorem

Define the autocorrelation of signal

[ ] ( )

= =

deXnxj

n

22

2

1

power spectrum

[ ]nx[ ]

( ) ][][][][

][][

*

*

nxnxlnxlx

mxnmxnR

l

mxx

==

+=

=

=

( ) ( ) ( ) ( )2*

XXXSxx ==

[ ] ( ) ( ) ==

deXdeSnR njnjxxxx2

2

1

2

1

[ ] [ ] [ ] [ ] ( )

=

=

===

=

mm

xx dXmxmxmxR

n

22*

2

10

0Set

The total energy of a signal

can be given in either thetime or frequency domain.

)(

lnnlm

nml

==

+=

222

*

conjugatecomplex:

zyxzz

jyxzjyxz

*

*

=+=

=+=


29/59

SP- Berlin Chen 29

Discrete-Time Fourier Transform (DTFT)

A LTI system with impulse response

What is the output for

[ ]nh[ ]ny [ ] ( ) += nAnx cos

[ ] ( )

[ ] [ ] [ ] ( )

[ ] ( ) [ ]

[ ] ( )( )

( ) [ ]

( ) ( )( ) ( ) ( )

( ) ( ) ( )

( ) ( ) ( )( ) ( ) ( ) ( )( )

[ ] ( ) ( ) ( )( )

jj

jjjj

eHjnjj

eHjjnj

jnj

kj

k

nj

knj

k

nj

nj

eHneHAny

eHneHjAeHneHA

eeHA

eeHAe

eHAe

ekhAe

Aekh

nhAeny

Aenxnxnx

njAnx

j

j

++=

+++++=

=

=

=

=

=

=

=+=

+=

++

+

+

=

+

+

=

+

+

cos

sincos

*

sin

0

10

1

[ ]ny [ ]ny1

( )

( ) attenuate1

amplify1

j

j

eH

eH

Systems frequency

response


30/59

SP- Berlin Chen 30



31/59

SP- Berlin Chen 31

Z-Transform

z-transform is a generalization of (Discrete-Time) Fourier

transform

z-transform of is defined as

Where , a complex-variable

For Fourier transform

z-transform evaluated onthe unit circle

( ) [ ]

=

=n

nznhzH

[ ] ( )zHnh [ ] ( )

j

eHnh

[ ]nh

jrez =

( ) ( ) jez

jzHeH

==

)1( == zez j

complex plane

unit circle

Im

Re

jez=


32/59

SP- Berlin Chen 32

Z-Transform (cont.)

Fourier transform vs. z-transform

Fourier transform used to plot the frequency response of a filter

z-transform used to analyze more general filter characteristics, e.g.

stability

ROC (Region of Converge) Is the set ofzfor which z-transform exists (converges)

In general, ROC is a ring-shaped region and the Fourier

transform exists if ROC includes the unit circle ( )

[ ]


33/59

SP- Berlin Chen 33

Z-Transform (cont.)

An LTI system is defined to be causal, if its impulse

response is a causal signal, i.e.

Similarly, anti-causalcan be defined as

An LTI system is defined to be stable, if for every

bounded input it produces a bounded output

Necessary condition:

That is Fourier transform exists, and therefore z-transform

includes the unit circle in its region of converge

[ ] 0for0 = nnh

Right-sided sequence

Left-sided sequence

[ ]


34/59

SP- Berlin Chen 34

Z-Transform (cont.)

Right-Sided Sequence

E.g., the exponential signal

[ ] [ ] [ ]


37/59

SP- Berlin Chen 37

Z-Transform (cont.)

Finite-length Sequence

E.g.

[ ]

= others,0

10,

.3 4Nna

nh

n

( ) ( ) ( )az

az

zaz

azazzazH

NN

N

NN

n

nN

n

nn

=

===

=

=

11

11

0

11

04

1

1

1

0exceptplane-entireis4 = zzROC

13221 ..... ++++ NNNN azaazz

Im

the unit cycle

3

1

7 poles at zero A pole and zero atis cancelledaz=

4

( )11,

2

== ,..,Nkaez Nkj

k

IfN=8

0 N-1

Re


38/59

SP- Berlin Chen 38

Z-Transform (cont.)

Properties ofz-transform

1. If is right-sided sequence, i.e. and ifROC

is the exterior of some circle, the all finite for which

will be in ROC

If ,ROCwill include

2. If is left-sided sequence, i.e. , the ROCisthe interior of some circle,

If ,ROCwill include

3. If is two-sided sequence, the ROCis a ring

4. The ROC cant contain any poles

[ ]nh [ ] 1,0 nnnh =

01 n

0

rz >z

=z

[ ]nh [ ] 2,0 nnnh =

02


39/59

SP- Berlin Chen 39

Summary of the Fourier and z-transforms


40/59

SP- Berlin Chen 40

LTI Systems in the Frequency Domain

Example 1: A complex exponentialsequence System impulse response

Therefore, a complex exponential input to an LTI systemresults in the same complex exponential at the output, butmodified by

The complex exponential is an eigenfunction of an LTI

system, and is the associated eigenvalue

[ ]nh

[ ] [ ] [ ] [ ]

[ ]

( ) njjkjnj

knj

eeH

ekhe

ekhnhnxny

=

=

==

=

=

-k

)(

-k

( )

response.frequencysystem

theastoreferredoftenisItresponse.impulsesystemthe

ofansformFourier trthe:jeH

[ ] njenx =[ ] [ ] [ ]

[ ] [ ]

[ ] [ ][ ] [ ]knxkh

knhkx

nxnh

nhnxny

k

k

=

=

=

=

=

=

*

*

[ ]{ } [ ]nxeHnxT j=

jeH

jeH

scalar


41/59

SP- Berlin Chen 41

[ ] ( ) ( )

( ) ( )[ ]( ) ( ) ( ) ( )[ ]

( ) ( )[ ]00

00

000

0

0000

0000

0

)()(

)()(

cos

2

2

22

jj

njeHjjnjeHjj

njj*njj

njjjnjjj

eHneHA

eeeHeeeHA

eeHeeHA

eeA

eHeeA

eHny

jj

++=

+=

+=

+=

++

++

LTI Systems in the Frequency Domain (cont.)

Example 2: A sinusoidal sequence

System impulse response

[ ] ( )+= nwAnx 0cos

[ ] ( )

njjnjj eeAeeA

nAnx

00

22

cos 0

+=

+=

[ ]nh

( )

jj

j

j

ee

ie

ie

+=

=

+=

2

1cos

sincos

sincos

( ) ( )

( ) ( )( )000

00

jeHjjj*

j*j

eeHeH

eHeH

=

=

( )

( ) ( )

sincossincos

sincos*

sincos

*

jje

jejyxz

jejyxz

j

j

j

=+=

==

+=+=

magnitude response phase response


42/59

SP- Berlin Chen 42


Example 3: A sum of sinusoidal sequences

A similar expression is obtained for an input consisting of a sum

of complex exponentials

[ ] ( )= +=

K

kkkk nAnx 1 cos

[ ] ( ) ( )[ ]=

++=K

k

j

kk

j

kkk eHneHAny

1

cos

magnitude response phase response


43/59

SP- Berlin Chen 43


Example 4: Convolution Theorem

[ ] [ ]

=

=k

kPnnx

[ ] [ ] 1,


44/59

SP- Berlin Chen 44


Example 5: Windowing Theorem [ ] [ ] ( ) ( )

jj eXeWnwnx 2

1

[ ] [ ]

=

=k

kPnnx

[ ]

=

=

otherwise0

1,......,1,0,1

2cos46.054.0 Nn

N

n

nw

Hamming window

( ) ( ) ( )( )

( )

( )

=

=

=

=

=

=

=

=

=

=

=

k

kP

j

k

m

m

jm

k

j

k

j

jjj

eWP

mkP

eWP

kP

eWP

kPP

eW

eXeWeY

21

2

1

21

22

2

1

2

1

has a nonzero

value when

k

P

m

2

=

Difference Equation Realization


45/59

SP- Berlin Chen 45


for a Digital Filter

The relation between the output and input of a digital

filter can be expressed by

[ ] [ ] [ ] = = +=N

k

M

kkk knxknyny

1 0

( ) ( ) ( ) kN

k

M

kk

k

k zzXzzYzY

= =

+=

1 0

delay property

( )( )( )

=

=

==

N

k

k

k

M

k

k

k

z

z

zX

zYzH

1

0

1

[ ] ( )

[ ] ( )0

0

n

zzXnnx

zXnx

linearity and delay properties

A rational transfer function

Causal:

Rightsided, the ROC outside the

outmost pole

Stable:

The ROC includes the unit circle

Causal and Stable:

all poles must fall inside the unitcircle (not including zeros)

z-1

z-1

z-1

M

2

0[ ]ny[ ]nx

1

z-1

z-1

z-1

1

2

N



46/59

SP- Berlin Chen 46


for a Digital Filter (cont.)

M it d Ph R l ti hi


47/59

SP- Berlin Chen 47

Magnitude-Phase Relationship

Minimum phase system:

The z-transform of a system impulse response sequence ( a

rational transfer function) has all zeros as well as poles inside the

unit cycle

Poles and zeros called minimum phase components

Maximum phase: all zeros (or poles) outside the unit cycle

All-pass system: Consist a cascade of factor of the form

Characterized by a frequency response

with unit (or flat) magnitude for all frequencies

1

11

1

az

z-a*

Poles and zeros occur at conjugate

reciprocal locations

11

11

= az

z-a*

M it d Ph R l ti hi ( t )


48/59

SP- Berlin Chen 48

Magnitude-Phase Relationship (cont.)

Any digital filter can be represented by the cascade of a

minimum-phase system and an all-pass system

( ) ( ) ( )zHzHzH apmin=( )

( )

( ) ( )( ) ( )

( )( ) ( )( )

( )( )( )

( )filter.pass-allais

1

1

filter.phaseminimumaalsois1

:where

1

11

filter)phaseminimumais(1

:asexpressedbecancircle.unittheoutside

1)(1zerooneonlyhasthatSuppose

1

*

1

1

1

*1

1

1

*

1

=

=

sampling rate=8kHz

Digital Signal:Discrete-time

signal with discrete

amplitude

Analog Signal to Digital Signal (cont )


56/59

SP- Berlin Chen 56

Analog Signal to Digital Signal (cont.)

Impulse Train

To

Sequence [ ] ( ))( nTxnxa

=

Sampling

( )

( ) ( ) ( ) ( )

( ) ( ) [ ] ( )

=

=

=

==

==

nn

a

n

aa

s

nTtnxnTtnTx

nTttxtstx

tx

( )tx a

Continuous-Time

Signal

Discrete-TimeSignal

Digital Signal

Discrete-time

signal with discrete

amplitude

[ ]nx

Continuous-Time to Discrete-Time Conversion

( ) ( )

==

nnTtts

Periodic Impulse Train ( ) [ ]nxtxs

byspecifieduniquelybecan

switch

( )tx a

( ) ( )

==

nnTtts

( )

( )

=

=

1

0,0

dtt

tt

1

T0 2T 3T 4T 5T 6T 7T 8T-T-2T



57/59

SP- Berlin Chen 57


A continuous signal sampled at different periods

T

1

( )tx a ( )tx a

( )

( ) ( ) ( ) ( )

( ) ( ) [ ] ( )

=

=

=

==

==

nn

a

n

aa

s

nTtnxnTtnTx

nTttxtstx

tx



58/59

SP- Berlin Chen 58


( )jXa

frequency)(sampling22

ss FT

==

=

TsN

2

1

aliasing distortion

( ) ( )

==

ksk

TjS

2

( ) ( ) ( )

( ) ( )( )

=

=

=

k

sas

as

kjXT

jX

jSjXjX

1

21

N

( )

>

>

2 Ns

NNsQ

( )


59/59

SP- Berlin Chen 59


To avoid aliasing (overlapping, fold over) The sampling frequency should be greater than two times of

frequency of the signal to be sampled (Nyquist) sampling theorem

To reconstruct the original continuous signal

Filtered with a low pass filter with band limit

Convolved in time domain

s

( ) tth s= sinc

( ) ( ) ( )

( ) ( )

=

=

=

=

n

sa

n

aa

nTtnTx

nTthnTxtx

sinc

Ns > 2

Documents

Digital Signal Processing on Speech Recognition