Digital Microphone Array - University of Edinburgh · · 2009-10-23Digital Microphone Array Design, Implementation and Speech ... – Low Pass Filter (LPF) – Analogue DigitalConverter

10/20/2009

1

Digital Microphone Array

Design, Implementation and Speech Recognition Experiments

Erich Zwyssig

EADS IW UK Ltd.

CSTR - The University of Edinburgh

19th October 2009

Outline

• Motivation

• Background

• Digital Microphone Array – Background

• Digital Microphone Array – Building

• ASR – Methodology

• ASR – Setup

• Results

• Conclusions

10/20/2009

2

Motivation

• Meetings shall be (more)

– efficient

– productive

• AMI / AMIDA consortium

– One research topic

• Instrumented

Meeting Room

Instrumented Meeting Room

• Recording Devices

– Audio

– Video

– …

• Distant Speech Recognition

– People don’t like to wear head-mounted

microphones

10/20/2009

3

Background

• Distant Speech Recognition (DSR) combines:

– acoustic array processing

– automatic speech recognition (ASR).

• Problems

– Dereverberation

– Noise

Distant Speech Recognition

• A complete DSR system includes:

– microphone array

– algorithm to track the active speaker(s)

– beamforming algorithm to focus on the desired

speaker(s)

– post-filtering to enhance the beamforming

– speech recognition engine

– speaker adaptation component

10/20/2009

4

Microphone Array

• Mono

– Directivity through

mechanics/acoustics

• Stereo

– add two signals

• Array

– delay-sum

– superdirective beamforming

• linear endfire arrays

Microphone Array

• delay-sum beamforming

10/20/2009

5

Digital MEMS microphone array

• MEMS (Micro Electro Mechanical System)

• ultra small microphones

– withstand reflow soldering in automatic

manufacturing

– cheap

MEMS devices

• Accelerators

– Phones

– Game Consoles

• Microphones

• Chemical sensors

• etc.

10/20/2009

6

MEMS microphone

• 20 years of research

• Business

– 2004: $2 Mio

– 2006: $140 Mio

– 2011: $922 Mio (est.)

• Currently about 20 providers

• Main (novel) part is the membrane

MEMS microphone

• Two principles– frequency modulation

scheme(capacitor C modulates oscillator)

– pre-charged capacitor CQ = V * C = const.(C modulates V)

10/20/2009

7

Digital microphone array - System

• Digital MEMS microphone array

• Digital Signal Processing (DSP)

• Interface (IF)

• Personal Computer (PC)

Digital microphone array – Background

• Analogue Digital Converter (ADC)

• Oversampling ADC

• Digital Signal Processing (DSP)

• Interfaces (IF)

10/20/2009

8

ADC

• Analogue Digital Conversion– Nyquist converter

– Oversampling converter

• Building blocks– Low Pass Filter (LPF)

– Analogue Digital Converter (ADC)

– Digital Signal Processing (DSP)

Nyquist converter

• Sample at the Nyquist frequency

– e.g. 16kHz

– Need 96dB stopband attenuation for audio HiFi

• for the analogue low pass filter -> impossible

10/20/2009

9

Oversampling converter

• Swap

resolution

with

sample

frequency

Digital Signal Processing

• FIR filter

– e.g. moving average filter

10/20/2009

10

Digital Signal Processing

• Differentiator

• Integrator

• CIC Filter

(Cascaded integrator-

comb filter)

Interfaces

• PDM

• I2C

• I2S

• AC’97

• USB

• Firewire

10/20/2009

11

Digital microphone array – building

• System Design

• Signal Processing

• DSP implementation

• USB interface

System Design

• Interfaces define the system

– Microphone to DSP -> PDM

– DSP to PC � ????

• DSP to IF � AC’97

• IF to PC � USB

10/20/2009

12

Digital MEMS Microphone Array

System

Signal Processing

• HiFi audio ADCs typically work @ 64 fs

• Need to downsample to fs

• Two options– Downsample using one filter

(requires an FIR filter of 3400th order)

– Downsample in steps• CIC from 64fs to 8fs

• FIR from 8fs to 4fs to 2fs to fs (using halfband filters)

10/20/2009

13

Signal Processing

DSP implementation

• Microphone Interface

• DSP

• FIFO

• AC’97 Interface

• Controller

• Clocking

10/20/2009

14

DSP implementation

USB Interface

• (TI) TAS1020B USB StreamingController

– First trials (stereo)

• (TI) TUSB3200A USB StreamingController

– 8052 core

– DMA

– Full AC’97 IF support

10/20/2009

15

HW design flow

• DSP design

• HDL design

• Simulation

• Synthesis

• Debugging

DSP design

• Matlab©

– specify filter

• e.g. stopband

attenuation

– specify

constraints

• e.g. bit width

– export Xilinx

coefficients

10/20/2009

16

HDL Design

• Verilog HDL example (counter)

//functional

reg [width-1:0] count;

always @(posedge clk or negedge resetn)

begin

if (~resetn)

count <= 0;

else

if (ena)

count <= count + 1;

else

count <= count;

end

Simulation

• Modelsim™ XE example

10/20/2009

17

Synthesis

• Xilinx© ISE® Example

Debugging

• Setup: Logic Analyser and

Digital Sampling

Oscilloscope

10/20/2009

18

Digital Microphone Array

Limitations

• Windows XP/Vista

– Not more than

stereo over USB

• Xubuntu (Linux)

– Limits at 7

channels over USB

• MAC

– tbd

10/20/2009

19

ASR Methodology

• Beamforming (mdm-tools)– noise removal

– speaker tracking

– beamforming (and post-filtering)

• HMMs trained with HTK on the WSJCAM0 database– 53 male and 39 female speakers with British English

accents

– 11,000 tied-state triphones

– three emitting states per triphone

– 6 Gaussian mixture components per state

– 52-element feature vectors (comprising 13 MFCCs and 0th cepstral coefficient) with 1st, 2nd and 3rd order derivatives

ASR Adaptation

• MAP

– Too little data

• MLLR

– Means-only

– Means and variances (constrained)

• Channel, gender and individual

– Channel � analogue vs. digital

– Gender � female vs. male

10/20/2009

20

ASR Setup

• WSJCAM0 prompts– Adaptation sentences (17, approx. 1 min)

– 5k sentences (38, approx. 7 min)

– 20k sentences (38, approx. 7 min)

• 12 participants– 6 female / 6 male

– All UK English

• Recorded with– AMI/AMIDA analogue microphone array

– Newly developed digital microphone array

Recording setup

10/20/2009

21

ASR adaptation scenarios

Results

0.0

10.0

20.0

30.0

40.0

50.0

60.0

None MLLR Channel MLLR Speaker and Channel

[%]

WER

Analogue

Digital

10/20/2009

22

Conclusions

• The newly designed digital microphone array

compares well with the analogue one

• No effect of speed of speaker

• Two categories of speakers

– Ones the system likes (sheep),

and others it dislikes (goats)

• Using adaptation the gap between the

speakers and channels closes

Conclusions

10/20/2009

23

Conclusions

Acknowledgements

• Steve and Mike for using the AMI/AMIDA

setup

• Knowles Electronics and Bernafon Ltd. for

supplying the digital MEMS microphones

• Wolfson Microelectronics plc for access to DSP

filters, Keil µVision Compiler and EEPROM

programmer

10/20/2009

24

Thanks

• To you, the audience, for coming here today

• Any questions?

Demo

• IF 3.07 (Instrumented Meeting Room) for the

next 30 min

10/20/2009

25

References

• Zwyssig Erich, “Digital Microphone Array- Design,

Implementation and Speech Recognition Experiments”, Thesis

submitted towards the degree of MSc, The University of

Edinburgh, Aug 2009

• Zwyssig E., Lincoln M., Renals S., “A digital microphone array

for distant speech recognition”, submitted for ICASSP2010,

Dallas, February 2010

Distant Speech Recognition

• ASR– HMM

– GMM

– Viterbi

– LM

• RSR– Adaptation

• MAP

• MLLR

– FE• CMN

• CVN

• VTLN

• DSR– Microphone Array

– Beamforming

– Wienerfilter• Noise removal

• AMI/AMIDA projects– develop smart meeting room

• Current limitations are– portability

– cheap commodity HW

10/20/2009

26

ASR – Flow (adaptive)

ASR – Methodology

Documents

Digital Microphone Array - University of Edinburgh · · 2009-10-23Digital Microphone Array Design, Implementation and Speech ... – Low Pass Filter (LPF) – Analogue DigitalConverter