36
Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven [email protected] homes.esat.kuleuven.be/~moonen/

Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven [email protected] homes.esat.kuleuven.be/~moonen

Embed Size (px)

Citation preview

Page 1: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Speech & Audio Processing - Part–II

Digital Audio Signal Processing

Marc MoonenDept. E.E./ESAT-STADIUS, KU Leuven

[email protected]

homes.esat.kuleuven.be/~moonen/

Page 2: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2

Speech & Audio Processing

• Part-I (H. Van hamme) speech recognition speech coding (+audio coding) speech synthesis (TTS)

• Part-II (M. Moonen): Digital Audio Signal Processing microphone array processing noise- ,echo-, feedback- cancellation (de)reverberation active noise control, 3D audio

PS: selection of topics

Page 3: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 3

Digital Audio Signal Processing

• Aims/scope • Case study: Hearing instruments• Overview• Prerequisites• Lectures/course material/literature• Exercise sessions/project• Exam

Page 4: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 4

Aims/Scope

Aim is 2-fold :

• Speech & audio per se

S & A industry in Belgium/Europe/…

• Basic signal processing theory/principles : Optimal filters

Adaptive filter algorithms (APA, Filtered-X LMS,..)

Kalman filters (linear/nonlinear)

etc...

Page 5: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 5

192

120

07

(Oti

con

)

Case Study: Hearing Instruments 1/12

Hearing Aids (HAs)• Audio input/audio output (`microphone-processing-loudspeaker’)• ‘Amplifier’, but so much more than an amplifier!!• History:

• Horns/trumpets/…• `Desktop’ HAs (1900)• Wearable HAs (1930)• Digital HAs (1980)

• State-of-the-art: • MHz’s clock speed• Millions of arithmetic operations/sec, …• Multiple microphones

Page 6: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 6

Ale

ssa

nd

ro V

olt

a 1

745-

1827

©

Co

chle

ar L

td

Case Study: Hearing Instruments 2/12

Electrical stimulation

for low frequency

Electrical stimulation

for high frequency

Cochlear Implants (CIs)

• Audio input/electrode stimulation output• Stimulation strategy + preprocessing similar to HAs• History:

• Volta’s experiment…• First implants (1960)• Commercial CIs (1970-1980)• Digital CIs (1980)

• State-of-the-art: • MHz’s clock speed, Mops/sec, …• Multiple microphones

Other: Bone anchored HAs, middle ear implants, …

Intra-cochlear electrode

Page 7: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 7

• Hearing loss types:• conductive• sensorineural• mixed

• One in six adults (Europe)

…and still increasing• Typical causes:

• aging • exposure to loud sounds• …

Case Study: Hearing Instruments 3/12

[Source: Lapperre]

Page 8: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 8

Hearing impairment : Dynamic range & audibility

Normal hearing Hearing impaired subjects subjects

Case Study: Hearing Instruments 4/12

Level

100dB

0dB

Page 9: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 9

Hearing impairment : Dynamic range & audibility

Dynamic range compression (DRC) (…rather than `amplification’)

Case Study: Hearing Instruments 5/12

Level

100dB

0dB Input Level (dB)

Ou

tpu

t L

eve

l (d

B)

0dB 100dB

0dB

100dB

Design: multiband DRC, attack time, release time, …

Page 10: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 10

Hearing impairment : Audibility vs speech intelligibility

• Audibility does not imply

intelligibility• Hearing impaired subjects

need 5..10dB larger signal-to-noise ratio (SNR) for speech understanding in noisy environments

• Need for noise reduction (=speech enhancement) algorithms: • State-of-the-art: monaural 2-microphone adaptive noise

reduction• Near future: binaural noise reduction (see below)• Not-so-near future: multi-node noise reduction (see below)

Case Study: Hearing Instruments 6/12

SNR

20dB

0dB30 50 70 90

Hearing loss (dB, 3-freq-average)

Page 11: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 11

HA technology requirements• Small form factor (cfr. user acceptance)

• Low power: 1…5mW (cfr. battery lifetime ≈ 1 week)

• Low processing delay: 10msec (cfr. synchronization with lip reading)

DSP challenges in hearing instruments• Dynamic range compression (cfr supra)

• Dereverberation: undo filtering (`echo-ing’) by room acoustics

• Feedback cancellation

• Noise reduction

Case Study: Hearing Instruments 7/12

Page 12: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 12

DSP Challenges: Feedback Cancellation• Problem statement: Loudspeaker signal is fed back into microphone,

then amplified and played back again

• Closed loop system may become unstable (howling)

• Similar to feedback problem in public address systems (for the musicians amongst you)

Case Study: Hearing Instruments 8/12

Model

F

-

Similar to echo cancellation in GSM handsets, Skype,…

but more difficult due to signal correlation

Page 13: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 13

DSP Challenges: Noise reduction

Multimicrophone ‘beamforming’, typically with 2 microphones, e.g. ‘directional’ front microphone and ‘omnidirectional’ back microphone

Case Study: Hearing Instruments 9/12

“filter-and-sum” the

microphone signals

Page 14: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 14

Binaural hearing: Binaural auditory cues• ITD (interaural time difference)

• ILD (interaural level difference)

• Binaural cues (ITD: f < 1500Hz, ILD: f > 2000Hz) used for• Sound localization• Noise reduction

=`Binaural unmasking’ (‘cocktail party’ effect)

0-5dB

Case Study: Hearing Instruments 10/12

ITD

ILDsignal

Page 15: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 15

Binaural hearing aids• Two hearing aids (L&R) with wireless link & cooperation• Opportunities:

• More signals (e.g. 2*2 microphones) • Better sensor spacing (17cm i.o. 1cm)

• Constraints: power/bandwith/delay of wireless link• ..10kBit/s: coordinate program settings, parameters,…• ..300kBits/s: exchange 1 or more (compressed) audio signals

• Challenges:• Improved localization through cue preservation• Improved noise reduction + benefit from binaural unmasking• Signal selection/filtering, audio coding, synchronisation, …

Case Study: Hearing Instruments 11/12

Page 16: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 16

Future: Multi-node noise reduction – sensor networks

Case Study: Hearing Instruments 12/12

Page 17: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 17

Overview

General speech communication set-up : - background ‘noise’ noise suppression, source separation

- far-end echoes acoustic echo cancellation - reverberation de-reverberation/deconvolution

Applications :• teleconferencing/teleclassing• hands-free telephony• hearing aids, etc..

Page 18: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 18

Overview : Lecture-2

Microphone Array Processing Spatial filtering - Beamforming

Fixed vs. adaptive beamforming

Example filter-and-sum beamformer :

Application: hearing aids

),(1 Y

),(2 Y

),(1 Y

)(1 F

)(2 F

)(mF),( mY

)(MF),( MY

md

),(1 Y

)(S

),( Zcosmd

Page 19: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 19

Overview : Lecture-3

Noise Reduction

`microphone_signal[k] = speech[k] + noise[k]’

• Single-microphone noise reduction – Spectral Subtraction Methods (spectral filtering)– Iterative methods based on speech modeling (Wiener & Kalman Filters)

• Multi-microphone noise reduction– Beamforming revisited– Optimal filtering approach : spectral+spatial filtering

Page 20: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 20

Overview : Lecture-4

Acoustic Echo Cancellation

Adaptive filtering problem:• non-stationary/wideband/… speech signals• non-stationary/long/… acoustic channels

Adaptive filtering algorithmsAEC ControlAEC Post-processing Stereo AEC

Page 21: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 21

Overview : Lecture-5

Acoustic Feedback Cancellation• Ex: Hearing aids• Ex: PA systems• correlation between filter input (`x ’) and near-end signal (‘ n ’)• fixes : noise injection, pitch shifting, notch filtering, ...

amplifier

Page 22: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 22

Overview : Lecture-6

Reverb & De-reverberation

` microphone_signal[k] = filter*speech[k] (+ noise[k]) ’

• Reverb = effect of acoustic channel in between speaker and microphone(s)

• Reverb has an impact on coding, speech recognition, etc.

• Single-microphone de-reverberation– Cepstrum techniques

• Multi-microphone de-reverberation:– Estimation of acoustic impulse responses– Inverse-filtering method– Matched filtering

Page 23: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 23

Overview : Lecture-7

Active Noise Control

• Solution based on `filtered-X LMS’• Application : active headsets/ear defenders

Page 24: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 24

Overview : Lecture-7bis

3D Audio & Loudspeaker Arrays

• Binaural synthesis …with headphones head related transfer functions (HRTF) …with 2+ loudspeakers (`sweet spot’) crosstalk cancellation

Page 25: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 25

Overview : Lecture-8

Case Study:

Signal Processing in Cochlear Implants

1Hr lecture by Cochlear LtD

To be scheduled

Page 26: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 26

Aims/Scope (revisited)

Aim is 2-fold :• Speech & audio per se • Basic signal processing theory/principles : Optimal filtering / Kalman filters (linear/nonlinear) here : speech enhancement other : automatic control, spectral estimation, ... Advanced adaptive filter algorithms here : acoustic echo cancellation other : digital communications, ... Filtered-X LMS here : 3D audio other : active noise/vibration control

Page 27: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 27

Lectures

Lectures: 7*2hrs + 1*1hr– PS: Time budget = (15hrs)*4 = 60 hrs

Course Material: Slides – Use version 2013-2014 ! – Download from DASP webpage

http://homes.esat.kuleuven.be/~dspuser/dasp/

Page 28: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 28

Prerequisites

• H197 Signals & Systems (JVDW)

• HJ09 Digital Signal Processing (I) (PW)

signal transforms, sampling, multi-rate, DFT, …

• HC63 DSP-CIS (MM)

filter design, filter banks, optimal & adaptive filters

Page 29: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 29

Literature

Literature (General) (available in DSP-CIS library) • Simon Haykin `Adaptive Filter Theory’ (Prentice Hall 1996)• P.P. Vaidyanathan `Multirate Systems and Filter Banks’ (Prentice Hall 1993)

Literature (specialized) (some available in DSP-CIS library) • S.L. Gay & J. Benesty `Acoustic Signal Processing for Telecommunication’ (Kluwer 2000)• M. Kahrs & K. Brandenburg (Eds) `Applications of Digital Signal Processing to Audio and Acoustics’ (Kluwer1998)• B. Gold & N. Morgan `Speech and Audio Signal Processing’ (Wiley 2000)

Page 30: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 30

Exercise Sessions/Project

Acoustic source localization – Direction-of-arrival estimation– Noise reduction– Echo cancellation– Simulated set-up

Direction-of-arrival θ

Page 31: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 31

• Runs over 4 weeks (non-consecutive)• Each week

– 1 PC/Matlab session (supervised, 2.5hrs)– 2 ‘Homework’ sesions (unsupervised, 2*2.5hrs)

PS: Time budget = 4*(2.5hrs+5hrs) = 30 hrs

• ‘Deliverables’ after week 2 & 4• Grading: based on deliverables, evaluated during sessions

• TAs: guiliano.bernardi@esat (English+Italian)

alexander.bertrand@esat (English+Dutch)

PS: groups of 2

Acoustic Source Localization Project

Page 32: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 32

Work Plan

– Week 1: Design Matlab simulation set-up– Week 2: Direction-of-arrival (DoA) estimation

*deliverable*

– Week 3: DoA estimation + noise reduction– Week 4: DoA estimation + echo cancellation

*deliverable*

Acoustic Source Localization Project

..be there !

Page 33: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 33

• Oral exam, with preparation time• Open book• Grading

7 for question-1 7 for question-2

+6 for project ___ = 20

Exam

Page 34: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 34

• Oral exam, with preparation time• Open book• Grading

7 for question-1 7 for question-2

+6 for question-3 (related to project work) ___ = 20

September Retake Exam

Page 35: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 35

Website

1) TOLEDO

2) http://homes.esat.kuleuven.be/~dspuser/dasp/

• Contact: guiliano.bernardi@esat • Slides (use `version 2013-2014’ !!)• Schedule• DSP-library• FAQs (send questions to marc.moonen@esat)

Page 36: Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 36

Questions?

1) Ask teaching assistant (during exercises sessions)

2) E-mail questions to teaching assistant or marc.moonen@esat

3) Make appointment marc.moonen@esat ESAT Room 01.69