View
218
Download
0
Category
Tags:
Preview:
Citation preview
Noise reduction and binaural Noise reduction and binaural
cue preservation of multi-cue preservation of multi-
microphone algorithms microphone algorithms
Simon Doclo, Tim van den Bogaert, Marc Moonen, Jan Wouters
Dept. of Electrical Engineering (ESAT-SCD), KU Leuven, Belgium Dept. of Neurosciences (ExpORL), KU Leuven, Belgium
Oldenburg, June 28 2007
22
OverviewOverview
• Problem statement
o Improve speech intelligibility + preserve spatial awareness
o Bilateral vs. binaural processing
• Binaural signal processing using multi-channel Wiener filter
o MWF: noise reduction and preservation of speech cues, noise cues are distorted
o Extension of MWF to preserve binaural cues of all components:
– MWFv: partial estimation of noise component
– MWF-ITF: extension with Interaural Transfer Function
o Physical and perceptual evaluation
• Reduce bandwidth requirements of wireless link
o Distributed binaural MWF
33
Problem statementProblem statement
• Many hearing impaired are fitted with a hearing aid at both earso Signal processing to selectively enhance useful speech signal
and improve speech intelligibility
o Signal processing to preserve directional hearing and spatial awareness
o Multiple microphone available: spectral + spatial processing• Binaural auditory cues:
o Interaural Time Difference (ITD) – Interaural Level Difference (ILD)
o Binaural cues, in addition to spectral and temporal cues, play an important role in binaural noise reduction and sound localisation
o ITD: f < 1500Hz, ILD: f > 2000Hz
IPD/ITD
ILD
Problem statement
-bilateral/binaural Binaural processing
Bandwidthreduction
Conclusions
44
Bilateral vs. BinauralBilateral vs. Binaural
Bilateral system
Independent left/right processing:Preservation of binaural cuesfor localisation ?
Binaural system
More microphones: better performance ? preservation of binaural cues ?
Need of binaural link
Problem statement
-bilateral/binaural Binaural processing
Bandwidthreduction
Conclusions
55
• Bilateral system:
o Independent processing of left and right hearing aid
o Localisation cues are distorted
RMS error per loudspeaker when accumulating all responses of the different test conditions (NH = normal hearing, NO = hearing impaired without hearing aids, O = omnidirectional configuration, A = adaptive directional configuration)
[Van den Bogaert et al., 2006]
Bilateral vs. BinauralBilateral vs. Binaural
Problem statement
-bilateral/binaural Binaural processing
Bandwidthreduction
Conclusions
also effect on intelligibility through binaural hearing advantage
77
• Bilateral system:
o Independent processing of left and right hearing aid
o Localisation cues are distorted
• Binaural system:
o Cooperation between left and right hearing aid (e.g. wireless link)
o Assumption: all microphone signals are available at the same timeObjectives/requirements for binaural
algorithm:
1. SNR improvement: noise reduction, limit speech distortion
2. Preservation of binaural cues (speech/noise) to exploit binaural hearing advantage
3. No assumption about position of speech source and microphones
[Van den Bogaert et al., 2006]
Bilateral vs. BinauralBilateral vs. Binaural
Problem statement
-bilateral/binaural Binaural processing
Bandwidthreduction
Conclusions
1010
Configuration and signalsConfiguration and signals
• Configuration: microphone array with M microphones at left and right hearing aid, communication between hearing aids
noise component
0, 0, 0, 0( ) = ( ) , 0) =( 1m m mVY X m M 00, 0,, 0( ) = ( )( 0) , = 1mm mY V m MX
speech component
0 0 1 1( ) = ( ) ( ), ( ) = ( ) ( )H HZ Z W Y W Y
• Use all microphone signals to compute output signal at both ears
Problem statement
Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval
Bandwidthreduction
Conclusions
1111
Overview of cost functionsOverview of cost functions
Multi-channel Wiener filter (MWF): MMSE estimate of speech component in microphone signal at both ears
trade-off noise reduction and speech
distortion
Speech-distortion weighted multi-channel Wiener filter
(SDW-MWF)[Doclo 2002, Spriet 2004]binaural cue
preservation of speech + noise
Partial estimation of noise component
(MWFv)[Klasen 2005]
Extension with ITD-ILD or Interaural Transfer
Function (ITF)
[Doclo 2005, Klasen 2006, Van den Bogaert 2007]
Problem statement
Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval
Bandwidthreduction
Conclusions
1212
• Binaural SDW-MWF: estimate of speech component in microphone signal at both ears (usually front microphone) + trade-off between noise reduction and speech distortion
Binaural multi-channel Wiener Binaural multi-channel Wiener filterfilter
0
1
= , = ,x v M xx y v
M x v x
R R 0 rR r R R R
0 R R r
0
1
2 2
0, 0 0
11, 1
( )H H
r
HHr
XJ E
X
W X W VW
W VW X1=SDWW R r
speech componentin front microphones noise
reductionspeech distortion
Problem statement
Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval
Bandwidthreduction
Conclusions
estimate
o Depends on second-order statistics of speech and noise
o Estimate Ry during speech-dominated time-frequency segments, estimate Rv during noise-dominated segments, requiring robust voice activity detection (VAD) mechanism
o No assumptions about positions of microphones and sources
o Adaptive (LMS-based) algorithm available [Spriet 2004, Doclo 2007]
1313
Binaural multi-channel Wiener Binaural multi-channel Wiener filterfilter
• Interpretation for single speech source:
o Spectral and spatial filtering operation
with (spatial) coherence matrix and P (spectral) power
o Equivalent to superdirective beamformer (diffuse noise field) or delay-and-sum beamformer (spatially white noise field) +single-channel WF-based postfilter (spectral subraction)
Spatial separation between speech and noise sources
SNR
0
1 1*
,0 0,1 1 /
Hv v
SDW rH Hv v v s
AP P
Γ A A Γ AW
A Γ A A Γ A
• Binaural cues (ITD-ILD) :
Perfectly preserves binaural cues of speech component
Binaural cues of noise component speech component !!
Problem statement
Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval
Bandwidthreduction
Conclusions
1414
• Partial estimation of noise component
o Estimate of sum of speech component and scaled noise component
o Relationship with SDW-MWF: mix with reference microphone signals
reduction of noise reduction performance
works for multiple noise sources
Partial noise estimation (MWFv)Partial noise estimation (MWFv)
0
1
0
1
0,
2
0, 0
1, 11,
( )r
Hr
Hr
r
X
X
V
VJ E
W YW
W Y
0
1
0
1
2 2
0, 0 0
1,
0,
1 11,
0 1( ) ,r
r
H Hr
H Hr
XJ
VE
X V
W X W VW
W X W V
0
1
0 0, ,0
1 1, ,1
(1 )
(1 )
r SDW
r SDW
Z Y Z
Z Y Z
Problem statement
Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval
Bandwidthreduction
Conclusions
1515
Interaural Wiener filter (MWF-Interaural Wiener filter (MWF-ITF)ITF)
• Extension of SDW-MWF with binaural cueso Add term related to binaural cues of noise (and speech)
component
o Possible cues: ITD, ILD, Interaural Transfer Function (ITF)
( ) = ( ) ( ) ( )x vtot SDW cue cueJ J J J W W W W
0 0
1 1
Hv v
out Hv
ZITF
Z
W V
W V
0 10
1 1 1
*0, 1,0, 0 1
*1, 1 11, 1,
( , )
( , )
r rrv vin
r vr r
E V VV r rITF
V r rE V V
R
Re.g.
0
1
2 2
0, 0 0
11, 1
2 2
0 1 0 1
( ) =H H
r
tot HHr
H x H H v Hin in
XJ E
X
E ITF E ITF
W X W VW
W VW X
W X W X W V W V
ITF preservation speech ITF preservation noise
o Closed form expression!o large changes direction of speech increase weight o Implicit assumption of single noise source
Problem statement
Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval
Bandwidthreduction
Conclusions
1717
Simulation setupSimulation setup
• Identification of HRTFs:o Binaural recordings on CORTEX MK2 artificial head
o 2 omni-directional microphones on each hearing aid (d=1cm)
o LS = -90:15:90, 90:30:270, 1m from head
o Room reverberation: T60=140 ms (and T60=510 ms)
Problem statement
Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval
Bandwidthreduction
Conclusions
1818
Experimental resultsExperimental results
• Simulations:
o SxNy, SNR = 0 dB on left front microphone (broadband)
o fs = 20.48 kHz
• MWF algorithmic parameters:
o batch procedure, perfect VAD
o L=96, =5
o MWFv for different , MWF-ITF for different ,• Physical evaluation:
o Speech = HINT, noise = babble noise
o Speech intelligibility: SNR
o Localisation: ITD / ILD
• Perceptual evaluation:
o Preliminary study with NH subjects
o Speech intelligibility: SRT
o Localisation: localise S and N
Problem statement
Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval
Bandwidthreduction
Conclusions
1919
Physical evaluationPhysical evaluation
• Performance measures:
o Intelligibility weighted SNR improvement (left/right)
o ILD error (speech/noise component) power ratio
x xx out i in i
i
ILD ILD ILD
o ITD error (speech/noise component) phase of cross-correlation
x i x ii
ITD I ITD 1
* *0, 1, 0 10
{ } { }x i r r x xITD E X X E Z Z
L i L ii
SNR I SNR
importance of i-th frequencyfor speech intelligibility
low-pass filter 1500 Hz
Problem statement
Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval
Bandwidthreduction
Conclusions
0 0.2 0.4 0.6 0.8 10
10
20
S
NR
left
[dB
]
0 0.2 0.4 0.6 0.8 10
10
20
S
NR
rig
ht [
dB]
0 0.2 0.4 0.6 0.8 10
2
4
6
IL
D s
peec
h [d
B]
0 0.2 0.4 0.6 0.8 10
2
4
6
IL
D n
oise
[dB
]
0 0.2 0.4 0.6 0.8 10
0.5
1
IT
D s
peec
h [r
ad]
auditec 60deg (=5, L=96, N=4)
0 0.2 0.4 0.6 0.8 10
0.5
1
IT
D n
oise
[ra
d]
Physical evaluation: MWFvPhysical evaluation: MWFvS0N60
2323
• Procedure:o headphone experiments, using measured HRTFs
o Filters are calculated off-line on VU speech-weighted noise as S and multitalker babble noise as N
o All stimuli presented at comfort level, 5 NH subjects (ongoing)
Perceptual evaluationPerceptual evaluation
Headphones
HRTFx
HRTFv
speech
noise
GBinaural filter
Mic L
R
Problem statement
Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval
Bandwidthreduction
Conclusions
• Speech intelligibility:o Adaptive procedure to find 50% Speech Reception Threshold (SRT)
• Localisation:o S and N components (telephone) are sent separately through filter
o Localise S and N in room where HRTFs were measured
o Level roving 6 dB, 3 repetitions per condition for each subject
2424 N270 N315 S0 S45 N60 S90
-90
-75
-60
-45
-30
-15
0
15
30
45
60
75
90
v a mwf02 0dB
Perceptual evaluation: MWFvPerceptual evaluation: MWFv• Algorithms: unprocessed, state-of-the art bilateral, MWF, MWFv
(=0.2)
• Conditions: S0N60, S45N315 and S90N270 (T60=510 ms)
N270 N315 S0 S45 N60 S90
-90
-75
-60
-45
-30
-15
0
15
30
45
60
75
90
v a unproc 0dB
N270 N315 S0 S45 N60 S90
-90
-75
-60
-45
-30
-15
0
15
30
45
60
75
90
v a classic 0dB
N270 N315 S0 S45 N60 S90
-90
-75
-60
-45
-30
-15
0
15
30
45
60
75
90
v a mwf0 0dB
Problem statement
Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval
Bandwidthreduction
Conclusions
2525
Perceptual evaluation: MWFvPerceptual evaluation: MWFv
• With state-of-the-art systems: preservation of binaural cues only within central angle of frontal hemisphere.
• Binaural MWF:
o preserves localization cues for speech source
o preserves localization cues for noise source(s) with small mixing
o Recent SRT experiments (N=2) show no substantial SRT difference between =0 and =0.2
• Ongoing research:
o Perceptual evaluation (SRT and localisation) for MWF-ITF
Problem statement
Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval
Bandwidthreduction
Conclusions
2828
Bandwidth constraintsBandwidth constraints
• Binaural MWF:o 2M microphone signals are transmitted over wireless link
• Reduce bandwidth requirement of wireless link:o Transmit one signal from contralateral ear
Problem statement
Binaural processing
Bandwidthreduction
Conclusions
– Front contralateral microphone signal
– Output of contralateral fixed (e.g. superdirective) beamformer
– Output of MWF using only M contralateral microphone signals
– Iterative distributed binaural MWF scheme
2929
Physical evaluationPhysical evaluation
60 90 120 180 270 300 -60 60 -120 120 120 210 60 120 180 210 60 120 180 270
8
10
12
14
16
18
20
22Performance comparison of MWF-based binaural algorithms
noise source(s) angle (°)
AI
wei
ghte
d S
NR
impr
ovem
ent
(dB
)
MWF-fullMWF-frontMWF-contraMWF-iter
Problem statement
Binaural processing
Bandwidthreduction
Conclusions
Performance of dB-MWF close to full binaural MWF !
3030
Contralateral directivity patternContralateral directivity pattern
T60=140 ms
S0N120
Left HA -50
-45
-40
-35
30
210
60
240
90
270
120
300
150
330
180 0
Left HA - contralateral (N=4,120 deg,=5,SNR=14.6277)
Fullband
SNR=14.6dB
B-MWF
-45
-40
-35
-30
30
210
60
240
90
270
120
300
150
330
180 0
Left HA - front contralateral (N=3)
Fullband
MWF-front
SNR=10.5dB
-55
-50
-45
-40
-35
30
210
60
240
90
270
120
300
150
330
180 0
Left HA - MWF contralateral (N=4,120 deg,=5,SNR=14.2051)
Fullband
MWF-contra
SNR=14.2dB
-50
-45
-40
-35
30
210
60
240
90
270
120
300
150
330
180 0
dB-MWF
SNR=14.2dB
3131
ConclusionsConclusions
• State-of-the art signal processing in (bilateral) HAs: preservation of binaural cues only within central angle of frontal hemisphere
• Binaural MWF:
o Substantial noise reduction (MWF 4 3 > 2)
o Preservation of binaural speech cues
o Distortion of binaural noise cues
o No assumptions about positions and microphones VAD
• Compromise between noise reduction and binaural cue preservation can be achieved with extensions of MWF
o Mixing with microphone signals
o Interaural Transfer Function
• Reduction of bandwidth using distributed MWF
Problem statement
Binaural processing
Bandwidthreduction
Conclusions
Recommended