Nearfield Spherical Microphone Arrays
for speech enhancement and dereverberation
Etan Fisher
Supervisor:
Dr. Boaz Rafaely
Microphone Arrays Spatial sound acquisition Sound enhancement Applications:
reverberation parameter estimation dereverberation video conferencing
SpheresThe sphere as a symmetrical, natural entity.
Spherical symmetry
Facilitates direct sound field analysis:Spherical Fourier transformSpherical harmonics
Photo by Aaron Logan
Nearfield Spherical Microphone Array Generally, the farfield, plane wave assumption is made
(Rafaely, Meyer & Elko). In the nearfield, the spherical wave-front must be
accounted for.
Examples: Close-talk microphone Nearfield music recording Multiple speaker / video conferencing
Sound Pressure - Spherical Wave
Sound pressure on sphere r due to point source rp (spherical wave):
Spherical harmonics:
imm
nmn eP
mn
mnnY )(cos
)!(
)!(
4
)12(),(
0
||
),(),()()()(||
),,(*
n
n
nm
mnpp
mnpnn
p
rrik
YYkrhkrbkikarr
ekrp
p
From the solution to the wave equation (spherical coordinates):
Sound Pressure - Spherical Wave
Sound pressure on sphere r due to point source rp :
Spherical harmonics:
The spherical harmonics
are orthogonal and complete.
immn
mn eP
mn
mnnY )(cos
)!(
)!(
4
)12(),(
0
||
),(),()()()(||
),,(*
n
n
nm
mnpp
mnpnn
p
rrik
YYkrhkrbkikarr
ekrp
p
From the solution to the wave equation (spherical coordinates):
Sound Pressure - Spherical Wave Sound pressure on sphere r due to point source rp:
is the spherical Hankel function.
is the modal frequency function (Bessel):
ra radius of sphere Rigid
sphereOpen
))()('
)(')((4
)(4)(
krhkah
kajkrj
krjkrb
nn
nn
n
n
0
),(),()()()(),,(*
n
n
nm
mnpp
mnpnn YYkrhkrbkikakrp
)(krhn
)(krbn
Point Source Decomposition Sound pressure on sphere r due to point source rp:
Spherical Fourier transform:
Spatial filter – cancel spherical wave-front, yielding unit amplitude at rp=r0.
)()(
)()(
)()(
)()(
*
00p
mn
n
pn
nn
nmnm Y
krh
krhka
krhkrikb
krpkrw
)()()()()(),()(**
pmnpnn
mnnm YkrhkrbkikadYkrpkrp
0
),(),()()()(),,(*
n
n
nm
mnpp
mnpnn YYkrhkrbkikakrp
Point Source Decomposition Amplitude density:
Using the identity:
where Θ is the angle between Ω and Ωp,
0
*
0
)()()(
)()(),(n
n
nm
mnp
mn
n
pn YYkrh
krhkakw
)(cos4
12)()(
*
n
n
nm
mnp
mn P
nYY
0 0
)(cos4
12
)(
)()(),(n
nn
pn Pn
krh
krhkakw
N = 4; rA (array) = 0.1m; k = kmax
kmax = N/rA = 40
kmax = 2πfmax /343
fmax = 2184 Hz
r0 – Desired source location
rp – Interference location
Radial Attenuation
N = 4; rA (array) = 0.1m; k = kmax/4
kmax = N/rA = 40
kmax = 2πfmax /343
fmax = 2184 Hz
r0 – Desired source location
rp – Interference location
Radial Attenuation
N = 4; rA (array) = 0.1m; k = kmax/10
kmax = N/rA = 40
kmax = 2πfmax /343
fmax = 2184 Hz
r0 – Desired source location
rp – Interference location
Radial Attenuation
N = 2; rA (array) = 0.05 m; k = kmax
kmax = N/rA = 40
kmax = 2πfmax /343
fmax = 2184 Hz
r0 – Desired source location
rp – Interference location
Radial Attenuation – “Close Talk”
N = 2; rA (array) = 0.05 m; k = kmax /4
kmax = N/rA = 40
kmax = 2πfmax /343
fmax = 2184 Hz
r0 – Desired source location
rp – Interference location
Radial Attenuation – “Close Talk”
N = 12; rA (array) = 0.3 m; k = kmax /4
kmax = N/rA = 40
kmax = 2πfmax /343
fmax = 2184 Hz
r0 – Desired source location
rp – Interference location
Radial Attenuation – Large Array
N = 4; rA (array) = 0.1m; k = kmax
kmax = N/rA = 40
kmax = 2πfmax /343
fmax = 2184 Hz
The natural radial attenuation has been cancelled by multiplying the array output by the distance.
Normalized Beampattern
N = 4; rA (array) = 0.1m; k = kmax /4
kmax = N/rA = 40
kmax = 2πfmax /343
fmax = 2184 Hz
The natural radial attenuation has been cancelled by multiplying the array output by the distance.
Normalized Beampattern
N = 4; rA (array) = 0.1m; k = kmax /10
kmax = N/rA = 40
kmax = 2πfmax /343
fmax = 2184 Hz
The natural radial attenuation has been cancelled by multiplying the array output by the distance.
Normalized Beampattern
Directional Impulse Response
Amplitude density:
Impulse response at direction Ω0:
where is the ordinary inverse Fourier transform.
0 0
)(cos4
12
)(
)()(),(n
nn
pn Pn
krh
krhkakw
)},({)( 01 kwtw
1
Speech Dereverberation
Room IR Directional IR
{4 X 3 X 2}
N = 4
r = 0.1 m
r0 = 0.2 m
“Dry”
“Rev.”
“Derev.”
Music Dereverberation Room IR Directional IR
{ 8 X 6 X 3 }
N = 4
r = 0.1 m
r0 = 1.9 m
“Dry”
“Rev.”
“Derev.”
Conclusions Spherical wave pressure on a spherical microphone
array in spherical coordinates. Point source decomposition achieves radial
attenuation as well as angular attenuation. Directional impulse response (IR) vs. room IR. Speech and music dereverberation. Further work:
Develop optimal beamformer Experimental study of array