Psychoacoustic Investigation on the Auralization of Spherical Microphone Array Processing With Wave Field Synthesis

8/18/2019 Psychoacoustic Investigation on the Auralization of Spherical Microphone Array Processing With Wave Field Synthesis

1/22

Audio Engineering Society

Convention PaperPresented at the 138th Convention2015 May 7–10 Warsaw, Poland

This Convention paper was selected based on a submitted abstract and 750-word precis that have been peer reviewed by at least two qualied anonymous reviewers. The complete manuscript was not peer reviewed. This convention paper has been reproduced from the author’s advance manuscript without editing, corrections, or consideration by the Review Board. The AES takes no responsibility for the contents. Additional papers may be obtained by sending request and remittance to Audio Engineering Society, 60 East 42 nd Street, New York, New York 10165-2520, USA; also see www.aes.org. All rights reserved. Reproduction of this paper, or any portion thereof, is not permitted without direct permission from the Journal of the Audio Engineering Society .

Psychoacoustic investigation on theAuralization of spherical microphone arrayprocessing with wave eld synthesis

Gyan Vardhan Singh 1

Correspondence should be addressed to Gyan Vardhan Singh ( [email protected] )

ABSTRACT

In the present work we have investigated the perceptual effects induced by various errors and artifactswhich arise when spherical microphone arrays are used on the recording side. For spatial audio it is veryimportant to characterize the acoustic scene in three dimensional space. In order to achieve this threedimensional characterization of a sonic scene, spherical microphone arrays are employed. The use of thesespherical arrays has some inherent issues because of some errors and by virtue of mathematics involved inthe processing. In this paper we analyzed these issues on recording side (spherical microphone array) whichinuence the audio quality on the rendering side and did a psychoacoustic investigation to access the extentto which the errors and artifacts produce a perceivable affect during auralization when the acoustic scene isreproduced using wave eld synthesis.

1. INTRODUCTIONSound reproduction techniques for virtual sound sys-

tems have been studied, developed and implementedin various different ways and congurations. Acous-tic auralization of sound elds in this work, focuseson wave eld analysis (WFA) concerning sphericalmicrophone arrays, and their auralization on a 2-dimensional geometry of loudspeaker array follow-ing the principle of wave eld synthesis (WFS). Inorder to obtain the acoustic scene characteristics,

microphone arrays are implemented because an ar-ray of microphones is spatially distributed so it not

only samples the sound in time but also in space.Hence the need to obtain temporal properties alongwith spatial properties for a successful spatial soundreproduction is answered by the usage microphonearrays [2][3][4].

As spherical array can provide us a complete threedimensional sonic image of an acoustic scene, there-


2/22

Singh Spherical array processing with WFS

fore an attempt has been made in this paper to lookinto perceptual effect which come into play whenvarious parameters on the microphone array side arechanged. More importantly we investigate the corre-spondence between microphone array and the soundeld renderer and analyze the extent of effect in-duced by various errors and artifacts.

Any sound wave can be represented as a super posi-ton of plane waves in far eld of its sources [5][6],and also, we can say that a room can be character-ized by its impulse responses as it can be assumed tobe linear time invariant (LTI). Hence if we are ableto capture the room impulse responses of a roomthen we can fully characterize the acoustic natureof that room and inturn any acoustic event in that

room could be reproduced simply with the help of plane wave decomposed component of its room im-pulse responses.

In process of plane wave decomposition the spheri-cal aperture of a spherical microphone array is dis-cretized and because of this limitations get imposedon the performance of the array. An ideal full audiospectrum wave eld impact is simulated on the con-tinuous aperture of spherical microphone array andcompared with the sampled array aperture with dif-ferent degrees of errors in different categories. Bythis comparison we attempt to establish the extent

to which a said error would perceptually corrupt areproduced sound eld. We also try to see the extentto which some degree of error remains perceptuallyinsignicant or in other words the extent of errorwhich can be tolerated. The spatial aliasing limitimposed by the rendering system is analyzed and onthe basis of that a base for the transform order isestablished for comparison.

These transform orders also referred to as transformlevels come into play as a consequence of sphericalarray processing when spherical wave equations aresolved. When higher transform orders or levels arechosen, theoretically it also increases the sharpnessof the directional attributes calculated with planewave decomposition. But there exists a trade off, italso amplies various errors and artifacts and alsoinduces aliasing.

This research was conducted in TU Ilmenau, Ger-many as a part of master thesis work [17] and theauralization was implemented in WFS lab using 88

loudspeaker element. Real room data was not usedbut instead free eld acoustic impact on sphericalarray was simulated. In the simulation some partsof SOFiA sound eld analysis toolbox are used [1].

2. SPHERICAL MICROPHONE ARRAY SIG-NAL PROCESSINGThe following section introduces the wave equationand the fundamental mathematical background be-hind spherical array processing. Further the discus-sion goes into details of spherical harmonic decom-position and then plane wave decompositon.

2.1 . Wave EquationWave equation in spherical coordinates system isexpressed as follows

1r 2 · ∂ ∂r r 2 ∂p∂r + 1r 2 sinϑ · ∂ ∂ϑ sinϑ ∂p∂ϑ+ 1r 2 sin 2 ϑ · ∂

2

∂ϕ 2 − 1c2 · ∂ 2 p

∂t 2 = 0 (1)

In this equation p is a variable of ( r ,ϑ ,ϕ , t ). Thesolutions of wave equation 1 in frequency domain isexplained in [5] and is given in two forms as

p(r, Ω, k) = ∞l=0 lm = − l (Alm (k) · j l (kr )+ B lm (k) ·yl (kr ))Y ml (Ω) (2) p(r, Ω, k) = ∞l=0 lm = − l (C lm (k) ·h

(1)l (kr )

+ D lm (k) ·h(2)l (kr ))Y

ml (Ω) (3)

The two solution represent the interior and exteriorproblem, equation 3 refers to the exterior problemand equation 2 refers to the interior problem.

2.2 . Spherical harmonicsY ml (Ω) is the function known as spherical harmonic

of level or order l and mode m and is dened as

Y ml (ϑ,ϕ) = (2l + 1)4π (l −m)!(l + m)!P ml (cosϑ)eimϕ (4)In equation 4, P ml (cosϑ) is the Legendre functionof the rst kind and i = √ −1. Any function ona sphere could be represented by a combination of spherical harmonics [5]. The spherical harmonicsdene the angular components of the wave solution.The spherical harmonic for negative m can be writ-ten as

Y − ml (Ω) = ( −1)m Y ml (Ω)m > 0 (5)where Y ml (Ω) is the complex conjugate of Y

ml (Ω).

There are 2 l + 1 different spherical harmonics for

AES 138 th Convention, Warsaw, Poland, 2015 May 7–10Page 2 of 22


3/22


each level l as −l ≤m ≤ l. One more property of spherical harmonics is that they are orthonormal [5].

S 2 Y m

l (Ω)Y ml (Ω)dΩ = δ l l δ m m (6)

here δ l l is the Kronecker delta, which is 1 for l = land 0 otherwise. The surface integral is dened as

S 2 dΩ = 2π0 dϕ π0 sin ϑdϑ (7)As said above the any fuction on a sphere can bedecomposed into the sum of spherical harmonics [5][18] .

f (Ω) =

∞

l=0

l

m = − l f lm (k)Y m

l (Ω) (8)

this expression can also be termed as inverse spher-ical Fourier transform (ISFT) [18]. As the sphericalharmonic functions are orthonormal hence we canobtain the spherical fourier transform coefficients,given as

f lm (k) = S 2 Y ml (Ω)f (Ω)dΩ (9)The derivation for this expression can be referred in[18]. The importance of these expression presented

above is that with the help of these expression weobtain our spherical wave decomposition and in turnthe plane wave decomposition.

2.3 . Spherical harmonic decompositionIn following section we will derive expressions forspherical harmonic decomposition and talk aboutvarious consequences which are encountered duringthis part of wave eld analysis [19][20][13].

2.4 . Interior and Exterior problemTo understand the sound radiation we will invoketwo similar kind of explanation, in [12] Zotter ex-plains sound radiation as a soap bubble problem,Figure 1. We assume a free sound eld and an idealbubble of soap which is large enough to enclose somesound source, now when the sound is produced bythe source, the bubble surface will vibrate accordingto the motion of air, because as sound propagatesthrough the medium it will hit the bubble and con-sequently the soap bubble will also vibrate with theair molecules. At respective observation points on

Fig. 1: Soap bubble model of acoustic radiation[12].

the sphere the wave form of the vibrating spherecan be said to represent the radiated sound.In [5], Williams has explained that acoustic soundradiation from the instrument could be completelydened if we are able to acoustically map the motionof this continuous surface enclosing the sources. Thiskind of analysis of sound radiation is called exterior problem . In a similar way lets we say that there areno sources inside (rather it is enclosing the measure-ment set up) the soap bubble but instead the soundradiation propagates from outside (i.e. the sourcesare outside) and hits the bubble from exterior. Nowidentifying the motion of the surface of bubble would

be sufficient to describe the acoustic radiation, thisis called interior problem .

In interior problem analysis, Figure 1, the soundsources are located outside the spherical volume andestimating the acoustic effect on the surface of thisvolume is sufficient to characterize the sound inspace. Going a bit further we may say that in or-der to map this surface spherical microphone arrayis used. Hence we say that our spherical microphonearray is enclosed by an imaginary volume and at eachobservation point of the array we attempt to mea-sure the acoustic effect invoked by external sources.

In this paper the same analytical line is followed, asthe spherical array characterizes a listening room en-vironment by measuring the impulse response com-ing from different directions [2], this in turn givesthe directional behavior of the sound when aural-ization is done. The solution for interior problemcomes from equation 2, as the solution should benite at all point within the measurement region



4/22


Fig. 2: Interior problem[8]

r ≤b in gure 2, hence considering the properties of Hankel function and spherical Bessel function whenr = 0, that is at the origin, both the spherical Hankelfunction and spherical Bessel function of second kindwould not be nite hence our solution would contain

only the rst term of equation 2 and is given as [5]:

p(r, Ω, k) =∞

l=0

l

m = − l

(Alm (k)) · j l (kr ) ·Y ml (Ω) (10)

2.5 . Spherical wave spectrumNow as dened in equation 10, if we can obtainthe coefficient Alm (k)then we can easily dene thepressure eld p(r, Ω, k). Exploiting the property of orthonormality of spherical harmonics and the factthat any arbitrary function on a sphere can be ex-panded in terms of its spherical harmonics [18] andconsidering equation 10 we obtain

Alm (k) = 1

j l (kr ) S 2 p(r, Ω, k)Y ml (Ω)dΩ (11)The expression for Alm (k) is also called sphericalwave spectrum as it can be regarded as sphericalFourier transform of p(r, Ω, k) [5], also written as

P lm (r, k ) = 1

j l (kr ) S 2 p(r, Ω, k)Y ml (Ω)dΩ (12)P lm (r, k ) describes the sound wave in frequency interms of wave number or k-space.

2.6 . Spherical wave sound elds

Before we go to next sections lets bring out someanalysis as to how to express a spherical wave ata point due to some given source. Refer to gure3. We consider a point source, also termed as amonopole at the origin O. The pressure p(r, k ) atpoint P is given by the expression [5]

p(r, k ) = −ip0(k)ckQ seikr

4πr (13)

Fig. 3: Geometrical description for the calculationof pressure p(r,ϑ ,ϕ ,k ) at point P for source at Q

Here r is the length of the position vector r for pointP , c is the speed of sound, and k is the wave number.Qs represents the source strength [5]. Now if wewant to calculate the pressure eld at point P dueto a source located at point Q and this can be doneby some geometrical manupulation on equation 13.

Assume the same monopole to be located at Q withdistance r s = r s from the origin. If we say r s = r sthen the pressure at point P due to source at Qwould be equivalent to the pressure at P due to thesource at the origin O. Therefore pressure p(r, Ω, k)at point P for a source at Q is

p(r, Ω, k) = −ip0(k)ckQ seik r − r s

4π r −r s

X(14)

Here Ω ≡ (ϕ,ϑ). The signicance of this equationis that we derived an expression for pressure eldat a point on a sphere due to a source located ata position other than the origin, that means if wetry to present a analogy with spherical microphonearray, then consider the array as a spherical surfaceand at any point on that surface we can describe thepressure eld due to a source located at any positionQ. Here one more thing which is to be noted is thefact that as r −r s is dependent on ϕ and ϑ hence



5/22


the sound pressure in equation 14 is also dependenton ϕ and ϑ. Further as derived in [5] the term X isequivalent to Green function G(r

|r s ).

2.7 . Spherical harmonic expansion of plane wave p(r, Ω, k) = p0(k) ·ei ·

k · r (15)

where p0(k) is the magnitude of plane wave, r isthe position vector ( r, Ω), and k is the wave vector.Assuming p0(k) = 1 for the purpose of derivationand using equation 15 in 10 we get

ei ·k · r =∞

l=0

l

m = − l

(Alm (k)) · j l (kr ) ·Y ml (Ω) (16)

Here k and r are the wave vector and position vec-tor respectively. Over here we would like to pointout that the plane wave which was described in vec-tor domain by wave vector and position vector inequation 15 is expressed in terms of wave numberk and scalar distance r . Equation 16 can be furthertransformed as explained in [5] and is given as

ei ·k · r = 4 π∞

l=0

i l j l (kr )l

m = − l

Y ml (Ω) ·Y ml (Ω0) (17)

here Ω0 ≡ ϕ0 ,ϑ0 is the incident direction of theplane wave, where as Ω is the point where we wantto observe the pressure eld. From equations 16and 17 we can draw out a conclusion that

Alm = 4π i l ·Y ml (Ω0) (18)and we observe from this that the spherical wavecofficient Alm for plane wave sound eld is not de-pendent on k or frequency f of the wave.

In [6] [8] equation 17 has been simulated for a planewave sound eld of 1 kHz. The simulation has beenshown for different maximum value of level l andand nally it was deduced that the plane wave eld

can be approximated exactly only within a boundedregion around the origin and this region is bigger forhigher values of l. If we say that in equation 17, inplace of ∞ in the rst summation we replace it bya maximum level l = L, then we can establish anapproximate rule given by

dλ

= L2π

(19)

here d is the radius of the region, L maximum level land λ is the wavelength og the plane wave. This pr-portionality states the fact that the region for whichwe can effectively dene the pressure eld is propor-tional to the level l.

2.8 . Mode strengthWe dene an expression for the combination of

Bessel function and hankel function which have ap-peared in earlier sections while deriving coefficientsAlm of spherical harmonics. In the process of mea-surement of sound elds using spherical microphonearrays the interaction of sound eld with the arraystructure has to be taken into consideration [5] [13][20]. If we recall equation 10 and 11 and expressthem in a generalized form in order to associate them

to different kind of spherical microphone array struc-ture. The equation are written as

s(r, Ω, k) =∞

l=0

l

m = − l

(Alm (k)) ·bl (kr ) ·Y ml (Ω) (20)

Alm (k) = 1bl (kr ) S 2 s(r, Ω, k)Y ml (Ω)dΩ (21)

here s(r, Ω, k) is the spherical microphone array re-sponse. The term bl (kr ) is called as mode strength.For different microphone array structure the interac-tion of sound elds with the array is approximated

using this term [5][21].In rigid sphere the sensors arearranged on a solid sphere.

bl (kr ) = 4 πi l j l kr − j l (ka )h (2)l (ka )

h(2)l (kr ) ,

rigid sphere arrays

(22)

here j l (kr ) is the spherical bessel function of rstkind, h (2)l (kr ) and h

(2)l (ka ) are the spherical Hankel

function of second kind, ( ·) denotes the derivative,and a is the radius of the sphere, where r ≥ a. Ingure 4 the mode strengths bl (kr ) is plotted as afunction of kr and for different order l, in gure orderl are represented by the alphabet n .

The major advantage of using rigid sphere congu-ration is the improved numerical conditioning as inequation 21 the spherical coefficient Alm containsa term 1 /b l , and as bl is zero for some cases inopen sphere conguration but not in the case of rigidspheres [13] [20].



6/22


Fig. 4: Mode strength for rigid sphere array [20]

2.9 . Discretization of Spherical Aperture andSpatial AliasingIn practice we can sample a sphere only on a nite

number of microphone positions. Hence expressionfor spherical coefficients Alm (k) which are denedby the integral over a unit sphere in equation 21needs to be translated into a nite summation. Theapproximation of nite integrals is known as quadra-ture and the expression for A

lm(k) in terms of nite

summation is given as [6]:

Alm (k) ≈ Âlm (k) = 1bl (kr )

Q

q

wq ·s(r, Ω, k) ·Y ml (Ω)(23)

where Âlm (k) is the approximated spherical coeffe-cient, Q is the number of microphone positions andwq are the quadrature weights. The weights wq arethe factors which are used for compensation in dif-ferent types of quadrature schemes so as to approx-

imate the sound eld as closely as possible to thecontinuous aperture.

Spatial sampling requires to be limited in bandwidth i.e., limited harmonic order l to avoid aliasing[20] [22]. Hence in order to avoid spatial aliasing thefollowing equation must hold good [6]

Alm (k) = 0 , wherel > L max (24)

Here Lmax is the highest order spherical coefficientof the sound eld. The equation given in 24 mustbe ensured in sampling the sphere otherwise spatialaliasing will corrupt the coefficients at lower orders[22] [20] [6].

These quadrature allow us to perform sampling onthe sphere with negligible or no aliasing as far asequation 24 holds good. In this work we use Lebedevgrid.

QLb = 43

(Lmax + 1) 2 (25)

Using the approach of quadrature for discretizationof the sphere we require a level limited sound eld inorder to get an aliasing free sampling but for planewave sound elds the restrictions to a maximum levelLmax is not true as we can see this from equation16 and equation 17 which involve innite number of non-zero spherical coeffecients A lm (k). Hence somedegree of spatial aliasing does occurs. But spheri-cal Bessel function j l (kr ) decay rapidly for kr > l ,therefore the strength of coefficients in equation 16can also be supposed to show a similar behavior forkr > l . Hence we say that the aliasing error can beignored if the operation frequency of the microphonearray follows kr


7/22


number of plane waves with the assumption thatthey have magnitude of w(Ω0 , k) and are arrivingfrom all the directions Ω 0 . Integrating equation 27for all the incident directions we have the expressionfor spherical fourier coeffecients f lm (k)

f lm (k) = 4 πi l bl (kr ) S 2 w(Ω0 , k)Y ml (Ω0) (28)The expression in equation 28 is termed as the spher-ical fourier transfor of amplitudes w(Ω0 , k) and weexpress it as wlm (k)

wlm (k) = f lm (k) 1

4πi l bl (kr ) (29)

For obtaining the amplitude ws (Ωs , k) of any planewave arriving from any direction Ω s we perform aninverse SFT of equation 29

ws (Ωs , k) =∞

l=0

l

m = − l

f lm (k) · 1

4πi l bl (kr ).Y ml (Ωs )

(30)

ws (Ωs , k) is also called directivity function and de-scribes the decomposed plane wave for a particulardirection Ω s . Ωs is also known as steering directionof the microphone array, and tells the direction forwhich plane wave decomposition is computed.

Further if we use equation 26 in equation 30 weget the expression for plane wave decomposition interms of spherical harmonic coefficients A lm (k).

ws (Ωs , k) =∞

l=0

l

m = − l

14πi l

Alm (k) ·Y ml (Ωs ) (31)

3. ERROR ANALYSISThe performance of spherical microphone arrays is

affected by errors and artifacts. These errors ef-fect the plane wave decomposition of the impulseresponse data measured by spherical microphonearrays. In auralization the impact of these errorsdregarde the quality of spatial sound reproduction.

3.1 . Measurement errorsThe measurement errors evaluated in the listeningtest can be classied in two categories:

1. Sampling errors and artifacts: Due to nitenumber of microphones imposed by discretiza-tion of the sphere, spatial aliasing is observed.More over inaccurate positioning of the micro-phone elements add on to give positioning errors

2. Microphone Noise: This is the error induce bynon-ideal characteristics of the microphone andthe electronic noise of the microphone elements.

Fig. 5: Errors in Spherical Microphone array mea-surement

3.2 . Description of measurement error functionIn this section we follow the frame work given in

[20] and describe the mesurement errors mathemat-ically and there contribution in spherical harmoniccoeffecients.

For the analytical description we assume an arbi-trary sound eld is captured by an rigid sphere mi-crophone array. The frequency domain output of a single microphone element whic is considered tohave all the errors as depicted in gure 5 is

s(r, Ωq , k) + eq (32)

where k is the wave number, r is radius of thesphere, eq is the microphone noise and Ω q is themicrophone position with positioning errors. Thespherical harmonic coeffecients Alm (k) can be cal-culated by using equation 23 which is explained in2.9. We get

Âlm (k) = 1bl (kr ) (Qq=1 wq ·s(r, Ωq , k) ·Y ml (Ω) +



8/22


Qq=1 wq ·eq ·Y ml (Ω)) (33)

In this equation Q is the number of microphones, wq

are the quadrature weights and bl (kr ) is the modestrangth for rigid sphere conguration 2.8. Thecorrect microphone position as dened by the sam-pling scheme are denoted by Ω q . Now we express thesound eld s(r, Ωq , k) in terms of the correct spheri-cal harmonic coeffecients A l m (k) using equation 20in section 2.8 and substituting it in equation 33

Âlm (k) = 1bl (kr )∞l =0

lm = − l A l m (k) ·bl (kr )

× Q

q=1wq ·Y ml (Ωq ) ·Y ml (Ωq )

X

+ Qq=1 wq ·eq ·Y ml (Ω) (34)The term X is equivalent to the orthonormality con-dition of spherical harmonics given in section 2.2. In[20] this term has been extended to nd the conti-bution of aliasing error a and positioning error Ωand is expressed as

Qq=1 wq ·Y ml (Ωq ) ·Y ml (Ωq )

= δ l l ·δ m m + Ω(l ,m,l , m ),where l, l ≤Lmax= a (l ,m,l , m ) + Ω(l ,m,l , m ),where l ≤Lmax < l

(35)

Here δ l l and δ m m are Kronecker deltas. The max-imum level Lmax is the highest level of the sphericalharmonic coeffecients A l m (k) inside the sound eldwhich is sampled using Q microphone positions, therelation for L and Q could be seen in section 2.9equation 25 for Lebedev grid. In the rst part of equation 35 the level l < L max hence we do notsee aliasing error in that expression. Also from Kro-

necker deltas we see that if Ω = 0 then Ω q and Ωqshould be equal, hence Ω represents the positioningerror. In the lower part of equation 35 we considerthat l > L max that spatial aliasing would be there.Since l and l are different terms δ l l ·δ m m does notappears in this part. The aliasing error a is givenas [20]

a (l ,m,l , m ) = Qq=1 wq ·Y ml (Ωq ) ·Y ml (Ωq ),

where l≤Lmax < l(36)

For positioning error we obtain it by subtractingequation 35 from equation 37 [20]

Ω(l ,m,l , m ) = Qq=1 wq ·Y ml (Ωq ) ·Y ml (Ωq ),where l ≤Lmax < l(37)

Ω(l ,m,l , m )

= Qq=1 wq Y m

l (Ωq ) −Y ml (Ωq ) Y ml (Ωq ),where l ≤Lmax , l ≥0

(38)

Finally if use equation 35 in equation 34 and sep-arate the summation over l we get the expressionspherical harmonic coeffecients with all the errors[20]

Âlm (k) = 1bl (kr )

∞

l =0

l

m = − l

Al m (k) ·bl (kr ) ·δ l l ·δ m m

A ( s )lm (k )− signalcontribution+

1bl (kr )

∞

l =0

l

m = − l Al m (k) ·bl (kr ) · Ω(l ,m,l , m )

A (Ω)lm (k )− positioningerror+

1bl (kr )

∞

l =0

l

m = − l

Al m (k) ·bl (kr ) · a (l ,m,l , m )

A ( a )lm (k )− aliasingerror+

1bl (kr )

Q

q=1wq ·eq ·Y ml (Ωq )

A (

e )lm (k )− microphonenoise

(39)

In equation 39 the rst term refer to the errorfree contribution in spherical harmonic coeffecients

ˆAlm (k). As the Kronecker deltas would be one,hence the rst term simplies to Alm (k). All theother terms represent the errors. From the equationitself we see that the errors depend on level l, kr and



9/22


the quadrature. Finally we can obtain the expres-sion for plane wave decomposition by using equa-tion 39 and substituting it in equation 31 which isthe expression for directivity function in plane wavedecomposition. Each term A( ·)lm (k) in equation 39yields the contribution of that particular error tothe direction weights ws

w( · )s (Ωs , k) =∞

l=0

l

m = − l

14πi l ·A

( ·)lm (k) ·Y ml (Ωs )

(40)where Ωs is the steering direction of spherical mi-crophone array and A ( · )lm (k) can any of the four dif-ferent components in equation 39; A (s )lm (k), A

(Ω)lm (k),

A(a )

lm (k) or A(e )

lm (k). In order to get the effective in-uence of the measurement errors on results of planewave decomposition we relate the error contributionin equation 39 to corresponding signal contributionand we look for relative error contribution by tak-ing ratio of the squared absolute values of differenterrors with respect to signal contribution [20].

Ea (kr ) = |w (a )

s (Ω s ,k )|2

w (s )

s (Ω s ,k )2 , E Ω(kr ) = |w (Ω)s (Ω s ,k )|

2

w (s )

s (Ω s ,k )2

Ee (kr ) = |w (e )

s (Ω s ,k )|2

w (s )

s (Ω s ,k )2 (41)

Equation 41 Noise to signal ratios are calculated.Figure 6 shows the behaviour of different errors;noise, positioning and alising, for different levels l.

On comparing various quadratures for spatial alias-ing, microphone noise and position error. Lebe-dev quadrature is found to have better robustnessagainst the errors in general. Due to these charac-teristics we use Lebedev grid along with rigid sphere.[20]

3.2.1 . Microphone noiseIt is important to not that the mode strength2.8

have quite low values at higher levels l for low valuesof kr and hence this amplies the the spherical har-monic coefficients 23 considerably therefore in sit-uations where microphone noise is there, the noiseget amplied signicantly 6. The increase in micro-phone error is more vigorous in the low kr rangethan in the higher kr

Microphone noise also depends on the number of microphone used, it was shown in simulations in

Fig. 6: Errors in Spherical Microphone array mea-surement [20]

[20] that higher the number of microphone the bet-ter is its robustness against noise. It is also seenthat the inuence of microphone noise is the low-est when the maximum level l ≈ kr . For higher krthe mode strengths some what converge towards 0db and hence we can say theoritically the increasein error for higher kr should not be too signicant.The quadratures used for discretization of sphere do

not have any signicant affect in regard to the mi-crophone noise and they all behave in a similar way.But as the microphone noise affect more at low krwe can say that it limits the array performance onlower frequencies.

3.2.2 . Spatial aliasingThe problem of spatial aliasing is quite complex

in spherical microphone arrays. As continuous aper-ture is not practically feasible hence we do discretiza-tion of the sphere, using quadratures, which givesus a relation between number of microphones andthe maximum level l. But this discretization of the

sphere leads us to spatial aliasing problem. In [20][22] analysis of sampling techniques for spherical mi-crophone arrays and its effect on plane wave decom-position is explained. Aliasing free techniques forlevel limited functions and some solution like spatialanti aliasing lters for aliasing reduction are prposedin[22]

Refer to gure 4 because of the nature of bl (kr ) the



10/22


magnitude of spherical harmonic coeffecients of thesound pressure becomes increasingly insignicant forl > kr , r is radius of sphere. The aliasing error is ex-pected to be almost negligible if operating frequencyrange of array satises the condition kr l in gure plots of different levels are de-picted, and spherical Bessel function curve for higherorders becomes more and more damped. Now if welook at the spherical harmonic coeffecient Alm (k)values for each l, in cases where kr > l , are signi-cantly low due to the behaviour of spherical Besselfunction, but it can be said that coeffecient valuesfor kr < l would be present.

Although as l increases the spherical bessel functioncurve settles more and more closer to x axis, but if full spectrum sound wave is considered then to someextent for every kr < l , coeffecient values would bepresent, where l is not level limited.

As we have a limitation by quadrature that QLb =43 (Lmax + 1)

2 , and because we can only have a lim-ited number of microphone positions, hence planewave eld where coefficients for higher frequenciesand higher l are present need a higher number of mi-crophone positions to sample and obtain the coeffi-cients successfully, but this is not possible because Qthat is the number of microphone is limited. Hencecoeffecients the higher values of l would be samplederroneously that means spatial aliasing would occur.

As Bessel function for higher values of l are verylow for values of kr that are higher than l, but stillcomponents for kr < l would be there, hence we puta limit on kr that is kr ≤Lmax to subdue this effect.Spherical coefficients of plane wave in equation 16are not level limited and should contain higher levell, but this does not occur as the expression for planewave given above contains bessel function whichhave very low signicant values for cases wherel > kr . Hence levels till l ≈ kr would be denedin the spherical harmonic coefficients of plane waveand others will be insignicant or non existent.

Hence A lm (k) = 0, l > L max would hold good if thecondition kr


11/22


Fig. 2: Spherical Bessel function of the rst kind j l (x) (left) and the second kind yl (x) (right) for orderl∈ {0, 3, 6} (x is argument in the plots, x = kr)[8]

Âlm (k) = 1bl (kr )∞l =0

m = − ll Al m (k) ·bl (kr )

× Q

q=1wq ·Y ml (Ωq ) ·Y ml (Ωq )

Z(45)

Now the term Z is an approximation of orthonor-mality condition of spherical harmonics therefore wecan say [22]

Qq=1 wq ·Y ml (Ωq ) ·Y ml (Ωq )

= δ l l ·δ m m + a (l ,m,l , m ),where l ≤Lmax < l(46)

This is an approximation hence we get an additionalterm a (l ,m,l , m ) which represents the aliasing er-ror induced because of sampling. This aliasing erroris the same term which is the rst part of the secondcase in equation 35. As in section 3.2.1 we said thatmicrophone noise limits the performance of spheri-cal microphone array in lower frequency regions orat lower kr , in the case of spatial aliasing, array per-formance is limited at higher frequencies or for largerradii.

3.2.3 . Positioning errorPositioning error comes into play when microphone

element in a spherical microphone array are notplaced at the correct position as dened by the sam-pling scheme(refer section 2.9). The position errorwould affect the directional correctness of plane wavedecomposition and would nally corrupt the output.Because of positioning error the required spacing be-tween the microphone element on the sphere wouldget disturbed and the measurement of impulse re-sponse would not be correct. From gure 6 the be-havior of positioning error is shown, as is the casefor microphone noise positioning error also have ahigher impact at lower kr values. In the percep-tual evaluation we have simulated positining errorfor elevation and azimuth separately. It is clearfrom the gure that positioning error is inuencedby the transfor order or level l. As the levels are in-creased the inuence of positioning error increases.The mathematical expression for positioning errorsis given in section 3.2. The impact of positioningerror is minimal if we use kr ≈Lmax . It is concludedin [22] that the maximum robustness against micro-phone noise and positioning error as well, is obtainedfor plane wave decomposition with L ≈ kr , here Lis the value of level l.



12/22


4. AURALIZATIONIn order to auralize the sound eld keeping the spa-tial characteristics of sound alive, method based onWFS is applied in the present work. Impulse re-sponse based auralization is used which allows us tokeep measurement and reproduction sites indepen-dent of each other [6][8][3].

Impulse response based auralization : In thisapproach the room acoustics are measured and ana-lyzed or impulse responses for the room is measured.For reproduction the room characteristics which areobtained from the impulse response measurementare combined which dry audio channel and then re-produced. In more simpler words if suppose there isan acoustic event in a particular environment (say

concert hall) and now this acoustic scene (acousticscene can be dened as the acoustic event along withsonic effect which are induced by the environment)has to be recreated in another room then if we knowthe room impulse responses (RIR), it is possible torecreate the same acoustic scene by convolving thedry audio le with the directional responses obtainedby plane wave decomposition of RIR for that envi-ronment. Refer to Figure 3.

Now considering the work in this paper, in placeof using real room measurements we have simulatedfree eld wave impact on the spherical microphonearray. So the free eld impulse responses whichare used in this paper are generated by our simula-tion environment. Different errors and artifacts areadded during the procedure of plane wave decompo-sition. Here the important fact to note is that theprocessing of plane wave decomposition remains thesame if we use real room data or free eld data andhence these results hold good for auralization usingreal room acoustics.

In order to auralize the effect of these errors, im-pact of a full spectrum free eld sound wave ona sampled spherical microphone array is simulated.The direction of propagation of sound wave is Ω =(azimuth, elevation ) = ( ϕ,ϑ) = (0 ◦ , 90◦ ) i.e., thesource lies on the horizontal plane with no verticalelevation. Plane wave decomposition is done for 12directions including the the direction of proagation.The simulation is done for free eld case. Figure 4shows the 12 different direction for which the PWDof spherical microphone array data is done. For sim-ulation the spherical microphone array radius r is

Fig. 3: Impulse response based auralization [3].

taken as 15 cm. Free eld impulse responses for alldifferent cases were obtained for 12 different planewave decomposed directions. These responses fordifferent directions when convovled with the test au-dio signal will impose the behavior of free eld soundeld on it. As result we get a 12 channel audio le,where each channel represents to a directional senseof the test audio signal in space.

4.1 . Aspect to be perceptually evaluated

The basic parameters set for simulation were de-signed keeping in mind the rendering tool. We sim-ulate plane wave decomposition for different errorcases. The base transform order or level l is takenas 3, this comes from the fact that our reproductionsystem has a spatial aliasing limit of 1000 Hz and forspherical array of radius r = 15 cm we have a value



13/22


Fig. 4: Depiction of PWD for 12 direction

of kr ≈ 3. Now for a given order L, bl (kr ) has theshape of a bandpass lter ( except level zero, whichbehaves as a low-pass lter, with the peak aroundl ≈ kr ). Hence when a plane wave is approximatedwith a nite summation of order l = L in spheri-cal harmonics domain the reconstructed amplitudes

are expected to be attenuated at frequencies corre-sponding to kr > L [23]. Therefore to be in line withthis we keep the base transform order L = 3 consid-ering the spatial aliasing limit of WFS reproductionset up.

In order to establish a mechanism for comparison,we simulated and auralized an ideal full spectrumwave impact on a continuous aperture spherical mi-crophone array for order L = 3. The impulse re-sponses of ideal full spectrum wave were used to getour reference signal against which all comparisonsare made. For evaluation lets rst dene the ques-

tions for which we want to get answers from thislistening test.

1. The rst question is although if we have ourmicrophone array with enough number of mea-surement positions on the sphere as dened by2.9 then also we face spatial aliasing (refer sec-tion 3.2.2) as full spectrum audio is auralized

hence the limitations kr < l would not be fol-lowed, we want to know the perceptual effect of spatial aliasing.

We investigate is the perceptual effect of chang-ing transform order l for xed number of micro-phones. For these cases number of microphonefor spherical array was xed to be Q = 302. Thehighest transform order auralized for which weperceptually analyzed the signal is L = 6, forwhich according to equation 25, Q ≈ 66 is suf-cient. The parameter Q i.e the number of mi-crophones is 302 because for rstly it is abovethe minimum number of microphones requiredby all conditions of L analyzed in the listeningtest and secondly with a minimum 302 position

for our base transform level L = 3 there is nonoticeable spatial aliasing. The number of mi-crophones is taken as 302, because the rst testwhich we do is used to establish a minimum re-quired number of microphone where no aliasingis perceptually observed. For our base trans-form order L = 3 at 302 sampling points therewas no aliasing and this fact is substantiated bythe listening test also.

2. For microphone noise, we add additive whitegaussian noise to the frequency domain outputof the sampled sphere and then continue with

the process of plane wave decomposition. Twoquestion are there for which we attempt to ndanswers.

• For a xed transform level and a xednumber of microphone positions what isthe level of minimum microphone noisewhich degrades the over all audio qual-ity becomes perceptually signicant or forwhat value it remains percetually insignif-icant when test against the base transformlevel l = 3.

• The other question we investigate is the ef-fect of transform level l on the microphonenoise. We change the transform order forxed noise level which is obtained in theabove step and see how it impact for dif-ferent levels

3. The last aspect which we check is positioningerrors (refer section ?? ) and equation 38. The



14/22


positioning error are checked against varyingtransform orders L = 3 , 4, 5, 6.

The positioning error is added to the quadraturevalues obtained from Lebedev grid structure.Position error is an angular offset in azimuthand elevation which is added to Ω = ( ϕ,ϑ). Po-sitioning error values are normally distributedwith dened degree of standard deviation (SD)that is, we simulate the sound eld for normallydistributed error values with particular level of standard deviation. Two types of positioningerrors are separately evaluated for their percep-tual effect on the auralized sound.

• Positioning error in Azimuth: Here the er-ror is added only in azimuth φ• Positioning error in Elevation: Error isadded only in elevation ϑ

For both these errors a maximum value of erroris investigated for base transform order L = 3which does not induces any over all dregrada-tion in the audio quality in comparison to thereference, and this error level is investigated fordifferent transform order.

4.2 . ProcessingThe ow diagram 5 shows the sequence of process-

ing done for auralization spherical microphone arraydata through WFS system. The block Positioningerror and Microphone noise in gure 5 are the ad-dition of noise and positioning error, noise is addedin the second stage to the pressure responses of mi-crophone elements. Positioning error is added in therst stage when discretization process is done andquadratures are calculated.

4.3 . Reproduction set upThe spatial sound eld reproduction in our study isrealized using wave eld synthesis (WFS). The waveeld synthesis set up consisted of an loud speakerarray of 88 elements. The inter element spacing be-tween the loudspeakers is 18 cm (approx.). The spa-tial aliasing limit upto which the WFS reproductionset up can synthesize the sound elds accurately isapproximately 1000 Hz.

The reproduction is done in the WFS lab. Figure 6

As shown in picture 6 the layout of loudspeakers isin a horizontal plane. The height of this loud speaker

Fig. 5: Processing chain for auralization of sphericalmicrophone array

layout from the ground is approximately equal to thehead level when a test subject is in a sitting position.Hence the horizontal plane contains listener in themiddle surrounded by this array of loud speakers.

5. LISTENING TEST

5.1 . Structure of listening testThe listening test was conducted using theMUSHRA (Multiple Stimuli with Hidden referenceand Anchor) methodology [7]. The methodology isused for the subjective evaluation of audio quality.Here the listener is presented with a reference whichis labeled as such in the test, then there is an anchorand a certain number of test samples which containsthe hidden reference.

The reference is the audio track against which allother test samples are compared and graded. There-fore reference is the bench mark and according to itall comparisons are done.

Anchor is the audio track which is the worst case



15/22


16/22


Spatial aliasing

I. Spatial Aliasing Vs Number of MicrophonesTra ck Te st C on dit io n 2 Te st C on di ti on 3 Te st C on di ti on 4 Te st C on di ti on 5 Tes t Co nd it io n 6

(Anchor)Test Condition 1 (Reference)

L=3 Mic=302 L=3 Mic=194 L=3 Mic=86 L=3 Mic=26 Anchor L=3 Continuous aperture(a) castanet castanet castanet castanet castanet castanet(b) speech speech speech speech speech speech(c) music music music music music music

II. Spatial Aliasing Vs Transform order(L)Tra ck Te st C on dit io n 2 Te st C on di ti on 3 Te st C on di ti on 4 Te st C on di ti on 5 Tes t Co nd it io n 6


L=3 Mic=302 L=4 Mic=302 L=5 Mic=302 L=6 Mic=302 Anchor L=3 Continuous aperture(d) castanet castanet castanet castanet castanet castanet(e) speech speech speech speech speech speech(f) music music music music music music

Table 2: Test conditions for perceptual evaluation of spatial aliasing

Microphone noiseI. Perceptual analysis of different microphone noise level

Tra ck Te st C on dit io n 2 Te st C on di ti on 3 Te st C on di ti on 4 Te st C on di ti on 5 Tes t Co nd it io n 6(Anchor)

Test Condition 1 (Reference)

L=3 Mic=302Noise=-80dB




Anchor L=3 Continuous aperture

(g) castanet castanet castanet castanet castanet castanet(h) speech speech speech speech speech speech(i) music music music music music music

II. Microphone Noise Vs Transform orderTra ck Te st C on dit io n 2 Te st C on di ti on 3 Te st C on di ti on 4 Te st C on di ti on 5 Tes t Co nd it io n 6







(j) castanet castanet castanet castanet castanet castanet(k) speech speech speech speech speech speech(l) music music music music music music

Table 3: Test conditions for perceptual evaluation of micophone noise

Positioning errorI. Positioning Error (Elevation) Vs Transform order

Tra ck Te st C on dit io n 2 Te st C on di ti on 3 Te st C on di ti on 4 Te st C on di ti on 5 Tes t Co nd it io n 6(Anchor)

Test Condition 1 (Reference)

L=3 Mic=302SD=0.15

L=4 Mic=302SD=0.15

L=5 Mic=302SD=0.15

L=6 Mic=302SD=0.15


(m) castanet castanet castanet castanet castanet castanet(n) speech speech speech speech speech speech(o) music music music music music music

II. Positioning Error (Azimuth) Vs Transform orderTra ck Te st C on dit io n 2 Te st C on di ti on 3 Te st C on di ti on 4 Te st C on di ti on 5 Tes t Co nd it io n 6


L=3 Mic=302SD=0.15

L=4 Mic=302SD=0.15

L=5 Mic=302SD=0.15

L=6 Mic=302SD=0.15


(p) castanet castanet castanet castanet castanet castanet(q) speech speech speech speech speech speech(r) music music music music music music

Table 4: Test conditions for perceptual evaluation of positioning error



17/22


varying from 23 to 30 years, giving a mean age of 25.42. Among the test subject none had any kind of hearing impairment. All the test subjects were stu-dents of TU Ilmenau. 24% of the test subject hadsome kind of listening test experience but not withspatial sound reproduction setup. All the test subject were given an introductory overview of spatialsound systems and were explained the listening testset up. This kind of orientation was felt importantas the listening test and the wave studio lab at rstalways give an impression of three dimensional sur-round sound auralization, but in reality our work ismore focus to know the perceptual impact of vari-ous noise and artifacts. Hence an introductory levelof information and main motive of the listening testwas briey explained to the listeners

6. EVALUATION

6.1 . Test subject screeningOut of 21 participants, 14 were able to identify thehidden reference and anchor and 7 did not identifyeither one or both of them. Hence scores for 14 testsubjects were considered valid and the other 7 wereconsidered as outliers.

6.2 . Statistic for the evaluation of listening testThe distribution of the test data is checked and it

was found that the test data approximately followsa normal distribution for all six categories and allindividual conditions. The basic statistical formula-tion in this part of our analysis is as follows. Themean X tc is the mean for a particular track t and aparticular test condition c. Refer tables 2, 3, 4, andS tc is the standard deviation.

X tc = 1N

z

xztc (47)

S tc = N z x2ztc −( z x2ztc )2N (N −

1) (48)

where t : trackc: test conditionz: index of the test subjectN : Number of test subjects

6.3 . DenitionsStatistical signicance ( p −value ): The statisticalsignicance of a result is the probability that the ob-

served relationship (e.g., between variables) or a dif-ference (e.g. between means) in a sample occurredby pure chance (”luck of the draw”), and that inthe population from which the sample was drawn,no such relationship or differences exist. The higherthe p-value, the less we can believe that the observedrelation between variables in the sample is a reliableindicator of the relation between the respective vari-ables in the population.Condence Interval: The condence interval givesus the information about the reliability of the calcu-lated mean. It is dened as the range in which themean would exist with a given probability if the testis repeated. Calculation of condence interval:

X tc −δ tc , X tc + δ tc δ tc = t pS tc

√ N (49)The value of t p is extracted from the t-table distri-bution according to the number of test subjects N .

Analysis of variance (ANOVA) : The purpose of analysis of variance (ANOVA) is to test for signif-cant differences between means. In ANOVA thestatistical signiffcance between means are tested bycomparing the (i.e., analyzing) variances. In order toestablish that the data obtained for different condi-tions show perceptual difference, we further analysethe measured characteristics. The 2-way-ANOVAanalysis we get three p

−values (also explained in

as statistical signiffcance), if p-value is near zerothis means that the associated null hypothesis is indoubt. A sufficiently small p −value suggests thatatleast one column sample mean is signicantly dif-ferent than that of the other column sample means.Interpreting for the different test conditions used inour test. Now if p-value is sufficiently small than itproves the fact that there is some effect due to con-ditions imposed by transform order. The p-valuefor the test conditions is zero it proves the fact thataffect of transform order is signicant.

6.4 . Spatial aliasing vs number of microphoneWe analyse the the impact of changing transform or-der on spatial aliasing. Figure 7 shows us that theimpact of spatial aliasing increases as we increasethe transform order for microphone array process-ing. In this case the number of sampling positionis xed to 302. It is sufficiently proved in this casethat for a transform order of L = 3 and with spher-ical sampling for 302 position there is no perceptual



18/22


aliasing effect. It is important to note that the maxi-mum number of required sample positions accordingto lebedev grid structure is 66 for L = 6 (the highesttransform order auralized). Theoretically sphere dis-cretization for 302 position should be sufficient fora aliasing free plane wave decomposition. We seefrom the plots that speech signal get worst affectedby spatial aliasing artifacts. The behavior shown bymusic and castanets is almost similar. The con-dence interval for music and castanet overlap. Wedo a 2-way ANOVA analysis for spatial aliasing caseand look for statistical signicance. We see from thetable that p−value for transform order is zero or al-most zero this signies the fact that the main factoraffecting our test cases is the transform order.

Fig. 7: Aliasing vs Transform order

On looking at the second column we see that p-valuefor test items is also quite close to zero as we assumea signicance level of 95% for any test data to be sta-tistically signicant, that means p-value below 0.05would be considered signicant. Hence we that testtracks also have an effect on the perceptual resultsof spatial aliasing.

Table 5: 2-way ANOVA for Spatial Aliasing

6.5 . Evaluation of positioning errorThe affect of positioning error on auralization of

plane wave decomposition is tested separately for 1.Positioning error in elevation 2. Positioning error inazimuth The condence interval plots for positioningerror are shown in gure 8, and gure 9

Fig. 8: Analysis of positioning error in azimuth forall three test items

Comparing both these gures we see a obvious pat-tern, rst is that in both these error cases increas-ing the transform order degrades the audio quality.

But the next interesting part is that the slope or ex-tent to which error in elevation corrupts the audioquality is not the same in azimuth. In the case of Elevation error, rstly all three different test itemshave similar perceptual performance. There overlap-ing in condence intervals further substantiate thisconclusion. On the other hand in azimuth we seea different behavior, for speech and music it followsthe same trend and seems as if elevation and az-imuth have same effect, on speech and music but forthe castanets, azimuth error does not seem to de-grade the signal in the same way as it is for othertracks. The condence intervals although overlapwith each other but only to a small extent. Consid-ering the above discussion in mind we are temptedto investigate more on this issue. In order to estab-lish whether perceptual effect cast by these two errorcases are similar or not, we rst check for the hiddensignicance among the different test items in each er-ror case. Although in the case of elevation error if we look at the plots closely the condence interval



19/22


Fig. 9: Analysis of positioning error in elevation forall three test items

among different test items i.e, music, speech, andcastanet overlap to a relatively high degree, hencewe can fairly conclude on the basis of condence in-terval plots that in elevation error condition 3,4, and5 (which corresponds to transform order 4,5 and 6)share a high degree of similarity in there corruptivebehaviour towards the auralization of plane wave de-composition. In the case of azimuth we need more

evidence to establish the extent of impact. For az-imuth error we do a 2 way ANOVA analysis. In thisanalysis we compare effect of test items and the ef-fect of different conditions simultaneously.2-way ANOVA: In our test we assume the condenceinterval level of 95% hence, any p-value bigger than0.05 that is considered high.

Table 6: 2-way ANOVA analysis for azimuth error

The second p-value corresponds to the effect causedby test items, and second pvalue is 0.0006, this isalso very small value and hence, it suggests that dif-ferent test items do have an impact on the completetest scenario for azimuth error. The third value cor-responds to the fact that there is no interaction be-tween test items and test conditions. As we can

observe that the p-value for third category is quitehigh. On performing 2-way-ANOVA on elevationerror we get the following values of statistical signif-icance.

Table 7: 2-way ANOVA analysis for elevation error

The statistical signicance values in the case of el-evation suggest that only the test condition havethere inuence on the perceptual scores and testitem do not have any impact.

6.6 . Microphone noiseIn the perceptual analysis of noise we rst conductednoise level tests for different levels of noise. Thetransform order was kept constant at L = 3, anddifferent levels of noise were tested. Figure 10 givesthe mean and condence level plots for the differentlevel of noise.

Fig. 10: Noise level

It is observed from the gure that as the degree of noise is increased the perceptual response also goesdown. For all the test items the response towardsnoise is somewhat similar. One case stands out andshows an equivalent perceptual performance in com-parison to Reference, that is case when noise level is-80 dB, i.e.,it is also observed that noise level of -80



20/22


dB is perceptually indistinguishable in comparisonto refernce. the reference

Table 8: 2-way ANOVA analysis for Noise levels

Table 8 gives the values for 2-way ANOVA test.From the p-values it is evident that test items had nohidden signicant inuence. And as expected onlythe noise levels cast a signicant impact on the per-ceptual evaluation. In gure 11 we have compareda noise level of -80dB against transform order whichvaries from 3...6.

Fig. 11: Noise vs transform order

The plot show all the test items and there perceptualdegradation when transform orders are increased.The noise gets heavily effected even when the trans-form order is changed from 3 to 4. At a transformorder of 3 the test item showed equivalent percep-tual performance as compared to the reference. A2-way ANOVA test further substantiates the signif-icant impact of transform orders on noise.The signicance values in table 9 show the signif-icance of transform order on perceptual evaluationof different test items.

7. CONCLUSIONSpherical microphone arrays were studied and anal-

Table 9: 2-way ANOVA analysis for Noise levels vsTransform order

ysed. Our aim was to simulate errors in the simu-lation environment and then design a listening testto do a perceptual evaluation in order to establishwhether in reality any particular said parameter hadany perceptual effect or not and to what extent.There are three errors which affect its performanceare evaluated in this work

• Spatial aliasing• Positioning error• Microphone noise

A full spectrum wave impact is sampled on a spheri-cal microphone array and plane wave decompositionfor 12 direction is done. We simulated many testcases and analysed those test cases our selves. Afterindetailed analysis and auralization we design thelistening test. For all the purposes of auralizationthe transform order of L = 3 was selected as the

base transform order. WFS system has a spatialaliasing frequency of 1000 Hz, and in order to havealiasing free sampling, not only a sufficient numberof microphone positions is required but product of wavenumber and radius of the sphere kr ≈ L, thisis one of the conditions which need to be satisedin spherical microphone arrays, on the other sidethe WFS spatial aliasing should also not be crossed.The model lters have a shape of bandpass ltersexcept l = 0. Therefore, keeping L = 3 would re-strict the bandwidth on spherical microphone arrayside also and that is why for L = 3 we do not seeany signicant corruption of the signal specically

by spatial aliasing artifacts. Three different type of audio tracks were used. They are music, speech andcastanet. Following conclusions were drawn on thebasis of listening test.

1. Spatial aliasing get magnied as we increase thetransform order in spherical microphone arrayprocessing



21/22


2. The errors were evaluated for their percep-tual identiability as the transform order is in-creased. It is found from the test results thatmicrophone noise gets amplied with the in-crease in transform order.

3. The simulated spherical microphone array wasbased on lebedev grid sampling scheme and itis noticed that even after having way more thanthe required number of sampling positions alias-ing effects were observed. It is concluded thatthe number of microphone positions as per thecalculation of lebedev grid do not necessarilyprovide a aliasing free impulse response mea-surement.

4. The degradation of perceptual quality with in-creasing number of transform order is verysteep.

5. Positioning errors in azimuth and elevation alsoget amplied with increasing transform order.

6. Azimuth error is found out to be inuenced bythe audio tracks also (substantiated by 2 wayANOVA test).

7. It was observed that in all the cases speech isaffected badly by all the error categories equally.

8. REFERENCES

[1] Benjamin Bernschtz, Christoph Prschmann,Sascha Spors, Stefan Weinzierl. SOFiA-SoundField Analysis Toolbox. In International Con- ference on Spatial Audio , Detmold, Germany,November 2011.

[2] D. de Vries and E. M. Hulsebos. Auralizationof room acoustics by wave eld synthesis basedon array measurements of impulse responses.In 12th European Signal Processing Conference

(EUSIPCO) vol. no. 12, 2004, eng.[3] E. Hulsebos, Auralization using wave eld syn-

thesis , Ph.D. dissertation, Delft University of Technology, 2004.

[4] J. Sonke, Variable acoustic by wave eld syn-thesis , Ph.D. dissertation, Delft University of Technology, 2000.

[5] E. G. Williams, Fourier Acoustics: Sound Ra-diation and Near eld Acoustical Holography .Academic Press, 1999.

[6] F. Melchior, Investigations on spatial sound de-sign based on measured room impulse responses ,Ph.D. dissertation, TU Ilmenau, 2011.

[7] I. Recommendation, 1534-1: Method for thesubjective assessment of intermediate qualitylevel of coding systems, Tech. Rep., 2003.

[8] O. Thiergart, Sound eld analysis on the basis of a spherical microphone array for auralization applications , M.Sc. Thesis, TU Ilmenau, 2007.

[9] E. Skudrzyk, The foundations of acoustics: ba-

sic mathematics and basic acoustics . Springer-Verlag, 1971.

[10] D.T.Blackstock, Fundamentals of Physical Acoustics . John Wiley, 2000.

[11] J. Feldman, Solution of the wave equation byseparation of variables-lecture Notes, Depart-ment of Mathematics, University of BritishColumbia, January 2007.

[12] F. Zotter, Analysis and synthesis of sound- radi-ation with spherical arrays ,”Ph.D. dissertation,University of Music and Performing Arts, 2009.

[13] B.Rafaely. Plane-wave decomposition of thesound eld on a sphere by spherical convolution.In Journal of the Acoustical Society of America ,vol. 116, no. 4 I, pp. 2149-2157, 2004.

[14] J.Meyer and T.Agnello. Spherical microphonearray for spatial sound recording. In Audio En-gineering Society Convention 115 , Oct 2003.

[15] M.A.Poletti. Three-dimensional surroundsound systems based on spherical harmonics.In J. Audio Eng. Soc , vol. 53, no. 11, pp.1004-1025, 2005.

[16] R.Collins, Mathematical Methods for Physicists and Engineers , ser. Dover books on physics.Dover Publications, 1999.

[17] G.V.Singh, Psychoacoustic Investigation on the Auralization of Spherical Microphone Array data using Wave Field Synthesis , M.Sc. Thesis,TU Ilmenau, 2007.



22/22


[18] J. Driscoll and D. Healy, Computing fouriertransforms and convolutions on the 2-sphere,Advances in Applied Mathematics , vol. 15, no.2, pp. 202-250, 1994.

[19] T. Abhayapala and D. B. Ward. Theory anddesign of high order sound eld microphonesusing spherical microphone array. In Acoustics,Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference , vol. 2, May2002, pp. II-1949-II-1952.

[20] B. Rafaely, Analysis and design of spherical mi-crophone arrays, In Speech and Au- dio Pro-cessing, IEEE Transactions , vol. 13, no. 1, pp.135143, Jan 2005.

[21] J. Meyer and G. Elko, A highly scalable spheri-cal microphone array based on an orthonormaldecomposition of the sound eld in Acoustics.In Speech, and Signal Processing (ICASSP),2002 IEEE International Conference , vol. 2,May 2002.

[22] B. Rafaely, B. Weiss, and E. Bachmat, Spa-tial aliasing in spherical microphone arrays. InIEEE Transactions on Signal Processing vol.55, no. 3, pp. 10031010, March 2007

[23] A. Avni, J. Ahrens, M. Geier, S. Spors, H.

Wierstorf, and B. Rafaely. Spatial perception of sound elds recorded by spherical microphonearrays with varying spatial resolution. In The Journal of the Acoustical Society of America ,vol. 133, no. 5, pp. 27112721, 2013.

AES 138 th Convention, Warsaw, Poland, 2015 May 7–10

Documents

Psychoacoustic Investigation on the Auralization of Spherical Microphone Array Processing With Wave Field Synthesis