Audio lab Understanding the soundscape concept: the role of sound recognition and source identification David Chesmore Audio Systems Laboratory Department

Audio lab

Understanding the soundscape concept:

the role of sound recognition and source identification

David ChesmoreAudio Systems LaboratoryDepartment of Electronics

University of York

Audio lab

Overview of Presentation

• Role of soundscape analysis

• Instrument for Soundscape Recognition, Identification and Evaluation (ISRIE)

• Soundscape description language

• Applications

• Conclusions

Audio lab

Role of Soundscape Analysis

• Potential applications:• identifying relevant sound elements in a

soundscape (e.g. high intensity sounds)

• determining positive and negative sounds

• biodiversity studies

• tranquil areas

• preserving important soundscapes

• planning and noise abatement studies

Audio lab

Soundscape Analysis Options• Manual

• Advantage: subjective

• Disadvantages: time consuming, limited resources, subjective, very large storage requirements

• Automatic• Advantages: objective (once trained), continuous

analysis possible, much reduced data storage requirements

• Disadvantage: reliability of sound element classification

Audio lab

How to Automatically Classify Sounds?

• Major issues to address:• separation and localisation of sounds in the

soundscape (especially with multiple simultaneous sounds)

• classification of sounds depends on feature overlap, number of elements

• Number of elements, localisation, etc depends on application

Audio lab

Instrument for Soundscape Recognition, Identification and Evaluation (ISRIE)

• ISRIE is a collaborative project between York, Southampton and Newcastle Universities

• 1 of 3 projects arising from EPSRC Noisy Futures Sandpit

• York - sound separation + sound classification

• Southampton - applications + interface with users

• Newcastle - sound localisation + arrays

Audio lab

Aim of ISRIE

• Aim is to produce an instrument capable of automatically identifying sounds in a soundscape by:• separating sounds in 3-d

• localising sounds from the 3-d field

• classification of sound in a restricted range of categories

Audio lab

Outline of ISRIE

Localisation+

Separation

Classification

(alt, az) Location

Duration, SPL, LEQ

Category

Sensor

ISRIE

Audio lab

Sound Separation - Sensor• B-format microphone as sensor

– Provides 3D directional information– A coincident microphone array reduces convolutive

separation problems to instantaneous.– More compact and practical than multi-microphone

solutions.

Outputs

W – omni-directional component

X – fig-8 response along x-axis

Y – fig-8 response along y-axis

Z – fig-8 response along z-axis

Audio lab

Overview of Separation Method

• Use Coincident Microphone array

• Transform into Time-Frequency Domain

• Find Direction Of Arrival (DOA) vector for each Time-Frequency point.

• Filter sources based on known or estimated positions in 3D space

Audio lab

Assumptions

• Approximately W-Disjoint Orthogonal• Sparse in time-frequency domain, i.e. the power in any

time-frequency window is attributed to one source.

• Sound sources are geographically spaced (sparse)• Noise sources have unique Direction of Arrival (DOA).

Audio lab

The Dual Tree Complex Wavelet Transform (DT-CWT)

• Efficient filterbank structure• Approximately shift invariant

STFT separation

DT-CWT separation

Audio lab

Separation results - speech• 3 Male speakers• Recorded in anechoic chamber ISVR. Mixed to virtual B-format, known locations spaced around

microphone

Performance Measure

Speaker SIR original (dB)

SIR separated (dB)

SIR gain (dB)

PSRM (dB)

1 0.17 12.14 12.32 0.94

2 2.96 12.30 15.27 0.88

3 -6.81 10.89 17.70 0.58

Audio lab

Source Estimation and Tracking

• Examples used known source locations. In many deployment scenarios, this is acceptable.• More versatility could be provided by finding source

locations and tracking

• Two approaches considered• 3D histogram approach

• Clustering using plastic self organising map

Audio lab

Results• 2 Speakers – Directional Geodesic Histogram

Position of peaks at (0,0) and (10,20) degrees

Blur between peaks due to 2 sources only approximating the assumptions

Audio lab

Signal Classification

What features? TDSC

Which classifier?ANN – MLP, LVQ

Which Sounds?

ISRIE Sound Categories

Audio lab

Time-Domain Signal Coding

• Purely time-domain technique

• Successfully used for:• Species recognition

• birds, crickets, bats, wood-boring insects

• Heart sound recognition

• Current applications• Environmental sound

• Vehicle recognition

Audio lab

Time-Domain Signal Coding

Time

Epoch

Audio lab

MultiscaleTDSC (MTDSC)

• New method of D-S data presentation• Replaces S-matrix, A-matrix or D-matrix

• Multiscale • Made from groups of epochs in powers of 2 (512,

256, etc)

• Inspired by Wavelets

Audio lab

MTDSC

Level

1 S1(1) S1(2) S1(3) S1(4) S1(5) S1(6) S1(7) S1(8)

2 S2(1) S2(2) S2(3) S2(4)

3 S3(1) S3(2)

4 S4

1 Frame (epochs)

1n2

1jj11n1n S

2

1SValue in frame n=4

Audio lab

MTDSC Example

Logarithmic Chirp – 100Hz - 24kHz

Epoch frame length 2m

Audio lab

MTDSC (cont)

• Currently use shape but will investigate:• epoch duration (zero-crossings interval) only

• epoch duration and shape

• epoch duration, shape and energy

• Also use mean, can also use varience, higher order statistics for larger values of m (e.g. 9)

Audio lab

MTDSC Results (1)

MTDSC data generation & stacking

3 output LVQnetwork

Audio

1

2

3

• Winning output determines result

• Overall network accuracy: 76%

• Some categories better than others– Road, Rail – 93%

Audio lab

MTDSC Results (2)

• 3 different Japanese cicada species used for biodiversity studies (2 common, 1 rare) in northern Japan

• 21 test files from field recordings including 1 with -6dB SNR

• Backpropagation MLP classifier

• 20 out of 21 test files correctly classified• ~ 95% accuracy

Audio lab

Practical ISRIE

Localisation+

Separation

Classification

(alt, az) Location

Duration, SPL, LEQ

Category

Sensor

ISRIE

Approx location

required sound

category

UserSupplied

Data

Audio lab

Restricting Location

Cone of acceptance

Automatic rejection of signals

target

Audio lab

Further Automated Analysis

• At present, ISRIE only provides a classified sound element in a small range of categories

• Can we create a soundscape description language (SDL)?

• Needs to be flexible enough to accomodate manually and automatically generated soundscapes

• Take inspiration from speech recognition, natural language, bioacoustics (e.g. automated ID of insects, birds, bats, cetaceans)

Audio lab

sonotag = (L,,d,t,D,a,c,p,G)

where L = label

= direction of sound

d = estimated distance to sound

t = onset time

D = duration

a = received sound pressure level

c = classification (a = automatic, m = manual)

p = level of confidence in classification

G = geotag = G(ll,lo,al) ll = lat, lo = longitude, al = altitude

• Other possibilities exist

Audio lab

Example of Monaural Sonotags18s recording of O. viridulus at nature reserve in Yorkshire in 2003

(O. viridulus,,1,11:45,2,50,a,0.99,(53.914,-0.845,10))

(O. viridulus,,1,11:50,1.5,50,a,0.99,(53.914,-0.845,10))

(plane,,100,11:52.5,5,35,a,0.96,(53.914,-0.845,10))

(Bird1,,100,12:02,5,41,a,0.99,(53.914,-0.845,10))

Audio lab

Example of 3-D Sonotags

(speaker2,0,0,1.5,14:00,5,43,a,0.96,(53.9,-0.9,10))

(speaker1,10,20,2,14:00,5,42,a,0.92,(53.9,-0.9,10))

Treat separated sounds as monaural recordings forclassification

Audio lab

Applications (1)

• BS 4142 assessments

• PPG 24 assessments

• Noise nuisance applications

• Other acoustic consultancy problems

• Soundscape recordings

• Future noise policy

Audio lab

Applications (2)

• Biodiversity assessment, endangered species monitoring

• Alien invasive species (e.g. Cane Toad in Australia)

• Anthropomorphic noise effects on animals

• Habitat fragmentation

• Tranquility studies

Audio lab

Conclusions

• ISRIE has been shown to be successful in separating and classifying urban sounds• much work still to be done, especially in

classification

• Automated soundscape description is possible but a flexible and formal framework is needed

Documents

Audio lab Understanding the soundscape concept: the role of sound recognition and source identification David Chesmore Audio Systems Laboratory Department