Upload
tariq-passons
View
219
Download
0
Tags:
Embed Size (px)
Citation preview
Audio lab
Understanding the soundscape concept:
the role of sound recognition and source identification
David ChesmoreAudio Systems LaboratoryDepartment of Electronics
University of York
Audio lab
Overview of Presentation
• Role of soundscape analysis
• Instrument for Soundscape Recognition, Identification and Evaluation (ISRIE)
• Soundscape description language
• Applications
• Conclusions
Audio lab
Role of Soundscape Analysis
• Potential applications:• identifying relevant sound elements in a
soundscape (e.g. high intensity sounds)
• determining positive and negative sounds
• biodiversity studies
• tranquil areas
• preserving important soundscapes
• planning and noise abatement studies
Audio lab
Soundscape Analysis Options• Manual
• Advantage: subjective
• Disadvantages: time consuming, limited resources, subjective, very large storage requirements
• Automatic• Advantages: objective (once trained), continuous
analysis possible, much reduced data storage requirements
• Disadvantage: reliability of sound element classification
Audio lab
How to Automatically Classify Sounds?
• Major issues to address:• separation and localisation of sounds in the
soundscape (especially with multiple simultaneous sounds)
• classification of sounds depends on feature overlap, number of elements
• Number of elements, localisation, etc depends on application
Audio lab
Instrument for Soundscape Recognition, Identification and Evaluation (ISRIE)
• ISRIE is a collaborative project between York, Southampton and Newcastle Universities
• 1 of 3 projects arising from EPSRC Noisy Futures Sandpit
• York - sound separation + sound classification
• Southampton - applications + interface with users
• Newcastle - sound localisation + arrays
Audio lab
Aim of ISRIE
• Aim is to produce an instrument capable of automatically identifying sounds in a soundscape by:• separating sounds in 3-d
• localising sounds from the 3-d field
• classification of sound in a restricted range of categories
Audio lab
Outline of ISRIE
Localisation+
Separation
Classification
(alt, az) Location
Duration, SPL, LEQ
Category
Sensor
ISRIE
Audio lab
Sound Separation - Sensor• B-format microphone as sensor
– Provides 3D directional information– A coincident microphone array reduces convolutive
separation problems to instantaneous.– More compact and practical than multi-microphone
solutions.
Outputs
W – omni-directional component
X – fig-8 response along x-axis
Y – fig-8 response along y-axis
Z – fig-8 response along z-axis
Audio lab
Overview of Separation Method
• Use Coincident Microphone array
• Transform into Time-Frequency Domain
• Find Direction Of Arrival (DOA) vector for each Time-Frequency point.
• Filter sources based on known or estimated positions in 3D space
Audio lab
Assumptions
• Approximately W-Disjoint Orthogonal• Sparse in time-frequency domain, i.e. the power in any
time-frequency window is attributed to one source.
• Sound sources are geographically spaced (sparse)• Noise sources have unique Direction of Arrival (DOA).
Audio lab
The Dual Tree Complex Wavelet Transform (DT-CWT)
• Efficient filterbank structure• Approximately shift invariant
STFT separation
DT-CWT separation
Audio lab
Separation results - speech• 3 Male speakers• Recorded in anechoic chamber ISVR. Mixed to virtual B-format, known locations spaced around
microphone
Performance Measure
Speaker SIR original (dB)
SIR separated (dB)
SIR gain (dB)
PSRM (dB)
1 0.17 12.14 12.32 0.94
2 2.96 12.30 15.27 0.88
3 -6.81 10.89 17.70 0.58
Audio lab
Source Estimation and Tracking
• Examples used known source locations. In many deployment scenarios, this is acceptable.• More versatility could be provided by finding source
locations and tracking
• Two approaches considered• 3D histogram approach
• Clustering using plastic self organising map
Audio lab
Results• 2 Speakers – Directional Geodesic Histogram
Position of peaks at (0,0) and (10,20) degrees
Blur between peaks due to 2 sources only approximating the assumptions
Audio lab
Signal Classification
What features? TDSC
Which classifier?ANN – MLP, LVQ
Which Sounds?
ISRIE Sound Categories
Audio lab
Time-Domain Signal Coding
• Purely time-domain technique
• Successfully used for:• Species recognition
• birds, crickets, bats, wood-boring insects
• Heart sound recognition
• Current applications• Environmental sound
• Vehicle recognition
Audio lab
Time-Domain Signal Coding
Time
Epoch
Audio lab
MultiscaleTDSC (MTDSC)
• New method of D-S data presentation• Replaces S-matrix, A-matrix or D-matrix
• Multiscale • Made from groups of epochs in powers of 2 (512,
256, etc)
• Inspired by Wavelets
Audio lab
MTDSC
Level
1 S1(1) S1(2) S1(3) S1(4) S1(5) S1(6) S1(7) S1(8)
2 S2(1) S2(2) S2(3) S2(4)
3 S3(1) S3(2)
4 S4
1 Frame (epochs)
1n2
1jj11n1n S
2
1SValue in frame n=4
Audio lab
MTDSC Example
Logarithmic Chirp – 100Hz - 24kHz
Epoch frame length 2m
Audio lab
MTDSC (cont)
• Currently use shape but will investigate:• epoch duration (zero-crossings interval) only
• epoch duration and shape
• epoch duration, shape and energy
• Also use mean, can also use varience, higher order statistics for larger values of m (e.g. 9)
Audio lab
MTDSC Results (1)
MTDSC data generation & stacking
3 output LVQnetwork
Audio
1
2
3
• Winning output determines result
• Overall network accuracy: 76%
• Some categories better than others– Road, Rail – 93%
Audio lab
MTDSC Results (2)
• 3 different Japanese cicada species used for biodiversity studies (2 common, 1 rare) in northern Japan
• 21 test files from field recordings including 1 with -6dB SNR
• Backpropagation MLP classifier
• 20 out of 21 test files correctly classified• ~ 95% accuracy
Audio lab
Practical ISRIE
Localisation+
Separation
Classification
(alt, az) Location
Duration, SPL, LEQ
Category
Sensor
ISRIE
Approx location
required sound
category
UserSupplied
Data
Audio lab
Restricting Location
Cone of acceptance
Automatic rejection of signals
target
Audio lab
Further Automated Analysis
• At present, ISRIE only provides a classified sound element in a small range of categories
• Can we create a soundscape description language (SDL)?
• Needs to be flexible enough to accomodate manually and automatically generated soundscapes
• Take inspiration from speech recognition, natural language, bioacoustics (e.g. automated ID of insects, birds, bats, cetaceans)
Audio lab
sonotag = (L,,d,t,D,a,c,p,G)
where L = label
= direction of sound
d = estimated distance to sound
t = onset time
D = duration
a = received sound pressure level
c = classification (a = automatic, m = manual)
p = level of confidence in classification
G = geotag = G(ll,lo,al) ll = lat, lo = longitude, al = altitude
• Other possibilities exist
Audio lab
Example of Monaural Sonotags18s recording of O. viridulus at nature reserve in Yorkshire in 2003
(O. viridulus,,1,11:45,2,50,a,0.99,(53.914,-0.845,10))
(O. viridulus,,1,11:50,1.5,50,a,0.99,(53.914,-0.845,10))
(plane,,100,11:52.5,5,35,a,0.96,(53.914,-0.845,10))
(Bird1,,100,12:02,5,41,a,0.99,(53.914,-0.845,10))
Audio lab
Example of 3-D Sonotags
(speaker2,0,0,1.5,14:00,5,43,a,0.96,(53.9,-0.9,10))
(speaker1,10,20,2,14:00,5,42,a,0.92,(53.9,-0.9,10))
Treat separated sounds as monaural recordings forclassification
Audio lab
Applications (1)
• BS 4142 assessments
• PPG 24 assessments
• Noise nuisance applications
• Other acoustic consultancy problems
• Soundscape recordings
• Future noise policy
Audio lab
Applications (2)
• Biodiversity assessment, endangered species monitoring
• Alien invasive species (e.g. Cane Toad in Australia)
• Anthropomorphic noise effects on animals
• Habitat fragmentation
• Tranquility studies
Audio lab
Conclusions
• ISRIE has been shown to be successful in separating and classifying urban sounds• much work still to be done, especially in
classification
• Automated soundscape description is possible but a flexible and formal framework is needed