63
1 Do-It-Yourself SemanticAudio Jörn Loviscach Fachhochschule Bielefeld, Germany (Bielefeld University of Applied Sciences) The p r es en t u men t styl e o f t h i s t u t o r ia l isd u e to t ec h nic alr eq u ir e me n ts o f t h e c on fe r e n c e . G ive n a c h oic e, I wo u l d h ave us ed h a n d ske tc hesinste ad .

DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

Embed Size (px)

Citation preview

Page 1: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

1

Do-It-Yourself

Semantic Audio

Jörn LoviscachFachhochschule Bielefeld, Germany

(Bielefeld University of Applied Sciences)

The presentument style of this tutorial

is due to technical requirements of the conference.

Given a choice, I would have used hand sketches instead.

Page 2: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

2

The Funnel Principle

��������������������������������Meaning

Feature Extraction

Dimension Reduction

Machine Learning

Visualization

Waveform

User Interfaces

Page 3: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

3

Some Applications:

Music Information Retrieval (1)

Find music similar to music that the listener likes

Known Files

New File

Classification?

Page 4: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

4

Some Applications:

Music Information Retrieval (2)

Find music that fits to walking/jogging

Slow Fast

Pick an

appropriate

tempo

Page 5: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

5

Some Applications:

Music Information Retrieval (3)

Extract the chorus of a song

Find the most

prominent

repeated part

Page 6: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

6

Some Applications:

Music Information Retrieval (4)

Segment radio archives: news, music, ads, etc.

Cluster temporal evolution

and classify those clusters

Page 7: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

7

Some Applications:

Forensics

Detect gunshots in surveillance recordings

Find and classify

acoustic events

Page 8: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

8

Some Applications:

Language Learning

Identify the accent of a speaker

Recognize phonemes

and classify their timbre

Page 9: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

9

Some Applications:

Music Making

Control digital musical instruments acoustically

Cool

Algorithm

Page 10: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

10

Objective of this Tutorial

Get going

• for free

• without C++ programming

Basic methods of

• Feature Extraction

• Machine Learning

Page 11: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

11

Agenda

• The software landscape

• Basic feature extraction:

– Sonic Visualizer

– jAudio and Excel

• Feature extraction and machine learning:

– jAudio and WEKA

– MIRtoolbox in MATLAB®

• Real-time applications:

– timbreID in Pure Data

Page 12: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

12

Agenda

• The software landscape

• Basic feature extraction:

– Sonic Visualizer

– jAudio and Excel

• Feature extraction and machine learning:

– jAudio and WEKA

– MIRtoolbox in MATLAB®

• Real-time applications:

– timbreID in Pure Data Questio

ns so fa

r?

Page 13: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

13

The Software Landscape:

Scope

��������������������������������Meaning

Feature Extraction

Dimension Reduction

Machine Learning

Visualization

Waveform

User Interfaces

Page 14: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

14

The Software Landscape:

Offline vs. Real Time

• Offline processing

Currently the typical mode

• Real-time processing

Applications:

– Score following & chord recognition for live music

– Live control of digital musical instruments

Page 15: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

15

The Software Landscape:

Packaging

Many shapes and forms …

Page 16: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

16

Feature Extraction

Machine Learing

User Interface

Offline

Real-time

C/C++/Java

MATLAB®

Pure Data, Max/MSP

Stand-alone

Web Service

Active Development

Web Address

MARSYAS � � � � � � � � � � � http://marsyas.info/

CLAM � � � � � � � � � � � http://clam-project.org/

openSMILE � � � � � � � � � � � http://opensmile.sourceforge.net/

LibXtract � � � � � � � � � � � http://sourceforge.net/projects/libxtract/

Aubio � � � � � � � � � � � http://aubio.org/

jAudio (part of jMIR) � � � � � � � � � � � http://sourceforge.net/projects/jaudio/

Music-to-Knowledge (M2K) � � � � � � � � � � � http://www.music-ir.org/evaluation/m2k/

Maate � � � � � � � � � � � http://lwn.net/2002/0321/a/maate.php3

MPEG-7 Audio Encoder � � � � � � � � � � � http://mpeg7audioenc.sourceforge.net/

MIRtoolbox � � � � � � � � � � � ht t ps:/ / www.jyu.f i/ hum/ lait okset / musiikki/ en/ research/ coe/ mat erials/ mirt oolbox/ mirt oolbox

MA Toolbox � � � � � � � � � � � http://www.ofai.at/~elias.pampalk/ma/

Computer Audition Toolbox (CATbox) � � � � � � � � � � � http://cosmal.ucsd.edu/cal/projects/CATbox/catbox.htm

MPEG-7 XM � � � � � � � � � � � http://mpeg7.doc.gold.ac.uk/mirror/index.html

PsySound3 � � � � � � � � � � � http://psysound.wikidot.com/

IPEM Toolbox � � � � � � � � � � � http://www.ipem.ugent.be/?q=node/27

Soundspotter � � � � � � � � � � � http://soundspotter.org/

timbreID � � � � � � � � � � � http://williambrent.conflations.com/pages/research.html

MuBu � � � � � � � � � � � http://imtr.ircam.fr/imtr/MuBu

Chuck � � � � � � � � � � � http://chuck.cs.princeton.edu/

Sonic Visualiser � � � � � � � � � � � http://www.sonicvisualiser.org/

Sonic Annotator � � � � � � � � � � � http://www.omras2.org/SonicAnnotator

Praat � � � � � � � � � � � http://www.fon.hum.uva.nl/praat/

EchoNest � � � � � � � � � � � http://developer.echonest.com/

MPEG-7 Audio Analyzer � � � � � � � � � � � http://mpeg7lld.nue.tu-berlin.de/

Page 17: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

17

Feature Extraction

Machine Learing

User Interface

Offline

Real-time

C/C++/Java

MATLAB®

Pure Data, Max/MSP

Stand-alone

Web Service

Active Development

Web Address

MARSYAS � � � � � � � � � � � http://marsyas.info/

CLAM � � � � � � � � � � � http://clam-project.org/

openSMILE � � � � � � � � � � � http://opensmile.sourceforge.net/

LibXtract � � � � � � � � � � � http://sourceforge.net/projects/libxtract/

Aubio � � � � � � � � � � � http://aubio.org/

jAudio (part of jMIR) � � � � � � � � � � � http://sourceforge.net/projects/jaudio/

Music-to-Knowledge (M2K) � � � � � � � � � � � http://www.music-ir.org/evaluation/m2k/

Maate � � � � � � � � � � � http://lwn.net/2002/0321/a/maate.php3

MPEG-7 Audio Encoder � � � � � � � � � � � http://mpeg7audioenc.sourceforge.net/

MIRtoolbox � � � � � � � � � � � ht t ps:/ / www.jyu.f i/ hum/ lait okset / musiikki/ en/ research/ coe/ mat erials/ mirt oolbox/ mirt oolbox

MA Toolbox � � � � � � � � � � � http://www.ofai.at/~elias.pampalk/ma/

Computer Audition Toolbox (CATbox) � � � � � � � � � � � http://cosmal.ucsd.edu/cal/projects/CATbox/catbox.htm

MPEG-7 XM � � � � � � � � � � � http://mpeg7.doc.gold.ac.uk/mirror/index.html

PsySound3 � � � � � � � � � � � http://psysound.wikidot.com/

IPEM Toolbox � � � � � � � � � � � http://www.ipem.ugent.be/?q=node/27

Soundspotter � � � � � � � � � � � http://soundspotter.org/

timbreID � � � � � � � � � � � http://williambrent.conflations.com/pages/research.html

MuBu � � � � � � � � � � � http://imtr.ircam.fr/imtr/MuBu

Chuck � � � � � � � � � � � http://chuck.cs.princeton.edu/

Sonic Visualiser � � � � � � � � � � � http://www.sonicvisualiser.org/

Sonic Annotator � � � � � � � � � � � http://www.omras2.org/SonicAnnotator

Praat � � � � � � � � � � � http://www.fon.hum.uva.nl/praat/

EchoNest � � � � � � � � � � � http://developer.echonest.com/

MPEG-7 Audio Analyzer � � � � � � � � � � � http://mpeg7lld.nue.tu-berlin.de/

Libraries for C/C++ or Java

Toolboxes for MATLAB®

Extensions for Pure Data and Max/MSP

Stand-alone solutions

Web Services

Page 18: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

18

Agenda

• The software landscape

• Basic feature extraction:

– Sonic Visualizer

– jAudio and Excel

• Feature extraction and machine learning:

– jAudio and WEKA

– MIRtoolbox in MATLAB®

• Real-time applications:

– timbreID in Pure Data Questio

ns so fa

r?

Page 19: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

19

Sonic Visualizer

��������������������������������Meaning

Feature Extraction

Dimension Reduction

Machine Learning

Visualization

Waveform

User Interfaces

Page 20: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

20

Sonic Visualiser

• Manual and automated markup

• Many feature extractors available;

install in C:\Program Files (x86)\Vamp Plugins

• Great for experiments with feature extraction

• Things to see and try:

– Details next to mouse pointer

– Draw musical notes

– Align timelines of two versions

of a recording (plug-in)

Page 21: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

21

Male/Female Segmentation

• Add new pane; add spectrogram

• Window: 32,768 samples; vert. axis logarithmic

• Add new time instants layer

• Add markers

• Plot type: segmentation

• Name markers (edit layer data)

• Edit markers if needed

• Export annotation layer

Page 22: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

22

Agenda

• The software landscape

• Basic feature extraction:

– Sonic Visualizer

– jAudio and Excel

• Feature extraction and machine learning:

– jAudio and WEKA

– MIRtoolbox in MATLAB®

• Real-time applications:

– timbreID in Pure Data Questio

ns so fa

r?

Page 23: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

23

jAudio and Excel

��������������������������������Meaning

Feature Extraction

Dimension Reduction

Machine Learning

Visualization

Waveform

User Interfaces

Page 24: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

24

jAudio: the Program

• Feature extractor

• Graphical user interface and command line

• Java-based

• Multi-threaded

• Batch processing (add multiple files at once!)

• Export e.g. as ACE (XML-based);

nice for Excel

Page 25: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

25

jAudio: Catches

• Install as admin

• Override standard heap size:

No double-click to start, rather

java -Xmx1024M -jar jAudio.jar

in the directory of the jar. (Batch file!)

• No ä or é in audio file names:

XML output broken

• XML and ARFF: cleartext. Huge files!

Export as few values as possible.

Page 26: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

26

Sorting Files by Loudness

• Set paths for output files

• Do not export standard deviation

(Alter Aggregators, click Save!)

• For each file, extract overall mean

of root mean square

• Import into Microsoft Excel

• Sort and plot

Page 27: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

27

Sorting Sounds by Brightness

• These are different ways

of measuring brightness:

– Number or rate of zero crossings

– Spectral centroid

– Spectral rolloff point

• jAudio: for each file, extract overall mean

• Import into Microsoft Excel

• Sort and/or plot (x = item number)

Page 28: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

28

Zero Crossings

• Number or rate of sign changes

• Related to frequency and noise content

• Independent of volume

• Issue: sensitive to noise and harmonics

t

Page 29: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

29

Spectral Centroid

• Mean frequency (center of mass)

of the power spectrum (linear or log freq.)

• Independent of volume (if √Power)

f

Power

Page 30: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

30

Spectral Rolloff Point

• Determine the frequency that divides

the audio power 85:15 (for instance)

• Independent of volume (if √Power)

• Fluctuating with empty spectral regions

f

Power

85% 15%

Page 31: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

31

Sorting Music by Tempo

• Demos with Sonic Visualiser:

– Note onsets

– Beat and bar tracker

• jAudio:

for each file, extract mean

of strongest beat

• Import into Microsoft Excel

• Sort and/or plot (x = item number)

Page 32: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

32

Strongest Beat

• Compute envelope

• Compute autocorrelation

• Return inverse of time lag

of maximum autocorrelation (except 0)

t

Envelope

Lag

Page 33: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

33

0.0

0.2

0.4

0.6

0.8

1.0

0 50 100 150 200 250

bpm

Ac

cu

mu

late

d V

alu

e

Strongest Beat

• Issue with ambiguity:

jAudio picks the maximum histogram bin

• Could improve that in Excel

by extracting the full histogram

Page 34: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

34

Agenda

• The software landscape

• Basic feature extraction:

– Sonic Visualizer

– jAudio and Excel

• Feature extraction and machine learning:

– jAudio and WEKA

– MIRtoolbox in MATLAB®

• Real-time applications:

– timbreID in Pure Data Questio

ns so fa

r?

Page 35: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

35

jAudio and WEKA

��������������������������������Meaning

Feature Extraction

Dimension Reduction

Machine Learning

Visualization

Waveform

User Interfaces

Page 36: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

36

WEKA: the Program

Huge collection of machine learning algorithms

• Clustering: unsupervised machine learning

• Classification: supervised machine learning

��

��

Page 37: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

37

WEKA: the Program

• Great for experiments

• ARFF: Plaintext file format for input data,

one of the two formats written by jAudio

• Java-based

• In RunWeka.ini:

maxheap=512m

Page 38: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

38

Clustering Sounds by Similarity

• Demo: vowel sounds

• jAudio: extract MFCC averages

• Export as ARFF (change file extension!)

• Import into WEKA Explorer: Preprocess

• Retain only the means of MFCCs 1…12

• Cluster:

– Store clusters for visualization

– Visualize cluster assignments

Page 39: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

39

MFCCs: Mel-Frequency

Cepstral Coefficients (1)

Rough idea of what the ear sends to the brain

• Step 1: Short-time spectrum

in perceived frequency scale

0

500

1000

1500

2000

2500

3000

3500

10 100 1000 10000

Hz

Me

l

1

6

11

16

21

Ba

rk Mel

Bark

Page 40: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

40

MFCCs: Mel-Frequency

Cepstral Coefficients (2)

• Step 2: Compute approximate perceived

loudness: log of power

• Intermediate result: spectrum as perceived

Hz

Watt

Mel

dB

Page 41: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

41

MFCCs: Mel-Frequency

Cepstral Coefficients (3)

• Step 3: Describe the overall shape

of this spectrum

• Do this through a mixture of cosine shapes

• MFCCs = the amounts of the different cosines

− 0.03 * + 0.52 *

− 0.12 * + 0.03 * + …

Mel

dB

Page 42: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

42

MFCCs: Mel-Frequency

Cepstral Coefficients (4)

• Demo with Sonic Visualizer

• MFCC 0 ist just the audio level:

Discard it to be independent of level

• Fine structure of spectrum is ignored

• What MFCCs are not designed to do:

– Tell different fundamental frequencies apart

– Distinguish harmonic/inharmonic/noise

• Demo: vocal percussion

Page 43: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

43

Clustering: k-Means

• Input: data points, number of clusters (guess)

• Pick random centers

for clusters

• Iterate:

– Assign each data point

to the nearest center

– New center = centroid

of all points assigned

• Output: classification and centers

Page 44: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

44

Clustering:

Expectation Maximization (EM)

• Input: data points, number of clusters (guess)

• Pick random centers/sizes

for clusters

• Iterate:

– Assign each data point

to the most probable center

– New center/size according

to points assigned

• Output: classification and centers

Page 45: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

45

Clustering: Caveats

• Metric structure ≈ perception?

• Are all data dimensions of the right scale?

– Weka: Visualize All

– Weka: Standardize, Math Expression, …

• Vital when combining different features

Page 46: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

46

Music Classification

• jAudio: extract MFCC averages

• Add to ARFF file:

– @ATTRIBUTE class {classical, jazz, pop, rock}

– Class of each file

• Import into WEKA Explorer

• Classify

• Visualize classifier errors

Page 47: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

47

Classification:

k Nearest Neighbors (kNN)

• Input:

– Classified exemplars

– The number k

– The item x to be classified

• Find the k exemplars

nearest to x

• Vote by majority

among them

?

Page 48: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

48

Classification:

Zoo of Methods

• Neural Networks

• Support-Vector Machines

• and dozens more

Page 49: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

49

Agenda

• The software landscape

• Basic feature extraction:

– Sonic Visualizer

– jAudio and Excel

• Feature extraction and machine learning:

– jAudio and WEKA

– MIRtoolbox in MATLAB®

• Real-time applications:

– timbreID in Pure Data Questio

ns so fa

r?

Page 50: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

50

MIRtoolbox in MATLAB®

��������������������������������Meaning

Feature Extraction

Dimension Reduction

Machine Learning

Visualization

Waveform

User Interfaces

Page 51: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

51

MIRtoolbox: the software

• All in one well-designed package,

great for experimentation:

– Low-level features

– Dimension reduction

– Machine Learning

• Requires MATLAB®, which is costly

• Slower than Java or C++,

even though intermediate results are reused

Page 52: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

52

Music Classification:

Improvements (1)

Not mean of MFCCs, but statistical model

• Ignoring time order:

– k-Means

– Gaussian Mixture Model (GMM)

• With time oder:

– Hidden Markov Model (HMM)

MFCC

space

Page 53: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

53

Music Classification:

Improvements (2)

��������������������������������Meaning

Feature Extraction

Dimension Reduction

Machine Learning

Visualization

Waveform

User Interfaces

Page 54: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

54

Music Classification:

Improvements (3)

Not mean of MFCCs, but statistical model

• Ignoring time order:

– k-Means

– Gaussian Mixture Model (GMM)

• With time oder:

– Hidden Markov Model (HMM)

MFCC

space

Page 55: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

55

Music Classification:

Improvements (4)

Not mean of MFCCs, but statistical model

• Ignoring time order:

– k-Means

– Gaussian Mixture Model (GMM)

• With time oder:

– Hidden Markov Model (HMM)

MFCC

space

Page 56: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

56

Music Classification:

Improvements (5)

Not mean of MFCCs, but statistical model

• Ignoring time order:

– k-Means

– Gaussian Mixture Model (GMM)

• With time oder:

– Hidden Markov Model (HMM)

MFCC

space

Page 57: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

57

MIRtoolbox in Action

• Classify audio files by music genre

• Training set, test set:

add prefixes to the files, e.g., p, r, j, c

• Extract features,

condense by GMM,

classify by Bayes

Page 58: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

58

Agenda

• The software landscape

• Basic feature extraction:

– Sonic Visualizer

– jAudio and Excel

• Feature extraction and machine learning:

– jAudio and WEKA

– MIRtoolbox in MATLAB®

• Real-time applications:

– timbreID in Pure Data Questio

ns so fa

r?

Page 59: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

59

timbreID in PureData

��������������������������������Meaning

Feature Extraction

Dimension Reduction

Machine Learning

Visualization

Waveform

User Interfaces

Page 60: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

60

timbreID in Action

• Low-level features

• k-NN classification

• Clustering of

training exemplars

• Generate/control MIDI data, audio signals, …

Page 61: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

61

Outlook

Page 62: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

62

Outlook:

Semantic Audio via the Internet

• EchoNest: Web Service for

Music Information Retrieval

• Collect data from the users

• Keep waveforms

(large, expensive, sensitive)

away from the end user

• Mashups of Web Services?

• Real time, too??

Page 63: DIY Semantic Audio - Jörn Loviscach Feature Extraction Machine Learing User Interface Offline Real-time C/C++/Java MATLAB® Pure Data, Max/MSP Stand-alone Web Service Active Development

63

Thank you!

www.j3L7h.de

Questio

ns?