39
Music Processing Meinard Müller Lecture Audio Decomposition International Audio Laboratories Erlangen [email protected]

2018 Mueller MP-AudioDecomp - Amazon S3

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 2018 Mueller MP-AudioDecomp - Amazon S3

Music Processing

Meinard Müller

Lecture

Audio Decomposition

International Audio Laboratories [email protected]

Page 2: 2018 Mueller MP-AudioDecomp - Amazon S3

Book: Fundamentals of Music Processing

Meinard MüllerFundamentals of Music ProcessingAudio, Analysis, Algorithms, Applications483 p., 249 illus., hardcoverISBN: 978-3-319-21944-8Springer, 2015

Accompanying website: www.music-processing.de

Page 3: 2018 Mueller MP-AudioDecomp - Amazon S3

Book: Fundamentals of Music Processing

Meinard MüllerFundamentals of Music ProcessingAudio, Analysis, Algorithms, Applications483 p., 249 illus., hardcoverISBN: 978-3-319-21944-8Springer, 2015

Accompanying website: www.music-processing.de

Page 4: 2018 Mueller MP-AudioDecomp - Amazon S3

Book: Fundamentals of Music Processing

Meinard MüllerFundamentals of Music ProcessingAudio, Analysis, Algorithms, Applications483 p., 249 illus., hardcoverISBN: 978-3-319-21944-8Springer, 2015

Accompanying website: www.music-processing.de

Page 5: 2018 Mueller MP-AudioDecomp - Amazon S3

Chapter 8: Audio Decomposition

In the final Chapter 8 on audio decomposition, we present a challengingresearch direction that is closely related to source separation. Within this wideresearch area, we consider three subproblems: harmonic–percussiveseparation, main melody extraction, and score-informed audio decomposition.Within these scenarios, we discuss a number of key techniques includinginstantaneous frequency estimation, fundamental frequency (F0) estimation,spectrogram inversion, and nonnegative matrix factorization (NMF).Furthermore, we encounter a number of acoustic and musical properties ofaudio recordings that have been introduced and discussed in previouschapters, which rounds off the book.

8.1 Harmonic-Percussive Separation8.2 Melody Extraction8.3 NMF-Based Audio Decomposition8.4 Further Notes

Page 6: 2018 Mueller MP-AudioDecomp - Amazon S3

Why is Music Processing Challenging?

Chopin, Mazurka Op. 63 No. 3 Example:

Page 7: 2018 Mueller MP-AudioDecomp - Amazon S3

Why is Music Processing Challenging?

Waveform

Chopin, Mazurka Op. 63 No. 3 Example:Am

plitu

de

Time (seconds)

Page 8: 2018 Mueller MP-AudioDecomp - Amazon S3

Why is Music Processing Challenging?

Waveform / Spectrogram

Chopin, Mazurka Op. 63 No. 3 Example:Fr

eque

ncy

(Hz)

Time (seconds)

Page 9: 2018 Mueller MP-AudioDecomp - Amazon S3

Why is Music Processing Challenging?

Waveform / Spectrogram

Performance– Tempo– Dynamics– Note deviations– Sustain pedal

Chopin, Mazurka Op. 63 No. 3 Example:

Page 10: 2018 Mueller MP-AudioDecomp - Amazon S3

Why is Music Processing Challenging?

Waveform / Spectrogram

Performance– Tempo– Dynamics– Note deviations– Sustain pedal

Polyphony

Chopin, Mazurka Op. 63 No. 3 Example:

Main Melody

AccompanimentAdditional melody line

Page 11: 2018 Mueller MP-AudioDecomp - Amazon S3

Decomposition of audio stream into different sound sources

Central task in digital signal processing

“Cocktail party effect”

Source Separation

Page 12: 2018 Mueller MP-AudioDecomp - Amazon S3

Source Separation

Decomposition of audio stream into different sound sources

Central task in digital signal processing

“Cocktail party effect”

Several input signals

Sources are assumed to be statistically independent

Page 13: 2018 Mueller MP-AudioDecomp - Amazon S3

Source Separation (Music)

Time

Time

Main melody, accompaniment, drum track

Instrumental voices

Individual note events

Only mono or stereo

Sources are often highly dependent

Page 14: 2018 Mueller MP-AudioDecomp - Amazon S3

Harmonic-Percussive Decomposition

Mixture:

Page 15: 2018 Mueller MP-AudioDecomp - Amazon S3

Harmonic-Percussive Decomposition

Harmonic component

Percussive component

Clearly percussive soundsClearly harmonic sounds

Mixture:

Page 16: 2018 Mueller MP-AudioDecomp - Amazon S3

Harmonic-Percussive Decomposition

Clearly percussive soundsClearly harmonic sounds

Mixture:

Harmonic component

Residualcomponent

Percussive component

Page 17: 2018 Mueller MP-AudioDecomp - Amazon S3

Harmonic-Percussive Decomposition

Mixture:

• Clearly harmonic sounds of singing voice and accompaniment

• Drum hits• Fricatives &

plosives in singing voice

• Noise-like sounds• Vibrato/glissando

sounds

Demo: https://www.audiolabs-erlangen.de/resources/2014-ISMIR-ExtHPSep/

Harmonic component

Percussive component

Residualcomponent

Literature: [Driedger/Müller/Disch, ISMIR 2014]

Page 18: 2018 Mueller MP-AudioDecomp - Amazon S3

Singing Voice Extraction

Singing voice Accompaniment

Original Recording

Page 19: 2018 Mueller MP-AudioDecomp - Amazon S3

Singing Voice Extraction

Original recording HPR

Harmonic component Residual componentPercussive component

Harmonic portion singing voice

MR TR SL

F0 annotation

Harmonic portion accompaniment

Fricativessinging voice

Instrument onsetsaccompaniment

Vibrato & formantssinging voice

Diffuse instruments soundsaccompaniment

+ +

Estimatesinging voice

Estimateaccompaniment

Time

Freq

uenc

y

Page 20: 2018 Mueller MP-AudioDecomp - Amazon S3

Score-Informed Source SeparationExploit musical score to support separation process

Time

Pitc

hPi

tch

Time

Pitc

h

Time

Page 21: 2018 Mueller MP-AudioDecomp - Amazon S3

Freq

uenc

y (H

z)

Render

Parametric Model Approach

Estimate

Parameters

Time (seconds) Time (seconds)

Freq

uenc

y (H

z)

Rebuild spectrogram information

Page 22: 2018 Mueller MP-AudioDecomp - Amazon S3

NMF (Nonnegative Matrix Factorization)

≈N

K

K

M

≥ 0 ≥ 0 ≥ 0

M

Page 23: 2018 Mueller MP-AudioDecomp - Amazon S3

NMF (Nonnegative Matrix Factorization)

Templates Activations

N

M K

K

M

Magnitude Spectrogram

Templates: Pitch + Timbre

Activations: Onset time + Duration

“How does it sound”

“When does it sound”

Page 24: 2018 Mueller MP-AudioDecomp - Amazon S3

NMF-Decomposition

Not

e nu

mbe

r

Freq

uenc

y

Note number Time

Initialized template Initialized activations

Random initialization

Page 25: 2018 Mueller MP-AudioDecomp - Amazon S3

NMF-Decomposition

Not

e nu

mbe

r

Freq

uenc

yFr

eque

ncy

Note number

Not

e nu

mbe

r

Time

Learnt templates Learnt activations

Initialized template Initialized activations

Random initialization → No semantic meaning

Page 26: 2018 Mueller MP-AudioDecomp - Amazon S3

NMF-Decomposition

Not

e nu

mbe

r

Freq

uenc

y

Note number Time

Initialized template Initialized activations

Constrained initialization

Page 27: 2018 Mueller MP-AudioDecomp - Amazon S3

NMF-Decomposition

Not

e nu

mbe

r

Freq

uenc

y

Note number Time

Activation constraints for p=55

Initialized template Initialized activations

Template constraint for p=55

Constrained initialization

Page 28: 2018 Mueller MP-AudioDecomp - Amazon S3

NMF-Decomposition

Not

e nu

mbe

r

Freq

uenc

yFr

eque

ncy

Not

e nu

mbe

r

Time

Org

Model

Note number

Initialized template Initialized activations

Constrained initialization → NMF as refinement

Learnt templates Learnt activations

Page 29: 2018 Mueller MP-AudioDecomp - Amazon S3

Score-Informed Audio Decomposition

500

580

523

Freq

uenc

y (H

ertz

)

0 10.5Time (seconds)

9876

1600

1200

800

400

9876

1600

1200

800

400

500

580

554Fr

eque

ncy

(Her

tz)

0 10.5Time (seconds)

Application: Audio editing

Page 30: 2018 Mueller MP-AudioDecomp - Amazon S3

Informed Drum-Sound Decomposition

Demo: https://www.audiolabs-erlangen.de/resources/MIR/2016-IEEE-TASLP-DrumSeparationLiterature: [Dittmar/Müller, IEEE/ACM-TASLP 2016]

Remix:

Page 31: 2018 Mueller MP-AudioDecomp - Amazon S3

Audio MosaicingSource signal: BeesTarget signal: Beatles–Let it be

Mosaic signal: Let it Bee

Demo: https://www.audiolabs-erlangen.de/resources/MIR/2015-ISMIR-LetItBeeLiterature: [Driedger/Müller, ISMIR 2015]

Page 32: 2018 Mueller MP-AudioDecomp - Amazon S3

NMF-Inspired Audio Mosaicing

. =

Non-negative matrix factorization (NMF)

Proposed audio mosaicing approach

.

Non-negative matrix Components Activations

Target’s spectrogram Source’s spectrogram Activations Mosaic’s spectrogram

fixed

learnedfixed

learned

fixed

learned

=

Time source

Freq

uenc

y

Tim

e so

urce

Time targetTime target

Freq

uenc

y

Page 33: 2018 Mueller MP-AudioDecomp - Amazon S3

NMF-Inspired Audio Mosaicing

Time target

Freq

uenc

y

Time source

Freq

uenc

y

Freq

uenc

y

Tim

e so

urce

Time targetTime target

. =≈

Spectrogram target

Spectrogram source

SpectrogrammosaicActivation matrix

Page 34: 2018 Mueller MP-AudioDecomp - Amazon S3

NMF-Inspired Audio Mosaicing

Time target

Freq

uenc

y

Time source

Freq

uenc

y

Freq

uenc

y

Tim

e so

urce

Time targetTime target

. =≈

Spectrogram target

Spectrogram source

SpectrogrammosaicActivation matrix

Core idea: support the development of sparse diagonal activation structures

Activation matrix

Das Bild kann nicht angezeigt werden.Das Bild kann nicht angezeigt werden.

Iterative updates

Preserve temporal context

Page 35: 2018 Mueller MP-AudioDecomp - Amazon S3

NMF-Inspired Audio Mosaicing

Time target

Freq

uenc

y

Time source

Freq

uenc

y

Freq

uenc

y

Tim

e so

urce

Time targetTime target

. =≈

Spectrogram target

Spectrogram source

SpectrogrammosaicActivation matrix

Page 36: 2018 Mueller MP-AudioDecomp - Amazon S3

NMF-Inspired Audio Mosaicing

Time target

Freq

uenc

y

Time source

Freq

uenc

y

Freq

uenc

y

Tim

e so

urce

Time targetTime target

. =≈

Spectrogram target

Spectrogram source

SpectrogrammosaicActivation matrix

Page 37: 2018 Mueller MP-AudioDecomp - Amazon S3

Audio MosaicingSource signal: WhalesTarget signal: Chic–Good times

Mosaic signal

Page 38: 2018 Mueller MP-AudioDecomp - Amazon S3

Audio MosaicingSource signal: Race carTarget signal: Adele–Rolling in the Deep

Mosaic signal

Page 39: 2018 Mueller MP-AudioDecomp - Amazon S3

Links

SiSEC: Signal Separation Evaluation Campaignhttps://www.sisec17.audiolabs-erlangen.de/

MedleyDB: A Dataset of Multitrack Audiohttp://steinhardt.nyu.edu/marl/research/medleydb

LibROSA (Python)https://librosa.github.io/librosa/