13
Music Processing Meinard Müller Advanced Course Computer Science Saarland University and MPI Informatik [email protected] Summer Term 2009 Audio Structure Analysis Music Structure Analysis Music segmentation pitch content (e.g., melody, harmony) music texture (e.g., timbre, instrumentation, sound) rhythm 2 Detection of repeating sections, phrases, motives song structure (e.g., intro, versus, chorus) musical form (e.g., sonata, symphony, concerto) Detection of other hidden relationships Audio Structure Analysis Given: CD recording Goal: Automatic extraction of the repetitive structure (or of the musical form) Example: Brahms Hungarian Dance No. 5 (Ormandy) 3 Example: Brahms Hungarian Dance No. 5 (Ormandy) Audio Structure Analysis Dannenberg/Hu (ISMIR 2002) Peeters/Burthe/Rodet (ISMIR 2002) Cooper/Foote (ISMIR 2002) Goto (ICASSP 2003) Chai/Vercoe (ACM Multimedia 2003) 4 Lu/Wang/Zhang (ACM Multimedia 2004) Bartsch/Wakefield (IEEE Trans. Multimedia 2005) Goto (IEEE Trans. Audio 2006) Müller/Kurth (EURASIP 2007) Rhodes/Casey (ISMIR 2007) Peeters (ISMIR 2007) Audio Structure Analysis Audio features Cost measure and cost matrix self-similarity matrix 5 self-similarity matrix Path extraction (pairwise similarity of segments) Global structure (clustering, grouping) Audio = 12-dimensional normalized chroma vector Local cost measure Audio Structure Analysis 6 Local cost measure cost matrix quadratic self-similarity matrix

Audio Structure Analysis - Max Planck Societyresources.mpi-inf.mpg.de/departments/d4/teaching/ss2009/... · 2009-07-01 · Audio Structure Analysis Self-similarity matrix Similarity

  • Upload
    others

  • View
    14

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Audio Structure Analysis - Max Planck Societyresources.mpi-inf.mpg.de/departments/d4/teaching/ss2009/... · 2009-07-01 · Audio Structure Analysis Self-similarity matrix Similarity

Music Processing

Meinard Müller

Advanced Course Computer Science

Saarland University and MPI [email protected]

Summer Term 2009

Audio Structure Analysis

Music Structure Analysis

� Music segmentation

– pitch content (e.g., melody, harmony)

– music texture (e.g., timbre, instrumentation, sound)

– rhythm

2

� Detection of repeating sections, phrases, motives

– song structure (e.g., intro, versus, chorus)

– musical form (e.g., sonata, symphony, concerto)

� Detection of other hidden relationships

Audio Structure Analysis

Given: CD recording

Goal: Automatic extraction of the repetitive structure

(or of the musical form)

Example: Brahms Hungarian Dance No. 5 (Ormandy)

3

Example: Brahms Hungarian Dance No. 5 (Ormandy)

Audio Structure Analysis

� Dannenberg/Hu (ISMIR 2002)

� Peeters/Burthe/Rodet (ISMIR 2002)

� Cooper/Foote (ISMIR 2002)

� Goto (ICASSP 2003)

� Chai/Vercoe (ACM Multimedia 2003)

4

� Lu/Wang/Zhang (ACM Multimedia 2004)

� Bartsch/Wakefield (IEEE Trans. Multimedia 2005)

� Goto (IEEE Trans. Audio 2006)

� Müller/Kurth (EURASIP 2007)

� Rhodes/Casey (ISMIR 2007)

� Peeters (ISMIR 2007)

Audio Structure Analysis

� Audio features

� Cost measure and cost matrix

self-similarity matrix

5

self-similarity matrix

� Path extraction (pairwise similarity of segments)

� Global structure (clustering, grouping)

� Audio

� = 12-dimensional normalized chroma vector

� Local cost measure

Audio Structure Analysis

6

� Local cost measure

� cost matrix

quadratic self-similarity matrix

Page 2: Audio Structure Analysis - Max Planck Societyresources.mpi-inf.mpg.de/departments/d4/teaching/ss2009/... · 2009-07-01 · Audio Structure Analysis Self-similarity matrix Similarity

Audio Structure Analysis

Self-similarity matrix

7

Audio Structure Analysis

Self-similarity matrix

8

Audio Structure Analysis

Self-similarity matrix

9

Audio Structure Analysis

Self-similarity matrix

10

Audio Structure Analysis

Self-similarity matrix

11

Audio Structure Analysis

Self-similarity matrix

12

Page 3: Audio Structure Analysis - Max Planck Societyresources.mpi-inf.mpg.de/departments/d4/teaching/ss2009/... · 2009-07-01 · Audio Structure Analysis Self-similarity matrix Similarity

Audio Structure Analysis

Self-similarity matrix

13

Audio Structure Analysis

Self-similarity matrix

14

Audio Structure Analysis

Self-similarity matrix Similarity cluster

15

Matrix Enhancement

Challenge: Presence of musical variations

� Fragmented paths and gaps

� Paths of poor quality

16

Idea: Enhancement of path structure

� Regions of constant (low) cost

� Curved paths

Matrix Enhancement

Shostakovich Waltz 2, Jazz Suite No. 2 (Chailly)

17

Matrix Enhancement

Idea: Usage of contextual information (Foote 1999)

18

smoothing effect

� Comparison of entire sequences

� length of sequences

� enhanced cost matrix

Page 4: Audio Structure Analysis - Max Planck Societyresources.mpi-inf.mpg.de/departments/d4/teaching/ss2009/... · 2009-07-01 · Audio Structure Analysis Self-similarity matrix Similarity

Matrix Enhancement (Shostakovich)

19

Cost matrix

Matrix Enhancement (Shostakovich)

20

Enhanced cost matrix

Matrix Enhancement (Brahms)

21

Cost matrix

Matrix Enhancement (Brahms)

22

Enhanced cost matrix

Problem: Relative tempo differences are smoothed out

Matrix Enhancement

Idea: Smoothing along various directions

and minimizing over all directions

23

tempo changes of -30 to +40 percent

� th direction of smoothing

� enhanced cost matrix w.r.t.

� Usage of eight slope values

Matrix Enhancement

24

Page 5: Audio Structure Analysis - Max Planck Societyresources.mpi-inf.mpg.de/departments/d4/teaching/ss2009/... · 2009-07-01 · Audio Structure Analysis Self-similarity matrix Similarity

Matrix Enhancement

25

Cost matrix

Matrix Enhancement

26

Cost matrix with

Filtering along main diagonal

Matrix Enhancement

27

Cost matrix with

Filtering along 8 different directions and minimizing

Path Extraction

28

� Start with initial point

� Extend path in greedy fashion

� Remove path neighborhood

Path Extraction

29

Cost matrix

Path Extraction

30

Enhanced cost matrix

Page 6: Audio Structure Analysis - Max Planck Societyresources.mpi-inf.mpg.de/departments/d4/teaching/ss2009/... · 2009-07-01 · Audio Structure Analysis Self-similarity matrix Similarity

Path Extraction

31

Enhanced cost matrix

Path Extraction

32

Thresholded

Path Extraction

33

Thresholded , upper left

Path Extraction

34

Path removal

Path Extraction

35

Path removal

Path Extraction

36

Path removal

Page 7: Audio Structure Analysis - Max Planck Societyresources.mpi-inf.mpg.de/departments/d4/teaching/ss2009/... · 2009-07-01 · Audio Structure Analysis Self-similarity matrix Similarity

Path Extraction

37

Extracted paths

Path Extraction

38

Extracted paths after postprocessing

Global Structure

39

Global Structure

40

How can one derive the

global structure from

pairwise relations?

Global Structure

� Taks: Computation of similarity clusters

� Problem: Missing and inconsistent path relations

41

� Problem: Missing and inconsistent path relations

� Strategy: Approximate “transitive hull”

Global Structure

Path relations

42

Page 8: Audio Structure Analysis - Max Planck Societyresources.mpi-inf.mpg.de/departments/d4/teaching/ss2009/... · 2009-07-01 · Audio Structure Analysis Self-similarity matrix Similarity

Global Structure

Path relations

43

Global Structure

Path relations

44

Global Structure

Path relations

45

Global Structure

Path relations

46

Global Structure

Path relations

47

Final result

Ground truth

Transposition Invariance

Example: Zager & Evans “In The Year 2525”

48

Page 9: Audio Structure Analysis - Max Planck Societyresources.mpi-inf.mpg.de/departments/d4/teaching/ss2009/... · 2009-07-01 · Audio Structure Analysis Self-similarity matrix Similarity

Transposition Invariance

Goto (ICASSP 2003)

� Cyclically shift chroma vectors in one sequence

� Compare shifted sequence with original sequence

� Perform for each of the twelve shifts a separate structure analysis

49

� Combine the results

Transposition Invariance

Goto (ICASSP 2003)

� Cyclically shift chroma vectors in one sequence

� Compare shifted sequence with original sequence

� Perform for each of the twelve shifts a separate structure analysis

50

� Combine the results

Müller/Clausen (ISMIR 2007)

� Integrate all cyclic information in one

transposition-invariant self-similarity matrix

� Perform one joint structure analysis

Example: Zager & Evans “In The Year 2525”

Transposition Invariance

51

Original:

Example: Zager & Evans “In The Year 2525”

Transposition Invariance

52

Original: Shifted:

Transposition Invariance

53

Transposition Invariance

54

Page 10: Audio Structure Analysis - Max Planck Societyresources.mpi-inf.mpg.de/departments/d4/teaching/ss2009/... · 2009-07-01 · Audio Structure Analysis Self-similarity matrix Similarity

Transposition Invariance

55

Transposition Invariance

56

Transposition Invariance

57

Minimize over all twelve matrices

Transposition Invariance

58

Thresholded self-similarity matrix

Transposition Invariance

59

Path extraction

Transposition Invariance

60

Path extraction Computation of similarity clusters

Page 11: Audio Structure Analysis - Max Planck Societyresources.mpi-inf.mpg.de/departments/d4/teaching/ss2009/... · 2009-07-01 · Audio Structure Analysis Self-similarity matrix Similarity

Transposition Invariance

Stabilizing effect

61

Self-similarity matrix (thresholded)

Transposition Invariance

Stabilizing effect

62

Self-similarity matrix (thresholded)

Transposition Invariance

Stabilizing effect

63

Transposition-invariant self-similarity matrix (thresholded)

Transposition Invariance

64

Transposition-invariant matrix Minimizing shift index

Transposition Invariance

65

Transposition-invariant matrix Minimizing shift index

Transposition Invariance

66

Transposition-invariant matrix Minimizing shift index = 0

Page 12: Audio Structure Analysis - Max Planck Societyresources.mpi-inf.mpg.de/departments/d4/teaching/ss2009/... · 2009-07-01 · Audio Structure Analysis Self-similarity matrix Similarity

Transposition Invariance

67

Transposition-invariant matrix Minimizing shift index = 1

Transposition Invariance

68

Transposition-invariant matrix Minimizing shift index = 2

Transposition Invariance

69

Discrete structure suitable for indexing?

Serra/Gomez (ICASSP 2008): Used for Cover Song ID

Example: Beethoven “Tempest”

Transposition Invariance

70

Self-similarity matrix

Example: Beethoven “Tempest”

Transposition Invariance

71

Transposition-invariant self-similarity matrix

Conclusions: Audio Structure Analysis

� Timbre, dynamics, tempo

� Musical key cyclic chroma shifts

Challenge: Musical variations

72

� Musical key cyclic chroma shifts

� Major/minor

� Differences at note level / improvisations

Page 13: Audio Structure Analysis - Max Planck Societyresources.mpi-inf.mpg.de/departments/d4/teaching/ss2009/... · 2009-07-01 · Audio Structure Analysis Self-similarity matrix Similarity

Conclusions: Audio Structure Analysis

� Filtering techniques / contextual information– Cooper/Foote (ISMIR 2002)

– Müller/Kurth (ICASSP 2006)

Strategy: Matrix enhancement

73

� Transposition-invariant similarity matrices– Goto (ICASSP 2003)

– Müller/Clausen (ISMIR 2007)

� Higher-order similarity matrices– Peeters (ISMIR 2007)

Conclusions: Audio Structure Analysis

Challenge: Hierarchical structure of music

74

Rhodes/Casey (ISMIR 2007)

System: SmartMusicKiosk (Goto)

75

System: SyncPlayer/AudioStructure

76