Introduction of my research histroy: From instrument recognition to support of amateurs' music creation

Introduction of my research history:From instrument recognition to support of amateurs' music

creation

Tetsuro KitaharaNihon University, [email protected]

http://www.kthrlab.jp/twitter: @tetsurokitahara

http://www.kthrlab.jp/

Introduction of myself● Name: Tetsuro Kitahara● Living in: Tokyo, Japan● Age: 37● Position: Associate professor in Nihon University● Favorites: Music, Drinking alchol, etc.

My research history2000

2002

2004

2007

2010

2016

Audio signal processing

Pattern recognition

Content-based MIR

Automatic music generation

Probabilistic modeling

Computer-human interaction

Stu

d ent

Pos

tDoc

Now

Research when I was a student● Instrument recognition for polyphonic music● Content-based Music Information Retrieval

Instrument recognition● Instrumentation is an important factor in MIR● Not many attempts at polyphonic instrument

recognition at that time● Typical framework:

Note detection -> feature extract. for each note

1:1:000

C4B3

D4

E4F4

G4

2:1:000 1:1:000

C4B3

D4

E4F4

G4

2:1:000

VN

VNVN

PFPF

(VN:Violin, PF: Piano)

For each note…

Pitch: C4Start: 1:1:000End: 1:3:000

Piano: 99.0%Violin: 0.6%

……A posteriori probabilitiesHarmonic structure

X1 = 0.124X2 = 0.635

……Feature vector

Instrument recognition● Instrogram: Subsymbolic time-freq. representation

of instrument existence

● p(ωi; t, f ) : the probabilty that the sound of instru-ment ωi with F0 of f exists at time t

Piano

Flute

Time [s]

Formulation of Instrogram

Non-specific IEP Conditional IEP

HMM

S E

…

S E

S E

HMM for each semitone

t

f

Instrument existence probability (IEP)

…

PreFEst [Goto, 2000](Estimate the weight for

tone model for each semitone)

= w110×

+ w660×

Observed spectrum

Tone model for 110Hz

Tone model for 660Hz

p(ωi; t, f ) = p(X; t, f ) p(ωi | X; t, f )

Demo VIdeo

Research when I was a PostDoc● Chord voicing based on Bayesian network (skip)● BayesianBand● OrpheusBB● CrestMuseXML and CrestMuse Toolkit (skip)

BayesianBandA jam session system based on mutual prediction of the user and system

C ? ? ?

Chord progression:

PredictDetermine in real time

Musicallymatch

Pleasant

ChordMelody

Concept Melody

Each keystroke

Infer the next chord

Next chord

When changing the measure

Get the latest result

Generate MIDI data

CDmEmFGAmBm(-5)

Likelihood

Chord Chord Chord

MelodyNote

MelodyNote

MelodyNote

Implementation

Demo VIdeo

OrpheusBB

Human-in-the-loop

Initialinput

Generate music　

　Listen to　the music

　Edit the　melody

Regenerate the backing

　Edit the　backing

Regenerate　　 the melody　　

Finish

Model

(Collaboration with Univ. of Tokyo)Demo

Allows users to edit outputs of the system

Automatically re-generate the remaining part according to part edited by users

Input lyrics

Edit the melody

The chord is auto-matically changed

Research in the current university● Four-part harmonization using Bayes nets (skip)● Humming-based composition support (skip)● Melody editing using melodic outline● Smart loop sequencer

Melody editing using melodic outline● Melody is represented as a continous curve● User can edit the melody by redrawing this curve

Edited part

Edit byuser

Note sequence

Pitch trajectory

Melodic outline

Procedure of melody editing

Pitch trajectory

Fourier transform

Inverse Fouriertransform Save for later use

How to extract melodic outline

Extract low-order coeffs.

High-order coeffs. of original melody

Low-order coeffs. of edited outline

Fourier transform

Inverse Fourier transform

Do Mi Fa So Ti Ra So Do Re Mi

Hidden Markov model

How to generate a melody from the edited outlineDemo

Smart loop sequencer● Automatically selects music loops from collection● Uses the degree of excitement as an input

The degree of excitement

Tuple of loop IDs for 5 parts at measure n ("0" for no loop)

Estimate the most likely

[s1, ..., sN] for given x

Observed signal

Fomulating with HMM

Simplify the formulation by independently considering

qn, i × s'n, i

Whether a loop is placed at measure n in part i

If so, which loop is placed there

P(xn | qn)

The more loops are inserted, the higher xn is emitted

5 1

the higher deg. of excitement is annotated in the loop, the higher xn is emitted

P(xn | s'n)

High deg. of excitementDrums

Drums

Sequence

Sequence

Low deg. of excitement

How to estimate the degree of excitement for each music loopDemo

Discussions

● Generate music data from users’ inputs

● Users’ inputs are usually imcomplete

● Typically based on probabilistic models

These works have two aspects:

Automatic music generation Human-computer interaction

● Allow users to input their intent easily

● More abstract than specific music data

● Details are hidden ● Tradeoff btwn details

and intuitiveness

Conclusion● My research

– MIR-related subjects (-2007)● Musical instrument recognition● MIR based on instrumentation similarity

– Music generation (2007-)● BayesianBand, Melodic outline, Smart loop sequencer, …

● 2 aspects in my recent works– Automatic music generation

– human-computer interaction

● Research plan during this stay– Improvement of melody generation model, ...

Technology

Introduction of my research histroy: From instrument recognition to support of amateurs' music creation