Drum transcription in polyphonic music using non...

Drum transcription in polyphonic music usingnon-negative matrix factorization

Arnaud Moreau2 Arthur Flexer1,2

1Institute of Medical Cybernetics and Artificial IntelligenceCenter for Brain Research, Medical University of Vienna, Austria

2The Austrian Research Institute for Artificial IntelligenceFreyung 6/6, A-1010 Vienna, Austria

Introduction

I Prerequisite for genre classification or beat/meter detectionI Transcription more difficult in polyphonic musicI Source separation based systemI Extension of work by Helen and Virtanen from

drum/non-drum classification and separation to fullpolyphonic drum transciption

Overview

featureextraction

[X ] f , t [A]c , t

SVMclassification

peak picking

input signal

NMFseparation

[X ] f , t

featureextraction

[S ] f , c [A]c , t

NMFseparation

[X ] f , t

drum samples

transcription

I Input audio is divided into 5 sec excerptsI Magnitude spectrogram representation (window size 2048,

hop size 512)I Non-negative matrix factorisation (NMF) algorithm gives

source-spectra and time-varying gains of c componentsI c components classified by Support Vector Machine (SVM)I Peak-picking algorithm

Results and Discussion

The algorithm is evaluated on 60 sec excerpts from 4multi-channel recordings, which are labelled manually,containing a total number of 1019 drum onsets.

Song 1, 242 onsets Song 3, 206 onsetsBD SD HH mean BD SD HH mean

Rp% 88.66 54.93 41.03 60.85 36.84 50.68 81.77 56.43Rr% 93.33 63.64 98.97 85.31 20.51 88.24 99.25 69.33Rh% 78.89 5.45 −43.30 13.68 −10.26 −17.65 74.44 15.51Song 2, 224 onsets Song 4, 347 onsetsRp% 33.33 69.57 34.76 45.89 80.00 31.25 76.63 62.63Rr% 13.75 69.05 93.33 58.71 50.00 6.33 63.24 39.85Rh% −11.25 35.71 −135.00 −36.85 37.50 −7.59 42.16 24.02

I Most errors are already made at the classification stagewhich harms the subsequent drum transcription

I Results not comparable - no publicly available data setI Remaining research questions (among others):

I What is the optimal feature subset?I What are the optimal thresholds for peak-picking?

Acknowledgement

Helmut Schonleitner of the cultural center AKKU (http://www.akku-steyr.at)provided the multichannel recordings that have been used to evaluate our algorithm.The Austrian Research Institute for Artificial Intelligence acknowledges support fromthe ministries BMUKK and BMVIT.

System

Features

spectral features temporal featuresspectral centroid temporal centroidspectral kurtosis temporal kurtosisspectral skewness temporal skewnessspectral rolloff crest factorspectral flatness peak timespectral contrast peak fluctuationnoise likeness percussivenessstandard deviation periodicity10 MFCCs20 dynamic MFCCs (mean+std)20 dynamic ∆MFCCs (mean+std)

The NMF algorithm

One short-time spectrum vector x(t) ismodelled as a sum of c components,each having a constant spectrum S andtime-varying gain A(t)

x(t) ≈c∑

i=1SiAi(t) or X ≈SA.

The components are estimated using theupdate rules

S = S. ∗AT (X./SA)

AT1and

A = A. ∗(X./SA)ST

This is a suitable representation for druminstruments, because their spectra don’tchange over time.

The classifier

I One SVM for classes drum/non-drum,2580 feature vectors

I One SVM for classes BD, SD, HH, 3145feature vectors

I Implemented in WEKA(www.cs.waikato.ac.nz/ml/weka/)

I Trainingdata: ENST-Drums(perso.enst.fr/˜gillet/ENST-drums/) and various drumsamples

I Crossvalidation results inside training set:86.28% and 92.94%

Selected References

S. Dixon.Onset detection revisited.In Proc. of the DAFx, pages 133–137, Montreal, Quebec, Canada, Sept.18–20, 2006.

O. Gillet and G. Richard.Enst-drums: an extensive audio-visual database for drum signalsprocessing.In Proceedings of the 7th International Conference on Music InformationRetrieval, pages 156–159, Victoria, BC, Canada, October 2006.

M. Helen and T. Virtanen.Separation of drums from polyphonic music using non-negative matrixfactorization and support vector machine.In Proc. of the 13th EUSIPCO, Antalya, Turkey, September 2005.

D. D. Lee and H. S. Seung.Algorithms for non-negative matrix factorization.In NIPS, pages 556–562, 2000.

J. Paulus and T. Virtanen.Drum transcription with non-negative spectrogram factorisation.In Proc. of the 13th EUSIPCO, Antalya, Turkey, September 2005.

K. Tanghe, S. Degroeve, and B. De Baets.An algorithm for detecting and labeling drum events in polyphonic music.In Proc. of the first MIREX, London, UK, September 11-15 2005.

C. Uhle, C. Dittmar, and T. Sporer.Extraction of drum tracks from polyphonic music using independentsubspace analysis.In Proc. of the 4th ICA, Nara, Japan, April 2003.

a.moreau@gmx.net arthur.flexer@ofai.at

Drum transcription in polyphonic music using non...

Documents

Automatic Drum Transcription and Source Separation - eirhomepage.eircom.net/~derryfitzgerald/ThesisFitz.pdf · Automatic Drum Transcription and Source Separation ... broad review

Multiple-instrument polyphonic music transcription using a

Global Structure-Aware Drum Transcription Based on Self

Subband-based Drum Transcription for Audio Signals (pdf)

Subband-based Drum Transcription for Audio Signals · to provide rhythmic transcriptions of polyphonic music and drum patterns. Transcription refers to the process of converting the

Drum Sound Recognition for Polyphonic Audio Signals by ... · the bass drum, snare drum, and hi-hat cymbals in polyphonic audio signals of popular songs. Our system is based on a

Automatic Music Transcription: from Monophonic to · PDF fileAutomatic Music Transcription: from Monophonic to Polyphonic F. Argenti, P. Nesi, ... (a musical score or sheet)

Alive - Pearl Jam - Drum Transcription

Real-time transcription and separation of drum · PDF fileREAL-TIME TRANSCRIPTION AND SEPARATION OF DRUM RECORDINGS BASED ON ... tion rather than transcription, ... bass-heavy sound

CS379H Undergraduate Thesis - kedzie.github.iokedzie.github.io/files/BayesianPolyphonicTranscription.pdf · CS379H Undergraduate Thesis Polyphonic Guitar Transcription by Bayesian

Automatic Drum Transcription and Source Separation

Animate - Drum Transcription

Dave Brubeck Take Five Drum Transcription webcom

Lab Polyphonic Transcription

Real-time polyphonic music transcription with non-negative

Using Blackboard Systems for Polyphonic Transcription A Literature Review by Cory McKay

An End-to-End Neural Network for Polyphonic Piano Music ... · PDF file1 An End-to-End Neural Network for Polyphonic Piano Music Transcription Siddharth Sigtia, Emmanouil Benetos,

Automatic transcription of polyphonic piano music using a note masking technique

Burnout - Drum Transcription

Report about polyphonic music transcription