23
Wireless Real-Time Drum Triggering A Design Project Report Presented to the School of Electrical and Computer Engineering of Cornell University In Partial Fulfillment of the Requirements for the Degree of Master of Engineering, Electrical Computer Engineering Submitted by Curran Sinha MEng Field Advisor: Bruce Land Degree Date: December 2017

Wireless Real-Time Drum Triggering · the rise of electronic music and the ease of producing music without a studio have pushed more drummers to think about exploring triggering

Embed Size (px)

Citation preview

WirelessReal-TimeDrumTriggering

ADesignProjectReport

PresentedtotheSchoolofElectricalandComputerEngineering

ofCornellUniversity

InPartialFulfillmentoftheRequirementsfortheDegreeof

MasterofEngineering,ElectricalComputerEngineering

SubmittedbyCurranSinha

MEngFieldAdvisor:BruceLandDegreeDate:December2017

2

Abstract

MasterofEngineeringProgram

SchoolofElectricalComputerEngineering

CornellUniversity

DesignProjectReport

ProjectTitle:WirelessReal-TimeDrumTriggering

Author:CurranSinha

Abstract:Withtheriseofelectronicmusicandhomeproductioninthemusicindustry,

drumtriggeringisgainingpopularityandisnowusedwidelybyprofessionalsand

amateursinliveandstudioenvironments.Drumtriggeringallowsamusiciantosample

varioussoundsusingmechanismstodetectadrumstrike,whichthentriggersamidi

sample.Typicalsetupsinvolvesensorsor“triggers”foreachdrumthatarewiredtoa

drummodule,whicheitherdirectlytriggerssamplesorgeneratesmididata.Thereareno

currentproductsthatallowforcheapwirelesstriggering,whichareportableandeasyto

setup.Thisdesignprojectaimstofillthatgapinthemarketbydevelopingasystemthat

usesjustonemicrophoneandamicrocontrollertodistinguishbetweenmultipledrums.

3

ExecutiveSummaryDrumtriggeringisacommontoolthatallowsmusicianstotriggerarbitrarysampleslikepercussioninstrumentsormelodictextures.Whetheritisbeingusinginaliveorstudiosetting,itaddsanotherdimensiontotheinstrumentbylayeringdifferentvoicesontopoftypicaldrumsetsounds.Eventhoughithasbeenaroundfordecades,inthepastfewyearstheriseofelectronicmusicandtheeaseofproducingmusicwithoutastudiohavepushedmoredrummerstothinkaboutexploringtriggering.Thestandardsystemfortriggeringrequiresaseparatepieceofhardwareforeachdrumthatneedstriggeringability,andallofthoseneedtobewiredtoacentraldrummoduletogeneratethesounds.Thesetupandconvenienceofthecurrenttechnologyislessthanideal.Newproductshavecomeoutthattrytofixthat,buttheyallarelackinginatleastonewaythatmakesthemundesirable.Eithertheycostmorethanacasualdrummerlookingtoexploretriggeringiswillingtopay,ortheydonothavethefunctionalitytotriggerdifferentdrums,ortheyareinconvenienttotransportandsetup.Thisdesignprojectaimstofixthat.Thisprojectisaproofofconceptforaproductthatusesjustamicrophoneandamicrocontrollertotriggermultipledrumsinrealtime.Thesystemissmallenoughtoeasilycarryaroundandthesetuprequiredisjustplacingthesystemnearthedrumsandtrainingeachdrum.ByimplementingthealgorithmthatperformsthetrainingandclassificationinMatlab,Iwasabletoprovethefeasibilityofthisdesign.Thealgorithmiseasilytransferabletoamicrocontrollertoexecutethesamealgorithminrealtimeandtriggersoundaccordingly.Therearequiteafewsignificantproblemswithtriggeringdrumsusingjustonemicrophone.Firstoff,tomaintainrealtimeresponse,theentiresystemmustdetectadrumstrikeandtriggertheaudiosamplewithin15milliseconds.Thatmeansonlyabout10millisecondsofasamplecanbeusedtoanalyze,andthatshortofasampledoesnotcontainalargeamountofinformationaboutthedrum.Aswell,onceastrikeisdetectedandasampleisrecorded,thefeaturesthatuniquelydistinguishthesamplefromotherdrumsamplesareusedinaK-nearestneighborsalgorithmtoallowtheusertoplayadrumandtriggeranysound.Thefinalresultsareextremelypromising.Givenatrainingsettotrainon,thealgorithmcorrectlyidentified93%ofthetestsamplesaccurately.Forversion1ofthedesign,thissystemhasplentyofroomtogrowandimprove,andhopefullywilleventuallyleadtoarealproduct.

4

1.Introduction1.1Motivation

Drumtriggeringisusedwidelybymusiciansallovertheworldtoaddelectronicand

sampledsoundstotheirplayingwhileusinganacousticdrumkit.Thisisnotjustafad.

Electronicmusicthatusesnon-acousticsoundsisgainingpopularityandmusiciansare

expectingthesesoundstoberecreatedinlivesettingsorstudiosettings.Aswell,

drummersarestartingtoexperimentwithcreatingfullsongsontheirown.Thisinvolves

triggeringthemelody,harmonyandbasslines,whilestillplayingthedrumpart.Ihave

beenplayingdrumsfor8yearsandatCornellI’mcurrentlypartofajazzcombo,jazzbig

band,andafunk/R&Bband,sowhenIstartedtothinkofafinaldesignproject,I

immediatelythoughtaboutdesigningsomethingrelatedtodrumsormusic.

Recently,aproductwasreleasedcalledSensoryPercussion(sunhou.se)thatusessensors

onadrumtodetectdifferenthitsdependingonwherethedrumisstruckandtriggeraudio

samplesaccordingly.Thesensorscandetectmanysounds(7+)likecenterofthehead,edge

ofthehead,rimofthedrum,rimshot(hittingtherimanddrumheadatthesametime)at

thecenterofthehead,rimshotattheedgeofthehead,andafewmore.Eachofthese

soundscanthenbeassignedtoasampleanditgivesthedrummernewsoundstocreate

musicwith.Alongwiththis,thecompanyhassoftwarethatallowsformoreadvanced

featureslikeblendingbetweentwodifferentsoundsifyouhitinbetweenregions.

Thisproductinspiredmebecauseitcombinedmyloveofdrummingwithhardwareand

signalprocessing.Irememberwhentheproductfirstcameout,thedrummingcommunity

wasextremelyexcitedandtheystillare,butthereareafewflawswiththesystem.Thefirst

issueisthatthesesensorsworkwithjustonedrumandiftheuserwantstotriggerwith

morethanonedrum,theyneedanothersensor.Thisislimitingbecausemostdrummers

areusedtocreatingbeatsusingseparatesoundsourcesandthusdifferentlimbs,which

wouldrequiremoresensors.Thisisrelatedtoanotherissue,whichisthatthesoftwareand

onesensorcosts$700.Eachadditionalsensoris$300.Formostmusicians,thisisan

expensivepartandcreatesahighbarriertoenterthemarket.Thelastissueisthatsetting

upthesystemcantakeawhilebecauseoneneedstoattachallthesensorsandthenwire

5

eachsensoruptoanaudiointerface.Forrecordingpurposesthisisfine,butifsomeone

wantstoquicklysetupthesesensorsitisinconvenient.Iusedthese“downsides”as

inspirationwhiledevelopingmyproject.

1.2.PreviousWork

Mymaingoalsforthisprojectweretodesignaportable,easytosetup,reliabledrum

triggeringmechanismthatdidnotcosttoomuchandhadnonoticeablelatency.Theinitial

thoughtIhadwastodothisusingonlyamicrophoneandamicrocontroller.Takingalook

atsomesimilarproductsonthemarket,IdiscoveredVersatrigger:awirelessdrum

triggeringmechanism.Despitemeetingafewofmyrequirementsofbeingwirelessand

havinglowlatency,thesetupprocessistediousandlaborious.Thistriggerneedstobe

installedinsidethedrumwhichmakeitnonportableandhardtosetup.Aswell,one

triggerandthehub,whichreceivesthedatafromthetrigger,costsaround$120.

Figure1.Versatrigger

AnotherproductthatwasrecentlyreleasediscalledtheElectronicAcousticModule(EAD)

fromYamaha.Theyuseasimilarprincipletothisdesignprojectbyusingjustoneunit,but

itdoesnottriggerdrumsseparately.Instead,itprocessesthesoundcomingfromthe

microphone(s)withcertainsoundeffectstoaddanotherlayerofsound.Thisisastepinthe

rightdirectionandconfirmsthatthereisgrowingneedforaproductliketheoneIam

developing.

6

Figure2:YamahaEAD

2.Implementation2.1PossibleSolutions

Tacklingthisproblemisnotsimplebecauseithasnotbeendonebeforeandcanvary

dependingontheenvironment.Oneapproachistousewirelesspiezosensorswitha

simpleradioprotocolsuchasZigbee,whichwouldtransferthesensorreadingstoacentral

hubthatgeneratedmididata.Thisisrelativelyeasytosetupandprobablywould

accomplishthejob,butthenissuesmightarisewithpricebecauseofvariousdifferentparts

andwithbatterylifebecauseradioprotocolstakeupadecentamountofpowerregardless

ofhowlittleinformationisbeingtransferred.Anotheroptionistousetwomicrophonesin

frontofthedrumsettogainnotonlyaudioinformation,butalsospatiallocalityofthe

drumset.Thisispromising,butabetterideawithregardstothatisjusttoplacetwo

microphonesnearthemiddleofthedrumsetandpointthemicrophonesinopposite

direction.Thiswouldhelpdeterminethedirectionthesoundiscomingfrom,whichcanbe

combinedwiththeaudioprocessingtodeterminethedrum.Thisideaseemedtomeetall

myrequirements,butIdecidedtostartwithjustonemicrophoneandrelyonaudio

processingtoseeifthatwouldbepossible.

2.2.SystemRequirements

Thecurrentdesignassumesanenvironmentconsistingofonlyadrumset,wheretheuseris

triggeringsoundsusingmultipledrums,butnotplayingcymbals.Inordertousetriggering

7

inaliveenvironment(bandperformance),atriggerperdrumisnecessarybecauseofall

thenoisefromotherinstruments.Thescopeofthisprojectiscateredmoretowards

drummerswhowanttoexplorenewsoundsandthepossibilitiesoftriggering.Aswell,the

systemisnotexpectedtoproduce100%accuracybecauseofthesimplicityandcheapprice

ofthedesign.Iwouldhopeforaccuracyabove80%though,wheremosthitsareregistered

andtriggerthecorrectsound,buttherewillbeafewmisplayedsounds.Moreiterationsof

thedesignwillhelpincreasetheaccuracythough.

Themainapproachtothisprojectistoconstantlysampleaudiofromthemicrophone,usea

simplepoweroramplitudethresholdtodetectthestartofahit,runtheaudiothroughan

FFT,sendthespectrumandtheactualsoundwaveintoaclassificationnetwork,andtrigger

anaudiosampleaccordingly.Thetrainingoftheclassificationnetworkwillhappenfirst

anditwillgenerateaclassificationmodelforthedrums.

2.3.Issues

Onemainissueisthefrequencyspectrumofadrumisnotlikeapianoortrumpetnote,

becausethereisnotalwaysaclearprimaryfrequencyandifthereistherearemanyother

harmonicsthatappear.Aswell,dependingonhowhardthedrumishit,thefrequency

spectrumcanchange.Onarelatednote,inordertomaintainrealtimeresponse,thetime

betweenthedrumhitandthemiditriggermustbelessthan15millisecondsbecausereal

timeaudioperceptionisbetween10-20milliseconds.Thismeanstheactualsamplethat

willbeusedcanonlybearound8-10millisecondsbecauseittakestimetoprocessthe

sampleandgenerateamiditrigger.Withsuchashortsample,thetoneofthedrumwillnot

becompletelypresent.Itinsteadwillbeacombinationofthetoneofthedrumandthe

impactofthestickhittingthehead.Acomparisonofafulldrumsoundwavetowhat10

millisecondsofadrumsoundisshowninFigure3and4.

8

Figure3.FullTomSoundWave

Figure4.10msTomSoundWave

2.28 2.285 2.29 2.295 2.3Time (ms) 104

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4Am

plitu

de

Full (220 ms) High Tom Hit

22806 ms23026 ms

2.2805 2.2806 2.2807 2.2808 2.2809 2.281 2.2811 2.2812 2.2813 2.2814Time (ms) 104

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

Ampl

itude

10ms High Tom Hit

9

Anotherinterestingissueistheclassificationmechanism.Itneedstoberobustandableto

adapttovariousdifferentdrums,butitcannottakeuptoomuchspaceortaketoolong

computationally.

2.4.SoftwareDesign

Atitscurrentstate,myprojectismoreofaproofofconceptforarealtimesystemthat

provesthefeasibilityofusingjustonemicrophonetotriggermultipledrums.Thesystemis

implementedinMatlab,butforrealtimeexecutionitneedstoberunonamicrocontroller.

Itiseasytoportthesystemtoonebecauseallthestepsarerelativelybasiccomputations.

InthenextsectionsIwillelaborateoneachstepofthedesign.

2.4.1.AudioInput

Thesystemsamplesaudioat8khzbecausethemajorityofdrumfrequenciesarepresent

below1000kHz.FortheprototypingIwasdoingwithMatlab,IendedupusingaZoom

H4Nmicrophone,whichhastwostereomicrophones.Itsamplesas44100,butdown

samplingthatissimple.Fortheactualmicrophonecircuitrythatwouldbeconnectedtothe

microcontroller,ahighpassandlowpassfilterwillbeaddedalongwithanamplifier.The

lowandhighpasswillhelpfilteroutanynoisethatispresent,whiletheamplifierallows

theusertosetthegaindependingonwherethemicrophoneisplacedandhowloudthey

areplaying.Theaudiosamplesarestoredinacircularbuffertoallowthesystemtohave

somepreviousrecordingstotakewhenadrumstrikeitdetected.

2.4.2.DetectingDrumStrike

Inordertodetectthestartofadrumstrike,thesystemneedstodetectasuddenincreasein

amplitudeaboveacertainthreshold,butitmustnottriggeronsporadicnoisethatmightbe

presentinenvironment.Thebasicwaytodothisistogenerateanenvelopeofthesignal

thattriestooutlinetheamplitudeofthesignal.TherearetwowaysItriedtoapproachthis:

anexponentialmovingaverage(1poleIIR)andamovingaveragewindow(.Both

approachesareshownonasampledrumstrikeinFigureXXX.Fortheexponentialmoving

average,itusesthisformula:

10

𝐸 𝑖 + 1 = 𝛼 ∗ 𝑖𝑛𝑝𝑢𝑡 𝑖 + 1− 𝛼 ∗ 𝐸(𝑖)

Ann-windowmovingaverageusesthefollowequation:

𝐸 𝑖 = 1𝑛 𝑖𝑛𝑝𝑢𝑡 𝑖 )!!!

!!!

Theinput(i)totheformulascanbeeithertherawmicrophoneinputsortheabsolutevalue

ofthoseresults.Usingtheabsolutevaluesallowstheenvelopetorespondslightlyfaster,

butforthisspecificapplication,itdoesnotaffecttheresultsignificantly.Also,thealpha

termfortheexponentialmovingaveragecaneasilybeadjustedtochangetheresponse

time,whichcanbeusefulwhentuningthesystem.Intermsofimplementinganenvelope

onamicrocontroller,theexponentialmovingaverageismuchsimpler,soIchosetowork

withthat.

Figure4.DrumStrikewithFilteredSignals

9.7155 9.716 9.7165 9.717 9.7175 9.718 9.7185 9.719 9.7195 9.72 9.7205Time (samples) 104

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

Ampl

itude

Audio signal with Envelopes

Original SignalExponential w AbsExponential wout Abs5 pt Moving Avg

11

2.4.3.FeatureExtraction

Afteradrumstrikeisdetected,thesystemtakesnoteofthatspecificindexandsaves

around2millisecondsbeforethedetectionasthestartofthesample.Itthenproceedsto

collectsamplesfor~8moremilliseconds,whichat8khzis64samples.Thisproducesa

10mssampletoprocess.Theseoffsetswerechosenbyanalyzingdrumstrikesandfinding

thesmallestindexthatwasneededtoobtainthestartofthedrumhitformostsamples,and

thentherestisallocatedforafterthestriketogainasmuchinformationaspossibleabout

thedrum.

Next,thesignalisprocessedtoextracttheimportantfeatures.Alargeproblemwith

extractingfeaturesisthatthereisnodefinitionforwhatisimportantforaspecificdrum.A

usefulmethodtodeterminethisisbyvisualizingthefeaturesasaK-nearestneighbors

classificationalgorithm.Inordertodothat,Ionlyusethreefeaturesbecauseitiseasyto

graphthefeaturesforallthedrumsandseeifthereisaclearseparationbetweenthem.I

willtalkmoreaboutthetradeoffsbetweenthefeaturesintheclassificationsection.The

mostobviousfeaturetoextractisthefrequencyresponseofthedrumsinceabassdrum

clearlyhasalowerpitchthenasmalltomdrumanditcanbeappliedtodrumsnomatter

howbigorsmalltheyare.TodothisthesystemgeneratesthediscreteFouriertransformof

the10-millisecondsampleusingthefastFouriertransform.Theoutputisagraphthat

representstheamplitudesofeachfrequencyrange.Anotherfactortoconsideriswhether

thefilteredsignalortheoriginalsignalshouldbepassedintotheFFT.Theoriginalsignal

seemslikethebetteroptioncomparingFigure5and6becauseithaslargerpeaks,but

issuescanariseifthereissignificantnoisethataddsinsomefrequencycomponents.

However,sincethenoisewillshowupinfrequencybinsseparatedfromthemainsampleI

chosetousetheoriginalsignal.Itisimportanttonotethatrawamplitudedataisnotuseful

becausethesystemneedstofunctionregardlessofthevolumeofthedrumhit.Todealwith

this,thesystemusesfeaturesbasedaroundrelativedataliketheorderofpeaksorusing

amplituderatiosrelativetothehighestpeak.

12

Figure5.OriginalSignalResponse

Figure6.FilteredSignalResponse

0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01Time (ms)

-0.4

-0.2

0

0.2

0.4

0.6

Ampl

itude

Raw Siganl

0 500 1000 1500 2000 2500 3000 3500 4000 4500Frequency

0

0.05

0.1

0.15

0.2

0.25

0.3

Ampl

itude

Raw Siganl FFT

0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01Time (ms)

-0.4

-0.2

0

0.2

0.4

Ampl

itude

Filtered Siganl

0 500 1000 1500 2000 2500 3000 3500 4000 4500Frequency

0

0.05

0.1

0.15

0.2

0.25

Ampl

itude

Filtered Siganl FFT

13

Oncethefrequencyresponseisavailable,thereisalargevarietyoffeaturesthatcanbe

extracted.AfewthatIexploredincludedthehighestpeakinthefrequencygraph,the

numberoffrequencypeaks,thetimedifferencebetweenpeaksinthetimedomainandthe

ratioofthehighestpeaktothesecondhighestpeakinthetimedomain.Thesearejusta

smallselectionIthoughtofbyobservingthedifferentresponsesoffourdrumsthatcanbe

seeninFigure7.

Figure7.DifferentDrumResponses

2.4.4.TrainingandClassification

ThissystemclassifiesdrumsusingasimpleK-nearestneighborsclassificationalgorithm.

Initially,duringthetraining,theuserplayseachdrum10or20timesandthesystemsaves

thethreefeaturesforeachdrumhit.Then,whenanewsampleisobtained,thethree

0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01Time (ms)

-0.4

-0.2

0

0.2

0.4

Ampl

itude

High Tom

0 500 1000 1500 2000 2500 3000 3500 4000 4500Frequency

0

0.05

0.1

0.15

0.2

0.25

Ampl

itude

High Tom FFT

0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01Time (ms)

-1

-0.5

0

0.5

1

Ampl

itude

Low Tom

0 500 1000 1500 2000 2500 3000 3500 4000 4500Frequency

0

0.1

0.2

0.3

0.4

0.5Am

plitu

de

Low Tom FFT

0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01Time (ms)

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

Ampl

itude

Snare Drum

0 500 1000 1500 2000 2500 3000 3500 4000 4500Frequency

0

0.05

0.1

0.15

Ampl

itude

Snare Drum FFT

0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01Time (ms)

-0.4

-0.2

0

0.2

0.4

Ampl

itude

Bass Drum

0 500 1000 1500 2000 2500 3000 3500 4000 4500Frequency

0

0.05

0.1

0.15

0.2

0.25

0.3

Ampl

itude

Bass Drum FFT

14

featuresarecalculatedandthenthedistancetoeachofthetrainingpointsiscalculated.

Thesampleisclassifiedasthemajorityofits“k”nearest(intermsofEuclidiandistance)

neighbors.Thisisafairlysimplemachine-learningalgorithm,butthathelpsintermsof

comprehendingthereasoningforitsclassifications.Thismakesiteasiertotunethe

algorithmsoitiscateredtowardsdrums.Whiletestingoutdifferentfeatures,Iwould

generategraphsofallofthetrainingpointsandthegoalwastoseparateeachdrumas

muchaspossible.Figure8showsoneofthefirstonesthatworkeddecently.Thefeatures

arethepeakfrequency,thenumberofpeaksinthetimedomain,andtheratiobetweenthe

maximumpeakandthelastpointinthetimedomain.Forthemostparteachdrumisina

specificregion,whichiswhatmatters.

Figure8.FirstWorkingKNNModel

15

Throughmoreanalysisofthedifferentdrums’timeandfrequencyresponses,andmusical

intuitionofwhattypicaldrumssoundslike,IgeneratedtheKNNgraphshowninFigure9.

ThisisoneofthebestIobtained.Itusesthepeakfrequency,thesecondhighestfrequency,

andtheratioofthemaxpeakandthelastpoint.Otherthanoneortwooutliers,thehigh

tomandbassdrumareextremelyseparated,andlowtomandsnaredrumareslightly

separated.Itriedtocompensateforthatbyusinganewfeaturethatwasfairlyspecific.The

featurewasthenumberoffrequencypeaksthatwereequalorgreaterthanhalfofthe

highestmagnitudepeak,whichrepresentsspectraldensityinaway.ThisisbecauseI

noticedthesnaredrumhadalotoffrequencynoiseandbecauseofthathadamuchbusier

frequencyspectrumdespitehavingasimilarfirstandsecondpeak.ThefinalKNNisshown

inFigure10.Thefinalsystemimplementstheclassificationusingthepeakfrequency,the

secondhighestfrequency,andthenumberoffrequencypeaksgreaterorequaltohalfthe

magnitudeofthehighestpeak.Manydrumstrikesmappedtothesamepoint,soIsized

eachpointaccordingtothenumberofinstancesateachpoint.Itisofcoursepossibleto

keepoptimizingforthetrainingset,butthatcaneasilyleadtooverfitting,whichIfeltIwas

startingtodo.Byanalyzingthespecificdrumsinthetrainingset,andtryingtothinkof

featuresthatwouldfitthatspecificsoundwavecanleadtoresultsthatdonotadaptwellto

otherdrums.

16

Figure9.SecondWorkingKNNModel

Figure10.FinalKNNModel

50

100

4

150

200

250

300

Peak

Fre

quen

cy

3

350

400

450

Num freq peak > (mag(peak freq)/2 2

Clustering for Different Drums

900800

second freq

7006005004001 3002001000

Hi tomLow tomBass DrumSnare

17

3.Results/DiscussionOverall,Iwasabletoobtaindecentresultsthatperformedwellwithagiventrainingand

testingset.ThesewereprocessedinMatlab(notinrealtime),butthewayeachaudio

signalisprocessedsimulatesa10mssampletoproveitsfeasibilityonamicrocontroller.

TrainingSet NumberofSamples

HighTom 60

LowTom 20

BassDrum 203

SnareDrum 91

Thenumberofsamplesherevariessomuchbecauseofthewaythemicrophoneis

positioned;certaindrumsarelouderthanothers,andsinceeachisusingthesame

threshold,notallofthemareregisteredasdrumstrikesdespitetherecordingcontaining

similaramountsofeachdrum.Thisneedstobeaddressedsomehow.Afterrunningthem

throughtheKNNalgorithmwithK=3,theresubstitutionloss,whichishowthetrainingset

performs(errorrate)onaKNNalgorithm,was0.058.Alow6%meansthatthedatahas

separatedthedrumsfairlywell.

TestingSet TotalSamples

Detected

TotalClassified

Correctly

Accuracy

HighTom 86 83 0.965

LowTom 37 32 0.865

BassDrum 252 248 0.984

SnareDrum 145 124 0.855

Average 520 487 0.937

Overall,thesenumbersshowthatthesystemworksforthemostpartandachieves

accuracybeyondwhatIexpected.Intermsofusageforexploringtriggers,thistypeof

18

accuracyforafirstversionisagreatstartingpoint,butstillleavesalotofroomfor

improvement.

Theseresultscanvaryquiteabitdependingonafewvariablesthatarenotsetinstone.

Themainthingsarethethresholdthatcanchangehowmanydrumstrikesaredetected,

thevalueofK,whichgreatlyaffectstheaccuracyifthemodelisnotwellfitted,andlastly

thefeaturesthatarechosentocreatetheKNNmodel.Whilethefeaturesforalldrumsare

similarattheircore,dependingontheenvironment,tuningofthedrums,volume,sticks

thatareused,andmanyotherfactors,thesefeaturesmaynotrepresentthoseparticular

drumsbest.Forthisreason,thisapplicationmightbemoresuitedforaneuralnetthat

acceptstheentirewaveformtoclassifythedrums.Aswell,computationallyspeaking,KNN

isintensivecomparetoasimpleneuralnetworkandoncethesystemmovestoa

microcontroller,thiscouldposeanissue.

4.ConclusionandFutureWorkOverall,thisprojectwasaninterestingandsuccessfulexplorationofwirelessdrum

triggeringusingamicrophone.Thegoalofdevelopingacheap,reliable,easytosetup

systemwasachievedintheory(Matlab)andIwouldhavelovedtoimplementthisonthe

microcontroller,butbecauseofalackofplanningandunforeseendifficultiesIwasunable

to.Theaccuracyofthesystemwasproventobehighenoughtouseasamusician,and

therearesignificantimprovementsthatcanbemade.

Theobviousnextstepistomovethisovertoamicrocontrollersoitcanactuallytrigger

soundsinrealtime.Ihavealreadystartedthatprocess.Ihavethefullmicrophonesetup,

connectedtoaPIC32thatalsoisconnectedtoaTeensy3.2fordebuggingpurposes.I

alreadyhavethatcodesamplingfromthemicrophone,performinganenvelopefilter,and

detectingandcapturing10millisecondsofadrumstrike.Thenextstepsinvolveputting

thatsamplethroughanFFTinfixedpointsoitcanexecuteitinafewmilliseconds,and

thenalsoimplementingsometypeofKNNorneuralnetworkonthemicrocontroller.This

19

isnottoocommonbecauseofspaceconstraints,butI’mconfidenttherewillbeenough

spacebecausetheaudioprocessingcodeisprettymemoryefficient.

Otherstepsincludeimplementingthesamefunctionalitywithtwomicrophones.Withthis

typeofspeciallocality,detectingwhichdrumisplayedbecomesmuchmoretrivialandcan

leadtoaccuraciesabove98%.Thatwouldbeexceptionalforevenprofessionalmusicians

andcouldbedevelopedintoaproduct.Asignificantamountofworkwouldneedtobe

dedicatedtochoosingtherightclassificationalgorithmthatcanworkforvarious

environments,isnottoocomputationallyintensive,andisimplementableona

microcontrollerwithspacerestraints.Theotherissueisoverlappingorsimultaneousdrum

hits.Thisisprobablysolvableusingtwomicrophonesandsomemorefrequencyanalysis.

Overall,thisdesignprojecthasshownmethatasimplesolutioncanachievethe

functionalityofanacceptedstandardinamuchcheaperway.Itaswellhasinspiredmeto

trytodevelopthisintoaproductbecausenowIamconfidentitwillworkandifIwould

lovetouseit,Ibelievetherewillbeotherdrummerswhowouldaswell.

20

Appendix

MatlabCode%Import samples hitom_train_o = importdata('longSamp/train/hitom.mp3'); lotom_train_o = importdata('longSamp/train/lotom.mp3'); bass_train_o = importdata('longSamp/train/bass.mp3'); snare_train_o = importdata('longSamp/train/snare1.mp3'); hitom_test_o = importdata('longSamp/test/hitom.mp3'); lotom_test_o = importdata('longSamp/test/lotom.mp3'); bass_test_o = importdata('longSamp/test/bass.mp3'); snare_test_o = importdata('longSamp/test/snare1.mp3'); all_test_o = importdata('longSamp/test/all.mp3'); Freq = 44100; Freq_d = 5; Fs = Freq/Freq_d; % 8.8kHz %downsample Samples hitom_train = downsample(hitom_train_o.data(:, 1), Freq_d); lotom_train = downsample(lotom_train_o.data(:, 1), Freq_d); bass_train = downsample(bass_train_o.data(:, 1), Freq_d); snare_train = downsample(snare_train_o.data(:, 1), Freq_d); hitom_test = downsample(hitom_test_o.data(:, 1), Freq_d); lotom_test = downsample(lotom_test_o.data(:, 1), Freq_d); bass_test = downsample(bass_test_o.data(:, 1), Freq_d); snare_test = downsample(snare_test_o.data(:, 1), Freq_d); all_test = downsample(all_test_o.data(:, 1), Freq_d); drum_names = {'hitom', 'lotom', 'bass', 'snare'}; drum_train = {hitom_train, lotom_train, bass_train, snare_train}; drum_test = {hitom_test, lotom_test, bass_test, snare_test}; % Set Constants tiny = 0.3; threshold = 0.23; j = 1; class = ""; knn_x = []; knn_y= []; counter = 0; total = 0; for j=1:length(drum_train) class = drum_names(j); testThing = drum_train{j}; % actual sample lengthH = round(length(testThing)); envelope = zeros(1, lengthH); % envelope1 = zeros(1, lengthH); %instantiate feature vectors numPeaks = []; peakRatio = []; peakFreq = []; secFreq = [];

21

numfPeaks = []; total = 0; % Create envelope Filter i = 1; while (i < (lengthH)) envelope(i+1) = (tiny*testThing(i)+ (1.0-tiny)*envelope(i)); % envelope(i+1) = (tiny*abs(testThing(i))+ (1.0-tiny)*envelope(i)); i = i+1; end i = 1; while (i <= (lengthH)) if (envelope(i) > threshold && counter <= 0) start = i - 15; last = i + 65; % sample = envelope(start:last); sample = testThing(start:last); % Time Based Features [peaks, loc] = findpeaks(sample, 'MinPeakDistance', 10); numPeaks = [numPeaks, length(peaks)]; [sortedPeaks, sortedPeakI] = sort(peaks, 'descend'); peek = (sample(end)/sortedPeaks(1)); peakRatio = [peakRatio, peek]; % Frequency Based Features L = length(sample); NFFT = 2^nextpow2(L); % Next power of 2 from length of myRecording Y = fft(sample,NFFT)/L; f = Fs/2*linspace(0,1,NFFT/2+1); max_fft = max(abs(Y(1:round(NFFT/2+1)))); [fpeaks, floc] = findpeaks(abs(Y(1:round(NFFT/2+1)))); [sortedfPeaks, sortedfPeakI] = sort(fpeaks, 'descend'); peakFreq = [peakFreq, f(floc(sortedfPeakI(1)))]; secFreq = [secFreq, f(floc(sortedfPeakI(2)))]; [nfpeaks, nfloc] = findpeaks(abs(Y(1:round(NFFT/2+1))), 'MinPeakHeight', max_fft/2); numfPeaks = [numfPeaks, length(nfpeaks)]; % graphing for debuggin % figure(); % % subplot(2,1,1) % plot ((1:length(sample))/Fs, sample); % subplot (2, 1, 2); % plot(f,2*abs(Y(1:NFFT/2+1))) % title(["thing " num2str(i)]); % end % add feature vector to KNN data knn_x = [knn_x; [peek, f(floc(sortedfPeakI(1))), f(floc(sortedfPeakI(2)))]]; knn_y = [knn_y, class]; total = total + 1; counter = 100; envelope(i) = 0; end if (counter > 0) counter = counter - 1; end i = i +1;

22

end disp (['total training for ' drum_names{j} ' = ' num2str(total)]); xx = peakRatio; yy = secFreq; zz = peakFreq; figure(70) % scatter3(xx, yy, zz, 60, 'filled'); %Engine [uxy, jnk, idx] = unique([xx.',yy.', zz.'],'rows'); szscale = histc(idx,unique(idx)); %Plot Scale of 25 and stars scatter3(uxy(:,1),uxy(:,2), uxy(:,3),'o', 'filled', 'sizedata',szscale*25) title("Clustering for Different Drums "); xlabel('Peak between max and last peak'); ylabel('Second Frequency peak '); zlabel('Peak Frequency '); legend({'Hi tom', 'Low tom', 'Bass Drum', 'Snare'}, 'Fontsize', 18); hold all; end %KNN Model mdl = fitcknn(knn_x, knn_y); mdl.NumNeighbors = 3; rloss = resubLoss(mdl); % Training for j=1:length(drum_test) class = drum_names(j); testThing = drum_test{j}; lengthH = round(length(testThing)); envelope = zeros(1, lengthH); numPeaks = []; peakRatio = []; peakFreq = []; secFreq = []; numfPeaks = []; correct = 0; total = 0; i = 1; while (i < (lengthH)) envelope(i+1) = (tiny*testThing(i)+ (1.0-tiny)*envelope(i)); % envelope(i+1) = (tiny*abs(testThing(i))+ (1.0-tiny)*envelope(i)); i = i+1; end i = 1; while (i <= (lengthH)) if (envelope(i) > threshold && counter <= 0) start = i - 15; last = i + 65; % sample = envelope(start:last); sample = testThing(start:last); [peaks, loc] = findpeaks(sample, 'MinPeakDistance', 10);

23

numPeaks = [numPeaks, length(peaks)]; [sortedPeaks, sortedPeakI] = sort(peaks, 'descend'); peek = (sample(end)/sortedPeaks(1)); peakRatio = [peakRatio, peek]; L = length(sample); NFFT = 256;%2^nextpow2(L); % Next power of 2 from length of myRecording Y = fft(sample,NFFT)/L; f = Fs/2*linspace(0,1,NFFT/2+1); max_fft = max(abs(Y(1:round(NFFT/2+1)))); [fpeaks, floc] = findpeaks(abs(Y(1:round(NFFT/2+1)))); [sortedfPeaks, sortedfPeakI] = sort(fpeaks, 'descend'); peakFreq = [peakFreq, f(floc(sortedfPeakI(1)))]; secFreq = [secFreq, f(floc(sortedfPeakI(2)))]; [nfpeaks, nfloc] = findpeaks(abs(Y(1:round(NFFT/2+1))), 'MinPeakHeight', max_fft/2); numfPeaks = [numfPeaks, length(nfpeaks)]; % if (i > (22600/(1/(Fs/1000))) && i < (23400/(1/(Fs/1000)))) % figure(); % % subplot(2,1,1) % plot ((1:length(sample))/Fs, sample); % subplot (2, 1, 2); % plot(f,2*abs(Y(1:NFFT/2+1))) % title(["thing " num2str(i)]); % end sampleData = [peek, f(floc(sortedfPeakI(1))), f(floc(sortedfPeakI(2)))]; drumClass = predict(mdl,sampleData); if (strcmp(drumClass{1},drum_names{j})) correct = correct +1; end total = total + 1; counter = 100; envelope(i) = 0; end if (counter > 0) counter = counter - 1; end i = i +1; end disp(['For ' drum_names{j} ' got ' num2str(correct) '/' num2str(total) '=' num2str(correct/total)]) end