Upload
phamnhan
View
216
Download
9
Embed Size (px)
Citation preview
WirelessReal-TimeDrumTriggering
ADesignProjectReport
PresentedtotheSchoolofElectricalandComputerEngineering
ofCornellUniversity
InPartialFulfillmentoftheRequirementsfortheDegreeof
MasterofEngineering,ElectricalComputerEngineering
SubmittedbyCurranSinha
MEngFieldAdvisor:BruceLandDegreeDate:December2017
2
Abstract
MasterofEngineeringProgram
SchoolofElectricalComputerEngineering
CornellUniversity
DesignProjectReport
ProjectTitle:WirelessReal-TimeDrumTriggering
Author:CurranSinha
Abstract:Withtheriseofelectronicmusicandhomeproductioninthemusicindustry,
drumtriggeringisgainingpopularityandisnowusedwidelybyprofessionalsand
amateursinliveandstudioenvironments.Drumtriggeringallowsamusiciantosample
varioussoundsusingmechanismstodetectadrumstrike,whichthentriggersamidi
sample.Typicalsetupsinvolvesensorsor“triggers”foreachdrumthatarewiredtoa
drummodule,whicheitherdirectlytriggerssamplesorgeneratesmididata.Thereareno
currentproductsthatallowforcheapwirelesstriggering,whichareportableandeasyto
setup.Thisdesignprojectaimstofillthatgapinthemarketbydevelopingasystemthat
usesjustonemicrophoneandamicrocontrollertodistinguishbetweenmultipledrums.
3
ExecutiveSummaryDrumtriggeringisacommontoolthatallowsmusicianstotriggerarbitrarysampleslikepercussioninstrumentsormelodictextures.Whetheritisbeingusinginaliveorstudiosetting,itaddsanotherdimensiontotheinstrumentbylayeringdifferentvoicesontopoftypicaldrumsetsounds.Eventhoughithasbeenaroundfordecades,inthepastfewyearstheriseofelectronicmusicandtheeaseofproducingmusicwithoutastudiohavepushedmoredrummerstothinkaboutexploringtriggering.Thestandardsystemfortriggeringrequiresaseparatepieceofhardwareforeachdrumthatneedstriggeringability,andallofthoseneedtobewiredtoacentraldrummoduletogeneratethesounds.Thesetupandconvenienceofthecurrenttechnologyislessthanideal.Newproductshavecomeoutthattrytofixthat,buttheyallarelackinginatleastonewaythatmakesthemundesirable.Eithertheycostmorethanacasualdrummerlookingtoexploretriggeringiswillingtopay,ortheydonothavethefunctionalitytotriggerdifferentdrums,ortheyareinconvenienttotransportandsetup.Thisdesignprojectaimstofixthat.Thisprojectisaproofofconceptforaproductthatusesjustamicrophoneandamicrocontrollertotriggermultipledrumsinrealtime.Thesystemissmallenoughtoeasilycarryaroundandthesetuprequiredisjustplacingthesystemnearthedrumsandtrainingeachdrum.ByimplementingthealgorithmthatperformsthetrainingandclassificationinMatlab,Iwasabletoprovethefeasibilityofthisdesign.Thealgorithmiseasilytransferabletoamicrocontrollertoexecutethesamealgorithminrealtimeandtriggersoundaccordingly.Therearequiteafewsignificantproblemswithtriggeringdrumsusingjustonemicrophone.Firstoff,tomaintainrealtimeresponse,theentiresystemmustdetectadrumstrikeandtriggertheaudiosamplewithin15milliseconds.Thatmeansonlyabout10millisecondsofasamplecanbeusedtoanalyze,andthatshortofasampledoesnotcontainalargeamountofinformationaboutthedrum.Aswell,onceastrikeisdetectedandasampleisrecorded,thefeaturesthatuniquelydistinguishthesamplefromotherdrumsamplesareusedinaK-nearestneighborsalgorithmtoallowtheusertoplayadrumandtriggeranysound.Thefinalresultsareextremelypromising.Givenatrainingsettotrainon,thealgorithmcorrectlyidentified93%ofthetestsamplesaccurately.Forversion1ofthedesign,thissystemhasplentyofroomtogrowandimprove,andhopefullywilleventuallyleadtoarealproduct.
4
1.Introduction1.1Motivation
Drumtriggeringisusedwidelybymusiciansallovertheworldtoaddelectronicand
sampledsoundstotheirplayingwhileusinganacousticdrumkit.Thisisnotjustafad.
Electronicmusicthatusesnon-acousticsoundsisgainingpopularityandmusiciansare
expectingthesesoundstoberecreatedinlivesettingsorstudiosettings.Aswell,
drummersarestartingtoexperimentwithcreatingfullsongsontheirown.Thisinvolves
triggeringthemelody,harmonyandbasslines,whilestillplayingthedrumpart.Ihave
beenplayingdrumsfor8yearsandatCornellI’mcurrentlypartofajazzcombo,jazzbig
band,andafunk/R&Bband,sowhenIstartedtothinkofafinaldesignproject,I
immediatelythoughtaboutdesigningsomethingrelatedtodrumsormusic.
Recently,aproductwasreleasedcalledSensoryPercussion(sunhou.se)thatusessensors
onadrumtodetectdifferenthitsdependingonwherethedrumisstruckandtriggeraudio
samplesaccordingly.Thesensorscandetectmanysounds(7+)likecenterofthehead,edge
ofthehead,rimofthedrum,rimshot(hittingtherimanddrumheadatthesametime)at
thecenterofthehead,rimshotattheedgeofthehead,andafewmore.Eachofthese
soundscanthenbeassignedtoasampleanditgivesthedrummernewsoundstocreate
musicwith.Alongwiththis,thecompanyhassoftwarethatallowsformoreadvanced
featureslikeblendingbetweentwodifferentsoundsifyouhitinbetweenregions.
Thisproductinspiredmebecauseitcombinedmyloveofdrummingwithhardwareand
signalprocessing.Irememberwhentheproductfirstcameout,thedrummingcommunity
wasextremelyexcitedandtheystillare,butthereareafewflawswiththesystem.Thefirst
issueisthatthesesensorsworkwithjustonedrumandiftheuserwantstotriggerwith
morethanonedrum,theyneedanothersensor.Thisislimitingbecausemostdrummers
areusedtocreatingbeatsusingseparatesoundsourcesandthusdifferentlimbs,which
wouldrequiremoresensors.Thisisrelatedtoanotherissue,whichisthatthesoftwareand
onesensorcosts$700.Eachadditionalsensoris$300.Formostmusicians,thisisan
expensivepartandcreatesahighbarriertoenterthemarket.Thelastissueisthatsetting
upthesystemcantakeawhilebecauseoneneedstoattachallthesensorsandthenwire
5
eachsensoruptoanaudiointerface.Forrecordingpurposesthisisfine,butifsomeone
wantstoquicklysetupthesesensorsitisinconvenient.Iusedthese“downsides”as
inspirationwhiledevelopingmyproject.
1.2.PreviousWork
Mymaingoalsforthisprojectweretodesignaportable,easytosetup,reliabledrum
triggeringmechanismthatdidnotcosttoomuchandhadnonoticeablelatency.Theinitial
thoughtIhadwastodothisusingonlyamicrophoneandamicrocontroller.Takingalook
atsomesimilarproductsonthemarket,IdiscoveredVersatrigger:awirelessdrum
triggeringmechanism.Despitemeetingafewofmyrequirementsofbeingwirelessand
havinglowlatency,thesetupprocessistediousandlaborious.Thistriggerneedstobe
installedinsidethedrumwhichmakeitnonportableandhardtosetup.Aswell,one
triggerandthehub,whichreceivesthedatafromthetrigger,costsaround$120.
Figure1.Versatrigger
AnotherproductthatwasrecentlyreleasediscalledtheElectronicAcousticModule(EAD)
fromYamaha.Theyuseasimilarprincipletothisdesignprojectbyusingjustoneunit,but
itdoesnottriggerdrumsseparately.Instead,itprocessesthesoundcomingfromthe
microphone(s)withcertainsoundeffectstoaddanotherlayerofsound.Thisisastepinthe
rightdirectionandconfirmsthatthereisgrowingneedforaproductliketheoneIam
developing.
6
Figure2:YamahaEAD
2.Implementation2.1PossibleSolutions
Tacklingthisproblemisnotsimplebecauseithasnotbeendonebeforeandcanvary
dependingontheenvironment.Oneapproachistousewirelesspiezosensorswitha
simpleradioprotocolsuchasZigbee,whichwouldtransferthesensorreadingstoacentral
hubthatgeneratedmididata.Thisisrelativelyeasytosetupandprobablywould
accomplishthejob,butthenissuesmightarisewithpricebecauseofvariousdifferentparts
andwithbatterylifebecauseradioprotocolstakeupadecentamountofpowerregardless
ofhowlittleinformationisbeingtransferred.Anotheroptionistousetwomicrophonesin
frontofthedrumsettogainnotonlyaudioinformation,butalsospatiallocalityofthe
drumset.Thisispromising,butabetterideawithregardstothatisjusttoplacetwo
microphonesnearthemiddleofthedrumsetandpointthemicrophonesinopposite
direction.Thiswouldhelpdeterminethedirectionthesoundiscomingfrom,whichcanbe
combinedwiththeaudioprocessingtodeterminethedrum.Thisideaseemedtomeetall
myrequirements,butIdecidedtostartwithjustonemicrophoneandrelyonaudio
processingtoseeifthatwouldbepossible.
2.2.SystemRequirements
Thecurrentdesignassumesanenvironmentconsistingofonlyadrumset,wheretheuseris
triggeringsoundsusingmultipledrums,butnotplayingcymbals.Inordertousetriggering
7
inaliveenvironment(bandperformance),atriggerperdrumisnecessarybecauseofall
thenoisefromotherinstruments.Thescopeofthisprojectiscateredmoretowards
drummerswhowanttoexplorenewsoundsandthepossibilitiesoftriggering.Aswell,the
systemisnotexpectedtoproduce100%accuracybecauseofthesimplicityandcheapprice
ofthedesign.Iwouldhopeforaccuracyabove80%though,wheremosthitsareregistered
andtriggerthecorrectsound,buttherewillbeafewmisplayedsounds.Moreiterationsof
thedesignwillhelpincreasetheaccuracythough.
Themainapproachtothisprojectistoconstantlysampleaudiofromthemicrophone,usea
simplepoweroramplitudethresholdtodetectthestartofahit,runtheaudiothroughan
FFT,sendthespectrumandtheactualsoundwaveintoaclassificationnetwork,andtrigger
anaudiosampleaccordingly.Thetrainingoftheclassificationnetworkwillhappenfirst
anditwillgenerateaclassificationmodelforthedrums.
2.3.Issues
Onemainissueisthefrequencyspectrumofadrumisnotlikeapianoortrumpetnote,
becausethereisnotalwaysaclearprimaryfrequencyandifthereistherearemanyother
harmonicsthatappear.Aswell,dependingonhowhardthedrumishit,thefrequency
spectrumcanchange.Onarelatednote,inordertomaintainrealtimeresponse,thetime
betweenthedrumhitandthemiditriggermustbelessthan15millisecondsbecausereal
timeaudioperceptionisbetween10-20milliseconds.Thismeanstheactualsamplethat
willbeusedcanonlybearound8-10millisecondsbecauseittakestimetoprocessthe
sampleandgenerateamiditrigger.Withsuchashortsample,thetoneofthedrumwillnot
becompletelypresent.Itinsteadwillbeacombinationofthetoneofthedrumandthe
impactofthestickhittingthehead.Acomparisonofafulldrumsoundwavetowhat10
millisecondsofadrumsoundisshowninFigure3and4.
8
Figure3.FullTomSoundWave
Figure4.10msTomSoundWave
2.28 2.285 2.29 2.295 2.3Time (ms) 104
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4Am
plitu
de
Full (220 ms) High Tom Hit
22806 ms23026 ms
2.2805 2.2806 2.2807 2.2808 2.2809 2.281 2.2811 2.2812 2.2813 2.2814Time (ms) 104
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
Ampl
itude
10ms High Tom Hit
9
Anotherinterestingissueistheclassificationmechanism.Itneedstoberobustandableto
adapttovariousdifferentdrums,butitcannottakeuptoomuchspaceortaketoolong
computationally.
2.4.SoftwareDesign
Atitscurrentstate,myprojectismoreofaproofofconceptforarealtimesystemthat
provesthefeasibilityofusingjustonemicrophonetotriggermultipledrums.Thesystemis
implementedinMatlab,butforrealtimeexecutionitneedstoberunonamicrocontroller.
Itiseasytoportthesystemtoonebecauseallthestepsarerelativelybasiccomputations.
InthenextsectionsIwillelaborateoneachstepofthedesign.
2.4.1.AudioInput
Thesystemsamplesaudioat8khzbecausethemajorityofdrumfrequenciesarepresent
below1000kHz.FortheprototypingIwasdoingwithMatlab,IendedupusingaZoom
H4Nmicrophone,whichhastwostereomicrophones.Itsamplesas44100,butdown
samplingthatissimple.Fortheactualmicrophonecircuitrythatwouldbeconnectedtothe
microcontroller,ahighpassandlowpassfilterwillbeaddedalongwithanamplifier.The
lowandhighpasswillhelpfilteroutanynoisethatispresent,whiletheamplifierallows
theusertosetthegaindependingonwherethemicrophoneisplacedandhowloudthey
areplaying.Theaudiosamplesarestoredinacircularbuffertoallowthesystemtohave
somepreviousrecordingstotakewhenadrumstrikeitdetected.
2.4.2.DetectingDrumStrike
Inordertodetectthestartofadrumstrike,thesystemneedstodetectasuddenincreasein
amplitudeaboveacertainthreshold,butitmustnottriggeronsporadicnoisethatmightbe
presentinenvironment.Thebasicwaytodothisistogenerateanenvelopeofthesignal
thattriestooutlinetheamplitudeofthesignal.TherearetwowaysItriedtoapproachthis:
anexponentialmovingaverage(1poleIIR)andamovingaveragewindow(.Both
approachesareshownonasampledrumstrikeinFigureXXX.Fortheexponentialmoving
average,itusesthisformula:
10
𝐸 𝑖 + 1 = 𝛼 ∗ 𝑖𝑛𝑝𝑢𝑡 𝑖 + 1− 𝛼 ∗ 𝐸(𝑖)
Ann-windowmovingaverageusesthefollowequation:
𝐸 𝑖 = 1𝑛 𝑖𝑛𝑝𝑢𝑡 𝑖 )!!!
!!!
Theinput(i)totheformulascanbeeithertherawmicrophoneinputsortheabsolutevalue
ofthoseresults.Usingtheabsolutevaluesallowstheenvelopetorespondslightlyfaster,
butforthisspecificapplication,itdoesnotaffecttheresultsignificantly.Also,thealpha
termfortheexponentialmovingaveragecaneasilybeadjustedtochangetheresponse
time,whichcanbeusefulwhentuningthesystem.Intermsofimplementinganenvelope
onamicrocontroller,theexponentialmovingaverageismuchsimpler,soIchosetowork
withthat.
Figure4.DrumStrikewithFilteredSignals
9.7155 9.716 9.7165 9.717 9.7175 9.718 9.7185 9.719 9.7195 9.72 9.7205Time (samples) 104
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
Ampl
itude
Audio signal with Envelopes
Original SignalExponential w AbsExponential wout Abs5 pt Moving Avg
11
2.4.3.FeatureExtraction
Afteradrumstrikeisdetected,thesystemtakesnoteofthatspecificindexandsaves
around2millisecondsbeforethedetectionasthestartofthesample.Itthenproceedsto
collectsamplesfor~8moremilliseconds,whichat8khzis64samples.Thisproducesa
10mssampletoprocess.Theseoffsetswerechosenbyanalyzingdrumstrikesandfinding
thesmallestindexthatwasneededtoobtainthestartofthedrumhitformostsamples,and
thentherestisallocatedforafterthestriketogainasmuchinformationaspossibleabout
thedrum.
Next,thesignalisprocessedtoextracttheimportantfeatures.Alargeproblemwith
extractingfeaturesisthatthereisnodefinitionforwhatisimportantforaspecificdrum.A
usefulmethodtodeterminethisisbyvisualizingthefeaturesasaK-nearestneighbors
classificationalgorithm.Inordertodothat,Ionlyusethreefeaturesbecauseitiseasyto
graphthefeaturesforallthedrumsandseeifthereisaclearseparationbetweenthem.I
willtalkmoreaboutthetradeoffsbetweenthefeaturesintheclassificationsection.The
mostobviousfeaturetoextractisthefrequencyresponseofthedrumsinceabassdrum
clearlyhasalowerpitchthenasmalltomdrumanditcanbeappliedtodrumsnomatter
howbigorsmalltheyare.TodothisthesystemgeneratesthediscreteFouriertransformof
the10-millisecondsampleusingthefastFouriertransform.Theoutputisagraphthat
representstheamplitudesofeachfrequencyrange.Anotherfactortoconsideriswhether
thefilteredsignalortheoriginalsignalshouldbepassedintotheFFT.Theoriginalsignal
seemslikethebetteroptioncomparingFigure5and6becauseithaslargerpeaks,but
issuescanariseifthereissignificantnoisethataddsinsomefrequencycomponents.
However,sincethenoisewillshowupinfrequencybinsseparatedfromthemainsampleI
chosetousetheoriginalsignal.Itisimportanttonotethatrawamplitudedataisnotuseful
becausethesystemneedstofunctionregardlessofthevolumeofthedrumhit.Todealwith
this,thesystemusesfeaturesbasedaroundrelativedataliketheorderofpeaksorusing
amplituderatiosrelativetothehighestpeak.
12
Figure5.OriginalSignalResponse
Figure6.FilteredSignalResponse
0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01Time (ms)
-0.4
-0.2
0
0.2
0.4
0.6
Ampl
itude
Raw Siganl
0 500 1000 1500 2000 2500 3000 3500 4000 4500Frequency
0
0.05
0.1
0.15
0.2
0.25
0.3
Ampl
itude
Raw Siganl FFT
0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01Time (ms)
-0.4
-0.2
0
0.2
0.4
Ampl
itude
Filtered Siganl
0 500 1000 1500 2000 2500 3000 3500 4000 4500Frequency
0
0.05
0.1
0.15
0.2
0.25
Ampl
itude
Filtered Siganl FFT
13
Oncethefrequencyresponseisavailable,thereisalargevarietyoffeaturesthatcanbe
extracted.AfewthatIexploredincludedthehighestpeakinthefrequencygraph,the
numberoffrequencypeaks,thetimedifferencebetweenpeaksinthetimedomainandthe
ratioofthehighestpeaktothesecondhighestpeakinthetimedomain.Thesearejusta
smallselectionIthoughtofbyobservingthedifferentresponsesoffourdrumsthatcanbe
seeninFigure7.
Figure7.DifferentDrumResponses
2.4.4.TrainingandClassification
ThissystemclassifiesdrumsusingasimpleK-nearestneighborsclassificationalgorithm.
Initially,duringthetraining,theuserplayseachdrum10or20timesandthesystemsaves
thethreefeaturesforeachdrumhit.Then,whenanewsampleisobtained,thethree
0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01Time (ms)
-0.4
-0.2
0
0.2
0.4
Ampl
itude
High Tom
0 500 1000 1500 2000 2500 3000 3500 4000 4500Frequency
0
0.05
0.1
0.15
0.2
0.25
Ampl
itude
High Tom FFT
0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01Time (ms)
-1
-0.5
0
0.5
1
Ampl
itude
Low Tom
0 500 1000 1500 2000 2500 3000 3500 4000 4500Frequency
0
0.1
0.2
0.3
0.4
0.5Am
plitu
de
Low Tom FFT
0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01Time (ms)
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
Ampl
itude
Snare Drum
0 500 1000 1500 2000 2500 3000 3500 4000 4500Frequency
0
0.05
0.1
0.15
Ampl
itude
Snare Drum FFT
0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01Time (ms)
-0.4
-0.2
0
0.2
0.4
Ampl
itude
Bass Drum
0 500 1000 1500 2000 2500 3000 3500 4000 4500Frequency
0
0.05
0.1
0.15
0.2
0.25
0.3
Ampl
itude
Bass Drum FFT
14
featuresarecalculatedandthenthedistancetoeachofthetrainingpointsiscalculated.
Thesampleisclassifiedasthemajorityofits“k”nearest(intermsofEuclidiandistance)
neighbors.Thisisafairlysimplemachine-learningalgorithm,butthathelpsintermsof
comprehendingthereasoningforitsclassifications.Thismakesiteasiertotunethe
algorithmsoitiscateredtowardsdrums.Whiletestingoutdifferentfeatures,Iwould
generategraphsofallofthetrainingpointsandthegoalwastoseparateeachdrumas
muchaspossible.Figure8showsoneofthefirstonesthatworkeddecently.Thefeatures
arethepeakfrequency,thenumberofpeaksinthetimedomain,andtheratiobetweenthe
maximumpeakandthelastpointinthetimedomain.Forthemostparteachdrumisina
specificregion,whichiswhatmatters.
Figure8.FirstWorkingKNNModel
15
Throughmoreanalysisofthedifferentdrums’timeandfrequencyresponses,andmusical
intuitionofwhattypicaldrumssoundslike,IgeneratedtheKNNgraphshowninFigure9.
ThisisoneofthebestIobtained.Itusesthepeakfrequency,thesecondhighestfrequency,
andtheratioofthemaxpeakandthelastpoint.Otherthanoneortwooutliers,thehigh
tomandbassdrumareextremelyseparated,andlowtomandsnaredrumareslightly
separated.Itriedtocompensateforthatbyusinganewfeaturethatwasfairlyspecific.The
featurewasthenumberoffrequencypeaksthatwereequalorgreaterthanhalfofthe
highestmagnitudepeak,whichrepresentsspectraldensityinaway.ThisisbecauseI
noticedthesnaredrumhadalotoffrequencynoiseandbecauseofthathadamuchbusier
frequencyspectrumdespitehavingasimilarfirstandsecondpeak.ThefinalKNNisshown
inFigure10.Thefinalsystemimplementstheclassificationusingthepeakfrequency,the
secondhighestfrequency,andthenumberoffrequencypeaksgreaterorequaltohalfthe
magnitudeofthehighestpeak.Manydrumstrikesmappedtothesamepoint,soIsized
eachpointaccordingtothenumberofinstancesateachpoint.Itisofcoursepossibleto
keepoptimizingforthetrainingset,butthatcaneasilyleadtooverfitting,whichIfeltIwas
startingtodo.Byanalyzingthespecificdrumsinthetrainingset,andtryingtothinkof
featuresthatwouldfitthatspecificsoundwavecanleadtoresultsthatdonotadaptwellto
otherdrums.
16
Figure9.SecondWorkingKNNModel
Figure10.FinalKNNModel
50
100
4
150
200
250
300
Peak
Fre
quen
cy
3
350
400
450
Num freq peak > (mag(peak freq)/2 2
Clustering for Different Drums
900800
second freq
7006005004001 3002001000
Hi tomLow tomBass DrumSnare
17
3.Results/DiscussionOverall,Iwasabletoobtaindecentresultsthatperformedwellwithagiventrainingand
testingset.ThesewereprocessedinMatlab(notinrealtime),butthewayeachaudio
signalisprocessedsimulatesa10mssampletoproveitsfeasibilityonamicrocontroller.
TrainingSet NumberofSamples
HighTom 60
LowTom 20
BassDrum 203
SnareDrum 91
Thenumberofsamplesherevariessomuchbecauseofthewaythemicrophoneis
positioned;certaindrumsarelouderthanothers,andsinceeachisusingthesame
threshold,notallofthemareregisteredasdrumstrikesdespitetherecordingcontaining
similaramountsofeachdrum.Thisneedstobeaddressedsomehow.Afterrunningthem
throughtheKNNalgorithmwithK=3,theresubstitutionloss,whichishowthetrainingset
performs(errorrate)onaKNNalgorithm,was0.058.Alow6%meansthatthedatahas
separatedthedrumsfairlywell.
TestingSet TotalSamples
Detected
TotalClassified
Correctly
Accuracy
HighTom 86 83 0.965
LowTom 37 32 0.865
BassDrum 252 248 0.984
SnareDrum 145 124 0.855
Average 520 487 0.937
Overall,thesenumbersshowthatthesystemworksforthemostpartandachieves
accuracybeyondwhatIexpected.Intermsofusageforexploringtriggers,thistypeof
18
accuracyforafirstversionisagreatstartingpoint,butstillleavesalotofroomfor
improvement.
Theseresultscanvaryquiteabitdependingonafewvariablesthatarenotsetinstone.
Themainthingsarethethresholdthatcanchangehowmanydrumstrikesaredetected,
thevalueofK,whichgreatlyaffectstheaccuracyifthemodelisnotwellfitted,andlastly
thefeaturesthatarechosentocreatetheKNNmodel.Whilethefeaturesforalldrumsare
similarattheircore,dependingontheenvironment,tuningofthedrums,volume,sticks
thatareused,andmanyotherfactors,thesefeaturesmaynotrepresentthoseparticular
drumsbest.Forthisreason,thisapplicationmightbemoresuitedforaneuralnetthat
acceptstheentirewaveformtoclassifythedrums.Aswell,computationallyspeaking,KNN
isintensivecomparetoasimpleneuralnetworkandoncethesystemmovestoa
microcontroller,thiscouldposeanissue.
4.ConclusionandFutureWorkOverall,thisprojectwasaninterestingandsuccessfulexplorationofwirelessdrum
triggeringusingamicrophone.Thegoalofdevelopingacheap,reliable,easytosetup
systemwasachievedintheory(Matlab)andIwouldhavelovedtoimplementthisonthe
microcontroller,butbecauseofalackofplanningandunforeseendifficultiesIwasunable
to.Theaccuracyofthesystemwasproventobehighenoughtouseasamusician,and
therearesignificantimprovementsthatcanbemade.
Theobviousnextstepistomovethisovertoamicrocontrollersoitcanactuallytrigger
soundsinrealtime.Ihavealreadystartedthatprocess.Ihavethefullmicrophonesetup,
connectedtoaPIC32thatalsoisconnectedtoaTeensy3.2fordebuggingpurposes.I
alreadyhavethatcodesamplingfromthemicrophone,performinganenvelopefilter,and
detectingandcapturing10millisecondsofadrumstrike.Thenextstepsinvolveputting
thatsamplethroughanFFTinfixedpointsoitcanexecuteitinafewmilliseconds,and
thenalsoimplementingsometypeofKNNorneuralnetworkonthemicrocontroller.This
19
isnottoocommonbecauseofspaceconstraints,butI’mconfidenttherewillbeenough
spacebecausetheaudioprocessingcodeisprettymemoryefficient.
Otherstepsincludeimplementingthesamefunctionalitywithtwomicrophones.Withthis
typeofspeciallocality,detectingwhichdrumisplayedbecomesmuchmoretrivialandcan
leadtoaccuraciesabove98%.Thatwouldbeexceptionalforevenprofessionalmusicians
andcouldbedevelopedintoaproduct.Asignificantamountofworkwouldneedtobe
dedicatedtochoosingtherightclassificationalgorithmthatcanworkforvarious
environments,isnottoocomputationallyintensive,andisimplementableona
microcontrollerwithspacerestraints.Theotherissueisoverlappingorsimultaneousdrum
hits.Thisisprobablysolvableusingtwomicrophonesandsomemorefrequencyanalysis.
Overall,thisdesignprojecthasshownmethatasimplesolutioncanachievethe
functionalityofanacceptedstandardinamuchcheaperway.Itaswellhasinspiredmeto
trytodevelopthisintoaproductbecausenowIamconfidentitwillworkandifIwould
lovetouseit,Ibelievetherewillbeotherdrummerswhowouldaswell.
20
Appendix
MatlabCode%Import samples hitom_train_o = importdata('longSamp/train/hitom.mp3'); lotom_train_o = importdata('longSamp/train/lotom.mp3'); bass_train_o = importdata('longSamp/train/bass.mp3'); snare_train_o = importdata('longSamp/train/snare1.mp3'); hitom_test_o = importdata('longSamp/test/hitom.mp3'); lotom_test_o = importdata('longSamp/test/lotom.mp3'); bass_test_o = importdata('longSamp/test/bass.mp3'); snare_test_o = importdata('longSamp/test/snare1.mp3'); all_test_o = importdata('longSamp/test/all.mp3'); Freq = 44100; Freq_d = 5; Fs = Freq/Freq_d; % 8.8kHz %downsample Samples hitom_train = downsample(hitom_train_o.data(:, 1), Freq_d); lotom_train = downsample(lotom_train_o.data(:, 1), Freq_d); bass_train = downsample(bass_train_o.data(:, 1), Freq_d); snare_train = downsample(snare_train_o.data(:, 1), Freq_d); hitom_test = downsample(hitom_test_o.data(:, 1), Freq_d); lotom_test = downsample(lotom_test_o.data(:, 1), Freq_d); bass_test = downsample(bass_test_o.data(:, 1), Freq_d); snare_test = downsample(snare_test_o.data(:, 1), Freq_d); all_test = downsample(all_test_o.data(:, 1), Freq_d); drum_names = {'hitom', 'lotom', 'bass', 'snare'}; drum_train = {hitom_train, lotom_train, bass_train, snare_train}; drum_test = {hitom_test, lotom_test, bass_test, snare_test}; % Set Constants tiny = 0.3; threshold = 0.23; j = 1; class = ""; knn_x = []; knn_y= []; counter = 0; total = 0; for j=1:length(drum_train) class = drum_names(j); testThing = drum_train{j}; % actual sample lengthH = round(length(testThing)); envelope = zeros(1, lengthH); % envelope1 = zeros(1, lengthH); %instantiate feature vectors numPeaks = []; peakRatio = []; peakFreq = []; secFreq = [];
21
numfPeaks = []; total = 0; % Create envelope Filter i = 1; while (i < (lengthH)) envelope(i+1) = (tiny*testThing(i)+ (1.0-tiny)*envelope(i)); % envelope(i+1) = (tiny*abs(testThing(i))+ (1.0-tiny)*envelope(i)); i = i+1; end i = 1; while (i <= (lengthH)) if (envelope(i) > threshold && counter <= 0) start = i - 15; last = i + 65; % sample = envelope(start:last); sample = testThing(start:last); % Time Based Features [peaks, loc] = findpeaks(sample, 'MinPeakDistance', 10); numPeaks = [numPeaks, length(peaks)]; [sortedPeaks, sortedPeakI] = sort(peaks, 'descend'); peek = (sample(end)/sortedPeaks(1)); peakRatio = [peakRatio, peek]; % Frequency Based Features L = length(sample); NFFT = 2^nextpow2(L); % Next power of 2 from length of myRecording Y = fft(sample,NFFT)/L; f = Fs/2*linspace(0,1,NFFT/2+1); max_fft = max(abs(Y(1:round(NFFT/2+1)))); [fpeaks, floc] = findpeaks(abs(Y(1:round(NFFT/2+1)))); [sortedfPeaks, sortedfPeakI] = sort(fpeaks, 'descend'); peakFreq = [peakFreq, f(floc(sortedfPeakI(1)))]; secFreq = [secFreq, f(floc(sortedfPeakI(2)))]; [nfpeaks, nfloc] = findpeaks(abs(Y(1:round(NFFT/2+1))), 'MinPeakHeight', max_fft/2); numfPeaks = [numfPeaks, length(nfpeaks)]; % graphing for debuggin % figure(); % % subplot(2,1,1) % plot ((1:length(sample))/Fs, sample); % subplot (2, 1, 2); % plot(f,2*abs(Y(1:NFFT/2+1))) % title(["thing " num2str(i)]); % end % add feature vector to KNN data knn_x = [knn_x; [peek, f(floc(sortedfPeakI(1))), f(floc(sortedfPeakI(2)))]]; knn_y = [knn_y, class]; total = total + 1; counter = 100; envelope(i) = 0; end if (counter > 0) counter = counter - 1; end i = i +1;
22
end disp (['total training for ' drum_names{j} ' = ' num2str(total)]); xx = peakRatio; yy = secFreq; zz = peakFreq; figure(70) % scatter3(xx, yy, zz, 60, 'filled'); %Engine [uxy, jnk, idx] = unique([xx.',yy.', zz.'],'rows'); szscale = histc(idx,unique(idx)); %Plot Scale of 25 and stars scatter3(uxy(:,1),uxy(:,2), uxy(:,3),'o', 'filled', 'sizedata',szscale*25) title("Clustering for Different Drums "); xlabel('Peak between max and last peak'); ylabel('Second Frequency peak '); zlabel('Peak Frequency '); legend({'Hi tom', 'Low tom', 'Bass Drum', 'Snare'}, 'Fontsize', 18); hold all; end %KNN Model mdl = fitcknn(knn_x, knn_y); mdl.NumNeighbors = 3; rloss = resubLoss(mdl); % Training for j=1:length(drum_test) class = drum_names(j); testThing = drum_test{j}; lengthH = round(length(testThing)); envelope = zeros(1, lengthH); numPeaks = []; peakRatio = []; peakFreq = []; secFreq = []; numfPeaks = []; correct = 0; total = 0; i = 1; while (i < (lengthH)) envelope(i+1) = (tiny*testThing(i)+ (1.0-tiny)*envelope(i)); % envelope(i+1) = (tiny*abs(testThing(i))+ (1.0-tiny)*envelope(i)); i = i+1; end i = 1; while (i <= (lengthH)) if (envelope(i) > threshold && counter <= 0) start = i - 15; last = i + 65; % sample = envelope(start:last); sample = testThing(start:last); [peaks, loc] = findpeaks(sample, 'MinPeakDistance', 10);
23
numPeaks = [numPeaks, length(peaks)]; [sortedPeaks, sortedPeakI] = sort(peaks, 'descend'); peek = (sample(end)/sortedPeaks(1)); peakRatio = [peakRatio, peek]; L = length(sample); NFFT = 256;%2^nextpow2(L); % Next power of 2 from length of myRecording Y = fft(sample,NFFT)/L; f = Fs/2*linspace(0,1,NFFT/2+1); max_fft = max(abs(Y(1:round(NFFT/2+1)))); [fpeaks, floc] = findpeaks(abs(Y(1:round(NFFT/2+1)))); [sortedfPeaks, sortedfPeakI] = sort(fpeaks, 'descend'); peakFreq = [peakFreq, f(floc(sortedfPeakI(1)))]; secFreq = [secFreq, f(floc(sortedfPeakI(2)))]; [nfpeaks, nfloc] = findpeaks(abs(Y(1:round(NFFT/2+1))), 'MinPeakHeight', max_fft/2); numfPeaks = [numfPeaks, length(nfpeaks)]; % if (i > (22600/(1/(Fs/1000))) && i < (23400/(1/(Fs/1000)))) % figure(); % % subplot(2,1,1) % plot ((1:length(sample))/Fs, sample); % subplot (2, 1, 2); % plot(f,2*abs(Y(1:NFFT/2+1))) % title(["thing " num2str(i)]); % end sampleData = [peek, f(floc(sortedfPeakI(1))), f(floc(sortedfPeakI(2)))]; drumClass = predict(mdl,sampleData); if (strcmp(drumClass{1},drum_names{j})) correct = correct +1; end total = total + 1; counter = 100; envelope(i) = 0; end if (counter > 0) counter = counter - 1; end i = i +1; end disp(['For ' drum_names{j} ' got ' num2str(correct) '/' num2str(total) '=' num2str(correct/total)]) end