10
Laser Conductor James Noraky and Scott Skirlo Introduction After a long week of research, most MIT graduate students like to unwind by playing video games. To feel less guilty about being sedentary all week, some, like the authors of this proposal, prefer to play active games. However, because of the low graduate stipends, most students cannot afford game consoles like the Wii or Xbox, yet alone the accessories necessary to play active games. To democratize active gameplay, we detail how graduate students with access to the 6.111 labkit, a NTSC video camera, a laser galvanometer, and neon green gloves can create a game called Laser Conductor. Laser Conductor is an interactive game where the player assumes the role of a conductor. With the gloves as the baton, the player brandishes different gestures as the music of a symphony plays. The player is directed by arrows drawn by the laser galvanometer, which indicates how the player should move their hands. The figure below shows an example of a gesture sequence drawn by the laser galvanometer that the player must imitate. The player’s gestures are then recognized and scored using visual input from the NTSC video camera. Laser Conductor: The laser galvanometer draws the current gesture as indicated by the square and the player moves his or her hand in the direction indicated by the arrows.

Laser Conductor - MITweb.mit.edu/6.111/www/f2016/projects/jnoraky_Project...Laser Conductor James Noraky and Scott Skirlo Introduction After a long week of research, most MIT graduate

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Laser Conductor - MITweb.mit.edu/6.111/www/f2016/projects/jnoraky_Project...Laser Conductor James Noraky and Scott Skirlo Introduction After a long week of research, most MIT graduate

LaserConductor

JamesNorakyandScottSkirlo

IntroductionAfteralongweekofresearch,mostMITgraduatestudentsliketounwindbyplayingvideogames.Tofeellessguiltyaboutbeingsedentaryallweek,some,liketheauthorsofthisproposal,prefertoplayactivegames.However,becauseofthelowgraduatestipends,moststudentscannotaffordgameconsolesliketheWiiorXbox,yetalonetheaccessoriesnecessarytoplayactivegames.Todemocratizeactivegameplay,wedetailhowgraduatestudentswithaccesstothe6.111labkit,aNTSCvideocamera,alasergalvanometer,andneongreenglovescancreateagamecalledLaserConductor.LaserConductorisaninteractivegamewheretheplayerassumestheroleofaconductor.Withtheglovesasthebaton,theplayerbrandishesdifferentgesturesasthemusicofasymphonyplays.Theplayerisdirectedbyarrowsdrawnbythelasergalvanometer,whichindicateshowtheplayershouldmovetheirhands.Thefigurebelowshowsanexampleofagesturesequencedrawnbythelasergalvanometerthattheplayermustimitate.Theplayer’sgesturesarethenrecognizedandscoredusingvisualinputfromtheNTSCvideocamera.

LaserConductor:Thelasergalvanometerdrawsthecurrentgestureasindicatedbythesquare

andtheplayermoveshisorherhandinthedirectionindicatedbythearrows.

Page 2: Laser Conductor - MITweb.mit.edu/6.111/www/f2016/projects/jnoraky_Project...Laser Conductor James Noraky and Scott Skirlo Introduction After a long week of research, most MIT graduate

Thisreportisorganizedasfollows.First,wewilldescribegesturerecognitionmodule.Then,wewilldescribethememorymodule.Thisisfollowedbythegamelogicfinitestatemachine.Thisreportconcludeswithadiscussionofthegalvanometercontrolmodule.

GestureRecognition

JamesNoraky

InLaserConductor,theusersplayasaconductorandmovetheirhandsindirectionsdrawnbythelasergalvanometer.Akeypartofthegameistorecognizetheuser’sgesturesandcomparethemagainsttheexpectedgestures.Inthissection,wedescribehowbasicgesturerecognitionisaccomplished.Thegesturerecognitionsubsystemtakesasinputimagesfromacameraandoutputs,ifany,thedirectionthateachhandismoving.Figure1showsthekeycomponentsofthissubsystem.

Figure1:Gesturerecognitionsystemwiththerelevantsignalsindicated

First,wewillbrieflydescribehowthecameraworks.Then,wedescribehowtheuser’shandsarefirstlocalized.Finally,wewilldescribehowthelocationsofthehandsaretranslatedintogestures.Thesegesturesarefedintothegamelogicfinitestatemachine.InterfacingwiththeCameraTointerfacethelabkitwiththecamera,weuseaNTSCcamera.ThiscameraoutputsananalogsignalwhichencodestheYCbCrcolorcomponentsandthecontrolsignalsf,v,andh.AsopposedtotherasterscanoperationofVGAmonitors,theframesofaNTSCcameraareinterlaced.Consequently,therearetwofieldsoflinesthatcorrespondtotheevenandoddlinesoftheimage.Thisissignaledbythefbit.SimilartoVGAmonitors,thevandhsignalsaretheverticalandhorizontalsyncsignals.Thecodethatconvertstheanalogsignalintoadigitalformandthatobtainsthef,v,andhsignalsareprovidedin[1].HandLocalizationTolocalizetheirhands,theuserswearneongreenglovesshowninFigure2.Thiscolorischosenbecauseitisuncommoninthelabareaandwouldminimizeanysourcesofpotentialinterference.Tolocalizethehands,wefirstapplyathresholdtotheimageobtainedfromtheNTSCcameratoobtainabinaryimagewhere‘1’representspixelsthatareneongreen.

Page 3: Laser Conductor - MITweb.mit.edu/6.111/www/f2016/projects/jnoraky_Project...Laser Conductor James Noraky and Scott Skirlo Introduction After a long week of research, most MIT graduate

TheNTSCcameraoutputspixelcolorintheYCbCrcolorspace,where10bitsareallocatedforeachcomponent.OneapproachtoidentifyingthegreenpixelswouldbetoconverttheYCbCrcolorintotheHSVcolorspace,whereathresholdcanbeappliedtothehuevalue.Furthermore,thisapproachisalsoappealingbecausethecodetoconvertYCbCrtoRGBandtoconvertRGBtoHSVarebothprovided[2].OurapproachinsteadfirstconvertstheYCbCrcolorofeachpixelintoRGB.Wethenthresholdthegreenvalueofeachpixeltoobtainthebinaryimage.BecausewearenotconvertingtoHSV,weminimizethelatencybetweenthetimeapixelisreadtowhenabinaryvalueisoutputted.ThisapproachiseffectiveinlocalizingtheglovesasshowninFigure2.Fromthisbinaryimage,wewillapproximatecenterofeachgloveandtrackitstrajectorytoidentifythecorrectgesture.

Figure2:PipelinethatconvertstheYCbCRintoabinaryimagethatlocalizesthegreengloves,fromwhichthecentroidsoftheglovesarethencomputed.

Itshouldbenotedthatapplyingathresholdbasedonaminimumvaluealone(G>Gthresh)isnoteffective.ThisapproachwouldproduceabinaryimagethatincludesboththegreenglovesandanybrightlightsourcesbecausewhitelighthassignificantvaluesforallthreeRGBcolorchannels.Here,werealizethatwemustalsoimposeanupperboundforthegreenchannel(G>Gmin&&G<Gmax),whichaccountsforthefactthatthegreenglovedoesnotreflectalloftheincidentlight.Withthisinsight,wewereabletorobustlylocalizethegloves.Inadditiontogreen,wefoundthatthisapproachisalsoeffectiveforidentifyingbluepixels.Onceweobtainabinaryimage,ourtaskistothenlocalizethepositionofthegloves.Becausetherewillbesomespuriousnoise,wecanconsiderimplementingmedianfiltersormorphologicalerosions.However,giventhelargesizeofthehandsinthebinaryimage,thecomputationofthecentroidisrobusttothesmallamountofnoise.Anotherdesignchoicewaswhethertosegmentthegloves.Thiscanbedoneusesomeintegralprojectionstofiguretheaxiswhichseparatestheimagesofthetwogloves.However,onereasonableassumptionistousethecenteroftheimagetoseparatethetwogloves.Thissimplificationleadstorobustcentroidtracking.

Page 4: Laser Conductor - MITweb.mit.edu/6.111/www/f2016/projects/jnoraky_Project...Laser Conductor James Noraky and Scott Skirlo Introduction After a long week of research, most MIT graduate

Tocomputethecentroidlocationinrealtime,wehaveasinput:abinarypixelobtainedafterapplyingthethreshold;fwhichdenotestheframenumber;vwhichisthevsyncsignal;andhwhichisthehsyncsignal.Usingthevandhsignal,wecancomputethexandycoordinateforeachpixel.Denotingpasthepixel,thecentroidlocationfortheleftgloveisthen:

𝑥"#$%, 𝑦"#$% =𝑝+,, ⋅ (𝑥, 𝑦)+01

∑𝑝+,,

Ananalogouscomputationismadefortherightglove.Sincewearecomputingthelocationinreal-time,thenumeratoranddenominatorwouldthereforebepartialsums.Tocomputethecentroid,weneedtodivide.GiventhesizeoftheNTSCcamera,wecanchooseanappropriatesizeregisterbitwidthforthepartialsumvariables.Thisdecisionisimportantbecausedividersintroducelatencytothecomputationofthecentroidcoordinate.Thelatencyofarestoringdividerisequaltothebitwidthofitsarguments[3].However,sincewesimplyneedtoappearreal-time,wedonotneedtobeveryprecisebecausetheplayercannotperceiveafewcyclesofdifferenceina27MHzclock.Tostartthedivider,wemakeuseofthefsignaltostartthedivisiononthetransitionfromanoddtoevenframe.Tofigureoutthissignal,itwashelpfultousethelogicanalyzertoseewhatsignalswouldbereliabletostartthedivision.Tocomputethecentroidlocationsforboththeleftandrightglove(denotedCX_L,CY_L,CX_R,CY_R),weuse4restoringdividersthatruninparallel.Becauseofthelatencyandpotentialfordifferenceamongthedividers,weneedtocomputeareadysignalthattakesintoaccountthestatusofeachdivider.Thisstepisimportantsothatspuriouscoordinatesarenotpassedintothegesturerecognitionsubsystem,whichwedescribenext. GestureRecognitionInthegame,thelasergalvanometerdrawsdifferentarrowswhichrepresentthetrajectoriesinwhichtheusermustmovetheirhands.Wewillrefertothesehandtrajectoriesasgestures.Inourimplementation,wesupport8directionsasshownbelow.

Page 5: Laser Conductor - MITweb.mit.edu/6.111/www/f2016/projects/jnoraky_Project...Laser Conductor James Noraky and Scott Skirlo Introduction After a long week of research, most MIT graduate

Figure3:ThedirectionssupportedbytheLaserConductorgame.Theblueregionrepresentsthemarginoferrorthatthecentroidsareallowedtostillbeclassifiedcorrectly.

Inthediagram,theblueregionsrepresentthemargininwhichthetrajectoriesneedtobeintobecorrectlycategorized.Ifthetrajectoriesdonotfallintheblueregion,thenitisclassifiedasanunknowngesture.Trajectoriesarecomputedbymaintainingthelast3centroidpositions.Anewcentroidpositionisloadedinwhenthereadysignalisassertedfromthehandlocalizationsubsystem.Fromthesethreecentroidpositions,wecancreate2vectorsbysubtractingthemostrecentcentroidsfromtheearliestcentroid.Wecandenotethesevectorsas(dx,dy).Then,wecheckcheckifthesevectorsarecontainedwithintheblueregionasshowninthediagram.Forexample,tocheckiftheglovesaremovingup,wehavetomakesurethethedxpositioniswithintheboundsandthedypositionkeepincreasing.Forthediagonaldirections,wemustcheckthedifferencebetweendxanddyaswellasthedirection.Werepeatthisprocessforboththeleftandrightglovetooutputagesture.Thisgestureisupdatedforeachnewcentroidposition.Thesegesturesarethenfedintothegamelogicfinitestatemachine.

FutureWorkIfwehadmoretime,anaturalextensionwouldbetodetectskincolordirectly.Ifthecamerasensorhasagoodcolorresponse,skincolorcanbedetecteddirectlywithouttheneedofgreengloves.Forexample,thewebcamofmostApplelaptopshavesuchcharacteristics.Todothis,wecanuseanotherNTSCcameraorinterfaceawebcamwiththelabkit.Tointerfaceawebcamwiththelabkit,wewilldescribehowlargeamountsofdatacanbetransferredusingtheFTDIUSB-to-FIFOchipwhenwedescribethemusicofthegame.Furthermore,wecanfurtherincreasetherobustnessofthecentroidtracking.Todothis,wewoulduseintegralprojectionstolocalizethecentroidofeachhandwithoutassumingthecentraldividingaxis.Coupledwithatrajectory“smoother,”wecanalsogetmorerobustresults.OnesuchapproachwouldinvolveaKalmanfilter.References[1]NTSCZBTSampleCode.http://web.mit.edu/6.111/www/f2016/tools/ntsc_zbtfix.zip[2]ConversionfromYCbCrtoRGB.http://web.mit.edu/6.111/www/f2011/tools/ycrcb2rgb.v[3]RestoringDivider.http://web.mit.edu/6.111/www/f2016/index.html

Page 6: Laser Conductor - MITweb.mit.edu/6.111/www/f2016/projects/jnoraky_Project...Laser Conductor James Noraky and Scott Skirlo Introduction After a long week of research, most MIT graduate

MusicManagerJamesNoraky

AkeypartoftheLaserConductorgameplayisthemusicsincetheuserisplayingasaconductor.Furthermore,iftheuserperformsagesturecorrectly,achimeshouldindicatethatthemoveiscorrect.Tohandlethevariousmusicandsoundsthatwillbeplayed,weimplementedamusicmanagerthatstoresandplaysbothasongandthechimeforacorrectmove.TheinputofthismoduleareflagsfromthegamelogicFSM:(1)aflagthatindicatesthestartofthegameandmusicand(2)aflagthatindicatesacorrectgestureandthestartofachime.Duetothedifferentdurationsofthesongandthechime,westorethesongintheZBTmemoryandwestorethechimesoundinBRAM.Boththesongandthechimeareuploadedandstoredonthelabkitafterithasbeenprogramed.LoadingtheSongontotheLabKitThearchitectureofthemusicmanagerissimilartotherecorderinLab5.Intherecorderlab,wetookthemicrophonedatafromtheAC97chip,appliedalow-passfilter,andstoredadecimatedversionofthefilteredsignal.OneapproachtoloaddataontothelabkitistoconnecttheheadphoneoutputofacomputertothemicrophoneinputofthelabkitandstorethedatalikeinLab5.ThebenefitsofthisapproachisthatitminimizesthecomplexityoftheZBTtiming.However,thismeansthatloadinginasongafterthelabkithasbeenprogrammedcantakeseveralminutes.Tominimizetheloadtime,weusetheFTDIUSB-to-FIFOchiptoloaddatafromacomputertothelabkit.Totransmitdatafromthecomputer,wemodifiedtheprovidedcode[1]sothatitcancompileontheOSXoperatingsystemandsendlargerdatapackets.WealsousedthesuppliedVerilogcodetoaccessthedataandstoreitinmemory.WethenwroteaMATLABscripttoconvertanyarbitraryMP3fileintoaformatthatcanbeloadedontotheFIFOusingourmodifiedC++code.Withthesemodifications,wecanloadseveralminutesworthofmusicinapproximately10seconds.Inourimplementation,wealsobypassthefilteringstepofthesoundinputandloadthefiltereddatadirectlyontothelabkit.Duringplayback,wewouldapplythesamefiltertointerpolatethesignalinto48kHz.Becauseofthehightransferrate,thetimingfortheZBTmemorybecomesimportant.TomakesuretheZBTmemoryisproperlyfunctioning,weusedtheprovidedtheramclockmodule[2]tocorrectlyusetheZBTmemory.Tofullyutilizethememory,wealsowroteasimplememorymanagertofullyutilize32outofthe36bitsofeachZBTmemoryrow.Bydoingthis,wecanstoremorethana5-minutesongintheZBTmemory.LoadingtheChimeontotheLabKit

Page 7: Laser Conductor - MITweb.mit.edu/6.111/www/f2016/projects/jnoraky_Project...Laser Conductor James Noraky and Scott Skirlo Introduction After a long week of research, most MIT graduate

WeusedasimilarapproachtoloadthechimeexceptthatinsteadofZBT,thechimeisstoredintheBRAM.Inplaybackmode,whenagestureisplayedcorrectly,thechimeisoverlaidwiththemusictoprovideaudiofeedbacktotheplayer.PuttingitAllTogetherInsteadoftherecordbuttonofLab5,wewiredthedifferentswitchestoloadinmusicforthesongandforthechime.Thisisperformedafterthelabkithasbeenprogramed.Whenboththesoundsareloaded,therecorderentersplaybackmodewhenthestartsignalisassertedbythegamelogicfinitestatemachine.Onepotentialquestionis:whydidwenotstorethedatainflash?WestoredthedatainZBTandBRAMforthepossibilityoftheuserbeingabletochangethemusicinsteadofusingsomepreprogrammedsong.Asafuturefeature,wecouldincludeasimplealgorithmthatconvertstheloadedmusicintothedifferentgesturesfordynamicgameplay,allowingourgametoaccommodatedifferentmusicaltastes.References[1]Datatransfer.http://web.mit.edu/6.111/www/f2016/tools/flash_IO.zip[2]Ramclockmodule.http://web.mit.edu/6.111/www/f2016/index.html

Page 8: Laser Conductor - MITweb.mit.edu/6.111/www/f2016/projects/jnoraky_Project...Laser Conductor James Noraky and Scott Skirlo Introduction After a long week of research, most MIT graduate

GameLogicModuleScottSkirlo

Fig.Schematicdiagramofsystem

Ifocusedondevelopingthegamelogicmodule,galvo_controlmoduleandaffiliatedmodulesassociatedwithgestureandvectorgraphicsstorage.Thetoughestlinkformeintheentireprojectwasfiguringoutbesthowtoflexiblyinterfaceshaperetrievalanddrawinginaflexibleway.OriginallyIenvisionedrewritingashapeRAMeverysingleframe.ThisRAMwouldcontainasetofcoordinatesforthegalvotostepthroughandthenumberofremainingstepsintheshapebeforecompletion.UponrealizingthatoneshapewasdrawnthegalvowouldgotothenextmemoryaddressintheRAMandproceedtoplotthecoordinatesforthatshape,goinguntilitreacheda"endoffile"numberindicatingthatithadcompletelygonethroughtheRAMframe.Originallythegamecontrollerwould'vedirectlyinterfacedwithagestureROM.AsIworkedthroughtheproblem,Irealizedthatflexiblywritingavariablenumberofshapes,ofdifferenttypesandcoordinateoffsetswouldbeverychallenging.BecauseofthisItriedtosimplifytheinformationbeingpassedbetweenthegameFSMandthegalvocontrollerasmuchaspossible,reducingthisonlytoashapetypeandcoordinateoffsettypes.Operationthenproceededasfollows.ForeverynewframethegameFSMwouldfillinabuffercontainingavariablenumberofshapetypesandoffsetsdependingontheuserinputsandthecurrentfetchedgesturesfromthegestureROM.Duringa"slave"phase,thegameFSMwouldthenrespondtothegalvocontrollerforrequestsforthenextshape.Uponassertionofthe"request_shape"flag,thegameFSMwouldreturnthecurrentshapeandoffsetset,andincrementtheshapebufferandgestureoffsetbuffers.Withthisinformation,thegalvocontrollerwouldthenrequesttheshapenumberandthefirstvertexnumberfromtheshapeROM.Uponreceivingthesecoordinatesandtheremainingvertexnumber,thegalvocontrollerwouldaddthesetothepointoffsetsandsetthex_posandy_posvaluestothis.Afterdoingthisthegalvo_controllerstartsatimertoallowtheshapetobedrawnandwaitforthegalvotomechanicallyrespondtothenewvoltages.Thiswaittimewasdeterminedbythedifferencebetweentheoriginalxandycoordinatesandthenewxandycoordinatestoallowthelaserline

Page 9: Laser Conductor - MITweb.mit.edu/6.111/www/f2016/projects/jnoraky_Project...Laser Conductor James Noraky and Scott Skirlo Introduction After a long week of research, most MIT graduate

"weight"tobeasuniformaspossiblefortheentirevectorgraphicfigure.Uponregisteringthatonlyonevertexisleftfromtheshapesbeingdrawn,thegalvocontrollerturnsthelaseroffandrequeststhenextshape,if"wait_shape_memory"hasn'tbeenasserted."Wait_shape_memory"isassertedbythegame_FSMwhenthefinalshapeintheshapebufferisreached.Upondoingthisthegame_FSMchecksit'sowntimertoseeifasufficienttimehasexpiredtoupdatethenextmove.Ifnot,itsimplyloadstheshapebufferwiththeoldshapes,andupdatestheoffsetcoordinatebuffersasnecessarybeforereturningtoslavemode.

GalvoControllerModuleScottSkirlo

Thegalvomirrorswerecontrolledby28-bitDACsfromtheuseroutput'softhelabkit.Theseinturnarefedtotwosetsofinputsandinvertinginputs.Tocreatethesecondinvertedsignal,theDACoutputisalsofedthroughanLF-741inaninvertingconfiguration.TheDACsareona0to5Vrange,sotheeffectivedifferentialinputvoltageoftheGalvocontrollerrangesfrom0to10V.Thereweresomesmallissueswiththeresistorsforsettinguptheopampintheinvertingconfiguration.When1kresistorswereused,wenoticedthattheretendedtobea"ringing"intheDACoutputwhichresultedinanotablefuzinessinthescreendrawing.Adjustingtheseto10kimprovedtheperformance.Theadditionoffilterinalowpassconfigurationendedupcausingoscillation.Originallyitwasplannedthat12bitserialDACswouldbeutilizedtodrivethescreen.However,inpracticethegalvocontrollermirrorshavesomuchinertiaandmechanicalnoise,andthefactthatthelaserspotsizeisalreadynotsonarrow,thatinpracticehavinghigherresolutionisnotusefulormeaningful.

Lasersource

Page 10: Laser Conductor - MITweb.mit.edu/6.111/www/f2016/projects/jnoraky_Project...Laser Conductor James Noraky and Scott Skirlo Introduction After a long week of research, most MIT graduate

Thelasersourcewasobtainedcheaplyonlineforafewdollars,andproduceda5mWredlaseroutput.Thisredlaserwasperfectlyeyesafeatthesepowerlevels(classI),sothatnoextraprecautionswouldbenecessaryforusingthesystem.Originallyweplannedondrawingatotalof7shapesatthesametime.Althoughwehadthecapacityforthisanddemonstratedit(seeFigbelow),forthesimplicityofgame-play,andtoallowthemostimportantarrowstobebrighter.Atlargedistances,6drawnarrowsarebarelyvisible,whereas2arequiteclearatthedistanceof1-10meters.Thereweresomeplanstoupdatetheresolutionofthegalvocontrollerto12bits.Thegalvohadcertaininterestingquirksassociatedwithit'soperation.