Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
ECE 532 Design Project Group Report
Virtual Piano
Chi Wei
Hecheng Wang
April 9, 2012
2
Table of Contents
1 Overview ................................................................................................................................................ 3
1.1 Goals ...................................................................................................................................... 3
1.2 Background and motivation ................................................................................................... 3
1.3 System overview .................................................................................................................... 3
1.4 IP and Hardware Descriptions ............................................................................................... 4
2 Outcome ................................................................................................................................................. 5
2.1 Review of initial concept ........................................................................................................ 5
2.2 Our Final Product ................................................................................................................... 5
2.3 Improvements for Future Development .................................................................................. 6
3 Project Schedule .................................................................................................................................... 7
3.1 Weekly Milestone Accomplishments ..................................................................................... 7
3.2 Discuss and Evaluation ........................................................................................................... 8
4 Description of the Blocks ....................................................................................................................... 8
4.1 MicroBlaze (software) ............................................................................................................ 8
4.1.1 Software Overview ................................................................................................... 8
4.1.2 Data Structures .......................................................................................................... 9
4.1.3 Detection Algorithm ............................................................................................... 10
4.1.4 Sound generation .................................................................................................... 11
4.1.5 Sound Playback Control ......................................................................................... 12
4.2 Video_TO_RAM Custom Tracking Logic ........................................................................... 13
4.2.1 Overview of Operation ........................................................................................... 13
4.2.2 Tracking Logic ........................................................................................................ 13
4.2.3 Finite State Machine ............................................................................................... 15
4.3 MPMC Controller ................................................................................................................. 16
4.4 PLB Bus ................................................................................................................................ 16
4.5 OPB bus ................................................................................................................................ 16
4.6 OPB-PLB, PLB-OPB Bridge ................................................................................................ 17
4.7 OPB_AC97 Controller.......................................................................................................... 17
5 Description of Our Design Tree ........................................................................................................... 17
References ............................................................................................................................................... 18
Appendix ................................................................................................................................................. 19
Appendix A – tone waveforms ................................................................................................... 19
3
1 Overview
1.1 Goals Any flat surface can become a virtual piano! The user merely plays as if on a real piano
and a video camera tracks the movement of the fingers. When a finger touches any point
on the Virtual Piano, the corresponding note is played back through the speakers. The user
could lay down a paper cut-out of the piano keys on the surface or remove this visual
reference for the truly wondrous experience of playing an invisible piano.
1.2 Background and motivation Musical instruments have long been a major source of entertainment in our culture. With
all the advancements in DSP and electronics in the recent years, generating music
electronically has never been easier. Yet many musicians still train with and prefer to play
the physical instruments. With our project, we hope to advance the creation of music
further into the digital realm while still drawing upon skills learned via traditional
instruments. We have picked the Piano as our choice of instrument to digitize.
The Piano is widely played by music students and musicians. It has already seen one level
of digitization: the keyboard. However, while the keyboard allows us to fiddle with the
digital sound output, it still relies on the physical input for playing. Our project takes the
piano to the next level by removing the physical presence of the keys and allowing the
piano to be played by recognizing the gestures of the hands of the player.
1.3 System overview Below is our system level block diagram.
Figure 1 System Level Block
Diagram
4
1.4 IP and Hardware Descriptions The following table provides a summary of all IP cores and hardware devices used for this
project.
Table 1 Summary of IP cores and devices
IP/Hardware Function Author
video_to_ram
(Tracking Logic)
-Modified version of video_to_ram from
previous year’s laser pointer game project
-Detect dots of green colour and keeps track
of the position 4 different dots.
Group
MicroBlaze Soft Processor Core for setting up video , hit
detection and generation of sound
Xilinx, program
implemented by group
dlmb/dlmb_ctrl Data memory controller interfaced through Local Memory Bus (LMB)
Xilinx
Ilmb/ilmb_ctrl Instruction memory controller interfaced through LMB
Xilinx
PLB Bus 2 PLB buses are used:
1. used by MicroBlaze to talk to AC97
Controller(through OPB) and memory
2. used between video_to_ram and memory
Xilinx
OPB Bus Used to communicate with AC97 Controller Xilinx
PLB2OPB Bridge/
OPB2PLB Bridge
Establish Communication Channels between
PLB and OPB Buses
Xilinx
Debug module (mdm)
Enables XMD Xilinx
DDR_SDRAM (mpmc) Memory to store the Color limit value and
the coordinates of 4 green dots
Xilinx
Opb_ac97 Audio Codec Controller Mike Wirthlin (ISU)
LM4550 AC97 Codec National
Semiconductor
Camera Captures the video and finger movement Sony
Video Daughter Card Interface with Camera Analog Devices
UART Transmit debug info using RS232 Xilinx
clock_generator Generate timing signals Xilinx
led_debug_mux Display the debugging value controlled by
DIP switches
Xilinx
Speakers Play Audio
5
2 Outcome
2.1 Review of initial concept All of the primary design goals have been met, these include:
1. Ability to play virtual piano on any flat light-colored surface.
2. Simultaneous recognition of up to four fingers.*
3. Ability to play one tone at a time in the range of a single octave.
4. White piano keys only
5. Response time of < 0.25 seconds. In fact the response seems to be less than 0.1
second.
Some of the optional goals are also met, which include:
1. Allow piano to be played on any colored surface, light or dark. In fact the
background color does not matter as long as it’s not green. Furthermore, the
lighting has minimal impact on the accuracy of the detection.
2. Increased the range to 1 note plus a full octave for 8 notes in total:
C4,D4,E4,F4,G4,A4,B4,C5.
3. Increase number of tones which can be produced simultaneously to three.
*the original goal of 5-finger tracking was modified to 4 fingers as the difference only has
a minimal impact on the overall performance of the system. Thus it was revised after
consulting with the TA.
2.2 Our Final Product The finished product as of the writing of this report is capable of detecting and tracking
the coordinates of 4 green dots on fingertips simultaneously. We did not extend the
number of tracking to more than 4 fingers due to the chord generation, which limits the
number of tones in a chord to 3. Thus it would pointless to track 10 fingers as proposed
initially since it would make no difference to the system’s sound generating performances.
For the physical setup of the system, the users are required to put on green stickers on their
fingertips to enable finger tracking. A layout of keys is drawn on a white board as a
reference to guide the user’s playing. The system successfully generates the corresponding
tones when a downward finger movement is detected on a specific region on the screen.
Multiple fingers will generate a chord and the response time is satisfactory with little
lagging. Overall, the project showed positive results in meeting our initial goals; most of
the features important to the playing of virtual piano are implemented and fully tested.
6
Our physical system setup is as follows.
Green stickers need to be put on to play the virtual piano.
2.3 Improvements for Future Development 1. The 3-tone chord isn’t quite as smooth as the 2-tone chord. We suspect that it was due
the slow processing speed of floating point operations that resulted in the empty space in
the AC97 FIFO buffer. As a future improvement, additional MicroBlaze can be added
dedicated to the sound generation. Alternatively, the sound generation could be done in
hardware to speed up the process and eliminate the undesired characteristics of the current
system.
2. The length of duration of playing is fixed at this point. In order to make the piano
playing experience more realistic we can make the playback length vary depend on the
7
actual position and movement of the fingertips. Our detection module can detect fingers’
movement including upward and downward. This expansion is very straightforward and
only requires modification to sound generation and playback control.
3. Black keys or half tone regions and sound can be added to increase the number of
playable keys. Additional tones in general will also make the whole system more
appealing.
3 Project Schedule
3.1 Weekly Milestone Accomplishments This is a summary of comparison between proposed milestones and actual weekly
progress.
Table 2 Summary of proposed Milestones and actual weekly progress
Dates Proposed Milestones Actual Progress
Feb 8 Create the base project and experiment
with VGA decoder, AC’ 97 audio
decoder and add MPMC to the system.
Constructed a base project with MPMC and
DDR memory as well as opb_AC97 core to
control the AC97 codec
Feb 15 Complete VGA_to_RAM, test video is
captured and stored in ram correctly.
Successfully implemented video_to_ram
module to capture the video data and store it
in SDRAM. Tested sound generation as well
as storing audio PCM data in the memory for
playback.
Feb 22 (Reading week) (Reading week)
Feb 29 Build and test basic sound generator
module and active region detection
module. Demonstrate software
detection algorithm functionality.
Implemented audio playback controller to
play pre-stored audio tone samples from
memory. Implemented hardware detection
block to track a single red dot without
simulation.
Mar 7 Complete active region detection
module and sound generator module.
Complete software detection
algorithm.
Implemented the software detection algorithm
and simulated and debugged in C.
Simulation of single red dot tracking works,
began building multipoint tracking.
Mar 14 First integration attempt. Debug the
entire system.
Integrated the software and hardware system.
Added interrupt signal from the hardware and
scaled down to 2 point tracking for debugging
purpose.
Mar 21 Major bugs should be fixed Major bugs fixed, as a result the basic
functions of tracking of two red dots and
playing corresponding tones are working.
Mar 28 Optimization of software detection
algorithm and hardware modules.
Improve the sound
Changed the color detection to green for
better performance, increased the number of
tracking dots to 3. Expanded the playable
tones to 7.
8
Apr 4 The system should work properly. Fix
the remaining minor bugs.
Added additional tones for playing. Fixed the
4 dots tracking issue. Physical setup of the
system was finalized
Apr 11 Final Demo Final Demonstration
3.2 Discuss and Evaluation Throughout the development of this project, we were able to follow the pace of the
proposed milestones fairly well. Although we intended to make the project progress at an
even pace, it turned out that the original plan has a heavier load on the back end. The
reason was that as the complexity of the system increased, the time and effort required to
debug the system also increased drastically. As expected, the majority of our time was
consumed trying to verify the functionalities and to debug the various issues resulted from
the system integration. Even though we had modified the milestones each week, in most
cases, they were merely changed to include specific details and to better reflect the
actually progress of our work. During one week, however we did fall behind our schedule
due to a bug in the video detection block. To remedy this, we worked extra hard for the
next week and was able to catch up without further delays in our timeline. Overall, the
original milestones were well set, but they could be improved by compressing some of the
work at the front end to leave more time for debugging.
4 Description of the Blocks
4.1 MicroBlaze (software)
4.1.1 Software Overview
The software component contains three modules: detection, sound generation and sound
playback control. Below is the program flow chart.
9
Figure 2 Software Core Flow Chart
The interrupt from tracking logic is triggered when new coordinates are available. The
interrupt routine is kept to minimum size to save time for main loop to feed in sufficient
samples.
4.1.2 Data Structures
Two important data structures are created: dot structure and playback structure. Their
definitions are the following:
The dot_structure contains the position and state information of each finger. It is only
updated in the interrupt routine. The playback_structure contains all the information
needed for sound generation. We support up to three playback sources simultaneously so
up to three tones can be played together to form chord. The playback_structure can only
be marked as busy in the interrupt routine in the case of a key has been pressed. It can only
be clearly in the main loop when the tone has finished playing.
Program Initialization:
Video module setup
Audio module setup
Initialize program global data
structures and parameters
Enable microblaze interrupt
Program main loop:
Check if audio INFIFO is full.
If it’s full, skip this iteration.
Generate sound samples and
feed the data into audio FIFO.
If none of playback structure is
occupied, clear FIFO.
Interrupt Routine:
Read updated coordinates from
memory and validate new
coordinates
Updates four fingers’ states stored
in dot structures.
Update playback structure if any
new event is triggered. (Key is
pressed)
Interrupt
signal
Resume
10
In addition, some enum types are also created to better organize the program. The
dot_state type contains supported tracking states. The region type stores all supported key
regions. REGION_NON is the area outside of key regions.
4.1.3 Detection Algorithm
The detection algorithm is implemented in Finite State Machine. The following is the
FSM state diagram.
Figure 3 Detection FSM
The FSM is only updated in the interrupt routine. After processing each frame, the
tracking logic will write the new coordinates to memory and send an interrupt signal to
11
notify software core. Done_Downward is the state that is important in this project as it
indicates a key has been pressed. The threshold is the number of frames the finger keeps
moving in the same direction before it enters the next state. This adjusts the sensitivity of
the detection. After one key is pressed, if the finger still stays in the key regions, it will be
blocked in IDEL state until it moves out of key regions. This covers the case when user
puts the finger in the key regions after pressing the key. The user has to move his finger
out of the key regions to reset the state and move downward again to trigger another event.
Otherwise it is considered as the zombie state of previous event and will not trigger any
new event. As described in the diagram, the detection algorithm does not only detect key
pressing event, it also detects the movement of fingers. The implementation makes it
possible to further develop the project if additional features need to be added. For example,
piano sound usually lasts for certain length and start to fade out when the key is released.
This key releasing state is already captured in the FSM.
4.1.4 Sound generation
Considering the speed of microBlaze, basic sound samples (one period for each tone) are
pre-generated on PC with python script. AC97’ is set to 44.1kHz sampling rate so every
tone has fixed number of samples in one period. To have the highest sound quality, we
also added harmonic frequencies and envelope. We in total implemented 8 tones from C4
to C5. They are in Motorola 16-bits format so that it’s consistent with the system setup.
Below is the frequency and waveform for C4. (See Appendix A for the rest tones).
Figure 4 Piano Tone C4 Samples – 261Hz, 167 Samples
12
To emulate the sound of piano, the envelope which we generated has the following shape.
Figure 5 Envelope Coefficients
Our system supports up to three tracks, so up to three tones can be played simultaneously.
(Three tones can be the same tone) Because tones’ samples are generated separately, their
samples are added up to form the combined sound before feeding into audio buffer. The
details of combining sound are discussed in sound playback control section.
4.1.5 Sound Playback Control
In the main loop, it keeps checking if any of three playback sources is on. If there’s
playback source with data available, then combine these sources and feed the sample into
AC’97 in buffer. Otherwise clear the INFIFO and wait for new tones. Below is the sound
combination logic.
Figure 6 Sound Combination Logic
The playback of each tone is set to the same length. When all the samples are fed into the
buffer, the playback source will be marked free and made available to accept new tones.
When all three playback sources are occupied, pressing the key will not have any effect
and this event will be ignored and lost.
13
4.2 Video_TO_RAM Custom Tracking Logic
4.2.1 Overview of Operation
This custom block was based on the video_to_ram block from the video demo project.
This block performs two key functions color detection and position tracking built on the
existing functions embedded in the original video_to_ram block. Namely, the original
block accomplishes the following, it decodes the streaming video data, and converts the
YCrCb format into RGB format, then writes them into two line buffers, then writes the
line buffers’ content to the memory. See figure below.
Since we do not have a video output feature, we have changed the block so it no longer
writes the buffer content to the memory. Instead, the custom tracking logic is implemented
by reading every pixel data from the line buffers (alternating between the two and only
read when it is not being written to by the RGB decoder). The RGB value of the pixel is
then evaluated according to the content of the r_current_color_limits register and the
coordinates of the pixel is saved if it is deemed a valid data.
4.2.2 Tracking Logic
The following diagram shows the detection logic used for color detection and tracking.
Figure 7 Video Processing Flow Chart
14
Figure 8 Tracking Logic Diagram
The color margin and position margin are encoded from the r_current_color_limits
register, which is read from the memory location 0x45000000. It is encoded as the
following:
Not Used Positional Margin Color Margin
[31:16] [15:8] [7:0]
The location registers contains the data of frame number, pixel count, line count and the
valid bit. The frame number, which is incremented for every new frame, is used only for
debugging purpose. The pixel count, which is incremented for every time a new pixel is
read from the buffer for detection, is used as the X coordinate. The line count, which is
incremented for every new line from the incoming video stream, is used as the Y
coordinates. They are encoded as the following:
Frame Number Y Coordinate Valid Bit X Coordinate
[0:9] [10:19] [22] [21:31]
15
The location register are sorted according to their X coordinate values and then written to
the memory address from the smallest to the largest at 0x450000004, 0x450000008,
0x45000000c and 0x45000010.
4.2.3 Finite State Machine
Key FSM states are described in the following.
S_LINE_HOLDER: A single pixel is read out of the line buffer and the tracking logic
determines whether the location registers should be updated according to the color limit
register.
S_READ_REQUEST: Appropriate signals are asserted on the PLB in order to request a
4 byte read at the memory location of the color limits.
S_READ: the color limit register is updated with the content in the memory
Figure 9 Tracking Logic FSM Diagram
16
S_WRITE_REQUEST_X: Appropriate signals are asserted on the PLB bus in order to
request a 4 byte write at the memory location for the dots’ coordinates. The content of the
location register X is also placed on the data bus.
S_FINAL: The FSM stalls here until a new frame is ready, i.e. line count becomes 0. Also
sends an interrupt signal to microBlaze to signal the memory has been updated.
Essentially, the FSM processes an entire frame while it spins in S_LINE_HOLDER.
During this time, the location registers’ content are updated accordingly. Once the whole
frame is done, FSM performs memory updates. It firsts reads a 32 bit data from memory to
update the color limit register, then writes the 4 sets of coordinates one by one to the
memory.
4.3 MPMC Controller The MPMC (multi-port memory controller) is an IP block in the Xilinx IP library. It
allows up to 8 buses to be connected to the same memory through different ports. In this
project, we used a total of 2 ports in the MPMC:
Port 0: used by the Microblaze to write hardware configurations and color limit and read
the coordinates of the 4 finger tips.
Port 1: used by the video_to_ram custom tracking IP block to read the color limit and
write the coordinates to the memory.
4.4 PLB Bus Processer Local Bus is obtained from the Xilinx IP catalog under the name of plb_v64.
The PLB bus provides a connection between an optional numbers of PLB masters and
slaves. It became the key transportation of data from one module to another. To eliminate
bottleneck to the Memory, we used 2 PLB buses in our system,
1. Used for the microblaze to communicate with the DDR opb_AC97.
2. Used between video_to_ram tracking block and memory.
4.5 OPB bus On-Chip Peripheral Bus, obtained as an Xilinx IP. Used for communication between
MicroBlaze and its peripherals. The reason is bus was used was because the AC97
controller block was built with an OPB interface.
17
4.6 OPB-PLB, PLB-OPB Bridge Bridge which enables communication between the PLB and OPB buses, allows the
Microblaze which sits on the PLB to send commands to the AC97 controller on the OPB
bus and write playback data to the codec.
4.7 OPB_AC97 Controller
The opb_AC97 (Wirthlin) is the audio codec controller. It is an OPB slave that provides a
register‐based interface for the codec. It was found as part of the labs for an ECE course
at Iowa State University, and was created by Prof. Mike Wirthlin of Brigham Young
University. The controller interacts with the audio codec using 5 ports:
SDATA_OUT: From opb_AC97 to codec. Serial input of data frames for codec
sampled on the falling edge of BIT_CLK.
BIT_CLK: From codec to opb_AC97. 12.288 MHz clock.
SDATA_IN: From codec to opb_AC97. Serial output of data frames for codec sampled
on the rising edge of BIT_CLK.
SYNC: From opb_AC97 to codec. Defines boundaries of data frames.
RESET#: From opb_AC97 to codec. Active low hardware reset.
5 Description of Our Design Tree Table 3 Design Tree
Directory/File Description
./_xps Xilinx generated directory, contains options files for bitinit,
libgen, simgen and platgen
./blkdiagram Block diagram generated by XPS
./data/system.ucf System constraint file; contains external pins’assignment
./drivers Includes the drivers for AC97
./etc Option files for bitgen and downloading
./ub The processor
./doc/Final Report.pdf Group report of our project
./pcores/led_debug_mux_v1_00_a Contains core files for led mux
./pcores/opb_ac97_v2_00_a Contains core files for AC97
./pcores/video_to_ram_v1_00_a Contains core files for Custom tracking module
./sw/main.c Software core, contains three modules:
Finger movement detection
Sound generation
Sound playback controller
./sw/main.h Header file contains constants and parameters needed by
software core
./README Design tree documentation
18
References [1] Video to RAM deco project, with core video_to_ram from Jeffrey Goeders, retrieved
from Piazza.
[2] Latchezar Dimitrov, Jonathon Riley, Steven Doo, ECE532 Digital System Design,
Laser Pointer Project. 2011.
[3] IBM Corp. (2007, May) IBM 128-Bit Processor Local Bus Architecture Specifications
Version 4.7. [Online]. https://www-
01.ibm.com/chips/techlib/techlib.nsf/techdocs/3BBB27E5BCC165BA87256A2B0064FFB
4/$file/PlbBus_as_01_pub.pdf
[4] AC97-LM4550, Texas Instrument. Literature Number: SNAS032E.
19
Appendix
Appendix A – Tone Samples in Waveform
Piano Tone D4 – 293Hz, 150 samples
Piano Tone E4 – 329Hz, 134 samples
20
Piano Tone F4 – 349Hz, 126 samples
Piano Tone G4 – 392Hz, 112 samples
21
Piano Tone A4 – 440Hz, 100 samples
Piano Tone B4 – 494Hz, 89 samples
22
Piano Tone C5 – 523Hz, 84 samples