Upload
tamma
View
50
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Optical Music Recognition. Ichiro Fujinaga McGill University 2003. Content. Optical Music Recognition Levy Project Levy Sheet Music Collection Digital Workflow Management Gamera Guido / NoteAbility. Optical Music Recognition (OMR). - PowerPoint PPT Presentation
Citation preview
Optical Music Recognition
Ichiro Fujinaga
McGill University2003
Content
Optical Music Recognition
Levy Project Levy Sheet Music Collection
Digital Workflow Management
Gamera
Guido / NoteAbility
Optical Music Recognition (OMR)
Trainable open-source OMR system in development since 1984
Staff recognition and removal• Run-length coding• Projections
Lyric removal / classifier Stems and notehead removal Music symbol classifier Score reconstruction
Demo
OMR: Classifier
Connected-component analysis Feature extraction, e.g:
Width, height, aspect ratio Number of holes Central moments
k-nearest neighbor classifier Genetic algorithm
Overall Architecture for OMR
Staff removalSegmentation
Recognition
K-NN Classifier
Output
Symbol Name
Knowledge BaseFeature Vectors
OptimizationGenetic Algorithm
K-nn Classifier
BestWeight Vector
ImageFile
Off-line
Lester S. Levy Collection
Lester S. Levy Collection
North American sheet music (1780–1960)
Digitized 29,000 pieces including “The Star-Spangle Banner”
and “Yankee Doodle”
Database of: text index records images of music (8bit gray) lyrics (first lines of verse and chorus) color images of cover sheets (32bit)http://levysheetmusic.mse.jhu.edu
Reduce the manual intervention for large-scale digitization projects
Creation of data repository (text, image, sound) Optical Music Recognition (OMR) Gamera
XML-based metadata composer, lyricist, arranger, performer, artist, engraver,
lithographer, dedicatee, and publisher cross-references for various forms of names, pseudonyms authoritative versions of names and subject terms
Music and lyric search engines Analysis toolkit
Digital Workflow Management
The problem
Suitable OCR for lyrics not found Commercial OCR systems are often
inadequate for non-standard documents The market for specialized recognition of
historical documents is very small Researchers performing document
recognition often “re-invent” the basic image processing wheel
The solution
Provide easy to use tools to allow domain experts (people with specialized knowledge of a collection) to create custom recognition applications
Generalize OMR for structured documents
Introducing Gamera
Framework for creation of structured document recognition system
Designed for domain experts Image processing tools (filters, binarizations, etc.) Document segmentation and analysis Symbol segmentation and classification
• Feature extraction and selection• Classifier selection and combiners
Syntactical and semantic analysis
Generalized Algorithms and Methods for Enhancement and Restoration of Archives
Features of Gamera
Portability (Unix, Windows, Mac) Extensibility (Python and C++ plugins) Easy-to-use (experts and programmers) Open source Graphic User Interface Interactive / Batchable (scripts)
Graphic User Interface (wxWindows)
Architecture of Gamera
GAMERA Core (C++)
Scripting Environment (Python)
Plugins (Python)
Automatic Plugin Wrapper (Boost)
Plugins (C++)
Example of C++ Plugin
// Number of pixels in matrix#include “gamera.hh”#ifdef __area_wrap__#define NARGS 1#define ARG1_ONEBIT#endifusing namespace Gamera;template <class T>feature_t area(T &m) {return feature_t(m.nrows() * m.ncols());
}
Example of Python Plugin
// This filters a list of CC objectsimport gameradef filter_wide(ccs, max_width):tmp = []for x in ccs:
if x.ncols() > max_width:x.fill_matrix(0)
else:tmp.append(x)
return tmp
Gamera: Interface(screenshot in Linux)
Gamera: Interface(screenshot in Linux)
Histogram(screenshot in Linux)
Thresholding(screenshot in Linux)
Thresholding(screenshot in Linux)
Staff removal: Lute tablature
Classifier: Lute(screenshot in Linux)
Staff removal: Neums
Classifier: Neums(screenshot in Linux)
Greek example
GUIDO Music Notation FormatH. Hoos, K. Renz, J. Kilian
“A formal language for score-level representation”
Plain text: readable, platform independent Extensible and flexible Adequate representation NoteServer: Web/Windows GUIDO/XML NoteAbility (K. Hamel)
GUIDO: An example{ [ \beamsOff | \clef<"treble"> \key<"D"> f#*1/8. g*1/16 |a*1/4. d2*1/8 d*1/4. c#*1/8 |e1*1/2 _*1/4 f#*1/8. g*1/16 |c#2*1/4. b1*1/8 a*1/4. g*1/8 || e#*1/2 f#*1/4 f#*1/8. g*1/16 |a*1/4. d2*1/8 d*1/4. c#*1/8 |e1*1/2 _*1/4 f#*1/8 g |c#2*1/4. b1*1/8 a*1/4. c#*1/8 ],
…
Conclusions
Gamera allows rapid development of domain-specific document recognition applications
Domain experts can customize and control all aspects of the recognition process
Includes an easy-to-use interactive environment for experimentation
Beta version available on Linux OS X version in preparation
Projections
X-projections
Y-projections
back