31
Optical Music Recognition Ichiro Fujinaga McGill University 2003

Optical Music Recognition

  • Upload
    tamma

  • View
    50

  • Download
    1

Embed Size (px)

DESCRIPTION

Optical Music Recognition. Ichiro Fujinaga McGill University 2003. Content. Optical Music Recognition Levy Project Levy Sheet Music Collection Digital Workflow Management Gamera Guido / NoteAbility. Optical Music Recognition (OMR). - PowerPoint PPT Presentation

Citation preview

Page 1: Optical Music Recognition

Optical Music Recognition

Ichiro Fujinaga

McGill University2003

Page 2: Optical Music Recognition

Content

Optical Music Recognition

Levy Project Levy Sheet Music Collection

Digital Workflow Management

Gamera

Guido / NoteAbility

Page 3: Optical Music Recognition

Optical Music Recognition (OMR)

Trainable open-source OMR system in development since 1984

Staff recognition and removal• Run-length coding• Projections

Lyric removal / classifier Stems and notehead removal Music symbol classifier Score reconstruction

Demo

Page 4: Optical Music Recognition

OMR: Classifier

Connected-component analysis Feature extraction, e.g:

Width, height, aspect ratio Number of holes Central moments

k-nearest neighbor classifier Genetic algorithm

Page 5: Optical Music Recognition

Overall Architecture for OMR

Staff removalSegmentation

Recognition

K-NN Classifier

Output

Symbol Name

Knowledge BaseFeature Vectors

OptimizationGenetic Algorithm

K-nn Classifier

BestWeight Vector

ImageFile

Off-line

Page 6: Optical Music Recognition

Lester S. Levy Collection

Page 7: Optical Music Recognition

Lester S. Levy Collection

North American sheet music (1780–1960)

Digitized 29,000 pieces including “The Star-Spangle Banner”

and “Yankee Doodle”

Database of: text index records images of music (8bit gray) lyrics (first lines of verse and chorus) color images of cover sheets (32bit)http://levysheetmusic.mse.jhu.edu

Page 8: Optical Music Recognition

Reduce the manual intervention for large-scale digitization projects

Creation of data repository (text, image, sound) Optical Music Recognition (OMR) Gamera

XML-based metadata composer, lyricist, arranger, performer, artist, engraver,

lithographer, dedicatee, and publisher cross-references for various forms of names, pseudonyms authoritative versions of names and subject terms

Music and lyric search engines Analysis toolkit

Digital Workflow Management

Page 9: Optical Music Recognition

The problem

Suitable OCR for lyrics not found Commercial OCR systems are often

inadequate for non-standard documents The market for specialized recognition of

historical documents is very small Researchers performing document

recognition often “re-invent” the basic image processing wheel

Page 10: Optical Music Recognition

The solution

Provide easy to use tools to allow domain experts (people with specialized knowledge of a collection) to create custom recognition applications

Generalize OMR for structured documents

Page 11: Optical Music Recognition

Introducing Gamera

Framework for creation of structured document recognition system

Designed for domain experts Image processing tools (filters, binarizations, etc.) Document segmentation and analysis Symbol segmentation and classification

• Feature extraction and selection• Classifier selection and combiners

Syntactical and semantic analysis

Generalized Algorithms and Methods for Enhancement and Restoration of Archives

Page 12: Optical Music Recognition

Features of Gamera

Portability (Unix, Windows, Mac) Extensibility (Python and C++ plugins) Easy-to-use (experts and programmers) Open source Graphic User Interface Interactive / Batchable (scripts)

Page 13: Optical Music Recognition

Graphic User Interface (wxWindows)

Architecture of Gamera

GAMERA Core (C++)

Scripting Environment (Python)

Plugins (Python)

Automatic Plugin Wrapper (Boost)

Plugins (C++)

Page 14: Optical Music Recognition

Example of C++ Plugin

// Number of pixels in matrix#include “gamera.hh”#ifdef __area_wrap__#define NARGS 1#define ARG1_ONEBIT#endifusing namespace Gamera;template <class T>feature_t area(T &m) {return feature_t(m.nrows() * m.ncols());

}

Page 15: Optical Music Recognition

Example of Python Plugin

// This filters a list of CC objectsimport gameradef filter_wide(ccs, max_width):tmp = []for x in ccs:

if x.ncols() > max_width:x.fill_matrix(0)

else:tmp.append(x)

return tmp

Page 16: Optical Music Recognition

Gamera: Interface(screenshot in Linux)

Page 17: Optical Music Recognition

Gamera: Interface(screenshot in Linux)

Page 18: Optical Music Recognition

Histogram(screenshot in Linux)

Page 19: Optical Music Recognition

Thresholding(screenshot in Linux)

Page 20: Optical Music Recognition

Thresholding(screenshot in Linux)

Page 21: Optical Music Recognition

Staff removal: Lute tablature

Page 22: Optical Music Recognition
Page 23: Optical Music Recognition

Classifier: Lute(screenshot in Linux)

Page 24: Optical Music Recognition

Staff removal: Neums

Page 25: Optical Music Recognition

Classifier: Neums(screenshot in Linux)

Page 26: Optical Music Recognition

Greek example

Page 27: Optical Music Recognition

GUIDO Music Notation FormatH. Hoos, K. Renz, J. Kilian

“A formal language for score-level representation”

Plain text: readable, platform independent Extensible and flexible Adequate representation NoteServer: Web/Windows GUIDO/XML NoteAbility (K. Hamel)

Page 28: Optical Music Recognition

GUIDO: An example{ [ \beamsOff | \clef<"treble"> \key<"D"> f#*1/8. g*1/16 |a*1/4. d2*1/8 d*1/4. c#*1/8 |e1*1/2 _*1/4 f#*1/8. g*1/16 |c#2*1/4. b1*1/8 a*1/4. g*1/8 || e#*1/2 f#*1/4 f#*1/8. g*1/16 |a*1/4. d2*1/8 d*1/4. c#*1/8 |e1*1/2 _*1/4 f#*1/8 g |c#2*1/4. b1*1/8 a*1/4. c#*1/8 ],

Page 29: Optical Music Recognition
Page 30: Optical Music Recognition

Conclusions

Gamera allows rapid development of domain-specific document recognition applications

Domain experts can customize and control all aspects of the recognition process

Includes an easy-to-use interactive environment for experimentation

Beta version available on Linux OS X version in preparation

Page 31: Optical Music Recognition

Projections

X-projections

Y-projections

back