24
Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M M ath ath L L iteracy iteracy : : The ability to read and write math notation.

Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and

Embed Size (px)

Citation preview

Math Literate Computers

Dorothea Blostein School of Computing, Queen’s University

CICM 2009

MMath ath LLiteracyiteracy:: The ability to read and write math notation.

In people, understanding precedes literacy. Computers are fairly literate, but with shallow understanding.

People learn to read before they learn to write. Computers are better at writing than reading.

Math literacy relates to literacy in other diagram notations:two-dimensional, domain-specific, natural languages.

Freedom to think with paper and pencil.

Computer support for typesetting, search, automated reasoning.

Goal:Goal: Smooth conversion betweenSmooth conversion betweenpaper and electronic documentspaper and electronic documents

Four Color Theorem, Appel and Haken, 1976

Math Notation - A Tool to Support Reasoning• Evolved over centuries• Additional notation is invented as needed• Many dialects

Topics

Notational conventions map between information and ink.

Writing (Generation)

Reading (Recognition)

Difficult: create anaesthetically appealing diagram

A solved problem

Difficult. An active research area.

Difficult: handle symbol recognition errors and variable layout.

Writing (Generation)

Reading, RecognitionReading (Recognition)

Conventions geared toward generation

Conventions geared toward recognition

Many Diagrams Represent the Same Information

Same use of hard conventions

Varying use of Soft conventions

RecognitionAll the diagrams lead to same information

GenerationOne path (chosen according

to user preferences) from information to diagram

Hard conventions: how to encode information. Soft conventions: how to make it readable.

Topics

Sources of Information about Math NotationSample Documents Math notation defined by use in society. Introspection.

geared toward manual typesetting.

By example. People use their judgment .

Chaundy, Barrett, Batey, The Printing of Mathematics, 1957.Wick, Rules for Typesetting Mathematics, 1965. Higham, Handbook of Writing for the Math. Sciences, 1993.

geared toward computational typesetting. Knuth, “Mathematical Typography,” Bulletin of the AMS, 1979.

for recognizing and generating math notation.

Written Descriptions

Program Code

Recognition Contestsdefine datasets and evaluation metrics. Contests at ICDAR and GREC: Arc segmentation, symbol recognition, segmenting text and graphics, raster to vector conversion, signature verification, document binarization, page segmentation.

Statistics about Math Notation: An Example

Gather statistics from training data.

Almost matches human performance in labeling bounding boxes.

Spatial relations for pairs of bounding boxes.

Top labels: most likely, based on statistics.

Ambiguity due to unknown baseline

[Wang&Faure, ICPR 1988]

Topics

Challenges in Math Recognition

Symbol recognition ( C O 0 7 > S 5 / 1 l

Several roles for symbols

Spatial relationships

Little redundancy

Handwritten notationis particularly difficult

Compilers easily handle math notation in programming languages.

2D math notation is harder: – Noise causes errors in segmenting and identifying symbols.– Can’t blame the user for mistakes.– Hard to capture 2D relationships effectively in a string.

Evaluate/compare these approaches?

The choice of software architecture is difficult to make and defend.

Procedurally-coded math syntax Coordinate grammar

Projection profile cutting Stochastic grammars & HMMs

Graph rewriting Tree rewriting

Math-Recognition Approaches

[Survey by Blostein and Grbavec, 1997]

Procedurally-coded math syntax Coordinate grammar

Projection profile cutting Stochastic grammars & HMMs

Graph rewriting Tree rewriting

Math-Recognition Approaches

No explicit definition of math syntax.

Update code in response to recognition errors.

Can get good recognition performance.

Procedurally-coded math syntax Coordinate grammar

Projection profile cutting Stochastic grammars & HMMs

Graph rewriting Tree rewriting

Math-Recognition Approaches

Apply a rule to a set of symbols: create subsets with syntactic subgoals.

A clear, well-structured representation of notational conventions.

[Anderson 1969; in Fu 77]

Attributes: xmin, ymin, xmax, ymax, xcenterm encodes meaning

horizontal cut

Procedurally-coded math syntax Coordinate grammar

Projection profile cutting Stochastic grammars & HMMs

Graph rewriting Tree rewriting

Math-Recognition Approaches

vertical cut

[Okamoto and Miao, 1992]

The order of cuts provides the tree-structure of the expression.

A simple and efficient technique.Can be applied prior to OCR.

Special handling of overlapping symbols:

Procedurally-coded math syntax Coordinate grammar

Projection profile cutting Stochastic grammars & HMMs

Graph rewriting Tree rewriting

Math-Recognition Approaches

Hidden Markov Model [Kopec, Chou 1994]

An explicit image-generation model,to drive recognition.

Applied to yellow pages & music notation.

2D stochastic context-free grammar [Chou 1989]

Find the most likely parse of the image, without segmentation.

Procedurally-coded math syntax Coordinate grammar

Projection profile cutting Stochastic grammars & HMMs

Graph rewriting Tree rewriting

Math-Recognition Approaches

Rewrite rules replace one subgraph by another

PROGRES language: a mix of textual and visual notation

Write a graph schema to define the structure of valid graphs.

The PROGRES execution environment flags violations.

Build Constrain

Parse

Parse

[Blostein, Schürr, Software Practice and Experience, 1999]

Math-Recognition Approaches

Compiler-inspired approach, using tree rewriting[Zanibbi, Blostein, Cordy: ICPR 2002 and PAMI 2002]

Separate analysis of layout, lexical, syntactic, and semantic aspects.

Get partial results even ifthere are syntax errors.

Find linear structures in the input,and create a tree from them.

Operation of a compiler

Recognition of math notation

Topics

Goal: seamless transition between - real world (stylus and paper)

- electronic world

Many paper documents are produced from electronic sources.Eventually include digitally-encoded contents?

Methods used in digital watermarking are relevant.

Electronic Paper is more advanced than Paper Electronic

Entering math expressions

• How much user time?

• How many residual errors?

• How much frustration?

Method 1: Use Recognition Software Scan a document image or write on a data tablet

Method 2: Enter information directly Type the information (e.g. LaTeX)

or use a structure-based editor

User proofreads and corrects

Generate math notation

Recognition software

Information

User Frustration

People eventually feel comfortable with irritating interfaces.

The Argh is a unit of frustration. Kilarghs. Megarghs….Arghometers need to be developed.

Document recognition is frustrating because:

1.Users don’t like to correct errors made by the “stupid computer”. Better to correct errors they made themselves.

2.Users don’t like to think about the marks on the paper.They would rather think about the document contents.

3.Users don’t like unpredictable systems. Better to adapt themselves (even if inconvenient) to achieve predictability.

[Talk at ICDAR 2001]

Possible research directions

Precisely define math literacy tasks.

Use soft conventions in recognition.

Use statistics: know about likely versus unlikely expressions.

Exploit the advanced state of generation, to improve recognition.

Topics: Notational Conventions What is Math Notation, anyway?

Math Recognition Approaches User Interface Issues

Conclusion

A group effort is required.