Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
Large Vocabulary Off–Line Handwritten Word Recognition[ Reconnaissance Hors–Ligne de Mots Manuscrits Dans un Lexique de Très Grand Dimension ]
Alessandro L. Koerich
[ École de Technologie Supérieure ]
August 2002 2/60
100–word vocabulary– Accuracy: 95.89% of the words are correctly recognized
(4,481 out of 4,674) – Speed: 2 sec/word
The Early BeginningIn 1998, the baseline recognition system developed by A. El–Yacoubi at the SRTP had the following performance:
Is it possible to overcome the accuracy and speed problems to make large vocabulary handwriting recognition feasible ?
30,000–word vocabulary– Accuracy: 73.70% of the words are correctly recognized
(3,445 out of 4,674) – Speed: 8.2 min/word
26 days to recognize the whole test setP r e
l i m
i n
a r i
e s
August 2002 3/60
The Story and The Goal
In this presentation I will show you how we have addressed the problems related to the accuracy and speed to build an off–line handwriting recognition system which has the following characteristics:– Omniwriter– Very–large vocabulary (80,000 words)– Unconstrained handwriting– Good recognition accuracy– Good recognition speed
P r e
l i m
i n
a r i
e s
August 2002 4/60
Presentation Outline
IntroductionProblem StatementThesis Contributions– Word Recognition System Based on HMMs
• Speed: Fast Two–Level HMM Decoding– Verification Module Based on NNs
• Accuracy: Post–Processing Word Hypotheses
Conclusion
I n t
r o d
u c
t I o
n
August 2002 5/60
Motivation
Current research has focused on relatively simple problems such as:– Isolated characters, digits and digit strings– Words in small and medium vocabularies– Well–determined application domains– Emphasis on the recognition accuracy
I n t
r o d
u c
t I o
n
August 2002 6/60
Limitations
I n t
r o d
u c
t I o
n
August 2002 7/60
Recent Results in Handwriting Recognition
We can conclude that large vocabulary handwritingrecognition is a challenge research subject.I n
t r o
d u
c t
I o n
August 2002 8/60
Applications of LVHR
Postal applications (addresses)Reading of handwritten notesFax transcriptionGeneric text recognitionInformation retrievalReading of handwritten fields in formsPen–pad devices
I n t
r o d
u c
t I o
n
August 2002 9/60
Presentation Outline
IntroductionProblem StatementThesis Contributions– Word Recognition System Based on HMMs
• Speed: Fast Two–Level HMM Decoding– Verification Module Based on NNs
• Accuracy: Post–Processing Word Hypotheses
Conclusion
P r o
b l
e m
s t
a t e
m e
n t
August 2002 10/60
Why Handwriting Recognition is Difficult ?
General Problems– Ambiguity of shapes– Writing styles
Large Vocabulary Problems– Segmentation of words into characters– Accuracy– Recognition speed
0 20,000 40,000 60,000 80,0000
200
400
600
800
1000
Lexicon Size (number of words)
Rec
ogni
tion
Tim
e (s
ec/w
ord)
0 20,000 40,000 60,000 80,00065
70
75
80
85
90
95
100
Lexicon Size (number of words)
Wor
d R
ecog
nitio
n R
ate
(%)
P r o
b l
e m
s t
a t e
m e
n t
August 2002 11/60
The Complexity of Handwriting Recognition
The complexity of the recognition process in the case of unconstrained handwriting is:
C = O (TN2HLV)
T: length of the observation sequenceN: number of parameters of the character modelsH: number of models per character classL: number of characters V: vocabulary size
Assuming typical values of the parameters(T=30, L=10, N=10, V=80,000, H=2)and a standard Viterbi algorithm
But current personal computers can perform between 1 GFLOPS and 3 GFLOPS !
245 GFLOPS
P r o
b l
e m
s t
a t e
m e
n t
August 2002 12/60
Current Methods for LVHR [speed]Lexicon pruning (prior to the recognition)– Application environment– Word length and shape
Organization of the search space– Lexical tree x Flat lexicon
Search strategy– Viterbi beam search– A*– Multi–pass
Most of these methods are not very efficient or/and they introduce errors which affect the recognition accuracy.P
r o b
l e
m s
t a
t e m
e n
t
August 2002 13/60
Current Methods for LVHR [accuracy]
Improvements in accuracy are associated with:– Feature set– Modeling of reference patterns– More than one model for each character class– Combination of different feature sets / classifiers
The complexity of the recognition process has been steadily increasing with the recognition accuracy.
P r o
b l
e m
s t
a t e
m e
n t
August 2002 14/60
The ChallengeWe have to account for two aspects that are in mutual conflict: recognition speed and recognition accuracyIn order to build an omniwriter large vocabulary off–line handwritten word recognition system (80,000 words)
It is relatively easy to improve the recognition speed while trading away some accuracy. But it is much harder to improve the recognition speed while preserving (or even improving) the original accuracy.
P r o
b l
e m
s t
a t e
m e
n t
August 2002 15/60
Methodology: Speed
Build a LV word recognition system based on HMMs to generate a list of N–best word hypotheses as well as the segmentation of such word hypotheses into charactersProblem: Current decoding algorithms are not efficient to deal with large vocabulariesSolution: Speedup the recognition process using a novel decoding strategy that reduces the repeated computation and preserves the recognition accuracy
P r o
b l
e m
s t
a t e
m e
n t
August 2002 16/60
Methodology: Accuracy
Build a verification module to post–process the N–best word hypotheses generated by the LVS which uses a different recognition strategy that focuses on the weaknesses of the HMM approach.Problem: The verification module should not cause delays in the recognition processSolution: Model segmented characters with NNsand rescore the N–best word hypotheses based on the combination of character probabilities
P r o
b l
e m
s t
a t e
m e
n t
August 2002 17/60
Methodology: Accuracy
Combine the LV word recognition system based on HMMs with the verification module based on NNs to improve the overall recognition accuracyReject a word hypothesis based on the composite score provided by the recognition–verification system to further improve the recognition accuracy
P r o
b l
e m
s t
a t e
m e
n t
August 2002 18/60
Presentation Outline
IntroductionProblem StatementThesis Contributions– Word Recognition System Based on HMMs
• Speed: Fast Two–Level HMM Decoding– Verification Module Based on NNs
• Accuracy: Post–Processing Word Hypotheses
Conclusion
Thes
is C
ontr
ibut
ions
August 2002 19/60
Word Recognition System Based on HMMs
Use some modules of a baseline recognition systemSegmentation–recognition approachLexicon–driven approach where character HMMs are concatenated to build up words according to the lexiconFast two–level HMM decoding algorithm to generate a list with the N–best word hypotheses as well as their definitive segmentation into charactersGlobal recognition approach to account for unconstrained handwriting
P A
ap
R
r
I
i
S
s
UU
LL
UL
LU
0U
0L
B E
UU
LL
UL
LU
UU
LL
UL
LU
UU
LL
UL
LUThes
is C
ontr
ibut
ions
August 2002 20/60
Word Recognition System Based on HMMs
?
Recognition
Baseline RecognitionSystem
Lexicon Decoding
N-Best Word Hypotheses
CAUBEYRESCOUBEYRACCAMPAGNACCOURPIGNACCAMBAYRACCOMPREGNACCAMPAGNAN
-22.350-23.063-23.787-24.028-24.093-24.097-24.853
Post-Processing
Verification
Rejection
Best Word Hypothesis
CAMPAGNAC -31.434
Thes
is C
ontr
ibut
ions
August 2002 21/60
Pre–Processing
(a)
(b)
(c)
(d)
(e)
Original image
Baseline slant normalization
Character skew correction
Lowercase character area normalization
Final image after smoothing
Thes
is C
ontr
ibut
ions
August 2002 22/60
Segmentation & Feature Extraction
Loose segmentation
Global features
Contour transition histogram features
Segmentationfeatures
Thes
is C
ontr
ibut
ions
August 2002 23/60
Character Model and Training
Characters are modeled by 10–state HMMsObservations are emitted along transitionsBaum–Welch algorithm to estimate the best parameter values of the character HMMsTh
esis
Con
trib
utio
ns
August 2002 24/60
Other Contributions
Multiple Character ModelsLexicon–Driven Level Building AlgorithmConstrained Level Building Algorithm
These contributions are detailed in:– Ph.D. Thesis – Proc. of the ICAPR 2001 [Koerich01]– IJDAR [Koerich02]
Thes
is C
ontr
ibut
ions
August 2002 25/60
Fast Two–Level HMM Decoding (1)
We have observed that there is a great number of repeated computation during the decoding of words in the lexicon.The current algorithms decode an observation sequence in a time–synchronous fashion;The probability scores of a character (λl) within a word depends on the probability scores of the immediate preceding character (λl–1);
Thes
is C
ontr
ibut
ions
August 2002 26/60
Fast Two–Level HMM Decoding (2)
During the recognition is it possible to decode the character “a” only once since it is always represented by the same character model ?Th
esis
Con
trib
utio
ns
August 2002 27/60
Fast Two–Level HMM Decoding (3)
To solve this problem of repeated computation, we have proposed a new algorithm that breaks up the decoding of words into two levels:– First Level: Character HMMs are decoded
considering each possible entry and exit point in the trellis and the results are stored into arrays.
– Second Level: Words are decoded but reusing the results of first level. Only character boundaries are decoded.
Thes
is C
ontr
ibut
ions
August 2002 28/60
FTL HMM Decoding: First Level (4)A
P(1,3)
max
P(1,4)P(1,5)P(1,6)P(1,7)
B Z. . .
A
s = 1
e = 3 e = 4 e = 5 e = 6 e = 7
s = 2
P(2,4)P(2,5)P(2,6)P(2,7)
P(3,5)P(3,6)P(3,7)
s = 3 s = 4
P(4,6)P(4,7)
s = 5
P(5,7)
BZ
P(1,3)P(1,4)P(1,5)P(1,6)P(1,7)
P(2,4)P(2,5)P(2,6)P(2,7)
P(3,5)P(3,6)P(3,7)
P(4,6)P(4,7) P(5,7)
P(1,3)P(1,4)P(1,5)P(1,6)P(1,7)
P(2,4)P(2,5)P(2,6)P(2,7)
P(3,5)P(3,6)P(3,7)
P(4,6)P(4,7) P(5,7)
We end up with one array of likelihoods for eachcharacter HMM
(1,1) (2,1) (3,1) . . . (T,1)
.
.
.
.
.
.
(1,2) (2,2) (3,2) . . . (T,2)(1,3) (2,3) (3,3) . . . (T,3)
(1,T) (2,T) (3,T) . . . (T,T) . . .
(1,1) (2,1) (3,1) . . . (T,1)
.
.
.
.
.
.
(1,2) (2,2) (3,2) . . . (T,2)(1,3) (2,3) (3,3) . . . (T,3)
(1,T) (2,T) (3,T) . . . (T,T) . . .
(1,1) (2,1) (3,1) . . . (T,1)
.
.
.
.
.
.
(1,2) (2,2) (3,2) . . . (T,2)(1,3) (2,3) (3,3) . . . (T,3)
(1,T) (2,T) (3,T) . . . (T,T) . . .
(1,1) (2,1) (3,1) . . . (T,1)
.
.
.
.
.
.
(1,2) (2,2) (3,2) . . . (T,2)(1,3) (2,3) (3,3) . . . (T,3)
(1,T) (2,T) (3,T) . . . (T,T) . . .
(1,1) (2,1) (3,1) . . . (T,1)
.
.
.
.
.
.
(1,2) (2,2) (3,2) . . . (T,2)(1,3) (2,3) (3,3) . . . (T,3)
(1,T) (2,T) (3,T) . . . (T,T) . . .
(1,1) (2,1) (3,1) . . . (T,1)
.
.
.
.
.
.
(1,2) (2,2) (3,2) . . . (T,2)(1,3) (2,3) (3,3) . . . (T,3)
(1,T) (2,T) (3,T) . . . (T,T) . . .
Es-sCu|Thes
is C
ontr
ibut
ions
August 2002 29/60
(1,1) (2,1) (3,1) . . . (T,1)
.
.
.
.
.
.
(1,2) (2,2) (3,2) . . . (T,2)(1,3) (2,3) (3,3) . . . (T,3)
(1,T) (2,T) (3,T) . . . (T,T) . . .
(1,1) (2,1) (3,1) . . . (T,1)
.
.
.
.
.
.
(1,2) (2,2) (3,2) . . . (T,2)(1,3) (2,3) (3,3) . . . (T,3)
(1,T) (2,T) (3,T) . . . (T,T) . . .
ABY
FTL HMM Decoding: Second Level (5)A B Z
. . .
Lexicon
(1,1) (2,1) (3,1) . . . (T,1)
.
.
.
.
.
.
(1,2) (2,2) (3,2) . . . (T,2)(1,3) (2,3) (3,3) . . . (T,3)
(1,T) (2,T) (3,T) . . . (T,T) . . .
(1,1) (2,1) (3,1) . . . (T,1)
.
.
.
.
.
.
(1,2) (2,2) (3,2) . . . (T,2)(1,3) (2,3) (3,3) . . . (T,3)
(1,T) (2,T) (3,T) . . . (T,T) . . .
(1,1) (2,1) (3,1) . . . (T,1)
.
.
.
.
.
.
(1,2) (2,2) (3,2) . . . (T,2)(1,3) (2,3) (3,3) . . . (T,3)
(1,T) (2,T) (3,T) . . . (T,T) . . .
(1,1) (2,1) (3,1) . . . (T,1)
.
.
.
.
.
.
(1,2) (2,2) (3,2) . . . (T,2)(1,3) (2,3) (3,3) . . . (T,3)
(1,T) (2,T) (3,T) . . . (T,T) . . .
(1,1) (2,1) (3,1) . . . (T,1)
.
.
.
.
.
.
(1,2) (2,2) (3,2) . . . (T,2)(1,3) (2,3) (3,3) . . . (T,3)
(1,T) (2,T) (3,T) . . . (T,T) . . .
(1,1) (2,1) (3,1) . . . (T,1)
.
.
.
.
.
.
(1,2) (2,2) (3,2) . . . (T,2)(1,3) (2,3) (3,3) . . . (T,3)
(1,T) (2,T) (3,T) . . . (T,T) . . .
[ ]),1()1(ˆmax)(ˆ1
tsll tTst +−=≤≤
χδδWe do not need to decode HMM states which is the most complex computation during the decoding (N2T)
Thes
is C
ontr
ibut
ions
August 2002 30/60
Word Recognition System Based on HMMs
The output of the Word Recognition System Based on HMMs is a list with the N–best word hypotheses, the segmentation of such word hypotheses into characters and likelihoods.
Thes
is C
ontr
ibut
ions
August 2002 31/60
Performance of the Word Recognition System
Two Lexicons: 36,015 words and 85,092 wordsTest set: 4,674 unconstrained words (city names)Three Platforms: SUN Ultra1, SUN Enterprise 6000, and AMD Athlon 1.1GHz
Thes
is C
ontr
ibut
ions
August 2002 32/60
Performance of the Word Recognition System
(Baseline)Conventional Viterbi
(FS Tree)
20 sec
8 minutes !!!
Thes
is C
ontr
ibut
ions
August 2002 33/60
Distributed Recognition Scheme
We explore the potential for speeding up the recognition process via distributed processingSplit the lexicon, partitioning the recognition task among several processorsThe speedup in the recognition process is proportional to the number of processors
Thes
is C
ontr
ibut
ions
August 2002 34/60
Performance of the Distributed Recognition Scheme
1 2 4 6 8 102
4
6
8
10
12
14
16
18
20
22
Number of Threads (Processors)
Rec
ogni
tion
Tim
e (s
ec/w
ord)
30,000–word vocabulary
Thes
is C
ontr
ibut
ions
August 2002 35/60
Summary of Speed Improvements
0 20 40 60 80 100 12070
71
72
73
74
Speedup (× BLN)
Wor
d R
ecog
nitio
n R
ate
(%)
CLBA
FS tree
FS flat
CDCLBA
BLN
LDLB
A
DS
30,000–word vocabulary
Thes
is C
ontr
ibut
ions
August 2002 36/60
Summary of Speed Improvements
DS FSTree
FSTree
Conventional Viterbi
MFLOPSComputational Complexity
Search Strategy
+
PLVMNT 22
LVHTN 2
)( 22 LVMNT +
800,4
500
50
V is the dominant variableT is quadratic but its magnitude is lowThe FS is advantageous while T < N2Th
esis
Con
trib
utio
ns
August 2002 37/60
Presentation Outline
IntroductionProblem StatementThesis Contributions– Word Recognition System Based on HMMs
• Speed: Fast Two–Level HMM Decoding– Verification Module Based on NNs
• Accuracy: Post–Processing Word Hypotheses
Conclusion
Thes
is C
ontr
ibut
ions
August 2002 38/60
Problems of the HMM ApproachThe features are not very discriminant at character level (trade–off between segmentation and recognition)Local information is somewhat overlookedConditional independence prevents an HMM from taking full advantage of the correlation that exists among the observations of a single character
Thes
is C
ontr
ibut
ions
August 2002 39/60
Solution ?Power of the HMM approach to segment words into characters and an NN classifier to to model isolated charactersProposal: To integrate HMMs (segmentation+recognition) and NNs(verification) to overcome the problems of particular methods.Motivation:
Thes
is C
ontr
ibut
ions
August 2002 40/60
Recognition–Verification Scheme
Thes
is C
ontr
ibut
ions
August 2002 41/60
Motivation & Facts
Segmentation of words into characters which is is carried out by the HMM recognizer is somewhat reliable– 8,005 out of 10,006 words images from the training
dataset were correctly segmented (80%).The difference in the recognition rate between the TOP 1 and TOP10 word hypotheses is greater than 15% (80,000–word)
1 5 10 20 50 10065
70
75
80
85
90
95
Top N Best Word Hypotheses
Wor
d R
ecog
nitio
n R
ate
(%)
80,000–word vocabulary
Thes
is C
ontr
ibut
ions
August 2002 42/60
Feature Extraction
Horizontal and VerticalProjections(10 + 10)
Top, Bottom, Right and LeftProfiles
(10 + 10 + 10 + 10)
Directional ContourHystogram (6 zones)(8 + 8 + 8 + 8 + 8 + 8)
Thes
is C
ontr
ibut
ions
August 2002 43/60
Neural Network Classifier
MLP classifier (108–90–26) Uppercase + lowercase MetaclassesEstimates Bayesian a posteriori probabilities given the character class
.
.
.
.
.
.
1
2
3
108
.
.
.
.
.
.
Output Layer
A
B
C
Z
.
.
.
Hidden LayerInput LayerThes
is C
ontr
ibut
ions
August 2002 44/60
Probability Estimation by NNoriginal image
loose segmentation
final segmentation
probability estimation
word hypothesis
word probability score
Word recognition system
Thes
is C
ontr
ibut
ions
August 2002 45/60
Combination of Character Scores
∏=
==ΨH
hn
h
hhnnNN LP
cPxcPSLP
1
)()(
)|()|(
Thes
is C
ontr
ibut
ions
August 2002 46/60
Combinations of HMM and NN Scores
Thes
is C
ontr
ibut
ions
August 2002 47/60
Combinations of HMM and NN Scores
Weighted Rule: PCOMB = α log (PHMM) + β log(PNN)
∑=
Ψ
Ψ= N
n
nNN
nNN
NNP
1
∑=
Ψ
Ψ= V
n
nHMM
nHMM
HMMP
1
Thes
is C
ontr
ibut
ions
August 2002 48/60
Reevaluation of the Word Recognition
PC AMD Athlon 1.1GHz 512MB RAM85,092–word vocabulary (French + US + Italian + Brazilian + Québec city names)
Thes
is C
ontr
ibut
ions
August 2002 49/60
Performance of the Recognition+Verification Approach
Word recognition rate based on:– Output of the word recognition system alone (HMM
FS Tree)– Output of the verification module alone (scores
produced by the NN)– Combination of the recognition and verification
(HMM + NN)
10 1k 10k 40k 80k65
70
75
80
85
90
95
100
Lexicon Size (Number of Words)
Wor
d R
ecog
nitio
n R
ate
(%)
Baseline (FS Tree)MLPBaseline+MLP Weighted
Word Recognition (HMM)Verification (NN)Recognition+Verification (HMM+NN)
Thes
is C
ontr
ibut
ions
August 2002 50/60
Recognition + Verification
For a recognition task considering a 80,000–word vocabulary and TOP 10 word hypotheses:The word recognition system based on HMMs takes 16s to generate a list of the TOP 10 word hypotheses (Preprocessing+Segmentation+FeatExtraction+Recognition)The verification process takes less than 110ms(FeatExtraction+Recognition+Combination+Reranking)Therefore, the verification process takes less than 1% of the time required to generate the TOP 10 word hypothesis list.
Thes
is C
ontr
ibut
ions
August 2002 51/60
Rejection
We need only one word hypothesis at the endIs the word hypotheses provided by the recognizer trustworthy ?We have used the probability scores of the TOP N word hypotheses to establish a rejection criteriaRejection Rule
P*2 ≤ P*
1 (1 + k )
Thes
is C
ontr
ibut
ions
August 2002 52/60
Rejection of Word Hypotheses
0 10 20 30 40 50 6065
70
75
80
85
90
95
100
Rejection (%)
Wor
d R
ecog
nitio
n R
ate
(%)
BaselineBaseline + Verification
80,000–word vocabulary
Thes
is C
ontr
ibut
ions
August 2002 53/60
Presentation Outline
IntroductionProblem StatementThesis Contributions– Word Recognition System Based on HMMs
• Speed: Fast Two–Level HMM Decoding– Verification Module Based on NNs
• Accuracy: Post–Processing Word Hypotheses
Conclusion
C o
n c
l u
s i o
n
August 2002 54/60
Conclusion
We have built an omniwriter off–line handwritten word recognition system that deals efficiently with large and very–large vocabularies, unconstrained handwriting styles, and runs on personal computers with an acceptable performance.
BEFORE30,000–word lexiconAccuracy: 73.70%Speed: 8.2 min/word
BEFORE80,000–word lexicon
Accuracy: 68.65% (86%)Speed: 16.2 min/word
NOW30,000–word lexicon
Accuracy: 73.70%Speed: 20.2 sec/word
4.2 sec/word (Distr)
NOW80,000–word lexicon
Accuracy: 78.05% (95%)Speed: 63.2 sec/wordC
o n
c l
u s
i o n
August 2002 55/60
Summary of the Contributions (1)How to improve the recognition speed while preserving the recognition accuracy
Problem
Idea
Solution
Results
To avoid the repeated computation of the probabilities of the same character HMMs given an observation sequenceTo breakup the computation of characters and words Fast Two–Level HMM Decoding AlgorithmSpeedup (24x to 120x) of the recognition process while maintaining exactly the same accuracy
C o
n c
l u
s i o
n
August 2002 56/60
Summary of the Contributions (2)How to improve the recognition accuracy without affecting the recognition speed achieved by the FS
Problem
Idea
Solution
Results
To postprocess the N–best word hypothesis list by different recognition approach based on character recognitionNeural network classifier which estimates character probabilities and rescores the N–best word hypothesis list Verification of Word Hypotheses
Improvement of 6.7% (10,000 words) to 9.4% (80,000 words) in the word recognition rateVerification process# takes less than 1% of the time required to generate a TOP 10 word hypothesis list for an 80,000–word vocabulary*.
#110msec on an PC Athlon 1.1GHz *16sec on an PC Athlon 1.1GHz
C o
n c
l u
s i o
n
August 2002 57/60
Summary of the Contributions (3)How to find out if the word hypothesis ranked as TOP 1 is trustworthy (how to reduce the error rate)
Problem
Idea
Solution
Results
To reject a word hypothesis based on the composite score provided by the recognition–verification system
Reject the word hypotheses based on the score of the TOP 1 and TOP 2 word hypothesis Rejection MechanismImprovement of about 27% (80,000 words) in the word recognition rate by rejecting 30% of the word hypothesesImprovement of about 10% (80,000 words) in the word recognition rate by rejecting 30% of the word hypotheses based on the composite score relative to the baseline recognition system aloneC
o n
c l
u s
i o n
August 2002 58/60
Conclusion and ImplicationsWe have overcame the accuracy and speed problems to make LVHR feasible– The Fast Two–Level HMM Decoding Algorithm has speeded
up significantly the recognition process without affecting the accuracy
– The integration of two different classifiers (HMMs and NNs) has improved significantly the recognition accuracy without affecting the recognition speed
The proposed LV recognition system is one of the first recognition systems that deals with:– Unconstrained handwriting– Very–large vocabulary– OminiwriterC
o n
c l
u s
i o n
August 2002 59/60
Conclusion and ImplicationsThe results obtained are still far from meeting the throughput requirements of many applicationsHowever, the results are very promising and they may stimulate further research on LV
C o
n c
l u
s i o
n
August 2002 60/60
Future WorkSPEED:– Use of heuristics into the fast two–level HMM decoding
algorithm to further speedup the recognition process– Application of the FTL HMM decoding in other domains like
speech recognitionACCURACY:– Improvement of the duration model of isolated characters– Verification of undersegmented and oversegmented
characters– Automatic selection of the N–best word hypotheses
C o
n c
l u
s i o
n
August 2002 61/60
Thank you for your attention !
T H
E
E N
D
August 2002 62/60
Multiple Character Models
How to integrate multiple character models into the lexical tree search ?
August 2002 63/60
Multiple Character ModelsThe use of the maximum approximation to select at each level only the more likely model.
Advantage• Linear growth• H(L–LS)V
Disadvantage• Suboptimal
August 2002 64/60
The Recognition Process
A B T
Level2
. . .
Level1
4 5 6
9
8
101 2
37
'A'
'a'
1 2 3 ...... T
Level1
Level9
Level9
. . .
. . .
. . .
. . .
. . .
. . .
. . .. . .
. . .
. . .
. . .
. . .
. . .. . .
4 5 6
9
8
101 2
37
ABBECOURT
4 5 6
9
8
101 2
37
'B'
'b'
4 5 6
9
8
101 2
37
Level2
4 5 6
9
8
101 2
37
'T'
't'
4 5 6
9
8
101 2
37
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .
(a)
(b)
(e)
(c)
E - C | r r B B - rL - H H - - o o - - s s u s s s # u u #
ABBECOURTABILLYABIMES
.
.
.ERVILLEYVIGNACZONZA
Pt*(1)Bt*(1)Wt*(1)Pt*(2)Bt*(2)Wt*(2) . . .
Pt*(9)Bt*(9)Wt*(9)
. . .
. . .
. . .
. . .
. . .. . .
(f)
. . .
ABBECOURTABILLYABIMES...ERVILLEYVIGNACZONZA
PT*(9)
August 2002 65/60
A Paradigm for HR
August 2002 66/60
The Complexity of Handwriting Recognition
Assuming typical values of the parameters (T=30, L=10, M=10, V=80,000, H=2) and a standard Viterbi algorithm:– Generic Recognition Process
C = O (TM2LV) = 2.4 GFLOPS– With Two Writing Styles
C = O (HTM2LV) = 4.8 GFLOPS– Mixture of Writing Styles
C = O (HLTM2V) = 245 GFLOPS
August 2002 67/60
Fast Two–Level Algorithm (1)
Thes
is C
ontr
ibut
ions
: Spe
ed
August 2002 68/60
Fast Two–Level Algorithm (2)
Thes
is C
ontr
ibut
ions
: Spe
ed
August 2002 69/60
Fast Two–Level Algorithm (3)Th
esis
Con
trib
utio
ns: S
peed
August 2002 70/60
Why Handwriting Recognition ?
To facilitate the transfer of information between people and machinesEasy way of interacting with a computerTo automate the processing of a great number of handwritten documents
I n t
r o d
u c
t I o
n
August 2002 71/60
Requirements
Focus on the weaknesses of the word recognition system (HMMs);Better discrimination of similar shapes;Results should be systematically integrable with the HMMs;The verification should not cause delays in the recognition process