4
International Journal of Engineering Sciences, 2(1) January 2013, Pages: 14-17 TI Journals International Journal of Engineering Sciences www.waprogramming.com ISSN 2306-6474 * Corresponding author. Email address: [email protected] A Brief Review and Survey of Segmentation for Character Recognition Rohini B. Kharate *1 , Dr.S.M.Jagade 2 , Sushilkumar N. Holambe 3 1 P.G. Student (ME-E&TC), TPCT’s College of Engineering, Osmanabad (M.S.) India. 2 Principal, TPCT’s College of Engineering, Osmanabad (M.S.) India. 3 T.P.C.T's college of Engineering Osmanabad- India. ARTICLE INFO ABSTRACT Keywords: Image Segmentation Euclidean Distance Metric Pattern Feature Extraction Recognition General Terms: Character recognition gray-scale character recognition multistage graph search recognition-based segmentation Training Learning rule Distorted patterns In the proposed methodology, the character segmentation regions are determined by using projection profiles and topographic features extracted from the gray-scale images. Then a nonlinear character segmentation path in each character segmentation region is found by using multi-stage graph search algorithm. Finally, in order to confirm the nonlinear character segmentation paths and recognition results, recognition-based segmentation method is adopted. Through the experiments with various kinds of printed documents, it is convinced that the proposed methodology is very effective for the segmentation and recognition of touched and overlapped characters. This paper is an approach to develop a method to get the optimized results using the easily available resources .Segmentation helps to extract out some common features among distinct handwriting styles of different people. © 2013 Int. j. eng. sci. All rights reserved for TI Journals. 1. Introduction Character recognition, is known as OCR (Optical Character Recognition) is an area within the pattern recognition. Optical Character Recognition deals with automatic recognition of different characters in a document image leading to clear and unambiguous recognition, analysis and understanding of the document content. The task of recognition can be broadly separated into two categories: machine printed data and the handwritten data. Machine printed characters are consisting of font used by user. They are unique and uniform. While handwritten characters are non-uniform; there size, shape depends on writer and the pen used by the writer. Handwriting of same writer may vary depending on the situation in which he is writing. But it is very difficult to design software, which is capable of identifying the characters with great accuracy. Basically, identification of the characters is one type of pattern recognition process. Variation in handwriting leads to great difficulty in identifying the character patterns. Different writing styles lead to the distortion in patterns \from the standard patterns used to train the network, giving false results. A strong generalization method is required to identify the distorted patterns. Multiscale Training Technique (MST) is used in many places to solve the generalization problem [1, 2, and 3]. Results of MST depend largely on resolution of the character images. Image resolution and the training speed have to be optimized to achieve the highest percentage of accuracy. Work has been done on identification of the characters in Devnagri script by combining multiple feature extraction techniques like intersection, shadow feature, chain code histogram and straight line fitting, [4]. Another approach towards feature extraction technique is to calculate only twelve directional feature inputs depending upon the gradients, where the features of the hand written characters are the directions of the pixels with respect to their neighboring pixels, [5]. Hybrid methods are also applied to recognize the hand written characters. One such method is a prototype learning/matching method that can be combined with support vector machines (SVM) in pattern recognition, [6].K-nearest neighbor methods can be used to recognize the patterns, [7]. In K-nearest neighbor method the pattern is obtained by looking into k number of nearest patterns having the least Euclidean distance with that of the pattern. Artificial Neural Nets (perceptron learning) are used to train the nets and later using the nets identifying the characters, [8, 9, and 10]. But obtaining 100% accuracy is still a challenge for many such nets. In this paper a concept of recognizing hand written character pattern has been developed and implemented called Row-wise segmentation technique. RST helps in minimizing errors in pattern recognition due to different handwriting styles to great extent. In this method input pattern matrix is segmented row-wise into different groups. Target pattern is also grouped where each group is the numeric equivalent of the chronological position of each English alphabet. Each input segment is fully interconnected with each target group. Number of target groups is equal to the number of rows in the input matrix. In general, the overall program has been divided into two parts, training and testing. Training requires the net to read segmented input patterns and testing requires the net to read any test character pattern, to read the produced target samples and to count the majority of samples and to find out the numeric equivalent of the sample to identify the character.

A Brief Review and Survey of Segmentation for Character Recognition

Embed Size (px)

DESCRIPTION

A Brief Review and Survey of Segmentation for Character Recognition

Citation preview

Page 1: A Brief Review and Survey of Segmentation for Character Recognition

International Journal of Engineering Sciences, 2(1) January 2013, Pages: 14-17

TI Journals

International Journal of Engineering Sciences www.waprogramming.com

ISSN 2306-6474

* Corresponding author. Email address: [email protected]

A Brief Review and Survey of Segmentation for Character Recognition

Rohini B. Kharate *1, Dr.S.M.Jagade 2, Sushilkumar N. Holambe 3 1 P.G. Student (ME-E&TC), TPCT’s College of Engineering, Osmanabad (M.S.) India. 2 Principal, TPCT’s College of Engineering, Osmanabad (M.S.) India. 3 T.P.C.T's college of Engineering Osmanabad- India.

A R T I C L E I N F O A B S T R A C T

Keywords: Image Segmentation Euclidean Distance Metric Pattern Feature Extraction Recognition General Terms: Character recognition gray-scale character recognition multistage graph search recognition-based segmentation Training Learning rule Distorted patterns

In the proposed methodology, the character segmentation regions are determined by using projection profiles and topographic features extracted from the gray-scale images. Then a nonlinear character segmentation path in each character segmentation region is found by using multi-stage graph search algorithm. Finally, in order to confirm the nonlinear character segmentation paths and recognition results, recognition-based segmentation method is adopted. Through the experiments with various kinds of printed documents, it is convinced that the proposed methodology is very effective for the segmentation and recognition of touched and overlapped characters. This paper is an approach to develop a method to get the optimized results using the easily available resources .Segmentation helps to extract out some common features among distinct handwriting styles of different people.

© 2013 Int. j. eng. sci. All rights reserved for TI Journals.

1. Introduction Character recognition, is known as OCR (Optical Character Recognition) is an area within the pattern recognition. Optical Character Recognition deals with automatic recognition of different characters in a document image leading to clear and unambiguous recognition, analysis and understanding of the document content. The task of recognition can be broadly separated into two categories: machine printed data and the handwritten data. Machine printed characters are consisting of font used by user. They are unique and uniform. While handwritten characters are non-uniform; there size, shape depends on writer and the pen used by the writer. Handwriting of same writer may vary depending on the situation in which he is writing. But it is very difficult to design software, which is capable of identifying the characters with great accuracy. Basically, identification of the characters is one type of pattern recognition process. Variation in handwriting leads to great difficulty in identifying the character patterns. Different writing styles lead to the distortion in patterns \from the standard patterns used to train the network, giving false results. A strong generalization method is required to identify the distorted patterns. Multiscale Training Technique (MST) is used in many places to solve the generalization problem [1, 2, and 3]. Results of MST depend largely on resolution of the character images. Image resolution and the training speed have to be optimized to achieve the highest percentage of accuracy. Work has been done on identification of the characters in Devnagri script by combining multiple feature extraction techniques like intersection, shadow feature, chain code histogram and straight line fitting, [4]. Another approach towards feature extraction technique is to calculate only twelve directional feature inputs depending upon the gradients, where the features of the hand written characters are the directions of the pixels with respect to their neighboring pixels, [5]. Hybrid methods are also applied to recognize the hand written characters. One such method is a prototype learning/matching method that can be combined with support vector machines (SVM) in pattern recognition, [6].K-nearest neighbor methods can be used to recognize the patterns, [7]. In K-nearest neighbor method the pattern is obtained by looking into k number of nearest patterns having the least Euclidean distance with that of the pattern. Artificial Neural Nets (perceptron learning) are used to train the nets and later using the nets identifying the characters, [8, 9, and 10]. But obtaining 100% accuracy is still a challenge for many such nets. In this paper a concept of recognizing hand written character pattern has been developed and implemented called Row-wise segmentation technique. RST helps in minimizing errors in pattern recognition due to different handwriting styles to great extent. In this method input pattern matrix is segmented row-wise into different groups. Target pattern is also grouped where each group is the numeric equivalent of the chronological position of each English alphabet. Each input segment is fully interconnected with each target group. Number of target groups is equal to the number of rows in the input matrix. In general, the overall program has been divided into two parts, training and testing. Training requires the net to read segmented input patterns and testing requires the net to read any test character pattern, to read the produced target samples and to count the majority of samples and to find out the numeric equivalent of the sample to identify the character.

Page 2: A Brief Review and Survey of Segmentation for Character Recognition

A Brief Review and Survey of Segmentation for Character Recognition Internat ional Journal of Engineeri ng Sciences, 2(1) January 2013

15

2. Character Recognition & Research Document image understanding and analysis means that transforms the information of a document in the from of paper into an electronic format i.e. text without manual keyboard entry. It is still challenging and interesting task to design a system which gives high recognition accuracy, without considering quality of the input document and different character font style variation. Optical Character Recognition (OCR) is the process of translating images of handwritten, printed text into a format understood by machines. The worlds information of literature, history, and other information is in hard-copy documents. OCR systems convert this information by converting the text on paper into electronic form. OCR system based on segmentation can be divided as preprocessing of given input, segmentation, feature extraction and Classification. For document analysis we have to use OCR system based on segmentation. We can avoid preprocessing if the document is noise free but its not possible because when we scan the noise appear in the document image. Segmentation is the decomposition of an image into sub images. Segmentation is dependent on local decisions with regards to shape similarity, as well as global decisions with regards to surrounding context. In the late 1960s and 1970s researchers observed that segmentation cause more errors in reading characters, whether hand or machine-printed. In the 1980’s researchers give new dimension to OCR to less constrained documents [1]. Some authors have surveyed segmentation [2] [3] [4] [5] [6] [7], or document analysis [8] [9] more details can be found in [10].Many of techniques have been developed for Latin but for Devnagari Veena Bansal and R.M.K. Sinha [11][12], U. Garain, B. B. Chaudhuri [15], had done but still we are lagging. More research in noise free, error free and distortion free segmentation technique is required for high accuracy and recognition for Devanagari document processing and OCR. The extensive applications of Handwritten Character Recognition (HCR) in recognizing the characters in bank checks and car plates etc. have caused the development of various new HCR systems such as optical character recognition (OCR) sys-tem. There are so many techniques of pattern recognition such as template matching, neural networks, syntactical analysis, wavelet theory, hidden Markov models, Bayesian theory and minimum distance classifiers etc. These techniques have been explored to develop robust HCR systems for different languages such as English (Numeral) [1-3], Farsi [4], Chinese (Kanji) [5,6], Hangul (Korean) scripts [7], Arabic script [8] and also for some Indian languages like Devnagari [9], Bengali [10], Telugu [11-13] and Gujarati [14]. HCR is an area of pattern recognition process that has been the subject of considerable research during the last few decades. Machine simulation of human functions has been a very challenging research field since the advent of digital computers [15]. The ultimate objective of any HCR system is to simulate the human reading capabilities so that the computer can read, understand, edit and do similar activities as human do with the text. Mostly, English language is used all over the world for the communication purpose, also in many Indian offices such as railways, passport, income tax, sales tax, defense and public sector undertakings such as bank, insurance, court, economic centers, and educational institutions etc. A lot of works of handwritten English character recognition have been published but still minimum training time and high recognition accuracy of handwritten English character recognition is an open problem. Therefore, it is of great importance to develop automatic handwritten character recognition system for English language. In this paper, efforts have been made to develop automatic handwritten character recognition system for English language with high recognition accuracy and minimum classification time. Handwritten character recognition is a challenging problem in pattern re-cognition area.

One work is due to Kumar & Singh [13] and they proposed Zernike moments based approach for English hand written character recognition . The other work on English hand written character recognition is proposed [14] and 64 dimensional chain code features have been used in that work, but still no any standard OCR is available for the same. An excellent survey of the area is given in [15].For recognition of handwritten numerals, Ramakrishnan et al. [16] used independent component analysis technique for feature extraction from numeral images.Considered a strategy combining decisions of multiple classifiers. In all these three studies, very small sets of samples were considered. In an attempt to develop a bilingual handwritten numeral recognition system, Lehal and Bhatt [17] used a set of global and local features derived from the right and left projection profiles of the numeral images for recognition of handwritten numerals of english and Roman scripts. 3. The OCR Data Set Creation We have collected handwritten characters from different peoples of different age group (i.e. 03 to 75), i.e. of 7000 people of different age groups, the datasheet used for collection is shown in figure 1. We have visited schools, High school, colleges, Government offices, Adult education schools for the Data collection. While preprocessing we have consider the distortion in image because of users pen and writing quality. We have performed Preprocessing operations for rectification of distorted images, improving the quality of images for ensuring better quality edges in the subsequent edge determination step. In order to remove noise and diminish spurious points, Which are introduced by uneven writing surface, we have used filtering operation. we have performed smoothing, sharpening, thresholding and contrast adjustment by using filtering operation [22][23]. we have applied skew Normalization, slant Normalization, size normalization ,curve smoothing, to remove all types of variations during the writing and obtain standardized data[24].Then we have performed thinning operation to get better features [25].

Page 3: A Brief Review and Survey of Segmentation for Character Recognition

Rohini B. Kharate et al. International Journal of Engi neering Sciences, 2(1) January 2013

16

Figure 1. Character Set written by Different persons

4. Feature Extraction We have extracted feature on our dataset, the feature set consisted of local intensity distribution of gradients computed on 5 X5 grid using 16 quantized gradient orientations [18,19]. The original binary image was converted to a gray image using Gaussian filter, here feature vector size is 400 .The number of blocks are initially 9 X 9 and down sampled to 5 X 5.Here we are using 16 direction level .In order to obtain 16 directions Gaussian filter and a Robert filter are applied to the character image to obtain a gradient image. The arc tangent of the gradient is quantized into 16 directions and the strength of the gradient is accumulated in each direction in each block. 5. Euclidian Distance-Based K-NN Classification In KNN classification, training patterns are plotted in d-dimensional space, where d is the number of features present. These patterns are plotted according to their observed feature values and are labeled according to their known class. An unlabelled test pattern is plotted within the same space and is classified according to the most frequently occurring class among its K-most similar training patterns; its nearest neighbors. The most common similarity measure for KNN classification is the Euclidian distance metric, defined between feature vectors as:

f ρ ρ

euc ( x, y) = ∑( xi − y i )2 i =1 (1) Where f represents the number of features. Smaller distance values represent greater similarity [20, 21]. 6. Results We have used 20,000 Hand written sample data files. We have used 1200 for testing. We have organized data in class, separate class is decided for each character (i.e. vowels, consonants without modifiers, consonants with modifiers(%)).Then we have taken result. Our results are given in the table 1.we have computed accuracy of each individual English Character the table shows the average accuracy. In case of we get high accuracy and rejection and error is also less because they all are unique. But in case of constants we high rejection and error.

Page 4: A Brief Review and Survey of Segmentation for Character Recognition

A Brief Review and Survey of Segmentation for Character Recognition Internat ional Journal of Engineeri ng Sciences, 2(1) January 2013

17

Table 1. Average Result Of Our Dataset.

References [1] J. Schuermann, A Reading machines, Proc. 6th Int. Conf. on Pattern Recognition, Munich, 1982. [2] L.D. Harmon, Automatic Recognition of Print and Script, Proceedings of the IEEE, vol. 60, no. 10, pp. 1165-1177, Oct. 72. [3] G. Dimauro, S. Impedovo and G. Pirlo, From Character to Cursive Script Recognition: Future [4] Trends in Scientific Research, Proc. 11th Int. Conf. on Pattern Recognition, vol. II, page 516, Aug. 1992. [5] C.E. Dunn and P.S.P. Wang, Character Segmenting Techniques for Handwritten Text - A Survey, Proc. 11th Int. Conf. on Pattern Recognition, vol.

II, page 577, August 1992. [6] E. Lecolinet and O. Baret, Cursive Word Recognition: Methods and Strategies, Fundamentals in Handwriting Recognition, S. Impedovo (Ed.),

NATO ASI Series F: Computer and Systems Sciences, vol. 124, Springer Verlag, 1994, pages 235-263. [7] G. Lorette and Y. Lecourtier, Is Recognition and Interpretation of Handwritten Text: a Scene Analysis Problem? Pre-Proceedings IWFHR III,

Buffalo, page 184, May 1993. [8] C.C. Tappert, C.Y. Suen and T. Wakahara, The State of the Art in On-line Handwriting Recognition, IEEE Trans. on Pattern Analysis and Machine

Intelligence, vol. 12, no. 8, page 787, Aug. 1990. [9] D.G. Elliman and I.T. Lancaster, A Review of Segmentation and Contextual Analysis Techniques for Text Recognition, Pattern Recognition, vol.

23, no. 3/4, pp. 337-346, 1990. [10] H. Fujisawa, Y. Nakano and K. Kurino, Segmentation methods for character recognition: from segmentation to document structure analysis,

Proceedings of the IEEE, vol. 80, no. 7 pp. 1079- 1092, July 1992. [11] R. G. Casey and E. Lecolinet, “A survey of methods and strategies in character segmentation, ” IEEE Trans. Pattern Anal. Machine Intell., vol. 18,

July 1996. [12] Veena Bansal and R.M.K. Sinha, Partitioning and Searching Dictionary for Correction of Optically-Read Devanagari Character Strings, in

Proceedings - Fifth International Conference on Document Analysis and Recognition, IEEE [13] Kumar S. and Singh C. (2005) In Proc. Intl. Conf. on Cognition and Recognition, 514-520. [14] Sharma N., Pal U., Kimura F. and Pa S. (2006) In Proc. Indian Conference on Computer Vision Graphics and Image Processing, 805-816. [15] Pal U. and Chaudhuri B.B. (2004) Pattern ecognition, 37, 1887-1899. [16] Ramakrishnan K.R., Srinivasan S.H. and Bhagavathy S. (1999) Proc. of the 5th ICDAR, 414-417. [17] Lehal G.S. and Nivedan Bhatt (2001) Advances in Multimodal Interfaces– ICMI 2001, Tan T., Shi Y. and Gao W.(2000) (Editors), LNCS, 1948,

442-449. [18] Wakabayashi T., Tsuruoka S., Kimura F.and Miyake Y. (1995) System and Computers in Japan, 26 (8), 35-44. [19] Kimura F., Miyake Y., Shridhar M. (1994) Proc. Of 4Th IWFHR. [20] Cover T.M. and Hart P. E. (1967) IEEE Trans. Inform. Theory, IT-13, 21-27. [21] Dasarathy B. V. (1991) IEEE Computer Society Press, New York. [22] Lee J. S. (1983) Digital Computer Vision, Graphics and Image Processing, 24, 255-269. [23] Haralick R. M. and Shapiro L. G. (1992) Computer and Robot Vision,1, Addison Wesley Publishing. [24] Guerfaii W. and Plamondon R. (1993) Pattern Recognition, 26 (3), 418-431. [25] Lam L., Lee S. W. and Suen C. Y. (1992) IEEE Trans. Pattern recognition and Machine Intelligence, 14, 869-885.

Accuracy Rejection Error

vowels 98% 3% 2%

consonants 97.50% 6% 5%

without modifiers

consonants 94% 7% 9%

with modifiers(%)