[IEEE 2009 12th International Conference on Computer and Information Technology (ICCIT) - Dhaka, Bangladesh (2009.12.21-2009.12.23)] 2009 12th International Conference on Computers

Proceedings of 2009 12th International Conference on Computer and Information Technology (ICCIT 2009)21-23 December, 2009, Dhaka, Bangladesh

OSDT: Outer Shape Detection Technique for Recognition ofBangia Optical Character

Md. Allmran, Jobed Hossain, Tanay Dey, Bijan Kumar Debroy, Ahsan Habib AbirDept. of Computer Science & Engineering, Khulna University ofEngineering & Technology, Khulna, [email protected], [email protected], [email protected],[email protected],

[email protected]

A. Image Acquisition & Image FilteringWe started whole process by getting an image document

Fig.l. Process sequence block diagram

Throughout our recognition process we followed thefollowing process sequence:

In Section III, we discuss our proposed BangIa opticalcharacter recognition process in details. In Section IV,we measure our performance through simulation andcompare with prevailing ones. Finally, some conclusions are drawn in Section V.

Segmentation

Gray scaleand binaryconversion

Image acquisition andfiltering

III. PROPOSED BANGLA OPTICALRECOGNITION TECHNIQUE

Output

Input

II. RELATED WORK

BangIa is one of the richest languages of the world. Soanalysts and scientists all over the world are trying tocomputerize the BangIa language for further processingconvenience. In [3] it uses superimposed matrices; someuse Neural Networks [2], [4], [5]. The main problem ofNeural Network technique is that first we need to trainup the characters with the system to get a effective output. Suppose we can consider a BangIa word 'bf'1l<P'. Ifwe put this word as input of that software which usesneural network, it can not recognize the characters. Firstwe have to train up each character Le., P, j, v, K withthe system. Then if we use these characters as input ofthat software, it can successfully recognize those characters. How much times we run the software, everytime we do need to train up the characters firstly. Otherwise it can not recognize that characters totally. But incase of our software, it does not need to train up thecharacters rather it depends on the unique shape ofevery character. In BangIa OCR Apona Pathak [6], usesNeural Network and faces the same problem as described above. BangIa OCR ofBRAC [7] uses the Hidden Markov Model technique which also incorporatesthe character train up technique for its recognitionprocess. It is quite amazing that 'Apona Pathak' software does not work properly with their own given examples which are given in site [6].

AbstractOptical Character Recognition is one ofthe challengingfields in recognition ofprinted Bangia text. The maindifficulties are that there are no precise techniques oralgorithms for separating lines, words, and charactersfrom printed Bangia text and efficiently recognize theseseparate characters. In this paper, we introduce newmethods to separate lines, words, and characters fromprinted Bangia text. We also propose Outer Shape Detection Technique (OSDT), a new technique to recognize each character based on its outer shape which isunique. To successfully accomplish this, proposed technique scans each character both from its left side andright side. Finally our experimental results are compared with some prevalent ones which precisely showthat our approach performs smoothly on different fontsizes and number ofcharacters.

Keywords: Optical Character Recognition, Filtering,Segmentation, Outer Shape Detection Technique.

I. INTRODUCTION

OCR means Optical Character Recognition. BangIaOCR System converts printed BangIa text into electronic version [1]. Recognition of printed BangIa text is asubject of interest for us and also very important forBangIa language processing [2]. BangIa OCR has manyapplication potentials like reading aid for the blind(OCR and speech synthesis), automatic text entry intothe computer, desktop publication, library catalogingand ledgering, automatic reading for sorting of postalmail, bank cheques and other documents [1]. However,there are no precise methods or algorithms for segmentation of lines, words, and characters from printed BangIa text. Another problem is the complexity ofthe shapeof each character. Although some limited researchworks have been done on this field but they are not sufficient to segment the printed BangIa text properly. Inour technique to recognize the shape of a character, weused the outer shape of every character which is unique.For the segmentation purpose some new methods havebeen taken to improve the performance. There were fewproblems to segment characters from a word when characters in a word are overlapped with one another andcontain no minimum gap between them. For those special cases, we used some algorithms for separatingthem. The remaining part of the paper is organized asfollows: related work is briefly discussed in Section II.

978-1-4244-6284-1/09/$26.00 ©2009 IEEE 384

containing BangIa text. This image may be stored in anyformat e.g. jpg, jpeg, bmp etc. To remove garbage fromimage we used image filtering.

B. Gray Scale Conversion and Binary Conversion

The image is converted into gray scale image for ourprocessing convenience. Then we convert the gray scaleimage into binary formatted image by considering thepixel as l(black) or O(white). When we get pixel, it isconsidered as binary l(if black) and otherwise considered binary O(ifwhite). As a result we get a full imagestructured with 1 and o.

C. Segmentation of Text Image

To segment the whole binary image containing BangIacharacters into separate characters, we firstly, separatedeach line. Then we departed each word from each line.Finally characters belonging to each word were separated. Each ofthe steps is sustained below:

C.1 Line Segmentation

In our system we first, separated the line from the binary image. We considered the binary image as a twodimensional array. From this array, we got the startingbinary data and ending binary data i.e, 1 of the line.

•

Figure 2: Segmentation of line from text

There must have some number of columns contain 'zero' between two words of a particular line. So wescanned vertically each pixel until a vertical line containing all zero (white) pixels is found. If we count acertain numbers of (may vary with font size) verticallines containing all zero (white) pixels then it specifiesthe end of a word.

C.2 Word Segmentation

F1,iU"miltr."~

Inter Character Gap

.Jen-I

~4---No "metra

Int.r Ch,riJC'... Gil. p

Figure 3: Segmentation of word from line

We scanned horizontally each pixel until a horizontal line containing all zero (white) pixels isfound. Then we assumed, there was a separationbetween two lines. From the figure 2, we see thatthere is a separation between lower boundary offirst line and upper boundary of second line. Sometimes some words may exist in little bit upper orlower position according to the base label of a line.But in our new technique, we also considered itand solve this problem very successfully. In figure2, we can see that two words, "evwo" and "Zviv"exist little bit upper and lower position respectivelywith respect to upper and lower base level of aline. From the binary image, we detected that distortion of a word and solved it successfully. All ofthese cases base line helps to identify the averageword position in a line and also to detect specialcharacter inside a word such as "w "," z" etc.

HIII"f 'tmilltr.l n

~Intel' Character Gap

Figure 4: Classification of words considering "matra"

In figure 3, there are several consecutive vertical linescontaining zero (white) pixels between the words"KvZj" and "eo". After the segmentation of each word,

C.3 Character Segmentation

To segment characters from a word, we tried to find outand remove 'matra' from the word. But words may havehalf 'matra' and some may have no 'matra' which is

we store the width of each word into a separate array forfurther processing.

shown in figure 4. So, firstly we started scanning horizontally from the beginning of each word in each line.After completing the scanning of the first line of a wordwe calculate the number of 1s in that line. If the numberof 1s in a horizontal line greater than a certain percen-

385

tage ofword width under the condition of font size usedand match with some conditions then our algorithmtook decision that was there a "matra" or half 'matra' orno 'matra' in the word. We used some segmentationtechniques to separate characters for special cases suchas what number of currently scanned, total number ofline will be scanned for a word etc.Generally, there are at least one vertical column contains white pixels that means all data are zero in thatcolumn between two character of a word. We scannedvertically each pixel (binary information) until a verticalline containing all zeros were found. If we found a vertical line containing all zeros, we considered it as a separation between two characters in a word.

C.4 Character Shape detection using OuterShape Detection Technique (OSDT)We know that BangIa characters are complex. At thebeginning of our analysis, we tried to find the similarities among characters. After a long analysis, we foundthat the shape of each character is unique except fewones. By analyzing, we expressed each character withfew turns such as left, right, vertical etc. In our recognition technique, we tried to detect these turns and thenmade a final decision to determine what character was.Firstly, we stored binary format of the image into a twodimensional array. From following figure5 (a), (b) wesee that many consecutive Is make a character," e". Wemade two layers of a binary formatted character. One isouter layer and another is inner layer. In figure 5 we caneasily observe these two layers. Figure 5 contains 1s inthe most outer side construct the outer shape of the character.

Ull11111111oOOOOOOOOU0OQ()()(WWYJ , , , 000aoo1111110000111100110001111000110QQUll.OODllOc00l111.00110000011110110000003.111110QQ00001ll1.10QOOOQ(JOllllOoQQOOOOO11100000000OO1.1.0

Outer Sha pt!'

Inner Shape

(b)

Inner Shape

Figure 5: (a) Binary representation ofa character (b) Inner and Outer Shape ofa character

It also contains Is in the most inner side construct theinner shape of the character. Inside outer and innershape 1s are made zero applying some condition forOSDT computing. We use outer shape to recognize acharacter. So, the proposed technique is named "OuterShape Detection Technique".

C.4.1 Scanning ProcedureWith our outer shape detection technique, we scannedleft outer shape of each character horizontally from leftto right to take a decision and also scanned right outershape from right to left to take another decision. Afterchecking these two decisions we get the final decision.Some of characters and their shapes are given below inthe table 1. Here we use some symbolic letters such as'L', 'R', 'V' and'S' where 'L' means left tum, 'R'means right turn, 'V' means vertical line and'S' meansstraight vertical line which is bigger than 'V'.

Table 1: Some characters with their correspondingshapes in OSDT

386

Character scanning from scanning fromleft to right right to left

e LR Sh RLR S

K LR RLVA LVR RVLVR LSVRV RLSVfJ SRLVR SI LVR RSVL.. VLSVL VLSL

The starting and ending indexes of each character isstored in character segmentation section. Firstly weremoved unnecessary binary data and resized each character. To scan from left to right we started form firstindex e.g. (0, 0) of resized 2D array of each character.We scanned horizontally until a black pixel (binary data1) of that row was found. Then we went first index ofnext line. We again started our process with lastscanned index of previous row and continue until ablack pixel of current row was found. By this processwe scanned all 1s of a binary data of each character andtook a decision with some letters Le., 'LR', 'RV' etc.

Let i be row index andj be column index of the array. Ifthe index of first 1 is [1,4] Le., i = 1 and j = 4 and thenext index of black pixel(e.g. 1) is [2,3] Le., i = 2, j= 3,we observed that 'j' was decreasing row by row. So, wetook the decision that it was left turn (L). Otherwise if'j' increases we decide that it was right turnrk). Fromfigure 6(a) we get consecutive left turn (L) and right

tum (R) while scanning from left to right. So, here wasthe decision LR. In the case of scanning from right toleft we started our scanning from the index in the topright corner. From this index we scanned the binarydata of characters horizontally until a black pixel isfound.

1 111 llll11.1J.0000(t(l01)01. QoaOrb:J01.1Q 0OODOOl.lll. D0001101.001 0001001QQIICfi 0 .........---OQ.l.OOl. DO([000],0010010(01)100101 .1)OaOOM.OCd1. 0OOODOalOO 00000<.1001ot. I)oQOOOOOCJi1. I)OOOOO(ll)O(tl 0

(c)

"1t1111t1t1t4t.~OIXG

Sea nning from botto m 1:0 up

(b)

Seannins fro m right left--II::~~:

(a)

___-~1CCO:)1«J~:G

100J0000000000-III~ (b<J lei

Figure 6: (a) Left to right Scanning(b) Right to left scanning(c) Completeshape of characterfXlOOJCO:

Figure 7: (a) Some special characters in BangIa language(b) A special character "w" with a main character"K" (c) After separating ''w " we got main character"K"

Then we went to the index of next line from its mostright corner and continued our process until a 1 isfound. By this process we scanned all 1s of the binarydata of a character and took a decision based on S, Vetc. From figure 6(b) we took decision that it was astraight (S) line. So, here the decision was S. But somemay think what the need for scanning from right to leftis. To detect a character accurately we need this right toleft scanning as well as left to right scanning e.g. if weclosely examine two BangIa characters "e" and "K",we can observe that the shape of left side of both characters are same. But only difference is its right side. Soscanning horizontally only from left to right will givefigure 7(a). These characters must be with an alphabetwithin a word. For these cases we applied our own algorithm to segment properly. For example: from figure7(b) we can easily see an alphabet "K" with ''W ". Byscanning from left to right we got the shape ofw whichhas unique shape. After detecting "w " we remove itand continue our process with the alphabet" K". To

the same decision for both "e" and "K". To identifyboth character uniquely and precisely we need furtherprocess i.e, scan from right side to left side. After scanning from right to left we will get another decision foreach character. Like this, character e and h has samedecision for right to left scanning but totally different inleft to right scanning. The decision for right scanning of"e" and" h" is'S'. The final decisions of this characterare respectively 'LR' and 'RLR'. From the figure 6(c)we see the complete shape of character. So we can saythat the character (from Table 1) is "e". We check thesetwo decisions precisely to recognize each character. Wealso considered some special characters likedetect this character we manipulated different types ofalgorithm. To describe the entire process in a nut shellan algorithm is stated below.

387

T bl 2 S' I'

IV. PERFORMANCE SIMULATION

A. Image Acquisition & Image Filtering

To perform the simulation we implemented our algorithm in C# .NET. The simulation parameters are shownin Table 2.

Apply vertical scanning from top todown then down to up

10. Repeat step 3 to 9 until shape ofall character in thetext is detected

B. ResultsOur system has been tested with several image file withdifferent font size, number of character in the image.The system accuracy is calculated as [4]

a e : imu ation parameters

Parameter ValuePlatform C# .NET

Input File Format JPG, JPEG, BMPOutput File Format .doc, .docx, .txt

FontName SutonnyMJNo. of characters 50 to 300

Font size 20pt, 28pt, 36pt

D. AlgorithmThe following algorithm will illustrate the procedurediscussed above:1. Image acquisition2. Gray scale conversion and Binary Conversion3. Line Splittingfrom binary data oftext image4. Word Splittingfrom line

Trying to delete Horizontal Line from a Wordconsidering condition

5. Character Splittingfrom a Word"Akar ( v)" splitting if it is closely coupledwith a character

6. Mapping characters position inside binary data ofatext7. Resizing each character from main Mapping ofcharacter inside binary data ofa text8. Shape Detection

Finding "matra" or " rashee-kar(w )" or special character closely coupled with mainchracterRemove special character and send the maincharacter to step 9

9. First do Left and then right scanning for each character by using OSDT

If the shape of currently scanned charactermatches with the shape of other charactersthen

(1,

--BRACOCR--OSDT

--Apona Pathak

50 100 150 200 250 3

32

e 0 28

x n 2420

e 16c t 12u i 8t m 4i e 0

0 . ~ . ~ ~

Figure 8: Total executiontime(second)over differentnumberof characters inputted

t. . . . . tOlCil 110,0/ r.t"CIrJ1.·.....t"lrG:tmtUTft'~tltf..UU......., .. of 't;tf ....ntW14tlf .. 100%·."ncror0 C\._. •. Tatallt'DIJ' a.1I..-lIc9T ini~ illfUtSlltJf tI.lft ••"'.......

For different images file we got different performances. proposed OSDT, uses the outer shape of a character toThe summary of the simulation results is shown in Ta- detect each character uniquely. So it incurs less cost asble 3. the technique incorporates less complex technique. On

Table 3: Summaryof simulationresults the other hand some OCR software such as "Apo-na Pathak", "BRAC OCR" integrates neural networksand hidden markov model technique respectively whichare complex methods. So they incur immense executiontime compare to OSDT.

C. Comparison with other Bangia OCR TechniquesTo compare the performance with other prevalentBangIa OCR software available in the web we ran thesimulations as a common form of input to differenttechniques. Figure 8 is the corresponding graph to thetotal execution time over different number of charactersinputted Figure 8 shows that our technique takes shortest time to get the output which relates to the total execution time over number of character inputted. As our

Font Total No. of Execution Accura-Size number of recognized time(s) cy

characters charactersin the in thedocument document

36 200 197 2.91 98.50%

36 250 240 3.53 96.00%

36 300 299 4.11 99.66%

20 200 143 3.10 71.50%

20 300 190 4.20 63.33%

28 250 215 3.77 86.00%

388

tional Conference on Computer and Information Technology, December 2005, Bangladesh.Ahmed Shah Mashiyat, Ahmed Shah Mehadi ,kamrul Hasan Talukder , " Bangla-off-lineHandwritten Character Recognition Using Superimposed Matrices ", ICCIT, Proceedings of7th International Conference on Computer andInformation Technology, December 2004,Bangladesh.Amin Ahsan Ali, Syed Monowar Hossain, "Optical BangIa Digit Recognition Using Backpropagation Neural Networks" , CS 407Project Works & Project Report ,Departmentof Computer Science, University of Dhaka,Bangladesh.Gil Cohen, Barak Hermesh , " ISLAB OCRproject using a neural network" [Online],Available:http://www.cs.technion.ac.il/labs/Isl/ProjectlPrjects/donelNN/OCRIocrcodezipAlamgir Mohammed, "BangIa OCR AponaPathak" [Online], Available:

http://www.apona-bd.comlbangla-ocr/2.htmlMd. Abul Hasnat, S M Murtoza Habib andMumit Khan, "Segmentation free BangIa OCRusing HMM: Training and Recognition", Proc.of 1st International Conference on DigitalCommunications and Computer Applications(DCCA2007), Irbid, Jordan, 2007.U. Bhattacharya, B. B. Chaudhuri, "A MajorityVoting Scheme for Multiresolution Recognition of Hand printed Numeral", DocumentAnalysis and Recognition, Proceedings, Seventh International Conference, 3-6 August2003.A. M. Shoeb Shatil and Mumit Khan, "Minimally Segmenting High Performance BangIaOCR using Kohonen Network", Proc. of 9thInternational Conference on Computer and In-formation Technology (ICCIT 2006), Dhaka,Bangladesh, December 2006.

[8]

[3]

[9]

[6]

[4]

[5]

[7]-BRACOCR-A onaPatt

20

50.00%

100.00%e rr a

cc c 50.00%

J:u 0A.r ~ 0.00%

12 28 36Font ~i7P

Figure 9: Accuracy of techniques over different size of thefont

Figure 9 represents the accuracy of the different techniques over different size of the font. For accuracy calculation we used equation (1). Table 3 shows the accuracy result at different font sizes in case of our techniques. Most of the OCR techniques depend on the sizeof the font through out the recognition process. But incontrast our technique depends on the outer shape ofevery character which is unique whether it is small orlarge. So our technique preserve myriad accuracy compared to the other techniques over the different fontsizes.

ro'!r!? 0.00%

50 100 150 200 250Number of character inoutted

Figure 10: Error rate (%) observed over different num-ber of characters inputted

Figure 10 shows the graph of error rate (%) over thenumber of character inputted to out OSDT techniqueand the other compared techniques. For error calculation we used the following equations:

error rate = 100% - Accuracy (2)Our technique is not perfect for all characters recognition but so far we got other techniques for simulationswere full of bug. None of them were not usefully to findout most of the BangIa character efficiently. As the accuracy of our technique is immense (figure 9) comparedto the other techniques, so consequently it will incurlow error rate (%). From figure 10 it can be easily perceivable that our technique costs low error rate compared to the other techniques which is desired for anyBangIa OCR. From the figure 9 and 10, it can easilyperceivable that our OSDT performs efficiently.

100.00%a a

REFERENCES[1] B. B. Chaudhuri and U. Pal, "A complete Ban

gIa OCR System", Pattern Recognition, Vol31, 1998.

[2] Md. Al Mehedi Hasan, Md. Abdul Alim, Md.Wahedul Islam, " A New Approach to BangIaText Extraction And Recognition From TexualImage" , ICCIT, Proceedings of 8th Interna-

389

Documents

[IEEE 2009 12th International Conference on Computer and Information Technology (ICCIT) - Dhaka, Bangladesh (2009.12.21-2009.12.23)] 2009 12th International Conference on Computers