7
An Analysis of Off-line and On-line Approaches in Urdu Character Recognition Naila Habib Khan 1 , Awais Adnan 2 and Sadia Basar 3 Institute of Management Sciences, Peshawar, Pakistan Abstract: In this research article a detailed analysis has been proposed for various offline and online character recognition systems for Urdu script from year 2002 to 2012. This analysis is based on the Methodology, Text Type, Font, Recognition Level, Sample and Accuracy Level achieved by each individual Urdu script recognition system. This paper attempts to cover various aspects of offline and online character recognition systems to provide wide exposure to this research topic with special emphasis on Urdu Script. Generally, character recognition is the capability of a computer system to comprehend printed or handwritten text from different sources like documents, books, reports, photographs or directly from digital touch screens. In Offline Character Recognition system, an image is sensed by a scanner having printed text. When using any digital device in real time for example a touch-screen or a digital pen, it is referred to as Online Character Recognition. Keywords: Online, Offline, OCR, Urdu. 1 Introduction Urdu is spoken by 490 million people around the world. It is the 4th largest language spoken and understood in the world. It is the official language of Pakistan and five Indian states. It is also widely spoken and understood in countries like Afghanistan, United Arab Emirates, Saudi Arabia, Bangladesh, United Kingdom, United States, South Africa, Botswana, Bahrain, Canada, Germany, Fiji, Guyana, Malawi, India, Nepal, Qatar, Mauritius, Oman, Zambia, Norway and Thailand. Urdu is normally confused to be Hindi. Urdu and Hindi are associated to each other and share the same background. The primary difference between Urdu and Hindi is its written script. In Pakistan it’s written in Arabic script hence named as “Urdu” where as in India it’s written in Devnagari script and hence called “Hindi”. In India Urdu is widely spoken and understood in different cities namely Delhi, Muzaffarnagar, Najibabad, Rampur, Roorkee, Bareilly, Meerut, Lucknow, Azamgarh, Bijnor, Deoband, Saharanpur, Moradabad, Aligarh, Allahabad, Gorakhpur, Agra, Bidar, Ajmer, Kanpur, Badaun, Bhopal, Hyderabad, Aurangabad, Bengaluru, Kolkata, Mysore, Patna, Gulbarga, Nanded, and Ahmedabad. India also publishes 405 daily Urdu newspapers. In Bangladesh Urdu is used as a language for communication but it’s referred to as “Behari”. Urdu was developed under the great influence of Arabic, Persian and Turkish languages almost 900 years ago. Urdu language shares the same script as Arabic, Persian, Turkish, Pashto and Kashmiri. Learning Urdu is highly beneficial because it helps u read Persian and Arabic alphabets, since Urdu script is 90% similar to these scripts [1]. Due to huge significance of Urdu script a number of researchers have focused on Optical Character Recognition systems, which can convert Urdu ancient literature to digital format. In this research paper we have focused and analyzed various offline and online character recognition systems for Urdu. The researches have been analyzed based on the Methodology, Text Type, Font, Recognition Level, Sample and Accuracy Level. Methodology includes the major machine learning techniques and algorithms implemented to develop a recognition system. Handwritten and typewritten texts are the two major text type’s used with any character recognition system. Font can be of any style, Nastaliq being the most famous font for Urdu script. Recognition levelis based on the concept that segmentation based or segmentation free approach has been used. Finally, the sample and accuracy level discusses the dataset and overall success rate of the character recognition system respectively. 2 Types of Character Recognition Systems Figure 1 below shows that character recognition can basically be divided into two types i.e. Online and Offline Character Recognition. Figure 1 Character Recognition and Online Character Recognition Computational Science and Systems Engineering ISBN: 978-1-61804-362-7 280

An Analysis of Off-line and On-line Approaches in …...Mysore, Patna, Gulbarga, Nanded, and Ahmedabad. India also publishes 405 daily Urdu newspapers. In Bangladesh Urdu is used as

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: An Analysis of Off-line and On-line Approaches in …...Mysore, Patna, Gulbarga, Nanded, and Ahmedabad. India also publishes 405 daily Urdu newspapers. In Bangladesh Urdu is used as

An Analysis of Off-line and On-line Approaches in

Urdu Character Recognition

Naila Habib Khan1, Awais Adnan2 and Sadia Basar3

Institute of Management Sciences, Peshawar, Pakistan

Abstract: In this research article a detailed analysis has been proposed for various offline and online character recognition systems

for Urdu script from year 2002 to 2012. This analysis is based on the Methodology, Text Type, Font, Recognition Level, Sample and

Accuracy Level achieved by each individual Urdu script recognition system. This paper attempts to cover various aspects of offline

and online character recognition systems to provide wide exposure to this research topic with special emphasis on Urdu Script.

Generally, character recognition is the capability of a computer system to comprehend printed or handwritten text from different

sources like documents, books, reports, photographs or directly from digital touch screens. In Offline Character Recognition system,

an image is sensed by a scanner having printed text. When using any digital device in real time for example a touch-screen or a

digital pen, it is referred to as Online Character Recognition.

Keywords: Online, Offline, OCR, Urdu.

1 Introduction

Urdu is spoken by 490 million people around the world.

It is the 4th largest language spoken and understood in the

world. It is the official language of Pakistan and five

Indian states. It is also widely spoken and understood in

countries like Afghanistan, United Arab Emirates, Saudi

Arabia, Bangladesh, United Kingdom, United States,

South Africa, Botswana, Bahrain, Canada, Germany, Fiji,

Guyana, Malawi, India, Nepal, Qatar, Mauritius, Oman,

Zambia, Norway and Thailand.

Urdu is normally confused to be Hindi. Urdu and Hindi

are associated to each other and share the same

background. The primary difference between Urdu and

Hindi is its written script. In Pakistan it’s written in Arabic

script hence named as “Urdu” where as in India it’s

written in Devnagari script and hence called “Hindi”. In

India Urdu is widely spoken and understood in different

cities namely Delhi, Muzaffarnagar, Najibabad, Rampur,

Roorkee, Bareilly, Meerut, Lucknow, Azamgarh, Bijnor,

Deoband, Saharanpur, Moradabad, Aligarh, Allahabad,

Gorakhpur, Agra, Bidar, Ajmer, Kanpur, Badaun,

Bhopal, Hyderabad, Aurangabad, Bengaluru, Kolkata,

Mysore, Patna, Gulbarga, Nanded, and Ahmedabad. India

also publishes 405 daily Urdu newspapers. In Bangladesh

Urdu is used as a language for communication but it’s

referred to as “Behari”.

Urdu was developed under the great influence of

Arabic, Persian and Turkish languages almost 900 years

ago. Urdu language shares the same script as Arabic,

Persian, Turkish, Pashto and Kashmiri. Learning Urdu is

highly beneficial because it helps u read Persian and

Arabic alphabets, since Urdu script is 90% similar to these

scripts [1]. Due to huge significance of Urdu script a

number of researchers have focused on Optical Character

Recognition systems, which can convert Urdu ancient

literature to digital format. In this research paper we have

focused and analyzed various offline and online character

recognition systems for Urdu. The researches have been

analyzed based on the Methodology, Text Type, Font,

Recognition Level, Sample and Accuracy Level.

Methodology includes the major machine learning

techniques and algorithms implemented to develop a

recognition system. Handwritten and typewritten texts are

the two major text type’s used with any character

recognition system. Font can be of any style, Nastaliq

being the most famous font for Urdu script. ‘Recognition

level’ is based on the concept that segmentation based or

segmentation free approach has been used. Finally, the

sample and accuracy level discusses the dataset and

overall success rate of the character recognition system

respectively.

2 Types of Character Recognition Systems

Figure 1 below shows that character recognition can

basically be divided into two types i.e. Online and Offline

Character Recognition.

Figure 1 Character Recognition and Online Character Recognition

Computational Science and Systems Engineering

ISBN: 978-1-61804-362-7 280

Page 2: An Analysis of Off-line and On-line Approaches in …...Mysore, Patna, Gulbarga, Nanded, and Ahmedabad. India also publishes 405 daily Urdu newspapers. In Bangladesh Urdu is used as

2.1 Online Character Recognition System

Online character recognition refers to real time

recognition of characters. In online systems, characters

are recognized as they are written. Here a concept of

digital ink is used, sensors are used to analyze pen tip

movements for example pen up/down. Online character

recognition relieves us from the task of locating the

position of character. These systems are available in

PDA’s, Handheld PC’s, and also in some of the latest

touch screen mobile phones.

2.2 Offline Character recognition

Offline character recognition involves the automatic

conversion of handwritten or typewritten text from

scanned paper to letter codes that can be utilized inside

the computer. Offline character recognition is a complex

process as compared to online character recognition. In

offline character recognition, characters must be first

located and then extracted for recognition.

3 Types of Text for Recognition

3.1 Printed Text Recognition

Printed text recognition refers to the recognition of the

text that is computer generated. In case of printed text we

can have different fonts and sizes. The text can be in any

computer generated font for e.g. Times New Roman,

Arial, Calibri and Courier etc. Printed character

recognition system is simpler as compared to handwritten

character recognition.

3.2 Handwritten Text Recognition

Handwritten text recognition refers to the recognition of

any text that has been written with hand. Recognition of

handwritten text is difficult as compared to typewritten

text. Handwritten characters vary from person to person

and also according to the state of mood of the person.

Henceforth developing a character recognition system for

its recognition is considered a difficult task.

4 Comparison of Off-line Character

Recognition System

For offline character recognition systems a detailed study

has been conducted from year 2002 to 2012. A total of 14

research papers have been taken into consideration. For

each research paper the attributes considered are year of

publication, title, type of text (handwritten/printed), font

utilized, methodology applied, level of recognition

(character/word), sample data taken and the overall

accuracy of the developed system. All the data is

organized in form of a table were the columns represents

the attributes mentioned above. Each row of the table

represents an individual research paper; data has been

organized in ascending order based on year column (see

Table 1).

4.1 Analysis of Recognition Levels Used for

Offline Character Recognition Systems

As Urdu is a cursive language targeting for character level

recognition is a challenging task. Intensive segmentation

procedure is required for character level recognition

systems. Segmentation produces numerous errors and the

shape of character may be disfigured. Husain [2] used

ligature based recognition using a segmentation free

approach. Pal and Sarkar [3] implemented a segmentation

based approach since their recognition system aimed at

recognizing individual character. Hence this

segmentation based approach led to several segmentation

errors; these errors contributed 0.7% to the total error rate

of the proposed recognition system. [4-7] developed an

optical character recognition system that could recognize

only isolated Urdu characters. Ahmad, et al. [8] used

segmentation based recognition; the word was first

stretched horizontally so that the segmentation errors

could be minimized. Javed, et al. [9] proposed a

segmentation free approach for Urdu Nastaliq script

recognition which is highly cursive and overlapping.

The overall research data for offline character

recognition systems was presented in terms of 100 % and

it was found that 57 % of researchers opted to work with

recognition at character level and 43% opted for word

level recognition (see Figure 2).

Figure 2 Analysis of Recognition Levels of Data Used for Offline

Character Recognition Systems

57%43%

Character Word

Computational Science and Systems Engineering

ISBN: 978-1-61804-362-7 281

Page 3: An Analysis of Off-line and On-line Approaches in …...Mysore, Patna, Gulbarga, Nanded, and Ahmedabad. India also publishes 405 daily Urdu newspapers. In Bangladesh Urdu is used as

Table 1 Analysis of Offline Character Recognition Systems

Year Title P/H

*

Font

Methodology Applied Recognition

Level:**

Test Data Accuracy

2002 A Multi-Tier Holistic Approach

For Urdu Nastaliq Recognition

[2]

P NNQ Holistic Approach

Feed Forward Back

Propagation Neural Network

W 200

100 %

2002 Ligature Based Optical

Character Recognition Of

Urdu- Nastaleeq Font [10]

P NQ Template Matching W Type Written

Nastaliq

Script

Reasonable

2003 Recognition Of Printed Urdu

Script [3]

P NK

&

NQ

Water Reservoir Principle

Ch 3050 97.8%.

2005 English, Devnagari And Urdu

Text Identification [11]

P AF Water Reservoir Principle

Binary Tree Classifier

W 3210

98.09%

2007 Urdu Nastaleeq Optical

Character Recognition [8]

P NQ Neural Network Ch Old And New

Written

Scripts

93.4 %

2007 OCR For Printed Urdu Script

Using Feed Forward Neural

Network [4]

P AR

Feed Forward Neural Network Ch 72pt Font

Size

98.03 %

2009 Urdu Compound Character

Recognition Using Feed

Forward Neural Networks [12]

P AR Segmentation Phase Based On

Pixels Strength

Feed Forward Neural Network

Ch 56 Classes Of

Characters.

Each Having

100 Samples

70%

2009 Optical Character Recognition

System For Urdu (Naskh Font)

Using Pattern Matching

Technique [5]

P NK

Pattern Matching Technique Ch Urdu Text

Having

Different

Fonts Sizes

89%

2009 A Finite State Model For Urdu

Nastalique Optical Character

Recognition [13]

P NQ

Finite State OCR Modeled

Using Finite Automata

Ch Nastaliq

Having Same

Font Size

Encouraging

2010 Segmentation Free Nastalique

Urdu OCR [9]

P NNQ Global Transformational

Features

Hidden Markov Model

W 3655 92 %

2010 Font Size Independent OCR For

Noori Nastaleeq [14]

P NNQ

Font Size Normalization

X-Height Calculation

Outline Capture

Chain Code Algorithm

W

Wide Variety

of Font Sizes

94% - 97%

(Font Size

24, 28 ,32 )

93% - 97%

(Font Sizes

40, 44, 48,

52 )

2012 An Efficient Method For Urdu

Language Text Search In Image

Based Urdu Text [15]

H -- Template Matching Technique

Correlation Algorithm

W 2,3,4 and 5

Character

Ligatures

100% ,87%

and 78% for

5-4,3 and 2

Char

Ligature

2012 Recognition Of Segmented

Arabic/Urdu Characters Using

Pixel Values As Their Features

[16]

P -- Pixel Value Feature Vectors

Neural Network

Ch 30 Mixed

Arabic/Urdu

Alphabets

Used For

Making 53

Classes

95%

2012 Recognition Of Offline

Handwritten Isolated Urdu

Character [7]

H -- Moment Invariant Technique

Primary And Secondary

Component Separation

SVM For Classification

Ch 36800 93.59%

* : P: Printed, H : Hand written

AF: Any font; AR: Ariel; NK : Naskh; NQ: Nastaliq; NNQ: Noori Nastaliq; -- : Not Specified

** : Ch: Character ; W: Word

Computational Science and Systems Engineering

ISBN: 978-1-61804-362-7 282

Page 4: An Analysis of Off-line and On-line Approaches in …...Mysore, Patna, Gulbarga, Nanded, and Ahmedabad. India also publishes 405 daily Urdu newspapers. In Bangladesh Urdu is used as

4.2 Analysis of Handwritten and Printed Text

Utilization for Offline Character

Recognition Systems

Pathan, et al. [7] states that inadequate amount of research

work has been directed towards Urdu handwritten

character recognition. Handwritten text can be used with

online character recognition systems and also with offline

character recognition systems. When handwritten text is

used with offline recognition systems it’s mostly referred

to as Offline handwriting recognition system. Offline

handwriting recognition system converts an image of

handwritten text into codes that are understood within the

computer and text-processing application domains.

Offline handwriting recognition involves scanning a

handwritten document or form written by an individual in

the past.

The main issue with handwritten text is that it differs

from one individual to another. Handwritten text is

affected not only by mood but also by the material on/with

which it is written. Different types of pen may have

different writing tip. For example a pen with smaller tip

will produce thin handwritten characters while a pen with

larger tip will produce handwritten characters that have

certain thickness. Fountain pen, ballpoint and marker pen

may also affect the handwriting of individuals because of

difference of inks and the tips of these pens.

Printed Text on the other hand is much simpler to

handle. We only have to deal with different fonts and sizes

of text. This simplicity is the primary reason that most of

the researchers have opted to work with printed text for

Urdu character recognition systems.

Figure 3 Analysis of Handwritten and Printed Text Utilization for

Offline Character Recognition System

Examining the research papers it is found that 86% and

7% research has been performed on printed and

handwritten respectively (see Figure 3). Khan, et al. [15]

and Pathan, et al. [7] utilized handwritten text as for

recognition purposes. Due to complexity and

segmentation issues handwritten text Urdu character

recognition is lagging. Urdu needs a robust character

recognition system that is capable of converting both

handwritten and printed text into computer recognizable

form.

4.3 Analysis of Font Types Used For Printed

Text in Offline Character Recognition

Systems

There are several calligraphic styles for writing Arabic

script. Naskh, Nastaliq, Kufi, Deevani, Sulus and Riqah

styles are few of them. Naskh and Nastaliq are the most

famous writing styles used with Urdu scripts. Nastaliq

and Naskh are both written from right to left. Nastaliq

writing style for Urdu is highly cursive, diagonal, context

sensitive and non-monotonic writing system [17].

Nastaliq is basically a fusion of Naskh and Taliq writing

styles; it’s really beautiful and artistic style of writing

Urdu script. Because of the complexities associated with

Nastaliq writing, developing an efficient Character

Recognition System for Urdu is highly challenging task.

Figure 4 Analysis of Font Types Used For Printed Text in Offline

Character Recognition Systems

Summing up, 22% Nastaliq, 21 % Noori Nastaliq and

7% both (Nastaliq and Naskh) give the end result of 50%.

This result of font analysis indicates that Nastaliq and its

variations like Noori Nastaliq are the most popular choice

in printed Urdu offline character recognition systems (see

Figure 4). While 29% of the systems didn’t declare the

fonts utilized or are completely font independent.

7%

86%

7%

Handwritten

Printed

Both(Handwritten & Printed)

22%

21%

7%7%

14%

29%

Nastaliq

Noori Nastaliq

Naskh

Both(Nastaliqand Naskh)

Ariel

Other

Computational Science and Systems Engineering

ISBN: 978-1-61804-362-7 283

Page 5: An Analysis of Off-line and On-line Approaches in …...Mysore, Patna, Gulbarga, Nanded, and Ahmedabad. India also publishes 405 daily Urdu newspapers. In Bangladesh Urdu is used as

Table 2 Comparison of Online Character Recognition Systems

Year Title Methodology

Applied

Recognition

Level: **

Test Data Accuracy

2005

Urdu Online Handwriting Recognition

[18]

Analytical

Approach

Slant Analysis

Tree Based

Dictionary

Searching Method

Ch 39 Urdu characters

10 Numerals

200 Two Character

Words

93% (Isolated

Characters)

93 %( Numerals)

78% (Two

Character Words)

2007 Online Urdu Character Recognition

System [19]

Feature Vector

Extraction

Back Propagation

Neural Network

W 240 Ligatures with

Combination of 6

Diacritics

Base Ligatures

93%

Secondary Strokes

98%.

2009 Urdu Qaeda: Recognition System For

Isolated Urdu Characters [6]

Feature Extraction

Linear Classifier

Ch Four Samples of Each

Character Were Taken

From Two Participants

92.8% for Fluent

Urdu Users

31 % for Non-

Native User

2010 HMM and fuzzy logic: A hybrid

approach for online Urdu script-based

languages character recognition [20]

Hybrid Classifier

HMM and Fuzzy

Logic

W 1800 ligatures 87.6% for Nastaliq

74.1% for Naskh

2012 Fuzzy Based Preprocessing Using

Fusion Of Online And Offline Trait For

Online Urdu Script Based Languages

Character Recognition [21]

Fuzzy Logic Based

Preprocessing

Primary Baseline

Extraction

Local Baseline

Extraction

W 1800 Ligatures for

Nasta'liq Script

1000 For Naskh Style

74.3% for Nasta'liq

60.7% for Naskh

** : Ch: Character ; W: Word

5 Comparison of Online Character

Recognition Systems

For online character recognition system total 5 research

papers, from year 2005 to 2012 have been taken into

account. The data has been organized in form of a table

for better analysis and understanding purpose. For each

research paper certain attributes are considered; year of

publication, title, methodology, level of recognition

(character/word), sample data and the accuracy of the

developed system. It also is worth mentioning that there

are fewer columns in Table 2 as compared to Table 1. This

is due to the fact that online character recognition systems

deals only with handwritten text so Text Type

(handwritten/printed) column has been omitted. Also the

type of font is rarely of concern when dealing with online

character recognition system since mostly Urdu is written

in Nastaliq calligraphic style.

5.1 Analysis of Recognition Levels Used for

Online Character Recognition Systems

The percentage outcome is higher for word level

recognition as compared to character level online Urdu

recognition. Malik and Khan [18] and Shahzad, et al. [6]

found it easier to work with recognition at character level

while [19] and [20, 21] opted to work with combination

of ligatures and diacritics instead of individual characters.

With analysis of online Urdu recognition systems it is

found that 25 % of research has been carried out towards

character level recognition and 75% towards

word/ligature level recognition (see Figure 5).

Figure 5 Analysis of Recognition Levels Used For Online Character

Recognition Systems

25%

75%

Character Word

Computational Science and Systems Engineering

ISBN: 978-1-61804-362-7 284

Page 6: An Analysis of Off-line and On-line Approaches in …...Mysore, Patna, Gulbarga, Nanded, and Ahmedabad. India also publishes 405 daily Urdu newspapers. In Bangladesh Urdu is used as

6 Results and Discussion

In this research article, a comparative analysis has been

done to know how much research work has been done for

both online and offline character recognition systems. The

final results clearly showed that more research work has

been performed on offline character recognition systems.

The primary reason is that of real time complexity

associated with online character recognition systems.

Also Urdu is a multi-stroke language which creates

complexities and issues in online recognition systems.

Out of all the research paper listed in this research paper

it is found that 26% of work has been carried out for

online character recognition system and 74% research

work has been conducted for offline character recognition

systems (see Figure 6).

Figure 6 Comparative Analysis of Online and Offline Character

Recognition Systems

7 Conclusion

Only 26% of research has been directed towards online

Urdu character recognition systems while a greater 74%

has been focused towards offline character recognition

systems (see Figure 6).

For Urdu Offline Character Recognition systems,

57 % of researchers opted to work with recognition at

character level while 43% researches opted for ligature

or word level recognition (see Figure 2). Word and

character level recognition both have been widely and

almost equally explored.

In case of handwritten and printed text the results are

7% and 86% respectively (see Figure 3). The

assumption that can be drawn is that complexity of

handwritten text has held back the researchers. The

complex nature of handwritten text is due to its high

dependability on the mood of person, type of pen and

surface of writing in usage.

Nastaliq and its variations is the favorite font among

the researchers. Though few authors have opted to

work with fonts, Ariel and Naskh. 29% of researchers

developed font independent systems or didn’t care for

the type of font utilized at all (see Figure 4).

For Urdu Online character recognition systems,

75% of work has been aimed at word level recognition

while 25% at character level recognition (see Figure 5).

8 References

[1] A. Bharath and S. Madhvanath, "Online handwriting

recognition for Indic scripts," in Guide to OCR for

Indic Scripts, ed: Springer, 2010, pp. 209-234.

[2] S. A. Husain, "A Multi-tier Holistic approach for Urdu

Nastaliq Recognition," Multi Topic

Conference,Abstracts 2002, p. 84, 2002.

[3] U. Pal and A. Sarkar, "Recognition of Printed Urdu

Script," presented at the Proceedings of the Seventh

International Conference on Document Analysis and

Recognition - Volume 2, 2003.

[4] I. Shamsher, Z. Ahmad, J. K. Orakzai, and A. Adnan,

"OCR For Printed Urdu Script Using Feed Forward

Neural Network," World Academy of Science,

Engineering and Technology, 2007.

[5] T. Nawaz, S. A. H. S. Naqvi, H. ur Rehman, and A.

Faiz, "Optical character recognition system for urdu

(naskh font) using pattern matching technique,"

International Journal of Image Processing (IJIP), vol.

3, p. 92, 2009.

[6] N. Shahzad, B. Paulson, and T. Hammond, "Urdu

Qaeda: Recognition System for Isolated Urdu

Characters," in IUI 2009 Workshop on Sketch

Recognition,, Sanibel Island, Florida, 2009.

[7] I. K. Pathan, A. A. Ali, and R. R.J., "Recognition of

Offline Handwritten Isolated Urdu Character,"

Advances In Computational Research, vol. 4, pp. 117-

121, 2012.

[8] Z. Ahmad, J. K. Orakzai, I. Shamsher, and A. Adnan,

"Urdu Nastaleeq Optical Character Recognition,"

World Academy Of Science, Engineering And

Technology, pp. 249-252, 2007.

[9] S. T. Javed, S. Hussain, A. Maqbool, S. Asloob, S.

Jamil, and H. Moin, "Segmentation free nastalique

urdu ocr," World Academy of Science, Engineering

and Technology, vol. 46, pp. 456-461, 2010.

[10] Z. A. Shah, "Ligature Based Optical Character

Recognition of Urdu- Nastaleeq Font," INMIC, 2002.

[11] S. Chanda and U. Pal, "English, Devnagari and Urdu

Text Identification," in Proceedings of the

International Conference on Cognition and

Recognition, 2005, pp. 538-546.

[12] Z. Ahmad, J. K. Orakzai, and I. Shamsher, "Urdu

Compound Character Recognition Using Feed

74%

26%

Offline Character Recognition Systems

Online Character Recognition Systems

Computational Science and Systems Engineering

ISBN: 978-1-61804-362-7 285

Page 7: An Analysis of Off-line and On-line Approaches in …...Mysore, Patna, Gulbarga, Nanded, and Ahmedabad. India also publishes 405 daily Urdu newspapers. In Bangladesh Urdu is used as

Forward Neural Networks," in ICCSIT, 2009, pp. 457

- 462.

[13] S. A. S. S.-u. Haque and M. K. Pathan, "A finite state

model for urdu nastalique optical character

recognition," IJCSNS, vol. 9, p. 116, 2009.

[14] Q. u. A. Akram, S. Hussain, and Z. Habib, "Font Size

Independent OCR for Noori Nastaleeq," in In

Proceedings of Graduate Colloquium on Computer

Sciences (GCCS), NUCES Lahore, 2010.

[15] K. Khan, M. Siddique, M. Aamir, and R. Khan, "An

Efficient Method for Urdu Language Text Search in

Image Based Urdu Text," IJCSI International Journal

of Computer Science Issues, vol. 9, March 2012.

[16] S. Zaman, W. Slany, and F. Sahito, "Recognition of

Segmented Arabic/Urdu Characters Using Pixel

Values as their Features," ICCIT, 2012.

[17] S. A. Sattar, S. Haque, M. K. Pathan, and Q. Gee,

"Implementation Challenges for Nastaliq Character

Recognition," Communications in Computer and

Information Science,Volume 20, pp. 279-285, 2009.

[18] S. Malik and S. A. Khan, "Urdu Online Handwriting

Recognition," Emerging Technologies, Proceedings

of the IEEE Symposium, vol. 17, 2005.

[19] S. A. Husain, A. Sajjad, and F. Anwar, "Online Urdu

Character Recognition System," in MVA, 2007, pp.

98-101.

[20] M. I. Razzak, F. Anwar, S. A. Husain, A. Belaid, and

M. Sher, "HMM and fuzzy logic: A hybrid approach

for online Urdu script-based languages' character

recognition," Know.-Based Syst., vol. 23, pp. 914-923,

2010.

[21] M. I. Razzak, S. A. Husain, A. A. Mirza, and A.

Belaid, "Fuzzy Based Preprocessing Using fusion Of

Online And Offline Trait For Online Urdu Script

Based Languages Character Recognition,"

International Journal Of Innovative

Computing,Information And Control, vol. 8, pp.

3149–3161, 2012.

Computational Science and Systems Engineering

ISBN: 978-1-61804-362-7 286