Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
FRINGE MAP BASED CHARCTER SEGMENTATION
R.SINDHU1
P.HEMA KUMAR2
1 P.G Student-Department of Electronics and Communication Engineering,
Swarnandhra Institute of Engineering and Technology, Narasapur 2 Assistant Professor, Department of Electronics And Communication
Engineering , Swarnandhra Institute of Engineering and Technology, Narasapur, [email protected]
Abstract
Accurate segmentation of text lines from printed or handwritten documents is an
important task in any document processing system. This becomes a challenging and
complex problem due to several reasons. Situations arise when the text from neighboring
lines overlaps the white space area, or touches text of the current line. Complications may
also arise when due to varying skew, text lines curve along the page in varied trajectories.
These situations are beyond the scope of common algorithms developed for some printed.
We proposed a approach based on fringe maps to generate segmenting paths between
adjacent text lines.
1. Introduction Literally, OCR stands for Optical Character Recognition. It is
a wide spread technology to recognize text inside images, such as
scanned documents and photos. OCR technology is used to convert
virtually any kind of images containing written text (typed, handwritten
or printed) into machine-readable text data.
OCR Technology became popular in the early 1990s while
attempting to digitize historic newspapers. Since then the technology
has underwent several improvements. Nowadays solutions deliver near
to perfect OCR accuracy. Advanced methods like Zonal OCR are used
to automate complex document based workflows.
Probably the most well-known use case for OCR is converting
printed paper documents into machine-readable text documents. Once
a scanned paper document went through OCR processing, the text of
the document can be edited with word processors like Microsoft Word
or Google Docs. Before OCR technology was available, the only option
to digitize printed paper documents was to manually re-typing the text.
International Journal of Research
Volume VIII, Issue VI, JUNE/2019
ISSN NO:2236-6124
Page No:2295
Not only this massively time consuming, it also came with inaccuracy
and typing errors.
2.1 OPTICALCHARACTER SEGMENTATION (OCR):
Optical Character Recognition (OCR) is a piece of software
that converts printed text and images into digitized form such that it can
be manipulated by machine. Unlike human brain which has the
capability to very easily recognize the text/ characters from an image,
machines are not intelligent enough to perceive the information
available in image. Therefore, a large number of research efforts have
been put forward that attempts to transform a document image to
format understandable for machine. OCR is a complex problem because
of the variety of languages, fonts and styles in which text can be
written, and the complex rules of languages etc.
2.2 GENERATIONS OF OCR:
The generations of OCR are
First generation OCR systems
Second generation OCR systems
Third generation OCR systems
Fourth generation OCR systems (Today’s generation).
2.3 Process of OCR:
The process of OCR is a composite activity comprises different
phases. These phases are
International Journal of Research
Volume VIII, Issue VI, JUNE/2019
ISSN NO:2236-6124
Page No:2296
Figure 2.3.1: Preprocessing of OCR
2.4 SEGMENTATION:
Segmentation is an integral part of any text based recognition
system. Segmentation phase include basically three phases, i.e. line
segmentation, word segmentation, character segmentation.
Methods Disadvantages
Histogram approach It causes some loss on the text line area
Header line and base
line approach
Header lines are rows with maximum number of black pixel and base li
ne are rows with minimum number of black pixel.
Hough transform
approach
It requires a relatively large amount of memory and a long computation
time.
Smearing method The consecutive black pixels along the horizontal directions are smeared.
Grouping approach It does not work when most of the edge pixels are isolated.
Gradient based
approach
Gradient magnitude and orientation of each pixel are explicitly used to
group the pixels.
2.5 METHODS OF CHARACTER SEGMENTATION
1. Water reservoir method
2. Split profile method
3. Segmentation model
4. Line and Zone Separation
5. Syllable segmentation
6. Water drop fall algorithm
6.1 Types of Drop fall Algorithm
6.2 Top-Left Drop Fall
6.3 Top-Right Drop Fall
6.4 Bottom-Left Drop Fall
6.5 Bottom-Right Drop Fall
2.6 FRINGE MAP METHOD:
In this method, it is necessary to understand the concept of
fringe maps. These can be said to be related to distance transforms [18]
for binary images. In a fringe map, each pixel is represented with a
fringe number. Every Black pixel has a fringe number of zero. White
pixels have a fringe number which is a positive integer, that is the
International Journal of Research
Volume VIII, Issue VI, JUNE/2019
ISSN NO:2236-6124
Page No:2297
distance from the nearest black pixel using a L2 metric. In
other words, A white pixel with fringe number x states that:
1) It is x pixels away from its nearest black pixel.
2) It is surrounded by atleast (x-1) white pixels in all directions.
3. OPTICAL CHARACTER RECOGNITION (OCR)
3.1 PROJECTION PROFILE METHOD:
Projection profile method is of two types
1. Horizontal projection profile method
2. Vertical projection profile method.
3.2 Horizontal projection profile method:
The horizontal projection profile method is used to calculate
sum of all white pixels on every row. It gives corresponding histogram
of that image in line segmentation.
3.3 Vertical projection profile method:
Vertical projection profile method is used to calculate sum of all white pixels and plot the
histogram of computed white pixels in word and charcter segmentation.
PROPOSED METHOD
Text line segmentation in printed or handwritten document
images is one of the crucial tasks in a document image analysis (DIA).
The performance of a DIA system depends critically on the accuracy of
text line segmentation. Errors at this stage cannot be easily overcome
or undone by the following stages that perform character recognition
or post processing. Text line segmentation is usually seen as a simple
or solved problem in an ideal situation, where documents contain
distinct white space between lines. However for historical documents or
Indic scripts this is not always true. Most challenges encountered in
such nonconventional text line segmentation are when:
1. Adjacent text lines may be very close or touch each other.
2. There is variation of the angle between text lines.
3. Text lines are curvilinear.
4. Variation in the skew directions.
International Journal of Research
Volume VIII, Issue VI, JUNE/2019
ISSN NO:2236-6124
Page No:2298
.
Figure 3.3.1: An example of documents with touching, overlapping
and skewed text lines
Fig3.4.1 shows examples of documents with touching, overlapping and
skewed text. Touching and overlapping of text lines occurs in printed or
handwritten documents because of narrow regions of white space
between adjacent text lines. Most Indian scripts such as Telugu,
Kannada, Tamil, Bangla and Malayalam have documents that suffer
overlapping and touching lines due to vowel modifiers and consonant
modifiers. An example is shown in Fig 4.2 for Telugu text where these
are called as Matras and Vottus respectively.
Appearance of skewed lines in the text image makes the
problem complex. The problem becomes compounded if the lines in a
text image are skewed with different orientations. Such lines are called
multi-skewed lines.
International Journal of Research
Volume VIII, Issue VI, JUNE/2019
ISSN NO:2236-6124
Page No:2299
3.3.2: Telugu script shows consonant modifiers and vowel modifiers of a
character.
We see that amongst the traditional text line segmentation
algorithms, the projection profiles method [18, 20, 25] is suitable for
clearly separated lines, it cannot handle text lines with overlapping,
touching and different skew angles. Connected component based
methods [18,24] and smearing method [18, 25] also do not do so well
for these documents.
The proposed text line segmentation method in this paper is
based on fringe maps [19]. Its objective is to find a segmenting path
between two adjacent lines to separate them. Our motivation is to use
an approach that is based upon using white spaces to guide the
segmentation method, much as a human reader follows the gaps and
views the lines between the text.
FRINGE MAP METHOD:
In this method, it is necessary to understand the concept of
fringe maps. These can be said to be related to distance transforms [27]
for binary images. In a fringe map [18, 26], each pixel is represented
with a fringe number. Every Black pixel has a fringe number of zero.
White pixels have a fringe number which is a positive integer, that is the
distance from the nearest black pixel using a L2 metric. In other words,
A white pixel with fringe number x states that:
1) It is x pixels away from its nearest black pixel.
2) Its is surrounded by atleast (x-1) white pixels in all directions.
For us the second point is also important. We assume inputs to
be normal binary images of text documents where the printing or
writing is dark and background is light. So black pixels or the writing is
represented as '0' and white pixels are '-1'. To generate a Fringe map
for the input binary image we start by examining each neighbor of each
black pixel, and write a positive integer in each neighbor which is white
pixel value -1. Horizontal, vertical, and diagonal neighbors are
examined. Having done this, examine each neighbor of each pixel with
a one in it, and write twos in the neighbors that are -1. Continue
growing fringes until there are no more -1 value pixels. An example
International Journal of Research
Volume VIII, Issue VI, JUNE/2019
ISSN NO:2236-6124
Page No:2300
fringe map is shown in Figure 4.3. Originally as proposed
concept of fringe distances were used by
Brown [19] to recognize characters in images. Here we have
very novel application since we want to use the fringe maps to segment
text lines
Figure 3.3.3: An example fringe map of an telugu image and Peak
Fringe number (in circle) between two consecutive zeros in vertical
direction.
4.1 IMPLEMENTATION:
The proposed algorithm for text line segmentation of printed or
handwritten document images deals with the following challenges:
(1) Components of neighboring text lines may touch or overlap as shown in Fig
(2) Text lines that appear in the document may not have a simple linear
skew and could be multiskew.
The proposed algorithm segment lines in three stages as shown
in Figure. The first stage generates a fringe map for the given input
binary image. In the second stage, Peak fringe numbers (PFNs) are
located in the fringe map. A filtering operation on the PFNs is
performed. Then the PFNs between text lines are determined. In the last
stage, a segmenting path between lines is generated by joining the
PFNs. Here it is important to note that each background pixel in the
fringe map is represented with a fringe number that is a positive
integer, black pixels or object pixels have fringe number that is zero. In
next step, we find the Peak Fringe number (PFN) between two zeros
(black pixels) in vertical direction in the fringe map. A PFN with value
x is surrounded by (x-1) pixels white pixels. PFNs are white pixels that
International Journal of Research
Volume VIII, Issue VI, JUNE/2019
ISSN NO:2236-6124
Page No:2301
may lie between the lines or in the white space inside a character. We
are interested in the determining the PFNs between the lines, and need
to reject the latter. In the last step we generate a segmenting path
between two adjacent lines by joining PFNs. With this perspective view
of the processing we now show more details in the following.
Figure 4.1.1: Proposed text line segmentation method block diagram.
4.2 Generating Fringe Map:
Fringe map is generated for the given input binary image. In a
fringe map, each pixel is represented with a fringe number. A pixel with
a fringe number x is surrounded by at least (x-1) white pixels in all
directions. We use maximum fringe numbers to find white pixels
between lines.
Definition of PFN:
In a fringe map, Peak Fringe Number (PFN) is defined as a
white pixel between two consecutive black pixels with maximum fringe
value (a number) in the direction of interest. An example PFN is shown
in Fig 4.4; where the direction of interest is in the vertical direction.
Locate and Determine the PFNs between lines:
In the following we assume that the default direction for PFNs
is vertical. We scan the fringe map along columns and locate the
International Journal of Research
Volume VIII, Issue VI, JUNE/2019
ISSN NO:2236-6124
Page No:2302
PFNs. The PFNs may be present inside the connected
component or outside the connected component (between the lines) as
shown in Figure 5. We are interested in only the PFNs outside the
connected components through which we can find a segmenting path
between lines. Generally, we observe that the values of PFNs which are
inside the connected components are less than the values of PFNs
outside the connected components. Therefore we can use this
observation to
separate the PFNs, and distinguish between those PFNs that we need.
A simple threshold based method could be used to separate the required
PFNs. Here we use the choice of arithmetic mean of all the PFNs to be
used as threshold T. We observe that as expected the PFNs greater than
or equal to T are usually found between the text lines. Thus we have
filtered out the desired PFNs which are now called as filtered peak
fringe numbers (FPFNs). Fig 4.5 shows these FPFNs, whose fringe
number is greater than or equal to T. FPFNs play an important role in
generating segmenting path between lines.
4.3 Text Line Segment Path Generation:
Now a text line segmenting path is required to be found that
separates adjacent text lines but is generated by joining the FPFNs. It
is quite possible that the filtering process leaves gaps and we may not
be able to join FPFNs easily. We need now to further analyze the
FPFNs and find a value.
International Journal of Research
Volume VIII, Issue VI, JUNE/2019
ISSN NO:2236-6124
Page No:2303
Figure 4.3.1: Regions and filtered PFNs
M1 that is greater than or equal to the arithmetic mean of the
FPFNs. Then tentative segmentation paths can be made by joining all
those FPFNs that are greater than M1. However it is not so simple
since discontinuities may arise. The following procedure explains the
search and joining of these FPFNs with those in a tentative
segmentation path. We scan the fringe map along columns from left to
rightand search for FPFNs with fringe value greater than or equal to
M1 and build a partial segment path. Those points on this path which
are end points of this partial segment path are called as Marked
Segment Points (MSPs). For a given point at a position (i,j) in the
current column j, for which we have already constructed a partial path.
Now beyond this we don’t find any points immediately that could be
added to the path straight away. However there is another FPFN in the
neighborhood where the path may continue
The issue is to find a local connection between these points.
These sets of points that need to be joined to the path are found by
searching with relaxed constraints on the fringe values that can be
added if the following two cases are used. Consider now the point p,
(i,j), we search for FPFN nearest to it in a square window of size N × N
where N = 2 × M1 with boundaries: Top as (i - M1), Bottom as (i +
M1), Left as (j+1) and Right as (j + 2×M1).
Now p is a FPFN greater than or equal to M1 and we set this
FPFN as a MSP. Now we consider the cases where for j+1 we add
pointsto the segmenting path called as segment points according to:
Case 1: FPFN is found at m,n then set the FPFN point as a
MSP and then we set the local maximum fringe number as a segment
point in each column of the window from (j+1)th column upto the (n-
1)th column.
Case 2: In the extreme condition that no FPFN is found in the
window then we set a local maximum fringe number less than or equal
International Journal of Research
Volume VIII, Issue VI, JUNE/2019
ISSN NO:2236-6124
Page No:2304
to M1 as a segment point in each column of the window from
(j+1)th column to (j + M1) th column.
The above procedure results in partial segmenting
pathsbetween lines. Now to join the resulting partial segmenting paths
we look backwards from right to left. We scan the fringe map along
columns from right to left and search for the MSPs to join the partial
paths such that for a point p at ith row and j th column:
If p(i, j) is a MSP then set that MSP as a segmentpoint.
Then we search for FPFN or MSP or segment point nearest to it in the
window of size N×N where N = 2×M1 with boundaries, Top as
(i - M1), Bottom as (i+ M1), Right as (j-1) and Left as (j -
2×M1).Again two cases may arise:
Case 1: FPFN or MSP or segment point is found at m,n then we
set local maximum fringe number as segment point in each column of
the window from (j-1)th column to (n1)th column and if the point at m,n
is FPFN then set FPFN point as MSP.
Case 2: No FPFN or MSP or segment point found in the
window then set local maximum fringe number less than or equal to M1
as segment point in each column ofthe window from (j-1)th column to (j
- 2× M1 - 1)th column and set (j - 2× M1) th column point as MSP. The
result of segment path generation is shown in Fig 4.6.
CONCLUSION
A projection profile method is used for line, word and
character segmentation. But for telugu scripts some of the lines are
not segmented properly. So, we use fringe map method in order to
segment the lines of telugu script accurately. Projection profile
method is applied for different languages and we find the accuracy of
line, word and character of different languages.
FUTURE SCOPE
Future scope of our project is to segment the lines rather than
displaying and it is applied to the projection profile method for
character segmentation. By using projection profile method some
characters which are connected together are not segmented for
different languages. So, that characters are segmented properly by
using water dropfall method.
International Journal of Research
Volume VIII, Issue VI, JUNE/2019
ISSN NO:2236-6124
Page No:2305
REFERNECES
1.V. K. Koppula and A. Negi, “Using fringe maps for text line segmentation
in printed or handwritten document images,” in 2010 Second Vaagdevi
International Conference on Information Technology for Real World
Problems (VCON’10), Warangal, India, Dec 2010, pp. 83–88.
2.R. L. Brown, “The fringe distance measure: an easily calculated image
distance measure with recognition results comparable to Gaussian blurring,”
IEEE Trans. Syst., Man,Cybern., vol. 24, no. 1, pp. 111–115, 1994.
3.A.V.S. Venkat Rao, N., “Canonical syllable segmentation of Telugu
document images”, TENCON 2008 - IEEE Region 10 Conference , pp 1-5,
Nov- 2008.
4.Y. K. Chen, “Segmentation of handwritten connected numeral string using
background and foreground analysis”, Proc. 15th ICPR, pp. 598-601, 2000.
5.Fujisawa, "Segmentation methods for character recognition from
segmentation to document structure analysis", Proceeding of the IEEE,
vol.80, pp.1079 - 1092, 1992.
6.Rafael C. Gonzalez, Richard E. Woods - ―Digital Image Processing‖
second edition, Pearson Education, ISBN: 81-7808-629-8 Image
segmentation and extraction Richard E. woods.
International Journal of Research
Volume VIII, Issue VI, JUNE/2019
ISSN NO:2236-6124
Page No:2306