Upload
duongtuyen
View
226
Download
5
Embed Size (px)
Citation preview
COLLEGE OF ENGINEERING AND COMPUTER SCIENCE
DEPARTMENT OF COMPUTER SCIENCE
Project Report
DIAGRAM DRAWING USING SHAPE RECOGNITION
Yajun Wang
Supervisor: Eric McCreath
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
I
Abstract
Sketch recognition is the automated recognition of hand-drawn inputs by an electronic
stylus. It can recognise geometric primitives and other gestures defined by developers
or users. The hand-drawn sketches recognition has been applied increasingly in a
variety of fields, such as front ends for computer-aided design systems, gestural
interfaces, alternative inputs for keyboard-less appliances, or automatic correction or
understanding of diagrams for immediately educational feedback.
This project aims to explore and develop current sketch recognition techniques and
implement a system for drawing simple diagrams through Java GUI programming. The
system can recognise shapes when users finish single-stroke drawing and edit text in
some regular shapes. For the shape recognition part, author conducted two testings on
this system: the accuracy of corners finding and the accuracy of shape recognition. The
corners finding testing shows that the algorithm developed in this system is
competitive with other corners finding algorithms, while the shape recognition testing
demonstrates that there are still some aspects need to improve in this system during
further work.
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
II
Contents
Abstract .............................................................................................................................. I
Contents ............................................................................................................................ II
1. Introduction ........................................................................................................... - 1 -
1.1 Objectives .................................................................................................... - 1 -
1.2 Scope and Limitations ................................................................................. - 2 -
1.3 Structure of This Report .............................................................................. - 3 -
2. Background ............................................................................................................ - 5 -
Previous Work ....................................................................................................... - 5 -
3. Project Overview ................................................................................................... - 8 -
3.1 Requirements of “DDUSR” ........................................................................ - 8 -
3.2 Schedule ...................................................................................................... - 9 -
3.3 System Design and Modelling .................................................................. - 10 -
4. Used Notation ...................................................................................................... - 13 -
5. Implementation .................................................................................................... - 14 -
5.1 Pre-process ................................................................................................ - 14 -
5.1.1 Resample ........................................................................................ - 14 -
5.1.2 Remove Tails .................................................................................. - 16 -
5.2 Corners Finding ......................................................................................... - 17 -
5.2.1 Stage One ........................................................................................ - 18 -
5.2.2 Stage Two ....................................................................................... - 21 -
5.3 Recognition ............................................................................................... - 22 -
5.3.1 Open or Close Shape Identification ................................................ - 22 -
5.3.2 Turning Angle Function .................................................................. - 22 -
5.3.3 Shapes Testing ................................................................................ - 23 -
6. Testing and Evaluation ........................................................................................ - 31 -
7. Conclusion ........................................................................................................... - 33 -
References .................................................................................................................. - 34 -
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 1 -
1. Introduction
Sketch recognition is the automated recognition of hand-drawn inputs by an electronic
stylus. The hand-drawn sketches recognition has been applied increasingly in a variety
of fields, such as front ends for computer-aided design systems (such as data-flow
diagrams, UML diagrams, electronic circuit diagram and other engineering design
diagrams), automatic correction or understanding of diagrams for immediately
educational feedback (for example, children draw a shape then system return feedback
immediately to tell them what shape they are drawing and what it should be in fact),
alternative inputs for small keyboard-less devices (such as Palm Pilots), or gestural
interfaces. (Hammond et al., 2008)
1.1 Objectives
In this project, the objective is to implement a program for drawing simple diagrams
through Java GUI programming (named as “DDUSR” – “Diagram Drawing Using
Shape Recognition”). This program would take user input from a mouse or a pen
device and recognise simple shapes (such as arrows, circles, ellipses, squares,
rectangles, triangles, lines, curves). These shapes would be converted to some smooth
and nice shapes immediately while the user is interacting with the program.
More specially, this application is developed for quickly drawing data-flow diagrams
when the users need them. Users can move, resize and edit those refined shapes to
combine into a data-flow diagram. Also this tool may be used for the construction of
other similar diagrams.
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 2 -
This project also aims to explore and develop current free-sketch recognition
techniques and algorithms to implement, and explore heuristics for determining how
the program can be best applied to the particular situation. Comparisons of this
application and other existing applications are also provided.
Figure 1.1 shows an example diagram drawn by DDUSR.
Figure 1.1 An Example Diagram (Drawn by DDUSR)
1.2 Scope and Limitations
This application reacts immediately once the user finishes single-stroke drawing. This
means one of the limitations on this application is that it does not support multiple
strokes drawing. For example, users may draw a rectangle with three strokes, as shown
in Figure 1.2.
Figure 1.2 Multiple-strokes drawing VS Single-stroke drawing
OR
Multiple-strokes
drawing
Single-stroke
drawing
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 3 -
Based on the requirements of this project – implement of recognising and drawing
simple shapes and the time limitation, the final program can recognise: lines, arrows,
circles, ellipses, rectangles, squares, parallelograms, quadratic curves, cubic curves,
polygons and polylines. But this application still can be developed more complex, and
then more other shapes (such as spiral, helix and arbitrarily complex shapes) will be
recognised and drawn – this has been achieved in some current sketch recognition
programs (Hammond et al., 2008).
This application also provides all of the necessary functionalities for drawing, moving,
resizing, editing, deleting, copying shapes (including changing drawing and background
color of shapes, and text font, size and color) and saving current diagram. However,
there are many improvements can be made that will improve efficiency and ease of use.
This includes shapes union and separation which will be helpful for moving multiple
shapes at once.
The application is only tested with Windows 7 on author’s own laptop and Ubuntu 9.10
on computers in ANU CSIT Labs. Overall, the testing environments of this application
are in fast processing speeds and great memory storage capacity. Under those running
environments, the application does not slow down even with more than fifty shapes in
the drawing panel. Therefore, the performance and scalability of this application are not
considered during testing.
1.3 Structure of This Report
The rest of this report is structured as following chapters.
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 4 -
Chapter 2 provides some background of the area this project in. This includes an
overview of the sketch recognition and related techniques, and description of some
similar systems developed by other researchers.
Chapter 3 outlines the requirements for the application by detailing the features needed
to develop, as well as features that were desirable but not essential. This chapter also
gives the initially planned schedule of this project and the final procedures in actual
implement to show whether this project is undertaken on schedule or not. It also
describes some difficulties which author met during this project. Finally, it illustrates
the overall design of this application which is divided into two parts: core recognition
system design and the user interface design.
Chapter 4 lists some notations used in Chapter 5 for avoiding ambiguity of the symbols
used. The next part of the report in Chapter 5 covers how the program was implemented.
This focuses on the core sketch recognition system. The following Chapter reports
testing and evaluation of this system.
The final Chapter then discusses the conclusions based on the project and the results of
the program, as well as the challenges faced during the project. Also some ideas for
future work on this project will be listed in this Chapter.
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 5 -
2. Background
Sketch recognition research lies at the crossroads of artificial intelligence and human
computer interaction. Sketch recognition has become one of the increasingly popular
forms of human-interaction due to the increasing use of Table PCs. Its techniques have
generally fallen into two camps: gesture-based and free-sketch. (Hammond et al, 2008)
First, gesture-based techniques can provide high accuracy, but requires users to learn
how to draw each shape in a particular manner because the order and direction of the
stroke, and the number of strokes are key factors for recognition. These techniques are
usually used in the Palm Pilot’s Graffiti. The performance of gesture-based recognition
is based on drawing-style features, such as the start and end direction of the stroke, the
drawing speed of the stroke, and the total rotation of the stroke. Second, free-sketch
recognition allows users to draw shapes as they would naturally, but most current
techniques have low accuracies or require significant domain-level tweaking to make
them usable. The geometric- based, and feature-based, vision-based techniques are
commonly used in free-sketch recognition to recognize shapes. In fact, the majority of
current systems attempting free-sketch recognition are somewhere in between the
gesture-based and free-sketch system, abandoning some constraints while keeping
others due to the drawback the free-sketch has. (Hammond et al, 2008)
Previous Work
The idea of interacting with computers via pen-based input began in 1964 with the
seminal work of Ivan Sutherland’s sketchpad system. This system was able to defined
relationships to an existing object diagram. It constructed a diagram based on a model
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 6 -
of the design process. The locations of the points and the lines of the diagram modelled
the variables of a design. And the geometric constraints which applied to the points and
lines of the diagram modelled the design constraints which confined the values of the
design variables. (Sutherland, 1964) In 1991, Dean Rubine proposed a gesture
recognition toolkit, GRANDMA, which allowed single-stroke gestures to be learned
and later recognized through the use of a linear classifier. Rubine proposed thirteen
features that could be used to describe any single stroke shape and also provides two
techniques for rejecting bad gestures. (Rubine, 1991) Rubine's work was extended by
other researchers later. However, both these systems use feature-based techniques
which require extensive training. Furthermore, because of the features chosen, these
systems required that strokes be drawn in the same manner each time they were drawn.
For example, a circle drawn in an anti-clockwise manner would not be the same as a
circle drawn in a clockwise manner.
Due to the drawbacks of feature-based techniques, a research trend towards more
geometric-based techniques occurred. In 2001, Sezgin et al. propose a system that was
composed of an approximation stage, a recognition stage and a beautification stage.
The system used a novel method to detect corners in drawn strokes – finding the points
of highest curvature along with the points of lowest speed. In 2006, Kim and Kim
present new curvature metrics on corner finding. The metrics, local convexity (the sum
of all the curvatures of the same sign within a window together) and local monoticity
(investigate decreasing curvatures of the same sign around a point) measure the
curvature in the same direction at a point. Kim and Kim also introduce a different
measure for the curvature at a point. The system first resamples the stroke points. Since
the distance of any two adjacent points become a constant after resampling, a point’s
curvature does not need to take account of changes of the path length. Therefore, the
curvature at each point is equal to the direction change (i.e. turning angle) at that point.
In 2008, Paulson and Hammond propose a system PaleoSketch which mainly uses
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 7 -
many of the concepts learned from the Sezgin et al. (2001) and other previous work. It
improves on these systems to be able to recognize a larger number of primitive shapes
and complex shapes which consist of lines and curves, while still maintains high
recognition accuracy.
There are also some other research in different domains of the sketch recognition field,
such as, dominant points in curves (Tech & Chi, 1989), corner finding in polylines
(Wolin & Hammond, 2008), partial matching of polygons by translation and rotation of
turning angles (McCreath, 2008). All of these three techniques are adopted in this
project and will be illustrated later in this report.
So far, some tools, such as LADDE (Hammond & Davis, 2005), have been developed
to allow users to describe higher level symbols as a combination of lower level
primitives matching certain geometric constraints. They are usually implemented for
drawing some certain diagram designs, such as UML diagrams, electronic circuit
diagrams, flow charts, mechanical engineering diagrams and military course of action
diagrams by using some shape definition languages. Others have attempted to improve
upper-level recognition by using context; however, the use of context typically requires
domain knowledge.
The goal of this project is to learn knowledge from previous work and develop a
system to recognize free-hand sketches with as high as possible accuracy by attempting
and developing generalized (probably new or modified) techniques. In addition, this
system is specially developed for quickly drawing data-flow diagrams.
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 8 -
3. Project Overview
This chapter first outlines the essential requirements for this system, as well as features
that were desirable but not essential. The initially planned schedule and the final
procedures in actual implement are also given in this chapter to show whether this
project is undertaken on schedule or not. It also describes some difficulties which author
met during this project. Finally, the overall design of this system which is divided into
two parts: core recognition system design and the user interface design, are illustrated.
3.1 Requirements of “DDUSR”
3.1.1 Essential Requirements
1. The program should react immediately after users finish one-stroke drawing. The
shapes should be recognised are: arrows, circles, squares, ellipses, rectangles and a
special shape type for data-flow diagram – DataStores.
2. The program should allow moving, resizing and editing shapes.
3. The program should allow inputting text in some regular shapes, such as circles,
ellipses, rectangles, squares.
3.1.2 Desirable Features
1. The program can recognise more shapes than what listed in essential requirements.
2. The program can have some feature functions, such as colour chooser, text font and
size selection, diagram saving and shape copying, deleting and recalling.
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 9 -
3.2 Schedule
3.2.1 Initial Plan
Week 1 Look for supervisor and determine the topic Week 2 Search and read materials related to the topic
Learn Java programming Week 3 Decide which method/algorithm will be used in the program
Learning Java programming Week 4 Prepare for initial presentation
Read more materials and discuss with supervisor and other people Week 5 Start programming
- draw and recognise basic shapes (square and circle) using mouse Week 6 - 8 Programming
- recognise other shapes Week 9 Week 10
Programming - improve processing accuracy and add colour selection
Week 11 Debugging Week 12 Complete application
Write report Week 13 Write report and submission Week 14 Prepare for final presentation
3.2.2 Actual Procedure
As the author is a new Java learner, the learning of Java is actually run throughout
whole lifecycle of this project. For the programming part, the implement of corner
finding algorithm occupies the majority of the lifecycle of this project due to several
attempts on various algorithms and settings of various constant factors. Then come to
the editing text part, the shape layer problem takes some time to fix because it requires
a large number of modifications and changes on mouse event handling. Overall, this
project is conducted step by step under the time constraint, although it takes few more
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 10 -
time on learning period through whole project lifecycle.
3.3 System Design and Modelling
3.3.1 Core Recognition System Design
Figure 3.1 Core Recognition System Modelling (Drawn by DDUSR)
As shown in Figure 3.1, the final points used in recognition process are those identified
as “corners”. Therefore, the main focus of this application is finding correct corners of
original shape (raw shape). This will be discussed in Chapter 5.2.
3.3.2 User Interface Design
The standing point of user interface design is as simple as possible, because the main
purpose of this program is for users (probably software designer) to quickly drawing
diagrams. So when they are designing their systems, they can rapidly “brainstorm”
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 11 -
their ideas. Also the shapes in these diagrams can be easily and quickly edited. The
other aim is that for users drawing some design diagrams which consist of simple
shapes such as data-flow diagrams. The user interface is combined by a menu bar
which provides some commonly used functions and a white drawing panel which
displays the refined shapes. A screen that of the user interfaces is shown in Figure 3.2.
Figure 3.2 User Interface with Some Drawn Shapes
Actually, when users are drawing a shape, they are interacting with the Glass Pane
rather than the Content Pane (see The JavaTM Tutorials for more details about Glass
Pane and Content Pane of JFrame), but the original shapes that they drawn are
displayed on the Content Pane. The Glass Pane is performed as a “transparent screen”
which is between users and the “display screen” (i.e. Content Pane). Users “draw” on
Glass Pane using an electronic pen or mouse, but all the contents (i.e. shapes) are
shown on the “display screen”, Content Pane, which is under the Glass Pane. The
purpose of using the Glass Pane is to catch the mouse events and then deliver them to
the corresponding components. Without Glass Pane, when the text pane in a shape is
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 12 -
existent, it will catch the mouse event and active the “editing text” mode. This will lead
the shape losing “move” function because the panel in which the shape cannot catch
the mouse event.
After passing the original shape to the core recognition system, the recognition system
returns a JComponent with the refined shape to add to the Content Pane. This means
the shapes with their own text panes (if exist) are actually drawn in their own
JComponents rather than the Content Pane. This can avoid the overlapping problem
when system draws shapes and adds text panes in Content Pane directly.
The users can double click the shapes to active the text pane. Only some supported
shapes have text pane, such as circles, ellipses, rectangles, squares and DataStores (a
special shape type for drawing data-flow purpose). When the user is under “Edit Text”
mode, the user can easily change focus between those shapes with text panes by single
click the text panes in shapes, because the Glass Pane is invisible under “edit text”
mode. If the user clicks the area without text pane, the application will set the Glass
Pane visible automatically and back to “Common” mode. Then the user can draw new
shapes, and move or resize existing shapes.
The inner mouse event passing process is shown in Figure 3.3.
Figure 3.3 Mouse Event Passing Process (Drawn by DDUSR)
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 13 -
4. Used Notation
This chapter lists some notations used in the following chapter.
1. A ← B append element B to the end of list A.
2. | A | the absolute value of A
3. Distance |𝑝𝑝𝑎𝑎 ,𝑝𝑝𝑏𝑏 | = �(xa − xb)2 + (ya − yb)2 the distance between pa and pb
4. line 𝑝𝑝𝑎𝑎 , 𝑝𝑝𝑏𝑏������� the straight line between pa and pb
5. Path Length (𝑝𝑝a , 𝑝𝑝b)� = ∑ �(xi − xi+1)2 + (yi − yi+1)2b−1i=a
the path length between pa and pb
6. Perimeter of Shape = ∑ �(xi − xi+1)2 + (yi − yi+1)2N−1i=0
7. directed line 𝑝𝑝a 𝑝𝑝b�����������⃗ a vector from pa and pb
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 14 -
5. Implementation
5.1 Pre-process
The pre-processing is vital significant for further recognition processes. Without a
good pre-processing, the system cannot recognise shapes accurately, because the
original (raw) inputs contain noise which has a considerable impact on the quality of
recognition results.
5.1.1 Resample
The first step of pre-processing is resampling the points of a shape. This process is
depicted in Figure 5.1. Resampling involves two key aspects: first, decide the
inter-spacing distance which is the distance between two points after resamepling;
second, relocate the points using a resampling algorithm along with the determined
inter-spacing distance.
Figure 5.1 Original Points of Sahpe VS Resampled Points of Shape
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 15 -
In DDUSR, for the inter-spacing distance, the approach of Wolin and Hammond (2008)
is used. Points are resampled based on the diagonal length of the shape’s bounding box.
In order to accommodate for shapes of different size, the interspacing distance S is equal
to the diagonal divided by a constant factor which is set to 40 in DDUSR. This constant
factor is the same as in Wolin and Hammond’s article (2008), which was determined
empirically. The lower the value of this constant will cause over- smoothed shapes,
whereas the higher the value of this constant is will result in too much noise.
Once the interspacing distance S has been calculated, the original points of the shape
can be resampled. The approach of Wobbrock et al. (2007) is used. At first, an empty
arraylist newp of Point is created to store the new resampled points. The first point in
the original point set, points0, is then appended to newp. A distance holder D is
initialized to 0. For point pi from second to the last point, compute d which is the
distance between pi-1 and pi, when the D+d is greater than S, according to the
trigonometric function the new position of resampled point can be calculated; otherwise,
D is updated to D+d.
The pseudo code of this algorithm is as follows:
Resample (Shape s, C) ptopleft = the top-left point of the bounding box of Shape s pbottomright = the bottom-right point of the bounding box of Shape s Ldiagonal = distance between ptopleft and pbottomright
inter-spacing distance S = Ldiagonal / 40 D = 0 resampled points newp ← points0
FOR i from 1 to (points.size – 1) d = distance between pointsi-1 and pointsi IF D + d >= S
delta = (S – D) / d p = p + delta * (pointsi – pointsi-1) newpi = p
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 16 -
pointsi = p D = 0
ELSE D = D + d
return newp
5.1.2 Remove Tails
Unintended “tails” at the beginning and ending of a shape tend to contain a great deal of
noise, which can be a significant problem for accurate recognition. Therefore, removing
these tails before sending the shape to further processing is necessary. To determine if a
tail is present, the first 15% and the last 5% of the shape points are analysed. Because
DDUSR is required to recognise arrows, the number of the last points which are to be
considered has to be minimised. The ranges of points to be considered are determined
by author experimentally. Then, the algorithm finds the point within each section (the
first 15% and the last 5%) that has the highest curvature (i.e. the absolute value of
turned angle divides to the distance between those two adjacent points). If that curvature
is higher than a threshold, 0.5, then the stroke of the shape is broken at that point and
removed as the “tail”. This is not performed on shapes with a low number of points, 5,
or with too small shape perimeter, 70 (Paulson & Hammond, 2008).
The pseudo code of this algorithm is as follows:
RemoveTail (points) s = points.size * 0.15 e = points.size * 0.95 max = -1 FOR i from 1 to s
d = distance between pointsi-1 and pointsi angle = turning angle at pointsi curvature = | angle / d |
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 17 -
IF max < curvature max = curvature sMax = i
IF max < 0.5 sMax = 0
max = -1 FOR points from e to (points.size – 2)
d = distance between pointsi-1 and pointsi angle = turning angle at pointsi curvature = | angle / d | IF max < curvature
max = curvature eMax = i
IF max < 0.5 eMax = points.size – 1
IF sMax == 0 and eMax == points.size - 1
return points FOR i from sMax to eMax
newp ←pointsi return newp
5.2 Corners Finding
Finding corners is the core concept of this application. The corners finding algorithm
in DDUSR is developed from the algorithm presented by Wolin & Hammond (2008)
which named as “ShortStraw”. The ShortStraw aims to develop a simple and effective
way to finding corner. It uses a bottom-up approach to detect corners by calculating the
distance (i.e. “straw”) between the endpoints of a constant support region around each
point and taking the points with the minimum “straw” to be corner candidates after
resampling the points of a shape. In corners finding algorithm of this project, the main
concept is the same as ShortStraw, but it has been developed to be more complex and
more accurate. Therefore, the simplicity of ShortStraw has been reduced, but the
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 18 -
drawbacks of ShortStraw will be covered, such as non-sensitive to obtuse angles and
redundant corners identified under small shapes or slowly-drawing circumstances. All
the constant factors are determined by author of this report experimentally.
5.2.1 Stage One
In this Stage, it follows the ShortStraw’s main concept, i.e. using both a bottom-up and
top-down approaches. The bottom-up approach attempts to build corners from
primitive information, whereas the top-down approach looks at higher-level patterns to
determine possible insertion or deletion of corners.
5.2.1.1 Bottom-Up
DDUSR finds corners in a stroke based on the length of the chord. A chord for a point
at pi is computed as:
chord𝑖𝑖 = Distance |𝑝𝑝𝑖𝑖−𝑘𝑘 ,𝑝𝑝𝑖𝑖+𝑘𝑘 |
where k is a varied support region determined through the method for dynamic chord
lengths presented by Teh and Chin (1989). It is an algorithm for determining the varied
support region for each point on a digital curve. The procedure does not require any
input parameter. It first detects the support region for each point based on its local
properties, and then calculates measures of relative significance (i.e. curvature) of each
point. This will conquer one of disadvantages of ShortStraw – non-sensitive to obtuse
angles.
The following is the pseudo code of the algorithm for computing a chord for a point in a
varied support region:
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 19 -
ComputeChord (points) FOR i from 2 to (points.size – 3)
chd1 = -1 FOR k form 1 to (points.size – i – 1)
IF (i – k) >= 0 chd2 = distance between pointsi-k and pointsi+k d2 = perpendicular distance of pi to the line 𝑝𝑝𝑖𝑖−𝑘𝑘 ,𝑝𝑝𝑖𝑖+𝑘𝑘������������ IF k != 1
IF chd1 >= chd2 BREAK
IF d1/chd1 >= d2/chd2 BREAK
d1 = d2 chd1 = chd2
chord ← chd1 return chord
To find the initial corner set, all the chords are first computed from the third point to the
antepenultimate point. Then a threshold md is set to be equal to the median chord in the
array which stores all calculated chords. For each chordi, if chordi is below the threshold
md, then the corresponding point is a corner candidate. After all, store the starting point
and the ending point to the arraylist of corners at the beginning and end positions
respectively.
5.2.1.2 Top-Down
After the initial set of corners is selected by taking the shortest chords, some higher-
level processing is executed to find missed corners and remove false corners. First,
DDUSR checks whether each consecutive pair of corners passes a line test or not. Two
consecutive corners, pa and pb, will pass the line test if the Euclidean distance and the
path length between them are relatively equal. We represent this equality through the
ratio:
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 20 -
𝑡𝑡 = Distance |𝑝𝑝a , 𝑝𝑝b|
Path Length (𝑝𝑝a , 𝑝𝑝b)�
where 0 < t < 1, since the Euclidean distance between the two points will always less
than the path length between them.
For the insertion of missed corners, a threshold T1 is set to be 0.99. If T1 is not high
enough, the algorithm may miss the correct corners, as shown on Figure 5.2.
Figure 5.2 False Corner Found
If t between two consecutive corners, pa and pb, is less than T1, then loop over the points
between pa and pb in original shape points. For each point pc between pa and pb, if the
absolute value of the turning angle (see 5.3.2 for details about turning angle function)
between line 𝑝𝑝𝑎𝑎 , 𝑝𝑝𝑐𝑐������� and line 𝑝𝑝𝑐𝑐 ,𝑝𝑝𝑏𝑏������� is greater than 0.09π. Applying check of turning
angles here can avoid adding in unnecessary corner candidates. Then compute the
perpendicular distance from pc to line 𝑝𝑝𝑎𝑎 ,𝑝𝑝𝑏𝑏�������. The point between pa and pb with the
maximum perpendicular distance to line 𝑝𝑝𝑎𝑎 ,𝑝𝑝𝑏𝑏������� is considered as missed corner
candidate. After insert a new corner, the program loop the corners from the beginning
again. This insertion process does not stop running over until no missed corner can be
found. By applying this method, DDUSR is becoming sensitive to the obtuse angles.
For the deletion of false corners, a threshold T2 is set to be 0.95. For each point pc from
the second to the penultimate corner candidates, if t between pc-1 and pc+1 is no less than
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 21 -
T2 and the absolute value of the turned angle at pc is less than 0.13π, then pc is a false
corner and should be removed. After an iteration is finished, if at least one corner has
been removed during the just finished iteration, then loop from the beginning again.
This deletion process does not stop running over until no false corners can be removed.
There is a further deletion for removing false corners; this is for any corners are too
close, especially in small shapes or shapes drawn in a slow speed. For each point pi
from the second to the last corner candidates, if the distance between two consecutive
corners pi-1 and pi is less than 30, then find out the smaller turned angle between pi-1 and
pi, and if this smaller turned angle is less than 0.5π, then remove the corresponding
corner candidate. The special case is when pi is the second to the last corner candidates.
When pi is either the second or last corner candidates, if the turned angle at pi is less
than 0.5π, then remove pi. However, remove the point anyway if the distance between
two consecutive corners pi-1 and pi is less than 5% of the perimeter of the shape. After
one corner candidate is removed, then loop from the beginning again. This deletion
process does not stop running over until no false corners can be removed.
5.2.2 Stage Two
Stage two is actually a curve property identification. If a shape can be find any corners
to be added after stage one, it is recognised as a curve.
For each point pi from the second to the last corner candidates, find an original shape
point p with the maximum perpendicular distance to line 𝑝𝑝𝑖𝑖−1,𝑝𝑝𝑖𝑖��������� between pi-1 and pi.
If the perpendicular distance is greater than a constant factor, 11, or the ratio of the
perpendicular distance to the distance between pi-1 and pi is greater than 0.14, then p is
a missed corner and should be added. In addition, the curve property of this
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 22 -
shape/stroke is true. Figure 5.3 shows an example of corner finding result
Figure 5.3 An example of corner finding result
5.3 Recognition
5.3.1 Open or Close Shape Identification
In DDUSR, a simple detection is applied to examine whether the shape is open or close.
If the ratio:
r =Distance|𝑝𝑝first , 𝑝𝑝last |Perimeter of Shape
is less than a constant factor, 0.16, which is determined empirically, then it is regarded
as a close shape; otherwise, it is regarded as a open shape. (Paulson & Hammond,
2008)
5.3.2 Turning Angle Function
Turning angle function provides a simple approach for shape recognition (McCreath,
2008). For a point pa which is not either one of the endpoints, the turning angle is the
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 23 -
angle directed line 𝑝𝑝a−1, 𝑝𝑝𝑎𝑎�����������������⃗ rotates to the directed line 𝑝𝑝a , 𝑝𝑝𝑎𝑎+1�����������������⃗ . When direction of
rotation is clockwise, the turning angle is a positive value; otherwise, the turning angle
is a negative value. However, instead of using the accumulative sum of total turning
angles from starting point to the point p as a sign at p, DDUSR uses the current turning
angle at p as a sign. For the starting point p0 (i.e. the first point), firstly draw a line
which parallels x-axis from p0, and then compute the angle between this line and line
𝑝𝑝0,𝑝𝑝1�������. When line 𝑝𝑝0,𝑝𝑝1������� is above the drawn line, the angle is a positive value;
otherwise, the angle is negative. This angle is regarded as the turning angle at p0.
Figure 4.2 (a) Turning Angle (Non-endpoint) (b) Turning Angle (starting point)
5.3.3 Shapes Testing
DDUSR users curve property, open property, the total sum of turning angles at all
points except the two endpoints and the average turning angle to identify the shape
types. This is a simple algorithm but still can recognise defined shape types accurately.
All the constant factors are determined empirically by the author of this report.
5.3.3.1 Line
If a shape:
1) only has two corners (i.e. two endpoints), and
θ > 0 θ < 0 pa pa
p0
θ > 0
p0 θ < 0
(a) (b)
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 24 -
2) is identified as a open shape,
it is a line. The DDUSR will present a line from the striating point to the end point in
the original shape. Figure 5.4 shows an example of “Line” recognition.
Figure 5.4 An example of “Line” Recognition
5.3.3.2 Arrow
To determine if a shape is an arrow, the shape should:
1) is regarded as a open shape, and
2) only has 4 or 5 or 6 corners, and
3) has a corner pi at which the absolute value of the turning angle is greater than
0.6π, and
4) has a corner pi+1 at which the absolute vale of the turning angle is greater than
0.88π, and
5) the distance between pi and pi+2 is less than 30,
it is a arrow. In DDUSR, an arrow is actually drawn as a polyline which has five points.
DDUSR takes pi-1 as the first point and pi as the second and fourth point, then compute
locations of the third and fifth points according to trigonometric function. Figure 5.5
shows an example of “Arrow” recognition.
Figure 5.5 An example of “Arrow” Recognition
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 25 -
5.3.3.3 DataStore
DataStore is a special shape for drawing data-flow diagrams. A shape is recognised as
a DataStore, if:
1) the number of corners is equal to 4, and
2) the sum of total turning angles except the angles at the endpoints is greater than
2.8 and less than 3.5 (i.e. in the range of (0.9π, 1.1π)), and
3) the average turning angle is greater than 1.3 and less than 1.8 (i.e. in the range
of (0.4π, 0.6π)), and
4) the absolute value of the slope of line 𝑝𝑝0, 𝑝𝑝1�������� is less than 1, and
5) the turning angles at the second and third points have the same sign (+ or – ),
and the sum of them is greater than 0.8π and less than 1.2π,
Like arrow, a DataStore is drawn as a polyline which consists of four points. Figure 5.6
shows an example of “DataStore” recognition.
Figure 5.6 An example of “DataStore” Recognition
5.3.3.4 Rectangle or Square
As a rectangle or a square, it should satisfy:
1) the number of corners is 5 or 6, and
2) the shape is regarded as close shape, and
3) the sum of the turning angles at all corners except endpoints should be greater
than 4 and less than 7 (i.e. in the range of (1.3π, 2.3π)), and
5) the average turning angle should be greater than 1.3 and less than 2 (i.e. in the
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 26 -
range of (0.4π, 0.64π)), and
6) In all corners excepts endpoints, there should be at least 3 right-angles. If the
turning angle of a corner is in the range of (0.4π, 0.6π), it is a right-angle. And
7) if the shape has two equal adjacent sides (the ratio of these two sides should be
less than 0.8),
it is recognised as a square; otherwise, it is recognised as a rectangle. Figure 5.7 shows
an example of “Square” and “Rectangle”recognition.
Figure 5.7 An example of “Square” and “Rectangle” Recognition
5.3.3.5 Parallelogram
In DDUSR, a parallelogram is actually drawn as a polygon. To be a parallelogram, the
following conditions should be satisfied:
1) the number of corners is 5, and
2) be identified as a close shape, and
3) the sum of total turning angles at all corners except endpoints should be greater
than 4 and less than 7 (i.e. in the range of (1.3π, 2.3π)), and
4) the average turning angle should be greater than 1.3 and less than 2 (i.e. in the
range of (0.4π, 0.64π)), and
5) the sum of turning angles at the 2nd and the 3rd corners should be greater than
0.8π and less than 1.2π, and
6) the ratio of turning angle at the 2nd and the 4th corner should be greater than 0.9.
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 27 -
Figure 5.8 shows an example of “Parallelogram” recognition.
Figure 5.8 An example of “Parallelogram” Recognition
5.3.3.6 Polygon
Polygon is for all close and non-curve shapes. Figure 5.9 shows an example of
“Polygon” recognition.
Figure 5.9 An example of “Polygon” Recognition
5.3.3.7 Ellipse or Circle
For a ellipse or a circle, the shape should satisfy some conditions:
1) the number of corners should be more than 5, and
2) the shape is marked as a close shape, and
3) the sum of turning angles at all corners except endpoints should be greater than
4 and less than 8.5 (i.e. in the range of (1.3π, 2.7π)), and
4) the average turning angle should be greater than 0.4 and less than 2 (i.e. in the
range of (0.13π, 0.64π)), and
5) compute the longest distance between corners which is the long axis of the
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 28 -
ellipse, then according to ellipse perimeter formula, the length of the short axis can
be worked out.
If the ratio of the short axis and the long axis is greater than 0.9, this shape is
recognised as circle; otherwise, it is regarded as ellipse. Figure 5.10 shows an example
of “Circle” and “Ellipse” recognition.
Figure 5.10 An example of “Circle” and “Ellipse” Recognition
5.3.3.8 CubicCurve
To determine if a shape is a CubicCurve, the following conditions:
1) the number of corners should be greater than 3 and less than 12, and
2) the shape is tagged as a open and curve shape, and
3) the sum of turning angles at all corners except endpoints should be less than 1.5,
and
4) the average turning angle should be less than 0.6,
5) from the 3rd corner, there should be at least one corner has different sign (+ or –)
from the 2nd corner,
should be reached. Figure 5.11 shows an example of “CubicCurve” recognition.
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 29 -
Figure 5.11 An example of “CubicCurve” Recognition
5.3.3.9 QuadCurve
In DDUSR, all arcs are represented as a QuadCurve simply, rather than a part of circle
or ellipse in other applications. To determine a QuadCurve, the following requisite
should be satisfied:
1) the number of corners should be in the range of [3, 9] and
2) the shape is tagged as a open and curve shape, and
3) the sum of turning angles at all corners except endpoints should be less than 1.5,
and
4) the average turning angles should be less than 1.5.
Figure 5.12 shows an example of “CubicCurve” recognition.
Figure 5.12 An example of “CubicCurve” Recognition
5.3.3.10 Polyline
A shape which does not pass the above shapes testing is recognised as a polyline.
Figure 5.13 shows an example of “CubicCurve” recognition.
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 30 -
Figure 5.13 An example of “Polyline” Recognition
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 31 -
6. Testing and Evaluation
This system is tested in two different ways. One is collecting the correct numbers of
corners the system can find; the other one is collecting the correct numbers of shapes
the system can recognise. These two sets of data are collected from 600 shapes drawn
by 5 different users. Each user drew those 12 types of shapes (as shown in Figure 4.2)
which can be recognised by the system. For each shape type, each user drew 10 times
with different sizes. The polygons and ploylines were drawn arbitrarily, and were not
restricted to how complex or simple they would be. In the first testing, the corners
found in shapes are only counted as correct corners when those corners can be used for
further recognition. In the second testing, the top-correct interpretations are not
considered. This means, the testing only considered whether shape types are correctly
recognised, not including the precision of the presented shapes. The result is shown as
Table 6.1 below.
Table 6.1 Testing Result
Correct Corners Total Corners Correct Shapes Total Shapes Arrow 252 252 48 50 Line 100 100 50 50 Circle 594 594 39 50 Ellipse 583 583 41 50 Square 275 275 50 50 Rectangle 275 275 50 50 Parallelogram 250 250 50 50 DataStore 200 200 50 50 QuadCurve 269 269 48 50 CubicCurve 287 287 44 50 Polygon 608 623 47 50 Polyline 589 614 50 50 Total 4282 4322 567 600
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 32 -
Due to the time limitation, the author does not implement other previous algorithms for
comparison. However, theoretically, the corner finding algorithm has a high accuracy
to find corners in both small and large shapes. It is competitive with other current
corner finding algorithms, such as ShortStraw. Wolin and Hammond reports that the
ShortStraw has a high accuracy, 0.979, while the algorithms developed by Sezgin et al.
(2001) and Kim & Kim (2006) have 0.824 and 0.790 respectively. In DDUSR, the
corner finding algorithm is developed from ShorStraw and has conquered some
disadvantages of ShortStraw. Therefore, in theory, the corners finding algorithm
developed in DDUSR would have relative high accuracy even under the same testing
environment as the other algorithms.
For the correct recognition part, there is still room for improvement in DDUSR. Paleo,
which is developed by Paulson and Hammond (2008), is reported that it has achieved
very high accuracies, 99.89% of correct recognition and 98.56% of top-correct
interpretation. In DDUSR, the recognition algorithm is still in a simple and rough stage.
For example, the curve property identification can be an aspect to improve because
sometimes the circle or ellipse is recognised as polygon (non-curved). And some other
properties determination approaches can be implemented along with the turning angles
function.
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 33 -
7. Conclusion
In this report, it first introduces the aims, scope and limitations of this project. Then it
gives a background of sketch recognition, and requirements, design and modelling of
this project. After that, it reports that the detailed implementation of the project and the
testing results.
To sum up, this project is conducted step by step on schedule and the application is
executable and usable. Although there are some room to be improved in the system,
DDUSR still can be used when users need to quickly draw some diagrams which
contain simple shapes, or when users “brainstorm” their designs.
There are some further work can be conducted to improve this program in the future.
1. Improve the recognition algorithm, such as curve property identification (this
might need to modify corner finding algorithm as well), other properties
determination approaches.
2. Identify more shapes to be recognised using geometric characteristics or other
features.
3. Add more useful functions for easy-to-use and customisation, such as shape union,
separation, and shape outline styles.
4. Support multiple-stroke drawing by determining time interval between two
continuously drawn strokes or other approaches.
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 34 -
References
Hammond Tracy, Brian Eoff, Brandon Paulson, Aaron Wolin, Katie Dahmen. Joshua
Johnston & Pankaj Rajan. Free-Sketch Recognition: Putting the CHI in Sketching. CHI
2008, April 5 – April 10 2008, Florence, Italy.
Hammond, Tracy & R. Davis. LADDER, A Sketching Language for User Interface
Developers. Computers & Graphics. 2005. 29, 4, 518-532.
Kim, D.H. and Kim, M.J. A Curvature Estimation for Pen Input Segmentation in
Sketch-based Modeling. Computer-Aided Design. 2006. 238-248.
McCreath, Eric. Partial Matching of Planar Polygons Under Translation and Rotation.
Proceedings of the 20th Annual Canadian Conference on Computational Geometry.
August 13-15, 2008 McGill University, Montreal, Quebec.
Oracle. The JavaTM Tutorials. 2010. http://download.oracle.com/javase/tutorial.
Paulson, Brandon & Tracy Hammond. PaleoSketch: Accurate Primitive Sketch
Recognition and Beautification. Proceedings of the 13th international conference on
Intelligent user interfaces. 2008. ACM, New York, USA. 1 – 10.
Teh, C.H. & R.T. Chin. On The Detection of Dominant Points on Digital Curves. IEEE
Trans. Pattern Anal. Mach, 1989. Intel. 11, 8, 859–872.
Wobbrock, J.O., A.D. Wilson & Y. Li. Gestures Without Libraries, Toolkits or Training:
A $1 Recognizer for User Interface Prototypes. UIST’07: Proceedings of the 20th
Annual ACM Symposium on User Interface Software and Technology. 2007. ACM,
New York, USA. 159–168.
Wolin, Aaron & Tracy Hammond. ShortStraw: A Simple and Effective Corner Finder
for Polylines. EUROGRAPHICS 5th Annual Workshop on Sketch-Based Interfaces and
Modeling. 2008. 33 – 40.
COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang
- 35 -
Rubine, D. Specifying Gestures by Example. Proc. of the 18th Annual Conference on
Computer Graphics and Interactive Techniques, ACM Press (1991), 329-337.
Sezgin, T.M., Stahovich, T. and Davis, R. Sketch Based Interfaces: Early Processing
for Sketch Understanding. Proc. of the 2001 Workshop on Perceptive User Interfaces,
ACM Press (2001), 1-8.
Sutherland, I.E. Sketch Pad: A Man-Machine Graphical Communication System. Proc.
of the SHARE Design Automation Workshop, ACM Press (1964), 6.329-6.346.