Project Report DIAGRAM DRAWING USING SHAPE RECOGNITION …courses.cecs.anu.edu.au/courses/CS_PROJECTS/10S2/Reports/Yajun … · college of engineering and computer science . department

COLLEGE OF ENGINEERING AND COMPUTER SCIENCE

DEPARTMENT OF COMPUTER SCIENCE

Project Report

DIAGRAM DRAWING USING SHAPE RECOGNITION

Yajun Wang

[email protected]

Supervisor: Eric McCreath

COMP8780 IHCC Project S2 2010 u4582110 Yajun Wang

I

Abstract

Sketch recognition is the automated recognition of hand-drawn inputs by an electronic

stylus. It can recognise geometric primitives and other gestures defined by developers

or users. The hand-drawn sketches recognition has been applied increasingly in a

variety of fields, such as front ends for computer-aided design systems, gestural

interfaces, alternative inputs for keyboard-less appliances, or automatic correction or

understanding of diagrams for immediately educational feedback.

This project aims to explore and develop current sketch recognition techniques and

implement a system for drawing simple diagrams through Java GUI programming. The

system can recognise shapes when users finish single-stroke drawing and edit text in

some regular shapes. For the shape recognition part, author conducted two testings on

this system: the accuracy of corners finding and the accuracy of shape recognition. The

corners finding testing shows that the algorithm developed in this system is

competitive with other corners finding algorithms, while the shape recognition testing

demonstrates that there are still some aspects need to improve in this system during

further work.


II

Contents

Abstract .............................................................................................................................. I

Contents ............................................................................................................................ II

1. Introduction ........................................................................................................... - 1 -

1.1 Objectives .................................................................................................... - 1 -

1.2 Scope and Limitations ................................................................................. - 2 -

1.3 Structure of This Report .............................................................................. - 3 -

2. Background ............................................................................................................ - 5 -

Previous Work ....................................................................................................... - 5 -

3. Project Overview ................................................................................................... - 8 -

3.1 Requirements of “DDUSR” ........................................................................ - 8 -

3.2 Schedule ...................................................................................................... - 9 -

3.3 System Design and Modelling .................................................................. - 10 -

4. Used Notation ...................................................................................................... - 13 -

5. Implementation .................................................................................................... - 14 -

5.1 Pre-process ................................................................................................ - 14 -

5.1.1 Resample ........................................................................................ - 14 -

5.1.2 Remove Tails .................................................................................. - 16 -

5.2 Corners Finding ......................................................................................... - 17 -

5.2.1 Stage One ........................................................................................ - 18 -

5.2.2 Stage Two ....................................................................................... - 21 -

5.3 Recognition ............................................................................................... - 22 -

5.3.1 Open or Close Shape Identification ................................................ - 22 -

5.3.2 Turning Angle Function .................................................................. - 22 -

5.3.3 Shapes Testing ................................................................................ - 23 -

6. Testing and Evaluation ........................................................................................ - 31 -

7. Conclusion ........................................................................................................... - 33 -

References .................................................................................................................. - 34 -


- 1 -

1. Introduction

Sketch recognition is the automated recognition of hand-drawn inputs by an electronic

stylus. The hand-drawn sketches recognition has been applied increasingly in a variety

of fields, such as front ends for computer-aided design systems (such as data-flow

diagrams, UML diagrams, electronic circuit diagram and other engineering design

diagrams), automatic correction or understanding of diagrams for immediately

educational feedback (for example, children draw a shape then system return feedback

immediately to tell them what shape they are drawing and what it should be in fact),

alternative inputs for small keyboard-less devices (such as Palm Pilots), or gestural

interfaces. (Hammond et al., 2008)

1.1 Objectives

In this project, the objective is to implement a program for drawing simple diagrams

through Java GUI programming (named as “DDUSR” – “Diagram Drawing Using

Shape Recognition”). This program would take user input from a mouse or a pen

device and recognise simple shapes (such as arrows, circles, ellipses, squares,

rectangles, triangles, lines, curves). These shapes would be converted to some smooth

and nice shapes immediately while the user is interacting with the program.

More specially, this application is developed for quickly drawing data-flow diagrams

when the users need them. Users can move, resize and edit those refined shapes to

combine into a data-flow diagram. Also this tool may be used for the construction of

other similar diagrams.


- 2 -

This project also aims to explore and develop current free-sketch recognition

techniques and algorithms to implement, and explore heuristics for determining how

the program can be best applied to the particular situation. Comparisons of this

application and other existing applications are also provided.

Figure 1.1 shows an example diagram drawn by DDUSR.

Figure 1.1 An Example Diagram (Drawn by DDUSR)

1.2 Scope and Limitations

This application reacts immediately once the user finishes single-stroke drawing. This

means one of the limitations on this application is that it does not support multiple

strokes drawing. For example, users may draw a rectangle with three strokes, as shown

in Figure 1.2.

Figure 1.2 Multiple-strokes drawing VS Single-stroke drawing

OR

Multiple-strokes

drawing

Single-stroke

drawing


- 3 -

Based on the requirements of this project – implement of recognising and drawing

simple shapes and the time limitation, the final program can recognise: lines, arrows,

circles, ellipses, rectangles, squares, parallelograms, quadratic curves, cubic curves,

polygons and polylines. But this application still can be developed more complex, and

then more other shapes (such as spiral, helix and arbitrarily complex shapes) will be

recognised and drawn – this has been achieved in some current sketch recognition

programs (Hammond et al., 2008).

This application also provides all of the necessary functionalities for drawing, moving,

resizing, editing, deleting, copying shapes (including changing drawing and background

color of shapes, and text font, size and color) and saving current diagram. However,

there are many improvements can be made that will improve efficiency and ease of use.

This includes shapes union and separation which will be helpful for moving multiple

shapes at once.

The application is only tested with Windows 7 on author’s own laptop and Ubuntu 9.10

on computers in ANU CSIT Labs. Overall, the testing environments of this application

are in fast processing speeds and great memory storage capacity. Under those running

environments, the application does not slow down even with more than fifty shapes in

the drawing panel. Therefore, the performance and scalability of this application are not

considered during testing.

1.3 Structure of This Report

The rest of this report is structured as following chapters.


- 4 -

Chapter 2 provides some background of the area this project in. This includes an

overview of the sketch recognition and related techniques, and description of some

similar systems developed by other researchers.

Chapter 3 outlines the requirements for the application by detailing the features needed

to develop, as well as features that were desirable but not essential. This chapter also

gives the initially planned schedule of this project and the final procedures in actual

implement to show whether this project is undertaken on schedule or not. It also

describes some difficulties which author met during this project. Finally, it illustrates

the overall design of this application which is divided into two parts: core recognition

system design and the user interface design.

Chapter 4 lists some notations used in Chapter 5 for avoiding ambiguity of the symbols

used. The next part of the report in Chapter 5 covers how the program was implemented.

This focuses on the core sketch recognition system. The following Chapter reports

testing and evaluation of this system.

The final Chapter then discusses the conclusions based on the project and the results of

the program, as well as the challenges faced during the project. Also some ideas for

future work on this project will be listed in this Chapter.


- 5 -

2. Background

Sketch recognition research lies at the crossroads of artificial intelligence and human

computer interaction. Sketch recognition has become one of the increasingly popular

forms of human-interaction due to the increasing use of Table PCs. Its techniques have

generally fallen into two camps: gesture-based and free-sketch. (Hammond et al, 2008)

First, gesture-based techniques can provide high accuracy, but requires users to learn

how to draw each shape in a particular manner because the order and direction of the

stroke, and the number of strokes are key factors for recognition. These techniques are

usually used in the Palm Pilot’s Graffiti. The performance of gesture-based recognition

is based on drawing-style features, such as the start and end direction of the stroke, the

drawing speed of the stroke, and the total rotation of the stroke. Second, free-sketch

recognition allows users to draw shapes as they would naturally, but most current

techniques have low accuracies or require significant domain-level tweaking to make

them usable. The geometric- based, and feature-based, vision-based techniques are

commonly used in free-sketch recognition to recognize shapes. In fact, the majority of

current systems attempting free-sketch recognition are somewhere in between the

gesture-based and free-sketch system, abandoning some constraints while keeping

others due to the drawback the free-sketch has. (Hammond et al, 2008)

Previous Work

The idea of interacting with computers via pen-based input began in 1964 with the

seminal work of Ivan Sutherland’s sketchpad system. This system was able to defined

relationships to an existing object diagram. It constructed a diagram based on a model


- 6 -

of the design process. The locations of the points and the lines of the diagram modelled

the variables of a design. And the geometric constraints which applied to the points and

lines of the diagram modelled the design constraints which confined the values of the

design variables. (Sutherland, 1964) In 1991, Dean Rubine proposed a gesture

recognition toolkit, GRANDMA, which allowed single-stroke gestures to be learned

and later recognized through the use of a linear classifier. Rubine proposed thirteen

features that could be used to describe any single stroke shape and also provides two

techniques for rejecting bad gestures. (Rubine, 1991) Rubine's work was extended by

other researchers later. However, both these systems use feature-based techniques

which require extensive training. Furthermore, because of the features chosen, these

systems required that strokes be drawn in the same manner each time they were drawn.

For example, a circle drawn in an anti-clockwise manner would not be the same as a

circle drawn in a clockwise manner.

Due to the drawbacks of feature-based techniques, a research trend towards more

geometric-based techniques occurred. In 2001, Sezgin et al. propose a system that was

composed of an approximation stage, a recognition stage and a beautification stage.

The system used a novel method to detect corners in drawn strokes – finding the points

of highest curvature along with the points of lowest speed. In 2006, Kim and Kim

present new curvature metrics on corner finding. The metrics, local convexity (the sum

of all the curvatures of the same sign within a window together) and local monoticity

(investigate decreasing curvatures of the same sign around a point) measure the

curvature in the same direction at a point. Kim and Kim also introduce a different

measure for the curvature at a point. The system first resamples the stroke points. Since

the distance of any two adjacent points become a constant after resampling, a point’s

curvature does not need to take account of changes of the path length. Therefore, the

curvature at each point is equal to the direction change (i.e. turning angle) at that point.

In 2008, Paulson and Hammond propose a system PaleoSketch which mainly uses


- 7 -

many of the concepts learned from the Sezgin et al. (2001) and other previous work. It

improves on these systems to be able to recognize a larger number of primitive shapes

and complex shapes which consist of lines and curves, while still maintains high

recognition accuracy.

There are also some other research in different domains of the sketch recognition field,

such as, dominant points in curves (Tech & Chi, 1989), corner finding in polylines

(Wolin & Hammond, 2008), partial matching of polygons by translation and rotation of

turning angles (McCreath, 2008). All of these three techniques are adopted in this

project and will be illustrated later in this report.

So far, some tools, such as LADDE (Hammond & Davis, 2005), have been developed

to allow users to describe higher level symbols as a combination of lower level

primitives matching certain geometric constraints. They are usually implemented for

drawing some certain diagram designs, such as UML diagrams, electronic circuit

diagrams, flow charts, mechanical engineering diagrams and military course of action

diagrams by using some shape definition languages. Others have attempted to improve

upper-level recognition by using context; however, the use of context typically requires

domain knowledge.

The goal of this project is to learn knowledge from previous work and develop a

system to recognize free-hand sketches with as high as possible accuracy by attempting

and developing generalized (probably new or modified) techniques. In addition, this

system is specially developed for quickly drawing data-flow diagrams.


- 8 -

3. Project Overview

This chapter first outlines the essential requirements for this system, as well as features

that were desirable but not essential. The initially planned schedule and the final

procedures in actual implement are also given in this chapter to show whether this

project is undertaken on schedule or not. It also describes some difficulties which author

met during this project. Finally, the overall design of this system which is divided into

two parts: core recognition system design and the user interface design, are illustrated.

3.1 Requirements of “DDUSR”

3.1.1 Essential Requirements

1. The program should react immediately after users finish one-stroke drawing. The

shapes should be recognised are: arrows, circles, squares, ellipses, rectangles and a

special shape type for data-flow diagram – DataStores.

2. The program should allow moving, resizing and editing shapes.

3. The program should allow inputting text in some regular shapes, such as circles,

ellipses, rectangles, squares.

3.1.2 Desirable Features

1. The program can recognise more shapes than what listed in essential requirements.

2. The program can have some feature functions, such as colour chooser, text font and

size selection, diagram saving and shape copying, deleting and recalling.


- 9 -

3.2 Schedule

3.2.1 Initial Plan

Week 1 Look for supervisor and determine the topic Week 2 Search and read materials related to the topic

Learn Java programming Week 3 Decide which method/algorithm will be used in the program

Learning Java programming Week 4 Prepare for initial presentation

Read more materials and discuss with supervisor and other people Week 5 Start programming

- draw and recognise basic shapes (square and circle) using mouse Week 6 - 8 Programming

- recognise other shapes Week 9 Week 10

Programming - improve processing accuracy and add colour selection

Week 11 Debugging Week 12 Complete application

Write report Week 13 Write report and submission Week 14 Prepare for final presentation

3.2.2 Actual Procedure

As the author is a new Java learner, the learning of Java is actually run throughout

whole lifecycle of this project. For the programming part, the implement of corner

finding algorithm occupies the majority of the lifecycle of this project due to several

attempts on various algorithms and settings of various constant factors. Then come to

the editing text part, the shape layer problem takes some time to fix because it requires

a large number of modifications and changes on mouse event handling. Overall, this

project is conducted step by step under the time constraint, although it takes few more


- 10 -

time on learning period through whole project lifecycle.

3.3 System Design and Modelling

3.3.1 Core Recognition System Design

Figure 3.1 Core Recognition System Modelling (Drawn by DDUSR)

As shown in Figure 3.1, the final points used in recognition process are those identified

as “corners”. Therefore, the main focus of this application is finding correct corners of

original shape (raw shape). This will be discussed in Chapter 5.2.

3.3.2 User Interface Design

The standing point of user interface design is as simple as possible, because the main

purpose of this program is for users (probably software designer) to quickly drawing

diagrams. So when they are designing their systems, they can rapidly “brainstorm”


- 11 -

their ideas. Also the shapes in these diagrams can be easily and quickly edited. The

other aim is that for users drawing some design diagrams which consist of simple

shapes such as data-flow diagrams. The user interface is combined by a menu bar

which provides some commonly used functions and a white drawing panel which

displays the refined shapes. A screen that of the user interfaces is shown in Figure 3.2.

Figure 3.2 User Interface with Some Drawn Shapes

Actually, when users are drawing a shape, they are interacting with the Glass Pane

rather than the Content Pane (see The JavaTM Tutorials for more details about Glass

Pane and Content Pane of JFrame), but the original shapes that they drawn are

displayed on the Content Pane. The Glass Pane is performed as a “transparent screen”

which is between users and the “display screen” (i.e. Content Pane). Users “draw” on

Glass Pane using an electronic pen or mouse, but all the contents (i.e. shapes) are

shown on the “display screen”, Content Pane, which is under the Glass Pane. The

purpose of using the Glass Pane is to catch the mouse events and then deliver them to

the corresponding components. Without Glass Pane, when the text pane in a shape is


- 12 -

existent, it will catch the mouse event and active the “editing text” mode. This will lead

the shape losing “move” function because the panel in which the shape cannot catch

the mouse event.

After passing the original shape to the core recognition system, the recognition system

returns a JComponent with the refined shape to add to the Content Pane. This means

the shapes with their own text panes (if exist) are actually drawn in their own

JComponents rather than the Content Pane. This can avoid the overlapping problem

when system draws shapes and adds text panes in Content Pane directly.

The users can double click the shapes to active the text pane. Only some supported

shapes have text pane, such as circles, ellipses, rectangles, squares and DataStores (a

special shape type for drawing data-flow purpose). When the user is under “Edit Text”

mode, the user can easily change focus between those shapes with text panes by single

click the text panes in shapes, because the Glass Pane is invisible under “edit text”

mode. If the user clicks the area without text pane, the application will set the Glass

Pane visible automatically and back to “Common” mode. Then the user can draw new

shapes, and move or resize existing shapes.

The inner mouse event passing process is shown in Figure 3.3.

Figure 3.3 Mouse Event Passing Process (Drawn by DDUSR)


- 13 -

4. Used Notation

This chapter lists some notations used in the following chapter.

1. A ← B append element B to the end of list A.

2. | A | the absolute value of A

3. Distance |𝑝𝑝𝑎𝑎 ,𝑝𝑝𝑏𝑏 | = �(xa − xb)2 + (ya − yb)2 the distance between pa and pb

4. line 𝑝𝑝𝑎𝑎 , 𝑝𝑝𝑏𝑏�� the straight line between pa and pb

5. Path Length (𝑝𝑝a , 𝑝𝑝b)� = ∑ �(xi − xi+1)2 + (yi − yi+1)2b−1i=a

the path length between pa and pb

6. Perimeter of Shape = ∑ �(xi − xi+1)2 + (yi − yi+1)2N−1i=0

7. directed line 𝑝𝑝a 𝑝𝑝b��⃗ a vector from pa and pb


- 14 -

5. Implementation

5.1 Pre-process

The pre-processing is vital significant for further recognition processes. Without a

good pre-processing, the system cannot recognise shapes accurately, because the

original (raw) inputs contain noise which has a considerable impact on the quality of

recognition results.

5.1.1 Resample

The first step of pre-processing is resampling the points of a shape. This process is

depicted in Figure 5.1. Resampling involves two key aspects: first, decide the

inter-spacing distance which is the distance between two points after resamepling;

second, relocate the points using a resampling algorithm along with the determined

inter-spacing distance.

Figure 5.1 Original Points of Sahpe VS Resampled Points of Shape


- 15 -

In DDUSR, for the inter-spacing distance, the approach of Wolin and Hammond (2008)

is used. Points are resampled based on the diagonal length of the shape’s bounding box.

In order to accommodate for shapes of different size, the interspacing distance S is equal

to the diagonal divided by a constant factor which is set to 40 in DDUSR. This constant

factor is the same as in Wolin and Hammond’s article (2008), which was determined

empirically. The lower the value of this constant will cause over- smoothed shapes,

whereas the higher the value of this constant is will result in too much noise.

Once the interspacing distance S has been calculated, the original points of the shape

can be resampled. The approach of Wobbrock et al. (2007) is used. At first, an empty

arraylist newp of Point is created to store the new resampled points. The first point in

the original point set, points0, is then appended to newp. A distance holder D is

initialized to 0. For point pi from second to the last point, compute d which is the

distance between pi-1 and pi, when the D+d is greater than S, according to the

trigonometric function the new position of resampled point can be calculated; otherwise,

D is updated to D+d.

The pseudo code of this algorithm is as follows:

Resample (Shape s, C) ptopleft = the top-left point of the bounding box of Shape s pbottomright = the bottom-right point of the bounding box of Shape s Ldiagonal = distance between ptopleft and pbottomright

inter-spacing distance S = Ldiagonal / 40 D = 0 resampled points newp ← points0

FOR i from 1 to (points.size – 1) d = distance between pointsi-1 and pointsi IF D + d >= S

delta = (S – D) / d p = p + delta * (pointsi – pointsi-1) newpi = p


- 16 -

pointsi = p D = 0

ELSE D = D + d

return newp

5.1.2 Remove Tails

Unintended “tails” at the beginning and ending of a shape tend to contain a great deal of

noise, which can be a significant problem for accurate recognition. Therefore, removing

these tails before sending the shape to further processing is necessary. To determine if a

tail is present, the first 15% and the last 5% of the shape points are analysed. Because

DDUSR is required to recognise arrows, the number of the last points which are to be

considered has to be minimised. The ranges of points to be considered are determined

by author experimentally. Then, the algorithm finds the point within each section (the

first 15% and the last 5%) that has the highest curvature (i.e. the absolute value of

turned angle divides to the distance between those two adjacent points). If that curvature

is higher than a threshold, 0.5, then the stroke of the shape is broken at that point and

removed as the “tail”. This is not performed on shapes with a low number of points, 5,

or with too small shape perimeter, 70 (Paulson & Hammond, 2008).

The pseudo code of this algorithm is as follows:

RemoveTail (points) s = points.size * 0.15 e = points.size * 0.95 max = -1 FOR i from 1 to s

d = distance between pointsi-1 and pointsi angle = turning angle at pointsi curvature = | angle / d |


- 17 -

IF max < curvature max = curvature sMax = i

IF max < 0.5 sMax = 0

max = -1 FOR points from e to (points.size – 2)

d = distance between pointsi-1 and pointsi angle = turning angle at pointsi curvature = | angle / d | IF max < curvature

max = curvature eMax = i

IF max < 0.5 eMax = points.size – 1

IF sMax == 0 and eMax == points.size - 1

return points FOR i from sMax to eMax

newp ←pointsi return newp

5.2 Corners Finding

Finding corners is the core concept of this application. The corners finding algorithm

in DDUSR is developed from the algorithm presented by Wolin & Hammond (2008)

which named as “ShortStraw”. The ShortStraw aims to develop a simple and effective

way to finding corner. It uses a bottom-up approach to detect corners by calculating the

distance (i.e. “straw”) between the endpoints of a constant support region around each

point and taking the points with the minimum “straw” to be corner candidates after

resampling the points of a shape. In corners finding algorithm of this project, the main

concept is the same as ShortStraw, but it has been developed to be more complex and

more accurate. Therefore, the simplicity of ShortStraw has been reduced, but the


- 18 -

drawbacks of ShortStraw will be covered, such as non-sensitive to obtuse angles and

redundant corners identified under small shapes or slowly-drawing circumstances. All

the constant factors are determined by author of this report experimentally.

5.2.1 Stage One

In this Stage, it follows the ShortStraw’s main concept, i.e. using both a bottom-up and

top-down approaches. The bottom-up approach attempts to build corners from

primitive information, whereas the top-down approach looks at higher-level patterns to

determine possible insertion or deletion of corners.

5.2.1.1 Bottom-Up

DDUSR finds corners in a stroke based on the length of the chord. A chord for a point

at pi is computed as:

chord𝑖𝑖 = Distance |𝑝𝑝𝑖𝑖−𝑘𝑘 ,𝑝𝑝𝑖𝑖+𝑘𝑘 |

where k is a varied support region determined through the method for dynamic chord

lengths presented by Teh and Chin (1989). It is an algorithm for determining the varied

support region for each point on a digital curve. The procedure does not require any

input parameter. It first detects the support region for each point based on its local

properties, and then calculates measures of relative significance (i.e. curvature) of each

point. This will conquer one of disadvantages of ShortStraw – non-sensitive to obtuse

angles.

The following is the pseudo code of the algorithm for computing a chord for a point in a

varied support region:


- 19 -

ComputeChord (points) FOR i from 2 to (points.size – 3)

chd1 = -1 FOR k form 1 to (points.size – i – 1)

IF (i – k) >= 0 chd2 = distance between pointsi-k and pointsi+k d2 = perpendicular distance of pi to the line 𝑝𝑝𝑖𝑖−𝑘𝑘 ,𝑝𝑝𝑖𝑖+𝑘𝑘�� IF k != 1

IF chd1 >= chd2 BREAK

IF d1/chd1 >= d2/chd2 BREAK

d1 = d2 chd1 = chd2

chord ← chd1 return chord

To find the initial corner set, all the chords are first computed from the third point to the

antepenultimate point. Then a threshold md is set to be equal to the median chord in the

array which stores all calculated chords. For each chordi, if chordi is below the threshold

md, then the corresponding point is a corner candidate. After all, store the starting point

and the ending point to the arraylist of corners at the beginning and end positions

respectively.

5.2.1.2 Top-Down

After the initial set of corners is selected by taking the shortest chords, some higher-

level processing is executed to find missed corners and remove false corners. First,

DDUSR checks whether each consecutive pair of corners passes a line test or not. Two

consecutive corners, pa and pb, will pass the line test if the Euclidean distance and the

path length between them are relatively equal. We represent this equality through the

ratio:


- 20 -

𝑡𝑡 = Distance |𝑝𝑝a , 𝑝𝑝b|

Path Length (𝑝𝑝a , 𝑝𝑝b)�

where 0 < t < 1, since the Euclidean distance between the two points will always less

than the path length between them.

For the insertion of missed corners, a threshold T1 is set to be 0.99. If T1 is not high

enough, the algorithm may miss the correct corners, as shown on Figure 5.2.

Figure 5.2 False Corner Found

If t between two consecutive corners, pa and pb, is less than T1, then loop over the points

between pa and pb in original shape points. For each point pc between pa and pb, if the

absolute value of the turning angle (see 5.3.2 for details about turning angle function)

between line 𝑝𝑝𝑎𝑎 , 𝑝𝑝𝑐𝑐�� and line 𝑝𝑝𝑐𝑐 ,𝑝𝑝𝑏𝑏�� is greater than 0.09π. Applying check of turning

angles here can avoid adding in unnecessary corner candidates. Then compute the

perpendicular distance from pc to line 𝑝𝑝𝑎𝑎 ,𝑝𝑝𝑏𝑏��. The point between pa and pb with the

maximum perpendicular distance to line 𝑝𝑝𝑎𝑎 ,𝑝𝑝𝑏𝑏�� is considered as missed corner

candidate. After insert a new corner, the program loop the corners from the beginning

again. This insertion process does not stop running over until no missed corner can be

found. By applying this method, DDUSR is becoming sensitive to the obtuse angles.

For the deletion of false corners, a threshold T2 is set to be 0.95. For each point pc from

the second to the penultimate corner candidates, if t between pc-1 and pc+1 is no less than


- 21 -

T2 and the absolute value of the turned angle at pc is less than 0.13π, then pc is a false

corner and should be removed. After an iteration is finished, if at least one corner has

been removed during the just finished iteration, then loop from the beginning again.

This deletion process does not stop running over until no false corners can be removed.

There is a further deletion for removing false corners; this is for any corners are too

close, especially in small shapes or shapes drawn in a slow speed. For each point pi

from the second to the last corner candidates, if the distance between two consecutive

corners pi-1 and pi is less than 30, then find out the smaller turned angle between pi-1 and

pi, and if this smaller turned angle is less than 0.5π, then remove the corresponding

corner candidate. The special case is when pi is the second to the last corner candidates.

When pi is either the second or last corner candidates, if the turned angle at pi is less

than 0.5π, then remove pi. However, remove the point anyway if the distance between

two consecutive corners pi-1 and pi is less than 5% of the perimeter of the shape. After

one corner candidate is removed, then loop from the beginning again. This deletion

process does not stop running over until no false corners can be removed.

5.2.2 Stage Two

Stage two is actually a curve property identification. If a shape can be find any corners

to be added after stage one, it is recognised as a curve.

For each point pi from the second to the last corner candidates, find an original shape

point p with the maximum perpendicular distance to line 𝑝𝑝𝑖𝑖−1,𝑝𝑝𝑖𝑖�� between pi-1 and pi.

If the perpendicular distance is greater than a constant factor, 11, or the ratio of the

perpendicular distance to the distance between pi-1 and pi is greater than 0.14, then p is

a missed corner and should be added. In addition, the curve property of this


- 22 -

shape/stroke is true. Figure 5.3 shows an example of corner finding result

Figure 5.3 An example of corner finding result

5.3 Recognition

5.3.1 Open or Close Shape Identification

In DDUSR, a simple detection is applied to examine whether the shape is open or close.

If the ratio:

r =Distance|𝑝𝑝first , 𝑝𝑝last |Perimeter of Shape

is less than a constant factor, 0.16, which is determined empirically, then it is regarded

as a close shape; otherwise, it is regarded as a open shape. (Paulson & Hammond,

2008)

5.3.2 Turning Angle Function

Turning angle function provides a simple approach for shape recognition (McCreath,

2008). For a point pa which is not either one of the endpoints, the turning angle is the


- 23 -

angle directed line 𝑝𝑝a−1, 𝑝𝑝𝑎𝑎��⃗ rotates to the directed line 𝑝𝑝a , 𝑝𝑝𝑎𝑎+1��⃗ . When direction of

rotation is clockwise, the turning angle is a positive value; otherwise, the turning angle

is a negative value. However, instead of using the accumulative sum of total turning

angles from starting point to the point p as a sign at p, DDUSR uses the current turning

angle at p as a sign. For the starting point p0 (i.e. the first point), firstly draw a line

which parallels x-axis from p0, and then compute the angle between this line and line

𝑝𝑝0,𝑝𝑝1��. When line 𝑝𝑝0,𝑝𝑝1�� is above the drawn line, the angle is a positive value;

otherwise, the angle is negative. This angle is regarded as the turning angle at p0.

Figure 4.2 (a) Turning Angle (Non-endpoint) (b) Turning Angle (starting point)

5.3.3 Shapes Testing

DDUSR users curve property, open property, the total sum of turning angles at all

points except the two endpoints and the average turning angle to identify the shape

types. This is a simple algorithm but still can recognise defined shape types accurately.

All the constant factors are determined empirically by the author of this report.

5.3.3.1 Line

If a shape:

1) only has two corners (i.e. two endpoints), and

θ > 0 θ < 0 pa pa

p0

θ > 0

p0 θ < 0

(a) (b)


- 24 -

2) is identified as a open shape,

it is a line. The DDUSR will present a line from the striating point to the end point in

the original shape. Figure 5.4 shows an example of “Line” recognition.

Figure 5.4 An example of “Line” Recognition

5.3.3.2 Arrow

To determine if a shape is an arrow, the shape should:

1) is regarded as a open shape, and

2) only has 4 or 5 or 6 corners, and

3) has a corner pi at which the absolute value of the turning angle is greater than

0.6π, and

4) has a corner pi+1 at which the absolute vale of the turning angle is greater than

0.88π, and

5) the distance between pi and pi+2 is less than 30,

it is a arrow. In DDUSR, an arrow is actually drawn as a polyline which has five points.

DDUSR takes pi-1 as the first point and pi as the second and fourth point, then compute

locations of the third and fifth points according to trigonometric function. Figure 5.5

shows an example of “Arrow” recognition.

Figure 5.5 An example of “Arrow” Recognition


- 25 -

5.3.3.3 DataStore

DataStore is a special shape for drawing data-flow diagrams. A shape is recognised as

a DataStore, if:

1) the number of corners is equal to 4, and

2) the sum of total turning angles except the angles at the endpoints is greater than

2.8 and less than 3.5 (i.e. in the range of (0.9π, 1.1π)), and

3) the average turning angle is greater than 1.3 and less than 1.8 (i.e. in the range

of (0.4π, 0.6π)), and

4) the absolute value of the slope of line 𝑝𝑝0, 𝑝𝑝1�� is less than 1, and

5) the turning angles at the second and third points have the same sign (+ or – ),

and the sum of them is greater than 0.8π and less than 1.2π,

Like arrow, a DataStore is drawn as a polyline which consists of four points. Figure 5.6

shows an example of “DataStore” recognition.

Figure 5.6 An example of “DataStore” Recognition

5.3.3.4 Rectangle or Square

As a rectangle or a square, it should satisfy:

1) the number of corners is 5 or 6, and

2) the shape is regarded as close shape, and

3) the sum of the turning angles at all corners except endpoints should be greater

than 4 and less than 7 (i.e. in the range of (1.3π, 2.3π)), and

5) the average turning angle should be greater than 1.3 and less than 2 (i.e. in the


- 26 -

range of (0.4π, 0.64π)), and

6) In all corners excepts endpoints, there should be at least 3 right-angles. If the

turning angle of a corner is in the range of (0.4π, 0.6π), it is a right-angle. And

7) if the shape has two equal adjacent sides (the ratio of these two sides should be

less than 0.8),

it is recognised as a square; otherwise, it is recognised as a rectangle. Figure 5.7 shows

an example of “Square” and “Rectangle”recognition.

Figure 5.7 An example of “Square” and “Rectangle” Recognition

5.3.3.5 Parallelogram

In DDUSR, a parallelogram is actually drawn as a polygon. To be a parallelogram, the

following conditions should be satisfied:

1) the number of corners is 5, and

2) be identified as a close shape, and

3) the sum of total turning angles at all corners except endpoints should be greater

than 4 and less than 7 (i.e. in the range of (1.3π, 2.3π)), and



5) the sum of turning angles at the 2nd and the 3rd corners should be greater than

0.8π and less than 1.2π, and

6) the ratio of turning angle at the 2nd and the 4th corner should be greater than 0.9.


- 27 -

Figure 5.8 shows an example of “Parallelogram” recognition.

Figure 5.8 An example of “Parallelogram” Recognition

5.3.3.6 Polygon

Polygon is for all close and non-curve shapes. Figure 5.9 shows an example of

“Polygon” recognition.

Figure 5.9 An example of “Polygon” Recognition

5.3.3.7 Ellipse or Circle

For a ellipse or a circle, the shape should satisfy some conditions:

1) the number of corners should be more than 5, and

2) the shape is marked as a close shape, and

3) the sum of turning angles at all corners except endpoints should be greater than

4 and less than 8.5 (i.e. in the range of (1.3π, 2.7π)), and



5) compute the longest distance between corners which is the long axis of the


- 28 -

ellipse, then according to ellipse perimeter formula, the length of the short axis can

be worked out.

If the ratio of the short axis and the long axis is greater than 0.9, this shape is

recognised as circle; otherwise, it is regarded as ellipse. Figure 5.10 shows an example

of “Circle” and “Ellipse” recognition.

Figure 5.10 An example of “Circle” and “Ellipse” Recognition

5.3.3.8 CubicCurve

To determine if a shape is a CubicCurve, the following conditions:

1) the number of corners should be greater than 3 and less than 12, and

2) the shape is tagged as a open and curve shape, and

3) the sum of turning angles at all corners except endpoints should be less than 1.5,

and

4) the average turning angle should be less than 0.6,

5) from the 3rd corner, there should be at least one corner has different sign (+ or –)

from the 2nd corner,

should be reached. Figure 5.11 shows an example of “CubicCurve” recognition.


- 29 -

Figure 5.11 An example of “CubicCurve” Recognition

5.3.3.9 QuadCurve

In DDUSR, all arcs are represented as a QuadCurve simply, rather than a part of circle

or ellipse in other applications. To determine a QuadCurve, the following requisite

should be satisfied:

1) the number of corners should be in the range of [3, 9] and

2) the shape is tagged as a open and curve shape, and

3) the sum of turning angles at all corners except endpoints should be less than 1.5,

and

4) the average turning angles should be less than 1.5.

Figure 5.12 shows an example of “CubicCurve” recognition.

Figure 5.12 An example of “CubicCurve” Recognition

5.3.3.10 Polyline

A shape which does not pass the above shapes testing is recognised as a polyline.

Figure 5.13 shows an example of “CubicCurve” recognition.


- 30 -

Figure 5.13 An example of “Polyline” Recognition


- 31 -

6. Testing and Evaluation

This system is tested in two different ways. One is collecting the correct numbers of

corners the system can find; the other one is collecting the correct numbers of shapes

the system can recognise. These two sets of data are collected from 600 shapes drawn

by 5 different users. Each user drew those 12 types of shapes (as shown in Figure 4.2)

which can be recognised by the system. For each shape type, each user drew 10 times

with different sizes. The polygons and ploylines were drawn arbitrarily, and were not

restricted to how complex or simple they would be. In the first testing, the corners

found in shapes are only counted as correct corners when those corners can be used for

further recognition. In the second testing, the top-correct interpretations are not

considered. This means, the testing only considered whether shape types are correctly

recognised, not including the precision of the presented shapes. The result is shown as

Table 6.1 below.

Table 6.1 Testing Result

Correct Corners Total Corners Correct Shapes Total Shapes Arrow 252 252 48 50 Line 100 100 50 50 Circle 594 594 39 50 Ellipse 583 583 41 50 Square 275 275 50 50 Rectangle 275 275 50 50 Parallelogram 250 250 50 50 DataStore 200 200 50 50 QuadCurve 269 269 48 50 CubicCurve 287 287 44 50 Polygon 608 623 47 50 Polyline 589 614 50 50 Total 4282 4322 567 600


- 32 -

Due to the time limitation, the author does not implement other previous algorithms for

comparison. However, theoretically, the corner finding algorithm has a high accuracy

to find corners in both small and large shapes. It is competitive with other current

corner finding algorithms, such as ShortStraw. Wolin and Hammond reports that the

ShortStraw has a high accuracy, 0.979, while the algorithms developed by Sezgin et al.

(2001) and Kim & Kim (2006) have 0.824 and 0.790 respectively. In DDUSR, the

corner finding algorithm is developed from ShorStraw and has conquered some

disadvantages of ShortStraw. Therefore, in theory, the corners finding algorithm

developed in DDUSR would have relative high accuracy even under the same testing

environment as the other algorithms.

For the correct recognition part, there is still room for improvement in DDUSR. Paleo,

which is developed by Paulson and Hammond (2008), is reported that it has achieved

very high accuracies, 99.89% of correct recognition and 98.56% of top-correct

interpretation. In DDUSR, the recognition algorithm is still in a simple and rough stage.

For example, the curve property identification can be an aspect to improve because

sometimes the circle or ellipse is recognised as polygon (non-curved). And some other

properties determination approaches can be implemented along with the turning angles

function.


- 33 -

7. Conclusion

In this report, it first introduces the aims, scope and limitations of this project. Then it

gives a background of sketch recognition, and requirements, design and modelling of

this project. After that, it reports that the detailed implementation of the project and the

testing results.

To sum up, this project is conducted step by step on schedule and the application is

executable and usable. Although there are some room to be improved in the system,

DDUSR still can be used when users need to quickly draw some diagrams which

contain simple shapes, or when users “brainstorm” their designs.

There are some further work can be conducted to improve this program in the future.

1. Improve the recognition algorithm, such as curve property identification (this

might need to modify corner finding algorithm as well), other properties

determination approaches.

2. Identify more shapes to be recognised using geometric characteristics or other

features.

3. Add more useful functions for easy-to-use and customisation, such as shape union,

separation, and shape outline styles.

4. Support multiple-stroke drawing by determining time interval between two

continuously drawn strokes or other approaches.


- 34 -

References

Hammond Tracy, Brian Eoff, Brandon Paulson, Aaron Wolin, Katie Dahmen. Joshua

Johnston & Pankaj Rajan. Free-Sketch Recognition: Putting the CHI in Sketching. CHI

2008, April 5 – April 10 2008, Florence, Italy.

Hammond, Tracy & R. Davis. LADDER, A Sketching Language for User Interface

Developers. Computers & Graphics. 2005. 29, 4, 518-532.

Kim, D.H. and Kim, M.J. A Curvature Estimation for Pen Input Segmentation in

Sketch-based Modeling. Computer-Aided Design. 2006. 238-248.

McCreath, Eric. Partial Matching of Planar Polygons Under Translation and Rotation.

Proceedings of the 20th Annual Canadian Conference on Computational Geometry.

August 13-15, 2008 McGill University, Montreal, Quebec.

Oracle. The JavaTM Tutorials. 2010. http://download.oracle.com/javase/tutorial.

Paulson, Brandon & Tracy Hammond. PaleoSketch: Accurate Primitive Sketch

Recognition and Beautification. Proceedings of the 13th international conference on

Intelligent user interfaces. 2008. ACM, New York, USA. 1 – 10.

Teh, C.H. & R.T. Chin. On The Detection of Dominant Points on Digital Curves. IEEE

Trans. Pattern Anal. Mach, 1989. Intel. 11, 8, 859–872.

Wobbrock, J.O., A.D. Wilson & Y. Li. Gestures Without Libraries, Toolkits or Training:

A $1 Recognizer for User Interface Prototypes. UIST’07: Proceedings of the 20th

Annual ACM Symposium on User Interface Software and Technology. 2007. ACM,

New York, USA. 159–168.

Wolin, Aaron & Tracy Hammond. ShortStraw: A Simple and Effective Corner Finder

for Polylines. EUROGRAPHICS 5th Annual Workshop on Sketch-Based Interfaces and

Modeling. 2008. 33 – 40.

http://download.oracle.com/javase/tutorial�


- 35 -

Rubine, D. Specifying Gestures by Example. Proc. of the 18th Annual Conference on

Computer Graphics and Interactive Techniques, ACM Press (1991), 329-337.

Sezgin, T.M., Stahovich, T. and Davis, R. Sketch Based Interfaces: Early Processing

for Sketch Understanding. Proc. of the 2001 Workshop on Perceptive User Interfaces,

ACM Press (2001), 1-8.

Sutherland, I.E. Sketch Pad: A Man-Machine Graphical Communication System. Proc.

of the SHARE Design Automation Workshop, ACM Press (1964), 6.329-6.346.

Documents

Project Report DIAGRAM DRAWING USING SHAPE RECOGNITION …courses.cecs.anu.edu.au/courses/CS_PROJECTS/10S2/Reports/Yajun … · college of engineering and computer science . department