19
JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT 1 EDUCATIONAL ROBOTS Gabriela Bajenaru 1 * Ileana Vucicovici 2 Horea Caramizaru 3 Gabriel Ionescu 4 Costin-Anton Boiangiu 5 ABSTRACT This paper describes a prototype designed to automate the movement of a robot among some objects with certain features. It is a first step in the possible development of an education robot - a platform on which students will work to develop more advanced methods for automatic guidance systems for a car in traffic, the detection and recognition of important objects (signs, both on the boards and on the road, traffic lights, pedestrians, other cars, etc.) and decisions based on the significance of objects and their position. The benefit is that they work with a real robot, and not a simulation, many of the necessary modules being already implemented and ready to use. KEYWORDS: education, robot, computer vision, object detection, cascade training, cascade classifier 1. INTRODUCTION This paper is willing to give students a starting point for building a real life guidance method for a car, in an open space environment, that is moving depending on the nearby objects position. A software solution for this purpose cannot be ideal, as the physical processes that cannot be simulated are the ones that generate the imperfections that lead to misclassification of the environment. Even recordings of the environment are not sufficient, as the vehicle expects the environment to move with its movement. The robot must offer a starting point and a general interface, which can be used for developing complex environment interpretation algorithms and guidance strategies. The basic modules for moving and recognition must be implemented. If a student wants to focus on a single algorithm, s/he can use the pre-implemented methods for the rest of the tasks. The robot contains either a single web camera on the front of the robot, and the decisions are taken strictly depending on what it sees in a certain moment of time, or 1 * Corresponding author. Engineer, [email protected], ”Politehnica” University of Bucharest, 060042 Bucharest, Romania 2 Engineer, [email protected], ”Politehnica” University of Bucharest, 060042 Bucharest, Romania 3 Engineer, [email protected], ”Politehnica” University of Bucharest, 060042 Bucharest, Romania 4 Engineer, [email protected], ”Politehnica” University of Bucharest, 060042 Bucharest, Romania 5 Associate Professor PhD Eng., [email protected], ”Politehnica” University of Bucharest, 060042 Bucharest, Romania

EDUCATIONAL ROBOTS - URA · EDUCATIONAL ROBOTS Gabriela Bajenaru1* Ileana Vucicovici 2 Horea Caramizaru 3 Gabriel Ionescu4 Costin-Anton Boiangiu 5 ABSTRACT This paper describes a

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: EDUCATIONAL ROBOTS - URA · EDUCATIONAL ROBOTS Gabriela Bajenaru1* Ileana Vucicovici 2 Horea Caramizaru 3 Gabriel Ionescu4 Costin-Anton Boiangiu 5 ABSTRACT This paper describes a

JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT

1

EDUCATIONAL ROBOTS

Gabriela Bajenaru 1*

Ileana Vucicovici 2

Horea Caramizaru 3

Gabriel Ionescu 4

Costin-Anton Boiangiu 5

ABSTRACT

This paper describes a prototype designed to automate the movement of a robot among

some objects with certain features. It is a first step in the possible development of an

education robot - a platform on which students will work to develop more advanced

methods for automatic guidance systems for a car in traffic, the detection and recognition

of important objects (signs, both on the boards and on the road, traffic lights, pedestrians,

other cars, etc.) and decisions based on the significance of objects and their position. The

benefit is that they work with a real robot, and not a simulation, many of the necessary

modules being already implemented and ready to use.

KEYWORDS: education, robot, computer vision, object detection, cascade training,

cascade classifier

1. INTRODUCTION

This paper is willing to give students a starting point for building a real life guidance

method for a car, in an open space environment, that is moving depending on the nearby

objects position. A software solution for this purpose cannot be ideal, as the physical

processes that cannot be simulated are the ones that generate the imperfections that lead to

misclassification of the environment. Even recordings of the environment are not

sufficient, as the vehicle expects the environment to move with its movement.

The robot must offer a starting point and a general interface, which can be used for

developing complex environment interpretation algorithms and guidance strategies. The

basic modules for moving and recognition must be implemented. If a student wants to

focus on a single algorithm, s/he can use the pre-implemented methods for the rest of the

tasks. The robot contains either a single web camera on the front of the robot, and the

decisions are taken strictly depending on what it sees in a certain moment of time, or

1* Corresponding author. Engineer, [email protected], ”Politehnica” University of Bucharest,

060042 Bucharest, Romania 2 Engineer, [email protected], ”Politehnica” University of Bucharest, 060042 Bucharest, Romania 3 Engineer, [email protected], ”Politehnica” University of Bucharest, 060042 Bucharest, Romania 4 Engineer, [email protected], ”Politehnica” University of Bucharest, 060042 Bucharest, Romania 5 Associate Professor PhD Eng., [email protected], ”Politehnica” University of Bucharest, 060042

Bucharest, Romania

Page 2: EDUCATIONAL ROBOTS - URA · EDUCATIONAL ROBOTS Gabriela Bajenaru1* Ileana Vucicovici 2 Horea Caramizaru 3 Gabriel Ionescu4 Costin-Anton Boiangiu 5 ABSTRACT This paper describes a

JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT

2

stereoscopic cameras used for 3D environment building. Detection is done in real-time

using an algorithm based on Viola-Jones [1].

2. THE CONSTRUCTION OF THE ROBOT

Hardware

The robot contains two parts: a mechanical part (two level chassis, two steppers

connected hardware drivers and an actuator) and an electronic part controlled by the

BeagleBoard (an embedded device that uses 1G ram and an ARM Cortex A8 processor

capable to sustain a Linux operation system), two video cameras and an extra Arduino

mega board.

Figure 1. Robot photos

Software

The recon-control-command process starts from the video cameras that transmit images to

the BeagleBoard. The images are processed by OpenCV and then, depending on the

result, the command is transmitted to the Arduino card using a USB type connector

through a pipe on a fixed port. Arduino reads the information to a certain interval and

executes two commands: translation or rotation.

3. DETECTION OF AN OBJECT IN OPENCV

Object detection is realized with a tool implemented in OpenCV. The initial algorithm was

proposed by Paul Viola and Michael Jones and improved by Reiner Lienhart.

Page 3: EDUCATIONAL ROBOTS - URA · EDUCATIONAL ROBOTS Gabriela Bajenaru1* Ileana Vucicovici 2 Horea Caramizaru 3 Gabriel Ionescu4 Costin-Anton Boiangiu 5 ABSTRACT This paper describes a

JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT

3

First, a classifier (actually a boosted-classifiers cascade working with Haar features) is

trained by a few hundreds of images, named positive examples, that are scaled to the same

dimension (30x30), with a number of arbitrary images that have the same size constituting

negative examples.

After a classifier is trained, this can be applied in a region of interests in an entry image.

The classifier has the result “1” if there is a possibility that the region contains the object.

Otherwise, the result is “0”. In order to look for the object in the entire image, the search

window can be moved all along the image, trying to verify every location by using the

classifier.

The classifier is specially designed to be easily resized, in order to be able to find main

objects at different dimensions, which is more effective than to resize the image itself.

Thus, in order to find an object with unknown size, the scanning procedure must be

performed repeatedly, at different scales. In the following paragraphs, every step of the

detection algorithm will be widely explained.

Features

Figure 2. Simple features used for classification

Object detection procedure classifies the images considering the value of some simple

features. There are many motivations for using features instead of pixels.

The most important thing is that the features can encode knowledge from the domain that

is difficult to learn by using a limited quantity of training data. In addition, there is a

critical reason tied to the performance: a system based on features operates faster than a

system based on pixels.

The used features are similar to the base functions Haar. To be more exact, three types of

features are used. The feature value of two rectangles represents the difference between

the pixels’ sum from each rectangle area. The areas have the same size and shape and are

horizontal or vertical (Fig. 1).

Page 4: EDUCATIONAL ROBOTS - URA · EDUCATIONAL ROBOTS Gabriela Bajenaru1* Ileana Vucicovici 2 Horea Caramizaru 3 Gabriel Ionescu4 Costin-Anton Boiangiu 5 ABSTRACT This paper describes a

JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT

4

A characteristic of three rectangles calculates the sum of the pixels from the two

extremities and, after that, the difference between this sum and the sum of the central

rectangle.

Finally, a characteristic of four rectangles calculates the difference between two pairs of

diagonal rectangles.

Learning Classification Functions

Considering a set of characteristics and a set of negative and positive training images, a

vast number of approaches can be used for automatic learning.

In the Viola-Jones system, a version of AdaBoost is used for characteristics selection and

for the classifier’s training.

In its original version, the learning algorithm AdaBoost is used to amplify the learning

performance of a simple classification algorithm. This is achieved by combining more

weak classification functions in order to obtain a stronger classifier. In the amplification

language, the simple learning algorithm is known as weak classifier.

Therefore, for example, a perceptron learning algorithm performs a search over the

perceptron set and determines the perceptron with the smallest classification error. After

the first learning round, the weights of the examples are revalued in order to highlight

those that have been misclassified by the weak classifier.

A weak classifier is restricted to a set of classification functions, each one depending on a

single feature. For this, the weak classifier is designed to select the single rectangular

feature that differentiates best the positive from the negative examples. For each feature

the weak classifier determines the optimal weight, in order to minimize the

misclassifications.

Construction of the Cascade

This section describes an algorithm for the construction of a cascade of classifiers that can

reach a high performance of detection, reducing the time necessary for calculating. The

key stands in understanding that amplified classifiers can be built smaller and therefore

more efficient, rejecting many negative sub-windows and detecting almost all the positive

ones. The simple classifiers are used for rejecting most of the sub-windows before some

complex classifiers are called, for obtaining a lower rate of “false positive”.

Cascade steps are built by training the classifiers that use AdaBoost.

Starting from a strong classifier with two features, an efficient object filter can be

obtained by adjusting the threshold of the strong classifier in order to minimize the false

negatives. The AdaBoost initial threshold is designed to produce a low error rate over the

training data. A smaller threshold produces a higher rate of detection, but also a higher

rate of false positive.

Page 5: EDUCATIONAL ROBOTS - URA · EDUCATIONAL ROBOTS Gabriela Bajenaru1* Ileana Vucicovici 2 Horea Caramizaru 3 Gabriel Ionescu4 Costin-Anton Boiangiu 5 ABSTRACT This paper describes a

JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT

5

The detection performance of a classifier with two features is far from being acceptable as

a detection system of the objects. However, the classifier can reduce the number of sub-

images that requires further processing, with just a few operations:

1. Evaluate the rectangle features;

2. Calculate the weak classifier for every feature;

3. Combine weak classifiers.

The complete form of the detection process is a degenerated decision tree, named a

“cascade” (Fig.5).

A positive result from the first classifier starts the evaluation of a second classifier, also

adjusted to obtain high rates of detection.

A positive result from the second classifier starts the third and so on.

A negative result at any moment leads to a sudden rejection of a sub-image.

Figure 3. The cascade methods

Cascade structure demonstrates that in every image we find an overwhelming quantity of

negative sub-windows. Therefore, the cascade attempts to reject as many negatives as

possible in early steps. While a positive instance starts the evaluation of every classifier,

this is a rare event.

Cascade Training

The general training process involves two types of compromises. In most of the cases,

classifiers with more features will reach higher detection rates and fewer false positive

results. At the same time, these require more time for their calculation.

Usually, there can be defined an optimization network that realizes a compromise

between:

The number of classification stages,

The number of features belonging to every stage,

The threshold of every stage.

Page 6: EDUCATIONAL ROBOTS - URA · EDUCATIONAL ROBOTS Gabriela Bajenaru1* Ileana Vucicovici 2 Horea Caramizaru 3 Gabriel Ionescu4 Costin-Anton Boiangiu 5 ABSTRACT This paper describes a

JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT

6

Each layer of the cascade is trained through AdaBoost. The number of features used for

training is high until the target detection and rates of false positive meet the requirements

for the respective level. The rates are deducted by testing the current detector on a

validation set. If the rate of false positives is not met, then another layer is added to the

cascade. The negative set for training the following layers is obtained by collecting all the

false detections found while running the current detector in a set of images that don’t

contain instances of the object given for detection.

4. CREATING EXAMPLES AND TRAINING THE DETECTION SYSTEM

OpenCV HaarTraining

OpenCV library offers us a very interesting demonstration for the detection of a face. For

the detection of a face we use programs (functions) that create classifiers for any object

detection. Thus, with HaarTraining we can create our own object classifiers.

However, we couldn’t track how the developers of OpenCV made the HaarTraining for

face detection just because they never provided information, for example what type of

images and parameters they used for training.

The objective of this report is to use the face detection functions in the purpose of

detecting other objects like a simple cone, striped cones or a tennis ball and analyzing

which are the ideal positive and negative images for a correct training.

Positive Pictures

Figure 4. Images used for collecting positive samples

Page 7: EDUCATIONAL ROBOTS - URA · EDUCATIONAL ROBOTS Gabriela Bajenaru1* Ileana Vucicovici 2 Horea Caramizaru 3 Gabriel Ionescu4 Costin-Anton Boiangiu 5 ABSTRACT This paper describes a

JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT

7

We need to collect positive images that contain only objects of interest, for example, we

wanted to realize the detection of a simple white cone and a cone with a model. The

detection is made depending on the shape, not the color. In [2] it is mentioned that the

detection of a face used 5000 positive models with frontal face and other 5000 provided

from 1000 distorted pictures. We will describe how to raise the number of samples in a

future section.

Some of the positive pictures were gained by filming the interest objects in different

backgrounds, with different light conditions and different angles and perspectives. Then

we saved the frames containing the objects in jpg format and cut them in order to show

only the object.

Negative Pictures - Background

We need to collect negative images, that don’t contain objects of interest. They will

compose a background where positive pictures representing only an object can be pasted.

For example, Kuranov et al. [2] stated they used 3000 negative images in training the

detection of a face.

We found on [14] sets of negative pictures containing about 3500 images. But this

collection was used for detecting the eyes and includes some figures in some of the

pictures. Therefore, we eliminated all the suspect images, resulting 2000 pictures, then we

added the background images containing the objects of interest (resulted from the

filming).

Cutting Positive Pictures

In order to collect positive images, we had to cut many images manually. We modified

some of them, inverted them, and changed the contrast, brightness and light.

Page 8: EDUCATIONAL ROBOTS - URA · EDUCATIONAL ROBOTS Gabriela Bajenaru1* Ileana Vucicovici 2 Horea Caramizaru 3 Gabriel Ionescu4 Costin-Anton Boiangiu 5 ABSTRACT This paper describes a

JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT

8

Figure 5. Examples of positive samples

Creating Samples for Training

We can create samples by using the utility tool “createsamples.exe”. In this section we

will describe functionalities of the software product meant for creating samples.

Creating Samples for Training – from One Picture

The first functionality in “createsamples.exe” is meant to create training samples from one

cut image, by applying some distortions. This function (cvhaartraining.cpp –

cvCreateTrainingSamples#) is launched with the options –img, -bg and –vec.

-img <one_positive_image>

-bg <collection_file_of_negatives>

-vec <name_of_output_file_containing_generated_samples>

For example, this code:

createsamples.exe -img cone1\6.jpg -num 200 -vec cone1\ss6.vec -bg cone1\negatives.txt

-maxxangle 0.6 -maxyangle 0.1 -maxzangle 0.3 -maxidev 100 -bgcolor 0 -bgthresh 0 -w

18 -h 24

will apply 200 distortions on our 18x24 picture, rotating the image on axis x,y,z,

providing a background (mapped and black) with a deviation in scaling of 100. We will

use 200 negative pictures found in the “negatives.txt” file.

We can use a collection of positive pictures and thus more pictures can be generated. A

vast quantity of pictures will be generated having the object rotated, scaled or translated in

many zones of the negative picture.

Page 9: EDUCATIONAL ROBOTS - URA · EDUCATIONAL ROBOTS Gabriela Bajenaru1* Ileana Vucicovici 2 Horea Caramizaru 3 Gabriel Ionescu4 Costin-Anton Boiangiu 5 ABSTRACT This paper describes a

JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT

9

Figure 6. Artificial positive samples

Creating Samples for Training – from More Than One Picture

The second function is to create training samples from more than one image without

applying distortions. This function is launched with the options –info and –vec

(cvhaartraing.ccp – cvCreateTestSamples#).

Createsamples.exe -info samples.txt -vec samples.vec -w 18 -h 24

This will generate samples without applying distortions and this function may be

considered a file form conversion function.

[filename] [# of objects] [[x y width height]...[...2nd object]...]

We will be able to create a file containing all the pictures and at each of them the zone in

which it is placed is to be specified, (x, y) representing the left upper corner coordinate of

the object, with its origin in (0, 0) and the left lower corner considered as follows:

Img/img1.jpg 1 140 100 45 45

Img/img2.jpg 2 100 200 50 50 50 30 25 25

The option –num is used only to restrict the number of generated samples and not to raise

the number of distortions applied to the samples.

Creating Training Samples

In order to create a full set of samples meant for a precise training, it will be necessary to

make a combination of the files .vec generated by the first function createsample (from

one image) and the second function (from more than one image). For this we created a

special program, capable of combining more .vec files. This will create a .vec file

containing all the .vec froms <collection_file_of_vecs> in a .vec file <output_vec_name>,

with the dimensions h, w.

The Training

Now we can train our classifier with our own images, using the utility tool HaarTraining.

We created samples of 18x24 in size (considering the rectangular shape that surrounds the

cone) and samples of 20x20 in size (considering the square shape surrounding a ball),

with nsplit = 2, nstages = 20, minhitrate = -.9999 (default: 0.995), maxfalsealarm = 0.5

Page 10: EDUCATIONAL ROBOTS - URA · EDUCATIONAL ROBOTS Gabriela Bajenaru1* Ileana Vucicovici 2 Horea Caramizaru 3 Gabriel Ionescu4 Costin-Anton Boiangiu 5 ABSTRACT This paper describes a

JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT

10

and weighttrimming = 0.95.

“-Nonsym” option is used when the class of objects has no vertical symmetry (left-

right).

“-Mode ALL” uses extended feature sets Haar-like. The default setting is BASIC and

only uses vertical functions, and ALL uses a full set of characteristics plus a

distortion under 45 degrees.

“-MEM 1024” is the available memory in MB for pre-calculation. The default value

is 200MB, and the maximum number 2GB, because there is a 4GB limit on 32 bytes

CPU (2^32=4GB) and becomes 2GB on Windows.

Testing and Analyzing the Performances

While detecting, a sliding window is moved, pixel by pixel, over the image at every scale.

Starting with the original scale, the characteristics are increased by 10% and 20%, until

the image size is at least twice the maximum size.

“Hits” represents a number of correct detections. “Missed” stands for the number of

detections lost or false negative (existing, but the detector missed the detection).

“False” is the number of false alarms or false positive (they do not exist, but the detector

considered their existence as real).

Figure 7. Testing the detection of the learned objects

5. THE DEPTH PERCEIVED IN STEREOSCOPIC IMAGES

People and animals able to focus both their eyes on one single object are capable of

stereoscopic sight, which is fundamental for depth perception. The principle stands in the

presentation of an image from two angles [12], not too different, for the brain to fuse into

one 3D image.

The Focal Length – Pixel Relation

This is the relation between focal length and pixel length (F/P), and is referred to the

relations between the image appearing on the camera objective and the real life image

from the external environment.

Page 11: EDUCATIONAL ROBOTS - URA · EDUCATIONAL ROBOTS Gabriela Bajenaru1* Ileana Vucicovici 2 Horea Caramizaru 3 Gabriel Ionescu4 Costin-Anton Boiangiu 5 ABSTRACT This paper describes a

JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT

11

It can be obtained by: [f] / [p] = [pixels]

Where:

F stands for the focal length,

P is the measure of each pixel.

Figure 8. The real object AB is projected in the image CD

AB – reference line with distance d1

CD – the projection of the line AB on the lens with the distance d2

O – optical center

d – horizontal distance between O and AB

f – horizontal distance between O and CD

n – number of pixels

p – pixel size

112

12 d

nd

p

f

d

d

np

f

npd

d

d

d

f

The measurement units are: [f] / [p] = [n] * [d] / [d1] = [pixels] * [millimeters] /

[millimeters] = [pixels]

Considering xp the measure in pixels (n) and xm the projection in millimeters (d1)

m

p

x

dx

p

f

d

nd

p

f

1

, d

x

p

fx m

p

, dx

p

fdx

p

fx ppm

1

Depth Estimation

We consider two cameras sharing one zone;

We consider an object with size O in the zone;

L stands for the size in cm of the object’s projection from the left and R from the

right;

D is the given distance between the two cameras;

Depth calculation is possible only when the two images are overlapped;

Page 12: EDUCATIONAL ROBOTS - URA · EDUCATIONAL ROBOTS Gabriela Bajenaru1* Ileana Vucicovici 2 Horea Caramizaru 3 Gabriel Ionescu4 Costin-Anton Boiangiu 5 ABSTRACT This paper describes a

JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT

12

The image is mirrored;

The right image is overlapping the left image.

Figure 9. Using two images to calculate the depth estimation

D – distance between the two cameras

O – object size

L – left projection

R – right projection

d – distance between focal point and object

f – distance between focal point and camera plane

Triangle ABO similar to EFO => a

f

D

d because a = np =>

n

D

p

fd

np

f

D

d

6. ROBOT MOVEMENT

We analyze the possible cases for decision making. There are five possible cases when

two cones (at least) are detected:

The robot sees two cones and is placed in the middle of them (on the direction X) =>

the robot will go straight on, no matter what the distance is.

The robot sees two cones and is placed on their right side (on the direction Y) and the

left cone is closer than the right one =>first the robot will go straight on, then to the

left.

The robot sees two cones and is placed on their right side (on the direction X) and the

left cone is farther away than the right one =>first the robot will go to the left and

then to the right.

The robot sees two cones and is placed on their left side (on the direction X) and the

left cone is closer than the right one => first the robot will go to the right and then to

the left.

The robot sees two cones and is placed on their left side (on the direction X) and the

left cone is farther than the right one => first the robot will go straight on, then to the

right.

Page 13: EDUCATIONAL ROBOTS - URA · EDUCATIONAL ROBOTS Gabriela Bajenaru1* Ileana Vucicovici 2 Horea Caramizaru 3 Gabriel Ionescu4 Costin-Anton Boiangiu 5 ABSTRACT This paper describes a

JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT

13

Figure 10. The possible scenarios

In order to be able to test the robot, cone detection and distance calculation from the robot

(camera) to the first set of cones, we made multiple experiments in a large hall.

Figure 11. Testing the robot in the real environment

7. MOBILE PLATFORM FOR ENVIRONMENT ANALYSIS

The next stage is to give the robot the sensation of a 3D environment that it is able to

analyze. We are aiming to obtain a large 3D panorama coming from all the pictures sent

by the 3D sensor.

3D sensors have evolved in such a rhythm that they became affordable to the public. The

improvements can be used in robotics too, thus obtaining 3D data from the environment.

At this moment, there are studies that focus on optimizing the acquisition of 3D images

using time-of-flight cameras [9] and 3D reconstruction [10]. Opposed to the current

approaches, which focus on 3D reconstruction using proposed algorithms and solutions,

this study has the goal of implementing a fast high-level algorithm that uses well

established techniques for image analysis (e.g. feature detection [11]) and based on the

results it will generate a 3D panorama based on the input data. Another goal that this

project aims to achieve is the movement of the robot using input from the 3D sensor.

Requirements for the Project

There is no need for human intervention, and this is possible by sending movement

commands to a mobile platform, with the following types of components:

1. A wireless receiver/transmitter.

2. A device that is capable of capturing data from the environment.

3. A powerful device that can run an operating system in order to provide more

complex capabilities.

4. Electric motors for moving the platform.

Page 14: EDUCATIONAL ROBOTS - URA · EDUCATIONAL ROBOTS Gabriela Bajenaru1* Ileana Vucicovici 2 Horea Caramizaru 3 Gabriel Ionescu4 Costin-Anton Boiangiu 5 ABSTRACT This paper describes a

JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT

14

5. Interface between the wireless receiver and the electric motors.

Principles

The basic principle stands in capturing 2D and 3D information about the environment and

scanning two successive data frames. For example, given N sets of images, any pair (k, k

+ 1), with k < N, that have common distinctive elements, meaning that these elements can

be identified in both images, should provide information about the relative position of the

image set k+1 to k. After N images the algorithm should output a result similar to a 3D

panorama. More processing can be done in order to obtain isolated 3D objects or textures

from the environment.

The frames have to be pre-processed to synchronize the 2D image to the 3D depth points.

Often, cameras induce distortions in the captured images and can be problematic when a

certain level of accuracy is needed, but this problem can be solved by calibrating the

camera and pre-processing each frame.

The Adopted Hardware Solution

The approach chosen for finding the hardware solutions starts with analyzing the

mechanical components. An issue in choosing the electric motors is estimating the weight

of the entire platform when all the components will be assembled. By making a careful

estimation of the final weight of the robotic platform, the best solution was the “Dagu

Wild Thumper 6WD” because it features 6 electrical motors that have a stall torque of

roughly 11 kg/cm and it also has a large surface for installing all the other devices [3].

The second component is the device that will command the electric parts. One of the most

popular platforms in recent years is the Arduino platform. The devices that use this

platform span from simple input/output microcontrollers to more complex solutions that

perform time-critical operations. Because most of the devices based on this platform

communicate through a USB cable that usually translates the data from the USB protocol

to RS232 by using an FTDI RS232 chip, reduces the complexity in implementing a

protocol for controlling the electric motors. This device is the “Wild Thumper Controller”.

The main electrical characteristics of the motor controller are [2]: Dual 15A fuse protected

H-bridges, Commands the motors via PWM, Controlled by an AtMega168 IC with

Arduino bootloader.

One of the recent developed devices is the Microsoft Kinect. It does not need to be

monitored continuously, thus saving processing power. The Kinect can deliver 2D

pictures and 3D depth information in a synchronized manner. The 2D pictures can vary in

size from 320x240, 640x480 to 1280x1024, but the 3D depth information has a limited

resolution of 640x480 pixels. The depth data is limited to a range of 11 bits, therefore the

distance from the camera to a point from the field of view is quantized in 2048 steps.

We choose to use a computer with a considerable processing power, but it also has to be a

small-form factor PC. Currently, most Linux based operating system support various

Page 15: EDUCATIONAL ROBOTS - URA · EDUCATIONAL ROBOTS Gabriela Bajenaru1* Ileana Vucicovici 2 Horea Caramizaru 3 Gabriel Ionescu4 Costin-Anton Boiangiu 5 ABSTRACT This paper describes a

JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT

15

processor architectures (arm, x86 etc.), but because x86 is still the most wide-spread, it is

the best choice for this exercise.

The chosen PC was the Alix3D3. This small-form factor PC consists of a 533 MHz x86-

processor with various peripherals that can easily run a modern Linux distribution.

Another advantage that this device has is the embedded Wi-Fi antenna that can allow us to

connect it to a wireless network easily.

All the processing will be done by a more powerful PC that will basically generate the

panorama.

The Communication Model

Figure 12. The communication chart

The Alix 3D3 will receive commands from the server and will send commands to the

motor driver, which will interpret the signals and pass them on to the motors. The

communication between the Server and the Alix3D3 will be made through a Wi-Fi

network.

Figure 13. Detailed communication model

Software Solution

There are 3 different programs running on three devices that this project integrates. These

Page 16: EDUCATIONAL ROBOTS - URA · EDUCATIONAL ROBOTS Gabriela Bajenaru1* Ileana Vucicovici 2 Horea Caramizaru 3 Gabriel Ionescu4 Costin-Anton Boiangiu 5 ABSTRACT This paper describes a

JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT

16

programs have the following roles:

1. On the server: visualize the data, 3D panorama stitching, movement-monitoring and

3D analysis of the latest frame of data received.

2. On the Alix3D3: receive movement commands, interpret and send movement

commands to the motor driver; it acts as an interface for the Kinect, sends data back

to the server.

3. On the motor driver: receive movement commands from the Alix3D3, monitor the

electric motors, and command the electric motors.

3D Stitching Process

This algorithm is based on overlapping photographs, aiming to obtain a large resolution

panorama. The process has three steps:

1. Image analysis consists of searching the image for defining features

2. Image calibration consists of reducing the difference between an ideal lens and the

camera lens, in order to eliminate distortions, exposure differences, vignette or

chromatic aberrations

3. Final picture reconstruction consists of using the features obtained in the first steps

and running a feature-matching algorithm. The purpose of this step is gaining a

panorama that does not expose the margins where the images have been stitched

together. The algorithm implemented in this project does this by rotating the data sets

in 3D space [4]

Results

The stitching process is one of the most complex parts of this work. It has to be done with

much attention and careful calibration.

Figure 14. 3D image stitching result

At this moment more work is needed on this step of the process, as it is not very accurate.

This is shown in the Fig.4 where the two images are overlapped on the right side of the

picture using feature points detection [13]. Also, some additional processing on the

Page 17: EDUCATIONAL ROBOTS - URA · EDUCATIONAL ROBOTS Gabriela Bajenaru1* Ileana Vucicovici 2 Horea Caramizaru 3 Gabriel Ionescu4 Costin-Anton Boiangiu 5 ABSTRACT This paper describes a

JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT

17

intermediate data [6][7][8], aiming to find a better alignment may be used to increase the

accuracy of the process.

8. PROBLEMS ENCOUNTERED AND FUTURE IMPROVEMENTS

Object Detection

The training method through positive and negative images does not make the best

detection, but considering the performances in the detection of a specified object using the

generated XML file, objects from a frame coming from a video with a O(n) complexity

can be detected, where n stands for the number of detected objects.

Training itself is not optimal or efficient, training time depending on the complexity of the

object which has to be detected – the more complex and harder to detect, the larger the

image training set needs to be. This is where improvements may occur, for example, a

smaller number of pictures, all different from each other, covering a bigger number of

possibilities. An object is affected by the light, the texture, its shadows, the angle of view

and multiple differences that may occur because of the video camera quality.

As a solution to increase the accuracy is the 3D transformation basing on a frame

sequence (morphable model). In this way, an estimate 3D mesh after one or more pictures

(depending on the resources and detail) of the face is created, so the face can be detected

from many angles. Lights, shadows, reflections over the texture and perspective changes

that modify the 2D projection of the object can be inserted.

Programming and Robot Movement

The robot has a BeagleBoard installed that has a processing power of 1 GHz and supports

a Linux type operating system, where almost all types of libraries can be uploaded.

Students will be able to operate the robot wirelessly. It will be located in the laboratory

and connected to a close-by server, but also the environment will be limited.

On the other hand, the robot is too big and heavy, therefore it has a very high consumption

rate (three 12V batteries), and it is not very stable, manifesting issues with the front

wheels.

The engine is too powerful and has no possibility of adjusting the movement or turning

angles. Also, if the battery offers lower than half of its power, serious errors may occur.

9. CONCLUSIONS

In the future, this could be the base of a more complex robot, constructed by the students

exploring on it. Home assignments offered will make the robot learn to detect more

complex features like vehicles, roads, people, traffic lights, in real time.

The next step in developing this project is the planning of the path that the robot will

Page 18: EDUCATIONAL ROBOTS - URA · EDUCATIONAL ROBOTS Gabriela Bajenaru1* Ileana Vucicovici 2 Horea Caramizaru 3 Gabriel Ionescu4 Costin-Anton Boiangiu 5 ABSTRACT This paper describes a

JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT

18

explore. For example when the scanning algorithm is started, it will scan the room from

its origin, and based on the information gathered about the environment, it will be able to

perform different tasks, like, for example, construct a path for uniformly cleaning the

room [15].

With the help of the students, the robot will be able to guide itself and act correctly in

traffic. This could help research in the creation of an autonomous vehicle.

If this idea proves to be useful, a more complex flying educational robot can be built for

analyzing the ground [5].

ACKNOWLEDGEMENTS

The authors want to show their appreciation to Daniel Rosner and Mihai Zaharescu for

their great ideas, support and assistance with this paper.

REFERENCES

[1] P. Viola and M. Jones, Robust Real-Time Object Detection, Vancouver, Canada,

2001

[2] Alexander Kuranov, Rainer Lienhart, Vadim Pisarevsky, An Empirical Analysis

of Boosting Algorithms for Rapid Objects with Extended Set of Haar Type

Characteristics, Intel technical Report 2002

[3] http://en.wikipedia.org/wiki/Transmission_Control_Protocol, (2012).

Transmission Control Protocol – Wikipedia, The Free Encyclopedia, Accessed

on: 2012-10-01

[4] https://www.sparkfun.com/products/11057, (2012). Wild Thumper Controller

Board – SparkFun Electronics, Accessed on: 2012-10-01

[5] Horia Stefan Lupescu, Costin-Anton Boiangiu – “Eagle Eye - Romania 3d Map

Feature Extraction” - , The Proceedings of Journal ISOM Vol. 8 No. 2 /

December 2014 (Journal of Information Systems, Operations Management), pp.

363-373

[6] http://en.wikipedia.org/wiki/Image_stitching, (2012). Image Stitching –

Wikipedia, The Free Encyclopedia, Accessed on: 2012-10-01

[7] http://en.wikipedia.org/wiki/Wi-Fi, (2012). Wi-Fi – Wikipedia, The Free

Encyclopedia, Accessed on: 2012-10-01

[8] Boiangiu, C.A., Spataru, A.C., Dvornic, A.I. & Bucur, I. (2008). “Usual

Scenarios and Suitable Approaches Used in Automatic Merge of Scanned

Images”, International Journal of Computers, pp. 340-349, ISSN: 1998-4308

[9] Boiangiu, C.A., Bucur, I. & Spataru, A.C., (2008). “Statistical Approaches Used

in Automatic Merge of Scanned Images”. Annals of DAAAM for 2008 &

Proceedings of the 19th International DAAAM Symposium, pp. 0129–0130,

Trnava, Slovakia, October 22-25, 2008

[10] Boiangiu, C.A., Spataru, A.C., Dvornic, A.I. & Bucur, I (2008). “Merge

Techniques for Large Multiple-Pass Scanned Images”. Proceedings of the 1st

WSEAS Int. Conf. on Visualisation, Imaging and Simulation (VIS ‘08),

WSEAS Press, pp. 72 – 76, Bucharest, Romania, November 7-9, 2008

Page 19: EDUCATIONAL ROBOTS - URA · EDUCATIONAL ROBOTS Gabriela Bajenaru1* Ileana Vucicovici 2 Horea Caramizaru 3 Gabriel Ionescu4 Costin-Anton Boiangiu 5 ABSTRACT This paper describes a

JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT

19

[11] Castaneda, V., Mateus, D., Navab, N., Stereo Time-of-Flight, Computer Aided

Medical Procedures (CAMP) Technische Universitat¨ Munchen (TUM),

Germany

[12] Cui, Y., Schuon, S., Chan, D., Thrun, S., Theobald, C., (2010). “3D Shape

Scanning with a Time-of-Flight Camera”, Computer Vision and Pattern

Recognition (CVPR), San Francisco, pp 1173-1180

[13] http://opencv.willowgarage.com/documentation/cpp/feature_detection.html,

(2012). Feature detection – OpenCV v2.1 Documentation, Accessed on: 2012-

10-01.

[14] http://face.urtho.net/, Accessed on: 2012-01-01.

[15] Gabriel Ionescu, Costin-Anton Boiangiu – „Mobile Platform for Environment

Analysis”, Annals of DAAAM for 2012, Proceedings of the 23rd International

DAAAM Symposium, Austria 2012, pp 0587 – 0590.