Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT
1
EDUCATIONAL ROBOTS
Gabriela Bajenaru 1*
Ileana Vucicovici 2
Horea Caramizaru 3
Gabriel Ionescu 4
Costin-Anton Boiangiu 5
ABSTRACT
This paper describes a prototype designed to automate the movement of a robot among
some objects with certain features. It is a first step in the possible development of an
education robot - a platform on which students will work to develop more advanced
methods for automatic guidance systems for a car in traffic, the detection and recognition
of important objects (signs, both on the boards and on the road, traffic lights, pedestrians,
other cars, etc.) and decisions based on the significance of objects and their position. The
benefit is that they work with a real robot, and not a simulation, many of the necessary
modules being already implemented and ready to use.
KEYWORDS: education, robot, computer vision, object detection, cascade training,
cascade classifier
1. INTRODUCTION
This paper is willing to give students a starting point for building a real life guidance
method for a car, in an open space environment, that is moving depending on the nearby
objects position. A software solution for this purpose cannot be ideal, as the physical
processes that cannot be simulated are the ones that generate the imperfections that lead to
misclassification of the environment. Even recordings of the environment are not
sufficient, as the vehicle expects the environment to move with its movement.
The robot must offer a starting point and a general interface, which can be used for
developing complex environment interpretation algorithms and guidance strategies. The
basic modules for moving and recognition must be implemented. If a student wants to
focus on a single algorithm, s/he can use the pre-implemented methods for the rest of the
tasks. The robot contains either a single web camera on the front of the robot, and the
decisions are taken strictly depending on what it sees in a certain moment of time, or
1* Corresponding author. Engineer, [email protected], ”Politehnica” University of Bucharest,
060042 Bucharest, Romania 2 Engineer, [email protected], ”Politehnica” University of Bucharest, 060042 Bucharest, Romania 3 Engineer, [email protected], ”Politehnica” University of Bucharest, 060042 Bucharest, Romania 4 Engineer, [email protected], ”Politehnica” University of Bucharest, 060042 Bucharest, Romania 5 Associate Professor PhD Eng., [email protected], ”Politehnica” University of Bucharest, 060042
Bucharest, Romania
JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT
2
stereoscopic cameras used for 3D environment building. Detection is done in real-time
using an algorithm based on Viola-Jones [1].
2. THE CONSTRUCTION OF THE ROBOT
Hardware
The robot contains two parts: a mechanical part (two level chassis, two steppers
connected hardware drivers and an actuator) and an electronic part controlled by the
BeagleBoard (an embedded device that uses 1G ram and an ARM Cortex A8 processor
capable to sustain a Linux operation system), two video cameras and an extra Arduino
mega board.
Figure 1. Robot photos
Software
The recon-control-command process starts from the video cameras that transmit images to
the BeagleBoard. The images are processed by OpenCV and then, depending on the
result, the command is transmitted to the Arduino card using a USB type connector
through a pipe on a fixed port. Arduino reads the information to a certain interval and
executes two commands: translation or rotation.
3. DETECTION OF AN OBJECT IN OPENCV
Object detection is realized with a tool implemented in OpenCV. The initial algorithm was
proposed by Paul Viola and Michael Jones and improved by Reiner Lienhart.
JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT
3
First, a classifier (actually a boosted-classifiers cascade working with Haar features) is
trained by a few hundreds of images, named positive examples, that are scaled to the same
dimension (30x30), with a number of arbitrary images that have the same size constituting
negative examples.
After a classifier is trained, this can be applied in a region of interests in an entry image.
The classifier has the result “1” if there is a possibility that the region contains the object.
Otherwise, the result is “0”. In order to look for the object in the entire image, the search
window can be moved all along the image, trying to verify every location by using the
classifier.
The classifier is specially designed to be easily resized, in order to be able to find main
objects at different dimensions, which is more effective than to resize the image itself.
Thus, in order to find an object with unknown size, the scanning procedure must be
performed repeatedly, at different scales. In the following paragraphs, every step of the
detection algorithm will be widely explained.
Features
Figure 2. Simple features used for classification
Object detection procedure classifies the images considering the value of some simple
features. There are many motivations for using features instead of pixels.
The most important thing is that the features can encode knowledge from the domain that
is difficult to learn by using a limited quantity of training data. In addition, there is a
critical reason tied to the performance: a system based on features operates faster than a
system based on pixels.
The used features are similar to the base functions Haar. To be more exact, three types of
features are used. The feature value of two rectangles represents the difference between
the pixels’ sum from each rectangle area. The areas have the same size and shape and are
horizontal or vertical (Fig. 1).
JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT
4
A characteristic of three rectangles calculates the sum of the pixels from the two
extremities and, after that, the difference between this sum and the sum of the central
rectangle.
Finally, a characteristic of four rectangles calculates the difference between two pairs of
diagonal rectangles.
Learning Classification Functions
Considering a set of characteristics and a set of negative and positive training images, a
vast number of approaches can be used for automatic learning.
In the Viola-Jones system, a version of AdaBoost is used for characteristics selection and
for the classifier’s training.
In its original version, the learning algorithm AdaBoost is used to amplify the learning
performance of a simple classification algorithm. This is achieved by combining more
weak classification functions in order to obtain a stronger classifier. In the amplification
language, the simple learning algorithm is known as weak classifier.
Therefore, for example, a perceptron learning algorithm performs a search over the
perceptron set and determines the perceptron with the smallest classification error. After
the first learning round, the weights of the examples are revalued in order to highlight
those that have been misclassified by the weak classifier.
A weak classifier is restricted to a set of classification functions, each one depending on a
single feature. For this, the weak classifier is designed to select the single rectangular
feature that differentiates best the positive from the negative examples. For each feature
the weak classifier determines the optimal weight, in order to minimize the
misclassifications.
Construction of the Cascade
This section describes an algorithm for the construction of a cascade of classifiers that can
reach a high performance of detection, reducing the time necessary for calculating. The
key stands in understanding that amplified classifiers can be built smaller and therefore
more efficient, rejecting many negative sub-windows and detecting almost all the positive
ones. The simple classifiers are used for rejecting most of the sub-windows before some
complex classifiers are called, for obtaining a lower rate of “false positive”.
Cascade steps are built by training the classifiers that use AdaBoost.
Starting from a strong classifier with two features, an efficient object filter can be
obtained by adjusting the threshold of the strong classifier in order to minimize the false
negatives. The AdaBoost initial threshold is designed to produce a low error rate over the
training data. A smaller threshold produces a higher rate of detection, but also a higher
rate of false positive.
JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT
5
The detection performance of a classifier with two features is far from being acceptable as
a detection system of the objects. However, the classifier can reduce the number of sub-
images that requires further processing, with just a few operations:
1. Evaluate the rectangle features;
2. Calculate the weak classifier for every feature;
3. Combine weak classifiers.
The complete form of the detection process is a degenerated decision tree, named a
“cascade” (Fig.5).
A positive result from the first classifier starts the evaluation of a second classifier, also
adjusted to obtain high rates of detection.
A positive result from the second classifier starts the third and so on.
A negative result at any moment leads to a sudden rejection of a sub-image.
Figure 3. The cascade methods
Cascade structure demonstrates that in every image we find an overwhelming quantity of
negative sub-windows. Therefore, the cascade attempts to reject as many negatives as
possible in early steps. While a positive instance starts the evaluation of every classifier,
this is a rare event.
Cascade Training
The general training process involves two types of compromises. In most of the cases,
classifiers with more features will reach higher detection rates and fewer false positive
results. At the same time, these require more time for their calculation.
Usually, there can be defined an optimization network that realizes a compromise
between:
The number of classification stages,
The number of features belonging to every stage,
The threshold of every stage.
JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT
6
Each layer of the cascade is trained through AdaBoost. The number of features used for
training is high until the target detection and rates of false positive meet the requirements
for the respective level. The rates are deducted by testing the current detector on a
validation set. If the rate of false positives is not met, then another layer is added to the
cascade. The negative set for training the following layers is obtained by collecting all the
false detections found while running the current detector in a set of images that don’t
contain instances of the object given for detection.
4. CREATING EXAMPLES AND TRAINING THE DETECTION SYSTEM
OpenCV HaarTraining
OpenCV library offers us a very interesting demonstration for the detection of a face. For
the detection of a face we use programs (functions) that create classifiers for any object
detection. Thus, with HaarTraining we can create our own object classifiers.
However, we couldn’t track how the developers of OpenCV made the HaarTraining for
face detection just because they never provided information, for example what type of
images and parameters they used for training.
The objective of this report is to use the face detection functions in the purpose of
detecting other objects like a simple cone, striped cones or a tennis ball and analyzing
which are the ideal positive and negative images for a correct training.
Positive Pictures
Figure 4. Images used for collecting positive samples
JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT
7
We need to collect positive images that contain only objects of interest, for example, we
wanted to realize the detection of a simple white cone and a cone with a model. The
detection is made depending on the shape, not the color. In [2] it is mentioned that the
detection of a face used 5000 positive models with frontal face and other 5000 provided
from 1000 distorted pictures. We will describe how to raise the number of samples in a
future section.
Some of the positive pictures were gained by filming the interest objects in different
backgrounds, with different light conditions and different angles and perspectives. Then
we saved the frames containing the objects in jpg format and cut them in order to show
only the object.
Negative Pictures - Background
We need to collect negative images, that don’t contain objects of interest. They will
compose a background where positive pictures representing only an object can be pasted.
For example, Kuranov et al. [2] stated they used 3000 negative images in training the
detection of a face.
We found on [14] sets of negative pictures containing about 3500 images. But this
collection was used for detecting the eyes and includes some figures in some of the
pictures. Therefore, we eliminated all the suspect images, resulting 2000 pictures, then we
added the background images containing the objects of interest (resulted from the
filming).
Cutting Positive Pictures
In order to collect positive images, we had to cut many images manually. We modified
some of them, inverted them, and changed the contrast, brightness and light.
JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT
8
Figure 5. Examples of positive samples
Creating Samples for Training
We can create samples by using the utility tool “createsamples.exe”. In this section we
will describe functionalities of the software product meant for creating samples.
Creating Samples for Training – from One Picture
The first functionality in “createsamples.exe” is meant to create training samples from one
cut image, by applying some distortions. This function (cvhaartraining.cpp –
cvCreateTrainingSamples#) is launched with the options –img, -bg and –vec.
-img <one_positive_image>
-bg <collection_file_of_negatives>
-vec <name_of_output_file_containing_generated_samples>
For example, this code:
createsamples.exe -img cone1\6.jpg -num 200 -vec cone1\ss6.vec -bg cone1\negatives.txt
-maxxangle 0.6 -maxyangle 0.1 -maxzangle 0.3 -maxidev 100 -bgcolor 0 -bgthresh 0 -w
18 -h 24
will apply 200 distortions on our 18x24 picture, rotating the image on axis x,y,z,
providing a background (mapped and black) with a deviation in scaling of 100. We will
use 200 negative pictures found in the “negatives.txt” file.
We can use a collection of positive pictures and thus more pictures can be generated. A
vast quantity of pictures will be generated having the object rotated, scaled or translated in
many zones of the negative picture.
JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT
9
Figure 6. Artificial positive samples
Creating Samples for Training – from More Than One Picture
The second function is to create training samples from more than one image without
applying distortions. This function is launched with the options –info and –vec
(cvhaartraing.ccp – cvCreateTestSamples#).
Createsamples.exe -info samples.txt -vec samples.vec -w 18 -h 24
This will generate samples without applying distortions and this function may be
considered a file form conversion function.
[filename] [# of objects] [[x y width height]...[...2nd object]...]
We will be able to create a file containing all the pictures and at each of them the zone in
which it is placed is to be specified, (x, y) representing the left upper corner coordinate of
the object, with its origin in (0, 0) and the left lower corner considered as follows:
Img/img1.jpg 1 140 100 45 45
Img/img2.jpg 2 100 200 50 50 50 30 25 25
The option –num is used only to restrict the number of generated samples and not to raise
the number of distortions applied to the samples.
Creating Training Samples
In order to create a full set of samples meant for a precise training, it will be necessary to
make a combination of the files .vec generated by the first function createsample (from
one image) and the second function (from more than one image). For this we created a
special program, capable of combining more .vec files. This will create a .vec file
containing all the .vec froms <collection_file_of_vecs> in a .vec file <output_vec_name>,
with the dimensions h, w.
The Training
Now we can train our classifier with our own images, using the utility tool HaarTraining.
We created samples of 18x24 in size (considering the rectangular shape that surrounds the
cone) and samples of 20x20 in size (considering the square shape surrounding a ball),
with nsplit = 2, nstages = 20, minhitrate = -.9999 (default: 0.995), maxfalsealarm = 0.5
JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT
10
and weighttrimming = 0.95.
“-Nonsym” option is used when the class of objects has no vertical symmetry (left-
right).
“-Mode ALL” uses extended feature sets Haar-like. The default setting is BASIC and
only uses vertical functions, and ALL uses a full set of characteristics plus a
distortion under 45 degrees.
“-MEM 1024” is the available memory in MB for pre-calculation. The default value
is 200MB, and the maximum number 2GB, because there is a 4GB limit on 32 bytes
CPU (2^32=4GB) and becomes 2GB on Windows.
Testing and Analyzing the Performances
While detecting, a sliding window is moved, pixel by pixel, over the image at every scale.
Starting with the original scale, the characteristics are increased by 10% and 20%, until
the image size is at least twice the maximum size.
“Hits” represents a number of correct detections. “Missed” stands for the number of
detections lost or false negative (existing, but the detector missed the detection).
“False” is the number of false alarms or false positive (they do not exist, but the detector
considered their existence as real).
Figure 7. Testing the detection of the learned objects
5. THE DEPTH PERCEIVED IN STEREOSCOPIC IMAGES
People and animals able to focus both their eyes on one single object are capable of
stereoscopic sight, which is fundamental for depth perception. The principle stands in the
presentation of an image from two angles [12], not too different, for the brain to fuse into
one 3D image.
The Focal Length – Pixel Relation
This is the relation between focal length and pixel length (F/P), and is referred to the
relations between the image appearing on the camera objective and the real life image
from the external environment.
JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT
11
It can be obtained by: [f] / [p] = [pixels]
Where:
F stands for the focal length,
P is the measure of each pixel.
Figure 8. The real object AB is projected in the image CD
AB – reference line with distance d1
CD – the projection of the line AB on the lens with the distance d2
O – optical center
d – horizontal distance between O and AB
f – horizontal distance between O and CD
n – number of pixels
p – pixel size
112
12 d
nd
p
f
d
d
np
f
npd
d
d
d
f
The measurement units are: [f] / [p] = [n] * [d] / [d1] = [pixels] * [millimeters] /
[millimeters] = [pixels]
Considering xp the measure in pixels (n) and xm the projection in millimeters (d1)
m
p
x
dx
p
f
d
nd
p
f
1
, d
x
p
fx m
p
, dx
p
fdx
p
fx ppm
1
Depth Estimation
We consider two cameras sharing one zone;
We consider an object with size O in the zone;
L stands for the size in cm of the object’s projection from the left and R from the
right;
D is the given distance between the two cameras;
Depth calculation is possible only when the two images are overlapped;
JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT
12
The image is mirrored;
The right image is overlapping the left image.
Figure 9. Using two images to calculate the depth estimation
D – distance between the two cameras
O – object size
L – left projection
R – right projection
d – distance between focal point and object
f – distance between focal point and camera plane
Triangle ABO similar to EFO => a
f
D
d because a = np =>
n
D
p
fd
np
f
D
d
6. ROBOT MOVEMENT
We analyze the possible cases for decision making. There are five possible cases when
two cones (at least) are detected:
The robot sees two cones and is placed in the middle of them (on the direction X) =>
the robot will go straight on, no matter what the distance is.
The robot sees two cones and is placed on their right side (on the direction Y) and the
left cone is closer than the right one =>first the robot will go straight on, then to the
left.
The robot sees two cones and is placed on their right side (on the direction X) and the
left cone is farther away than the right one =>first the robot will go to the left and
then to the right.
The robot sees two cones and is placed on their left side (on the direction X) and the
left cone is closer than the right one => first the robot will go to the right and then to
the left.
The robot sees two cones and is placed on their left side (on the direction X) and the
left cone is farther than the right one => first the robot will go straight on, then to the
right.
JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT
13
Figure 10. The possible scenarios
In order to be able to test the robot, cone detection and distance calculation from the robot
(camera) to the first set of cones, we made multiple experiments in a large hall.
Figure 11. Testing the robot in the real environment
7. MOBILE PLATFORM FOR ENVIRONMENT ANALYSIS
The next stage is to give the robot the sensation of a 3D environment that it is able to
analyze. We are aiming to obtain a large 3D panorama coming from all the pictures sent
by the 3D sensor.
3D sensors have evolved in such a rhythm that they became affordable to the public. The
improvements can be used in robotics too, thus obtaining 3D data from the environment.
At this moment, there are studies that focus on optimizing the acquisition of 3D images
using time-of-flight cameras [9] and 3D reconstruction [10]. Opposed to the current
approaches, which focus on 3D reconstruction using proposed algorithms and solutions,
this study has the goal of implementing a fast high-level algorithm that uses well
established techniques for image analysis (e.g. feature detection [11]) and based on the
results it will generate a 3D panorama based on the input data. Another goal that this
project aims to achieve is the movement of the robot using input from the 3D sensor.
Requirements for the Project
There is no need for human intervention, and this is possible by sending movement
commands to a mobile platform, with the following types of components:
1. A wireless receiver/transmitter.
2. A device that is capable of capturing data from the environment.
3. A powerful device that can run an operating system in order to provide more
complex capabilities.
4. Electric motors for moving the platform.
JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT
14
5. Interface between the wireless receiver and the electric motors.
Principles
The basic principle stands in capturing 2D and 3D information about the environment and
scanning two successive data frames. For example, given N sets of images, any pair (k, k
+ 1), with k < N, that have common distinctive elements, meaning that these elements can
be identified in both images, should provide information about the relative position of the
image set k+1 to k. After N images the algorithm should output a result similar to a 3D
panorama. More processing can be done in order to obtain isolated 3D objects or textures
from the environment.
The frames have to be pre-processed to synchronize the 2D image to the 3D depth points.
Often, cameras induce distortions in the captured images and can be problematic when a
certain level of accuracy is needed, but this problem can be solved by calibrating the
camera and pre-processing each frame.
The Adopted Hardware Solution
The approach chosen for finding the hardware solutions starts with analyzing the
mechanical components. An issue in choosing the electric motors is estimating the weight
of the entire platform when all the components will be assembled. By making a careful
estimation of the final weight of the robotic platform, the best solution was the “Dagu
Wild Thumper 6WD” because it features 6 electrical motors that have a stall torque of
roughly 11 kg/cm and it also has a large surface for installing all the other devices [3].
The second component is the device that will command the electric parts. One of the most
popular platforms in recent years is the Arduino platform. The devices that use this
platform span from simple input/output microcontrollers to more complex solutions that
perform time-critical operations. Because most of the devices based on this platform
communicate through a USB cable that usually translates the data from the USB protocol
to RS232 by using an FTDI RS232 chip, reduces the complexity in implementing a
protocol for controlling the electric motors. This device is the “Wild Thumper Controller”.
The main electrical characteristics of the motor controller are [2]: Dual 15A fuse protected
H-bridges, Commands the motors via PWM, Controlled by an AtMega168 IC with
Arduino bootloader.
One of the recent developed devices is the Microsoft Kinect. It does not need to be
monitored continuously, thus saving processing power. The Kinect can deliver 2D
pictures and 3D depth information in a synchronized manner. The 2D pictures can vary in
size from 320x240, 640x480 to 1280x1024, but the 3D depth information has a limited
resolution of 640x480 pixels. The depth data is limited to a range of 11 bits, therefore the
distance from the camera to a point from the field of view is quantized in 2048 steps.
We choose to use a computer with a considerable processing power, but it also has to be a
small-form factor PC. Currently, most Linux based operating system support various
JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT
15
processor architectures (arm, x86 etc.), but because x86 is still the most wide-spread, it is
the best choice for this exercise.
The chosen PC was the Alix3D3. This small-form factor PC consists of a 533 MHz x86-
processor with various peripherals that can easily run a modern Linux distribution.
Another advantage that this device has is the embedded Wi-Fi antenna that can allow us to
connect it to a wireless network easily.
All the processing will be done by a more powerful PC that will basically generate the
panorama.
The Communication Model
Figure 12. The communication chart
The Alix 3D3 will receive commands from the server and will send commands to the
motor driver, which will interpret the signals and pass them on to the motors. The
communication between the Server and the Alix3D3 will be made through a Wi-Fi
network.
Figure 13. Detailed communication model
Software Solution
There are 3 different programs running on three devices that this project integrates. These
JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT
16
programs have the following roles:
1. On the server: visualize the data, 3D panorama stitching, movement-monitoring and
3D analysis of the latest frame of data received.
2. On the Alix3D3: receive movement commands, interpret and send movement
commands to the motor driver; it acts as an interface for the Kinect, sends data back
to the server.
3. On the motor driver: receive movement commands from the Alix3D3, monitor the
electric motors, and command the electric motors.
3D Stitching Process
This algorithm is based on overlapping photographs, aiming to obtain a large resolution
panorama. The process has three steps:
1. Image analysis consists of searching the image for defining features
2. Image calibration consists of reducing the difference between an ideal lens and the
camera lens, in order to eliminate distortions, exposure differences, vignette or
chromatic aberrations
3. Final picture reconstruction consists of using the features obtained in the first steps
and running a feature-matching algorithm. The purpose of this step is gaining a
panorama that does not expose the margins where the images have been stitched
together. The algorithm implemented in this project does this by rotating the data sets
in 3D space [4]
Results
The stitching process is one of the most complex parts of this work. It has to be done with
much attention and careful calibration.
Figure 14. 3D image stitching result
At this moment more work is needed on this step of the process, as it is not very accurate.
This is shown in the Fig.4 where the two images are overlapped on the right side of the
picture using feature points detection [13]. Also, some additional processing on the
JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT
17
intermediate data [6][7][8], aiming to find a better alignment may be used to increase the
accuracy of the process.
8. PROBLEMS ENCOUNTERED AND FUTURE IMPROVEMENTS
Object Detection
The training method through positive and negative images does not make the best
detection, but considering the performances in the detection of a specified object using the
generated XML file, objects from a frame coming from a video with a O(n) complexity
can be detected, where n stands for the number of detected objects.
Training itself is not optimal or efficient, training time depending on the complexity of the
object which has to be detected – the more complex and harder to detect, the larger the
image training set needs to be. This is where improvements may occur, for example, a
smaller number of pictures, all different from each other, covering a bigger number of
possibilities. An object is affected by the light, the texture, its shadows, the angle of view
and multiple differences that may occur because of the video camera quality.
As a solution to increase the accuracy is the 3D transformation basing on a frame
sequence (morphable model). In this way, an estimate 3D mesh after one or more pictures
(depending on the resources and detail) of the face is created, so the face can be detected
from many angles. Lights, shadows, reflections over the texture and perspective changes
that modify the 2D projection of the object can be inserted.
Programming and Robot Movement
The robot has a BeagleBoard installed that has a processing power of 1 GHz and supports
a Linux type operating system, where almost all types of libraries can be uploaded.
Students will be able to operate the robot wirelessly. It will be located in the laboratory
and connected to a close-by server, but also the environment will be limited.
On the other hand, the robot is too big and heavy, therefore it has a very high consumption
rate (three 12V batteries), and it is not very stable, manifesting issues with the front
wheels.
The engine is too powerful and has no possibility of adjusting the movement or turning
angles. Also, if the battery offers lower than half of its power, serious errors may occur.
9. CONCLUSIONS
In the future, this could be the base of a more complex robot, constructed by the students
exploring on it. Home assignments offered will make the robot learn to detect more
complex features like vehicles, roads, people, traffic lights, in real time.
The next step in developing this project is the planning of the path that the robot will
JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT
18
explore. For example when the scanning algorithm is started, it will scan the room from
its origin, and based on the information gathered about the environment, it will be able to
perform different tasks, like, for example, construct a path for uniformly cleaning the
room [15].
With the help of the students, the robot will be able to guide itself and act correctly in
traffic. This could help research in the creation of an autonomous vehicle.
If this idea proves to be useful, a more complex flying educational robot can be built for
analyzing the ground [5].
ACKNOWLEDGEMENTS
The authors want to show their appreciation to Daniel Rosner and Mihai Zaharescu for
their great ideas, support and assistance with this paper.
REFERENCES
[1] P. Viola and M. Jones, Robust Real-Time Object Detection, Vancouver, Canada,
2001
[2] Alexander Kuranov, Rainer Lienhart, Vadim Pisarevsky, An Empirical Analysis
of Boosting Algorithms for Rapid Objects with Extended Set of Haar Type
Characteristics, Intel technical Report 2002
[3] http://en.wikipedia.org/wiki/Transmission_Control_Protocol, (2012).
Transmission Control Protocol – Wikipedia, The Free Encyclopedia, Accessed
on: 2012-10-01
[4] https://www.sparkfun.com/products/11057, (2012). Wild Thumper Controller
Board – SparkFun Electronics, Accessed on: 2012-10-01
[5] Horia Stefan Lupescu, Costin-Anton Boiangiu – “Eagle Eye - Romania 3d Map
Feature Extraction” - , The Proceedings of Journal ISOM Vol. 8 No. 2 /
December 2014 (Journal of Information Systems, Operations Management), pp.
363-373
[6] http://en.wikipedia.org/wiki/Image_stitching, (2012). Image Stitching –
Wikipedia, The Free Encyclopedia, Accessed on: 2012-10-01
[7] http://en.wikipedia.org/wiki/Wi-Fi, (2012). Wi-Fi – Wikipedia, The Free
Encyclopedia, Accessed on: 2012-10-01
[8] Boiangiu, C.A., Spataru, A.C., Dvornic, A.I. & Bucur, I. (2008). “Usual
Scenarios and Suitable Approaches Used in Automatic Merge of Scanned
Images”, International Journal of Computers, pp. 340-349, ISSN: 1998-4308
[9] Boiangiu, C.A., Bucur, I. & Spataru, A.C., (2008). “Statistical Approaches Used
in Automatic Merge of Scanned Images”. Annals of DAAAM for 2008 &
Proceedings of the 19th International DAAAM Symposium, pp. 0129–0130,
Trnava, Slovakia, October 22-25, 2008
[10] Boiangiu, C.A., Spataru, A.C., Dvornic, A.I. & Bucur, I (2008). “Merge
Techniques for Large Multiple-Pass Scanned Images”. Proceedings of the 1st
WSEAS Int. Conf. on Visualisation, Imaging and Simulation (VIS ‘08),
WSEAS Press, pp. 72 – 76, Bucharest, Romania, November 7-9, 2008
JOURNAL OF INFORMATION SYSTEMS & OPERATIONS MANAGEMENT
19
[11] Castaneda, V., Mateus, D., Navab, N., Stereo Time-of-Flight, Computer Aided
Medical Procedures (CAMP) Technische Universitat¨ Munchen (TUM),
Germany
[12] Cui, Y., Schuon, S., Chan, D., Thrun, S., Theobald, C., (2010). “3D Shape
Scanning with a Time-of-Flight Camera”, Computer Vision and Pattern
Recognition (CVPR), San Francisco, pp 1173-1180
[13] http://opencv.willowgarage.com/documentation/cpp/feature_detection.html,
(2012). Feature detection – OpenCV v2.1 Documentation, Accessed on: 2012-
10-01.
[14] http://face.urtho.net/, Accessed on: 2012-01-01.
[15] Gabriel Ionescu, Costin-Anton Boiangiu – „Mobile Platform for Environment
Analysis”, Annals of DAAAM for 2012, Proceedings of the 23rd International
DAAAM Symposium, Austria 2012, pp 0587 – 0590.