Upload
akhilesh-k-nair
View
228
Download
0
Embed Size (px)
Citation preview
8/2/2019 Seminar Arun
1/20
SEMINAR REPORT 2012 COMMUNICATION ROBOT SYSTEM BASED ON THE
HANDSHAKING ACTION
DEPARTMENT OF E&I Page 1 VJEC,CHEMPERI
CHAPTER 1
INTRODUCTION
The development of robots has been significant in production, including factories. The expectation
is high for the development of intelligent robot systems that work cooperatively with human beings in
daily life and in medical treatment and welfare. Human robot interaction is essential for the operation of
robots by people. Anyone can operate robots with ease by giving commands to the robot using gestures,
just as people communicate with gestures. An intelligent manipulator system using gesture recognition has
been developed. The omnidirectional image is used for the robot control system based on hand gestures
The communication robot system based on stereo vision and voice instructions was developed. The
control algorithm for a service robot through the hand over task has been proposed. This paper discussed a
human robot interaction based on the handshaking action. We developed a communication robot
HAKUEN that is composed of a multimedia robot with stereo camera, a wheel type mobile robot and a
PC with a microphone. The HAKUEN approaches and holds out its hand toward the operator according to
the voice command. The HAKUEN detects the operator's face based on the pixel values of the flesh tint in
the color image. We use the disparity in order to calculate the distance between the robot and the operator.
The effectiveness of our system is clarified by several experimental results
8/2/2019 Seminar Arun
2/20
SEMINAR REPORT 2012 COMMUNICATION ROBOT SYSTEM BASED ON THE
HANDSHAKING ACTION
DEPARTMENT OF E&I Page 2 VJEC,CHEMPERI
CHAPTER 2
LITERTURE SURVAY
UNECE issues its 2004 World Robotics survey
Worldwide investment in industrial robots up 19% in 2003. In first half of 2004, orders for
robots were up another 18% to the highest level ever recorded. Worldwide growth in the period 2004-
2007 forecast at an average annual rate of about 7%. Over 600,000 household robots in use - several
millions in the next few years. From the above press release we can easily realize that household (service)
robots getting popular. This gives the researcher more interest to work with service robots to make it
more user friendly to the social context. Speech Recognition (SR) technology gives the researcher the
opportunity to add Natural language (NL) communication with robot in natural and even way in the social
context. So the promise of robot that behave more similar to humans (at least from the perception-
response point of view) is starting to become a reality [28]. Brooks research [5] is also an example of
developing humanoid robot and raised some research issues. Form these issues; one of the important
issues is to develop machine that have human-like perception.
8/2/2019 Seminar Arun
3/20
SEMINAR REPORT 2012 COMMUNICATION ROBOT SYSTEM BASED ON THE
HANDSHAKING ACTION
DEPARTMENT OF E&I Page 3 VJEC,CHEMPERI
CHAPTER 3
ABOUT ROBOT
The term robot generally connotes some anthropomorphic (human-like) appearance consider
robot arms for welding . The main goal robotic is to make Robot workers, which can smart enough to
replace human from labor work or any kind of dangerous task that can be harmful for human. The idea of
robot made up mechanical parts came from the science fiction. Three classical films, Metropolis (1926),
The Day the Earth Stood Still (1951), and Forbidden Planet (1956), cemented the connotation that robots
were mechanical in origin, ignoring the biological origins in Capeks play. To work as a replacement of
human robot need some Intelligence to do function autonomously. AI (Artificial intelligence) gives us the
opportunity to fulfill the intelligent requirement in robotics. There are three paradigms are followed in AI
robotics depends on the problems. These are - Hierarchical, Reactive, and Hybrid deliberative/reactive.
Applying the right paradigm makes problem solving easier . Depending on three commonly acceptedrobotic primitives the overview of three paradigms of robotics on Figure 2.1.
In our project we follow Hybrid reactive paradigm to solve our robotic
Fig 3.1: Three paradigms a) Hierachical b) Reactive c) Hybrid reactive
8/2/2019 Seminar Arun
4/20
SEMINAR REPORT 2012 COMMUNICATION ROBOT SYSTEM BASED ON THE
HANDSHAKING ACTION
DEPARTMENT OF E&I Page 4 VJEC,CHEMPERI
CHAPTER 4
ROBOT CONSTRUCTION
We developed a communication robot HAKUEN that is shown in Figure1. This system is
composed of a multimedia robot with stereo camera, a wheel type mobile robot and a PC with a
microphone. The HAKUEN has two arms and each arm has six degrees of freedom of motion. The head
of the multimedia robot has two degrees of freedom of motion. The several LEDs are equipped around the
robot's eyes. The base of the robot is a two wheels mobile robot. When the operator gives the voice
command to the HAKUEN, the robot approaches and holds out its hand toward the operator. The
HAKUEN moves according to the operator's voice commands. We made the four motion functions about
the HAKUEN. These functions are shown below.
(1) Face tracking function
The HAKUEN moves its head in order to follow the operator's face motion. We call the motion is a "face
tracking function". The operators face is detected based on the pixel values of the flesh tint in the color
image.
(2) Handshaking function
The HAKUEN holds out its right hand toward the operator in order for the operator to shake robots hand.
We call the motion is a "handshaking function".
(3) Voice recognition function
The HAKUEN moves according to the operator's voice commands. We call the motion is a "voice
recognition function". We use the voice recognition software (via voice, IBM) which is controlled by the
Active X program in order to recognize the voice commands.
(4)Approach function
We consider that the suitable distance range between the HAKUEN and the operator is 0.6[m]-1.2[m].
The robot approaches the operator and keeps the suitable distance. We call the motion is an approach
8/2/2019 Seminar Arun
5/20
SEMINAR REPORT 2012 COMMUNICATION ROBOT SYSTEM BASED ON THE
HANDSHAKING ACTION
DEPARTMENT OF E&I Page 5 VJEC,CHEMPERI
Our assistive robot system is shown in Figure 1. This system is composed of the manipulator, a PC , a
microphone and a stereo vision hardware. The manipulator used here has six degrees of freedom of
motion and has a mechanical hand. Since the system has to recognize the position and posture of the hand
in real time, we use the stereo vision hardware. In our system the operator gives a hand gesture to the
manipulator conversationally. For example, when the operator points with the forefinger to the object and
gives the voice instruction to the manipulator in order to indicate the target object, the manipulator picks
up the object and hands it over to the operator.
.
Fig 4.1the Hakuen robot
8/2/2019 Seminar Arun
6/20
SEMINAR REPORT 2012 COMMUNICATION ROBOT SYSTEM BASED ON THE
HANDSHAKING ACTION
DEPARTMENT OF E&I Page 6 VJEC,CHEMPERI
CHAPTER 5
FACE TRACKING FUCTION
At first, the HAKUEN has to detect the human face in the color image. The human face is detected based
on the pixel values of the flesh tint in the color image. The color image is digitized as 24 bit RGB (Red,
Green and Blue) pixel value, so that each element of RGB is 8 bit or 256 levels of brightness6). However,
the RGB value is apt to be influenced by the light. Therefore, we use the HLS (hue, saturation, and value)
color specification system in order to detect the human face accurately. The each elements of HLS color
specification system are described in (1)-(3) and calculated based on the RGB pixel value. In order to
detect the human face in the color image, we transform color image to the binary image based on the
threshold values of HSL color specification system. We define the threshold values of HSL color
specification system about the flesh tint through the experiment
The system operates in two stages: it first applies a set of neural network-based filters to an image, and
then uses an arbitrator to combine the outputs. The filters examine each location in the image at several
scales, looking for locations that might contain a face. The arbitrator then merges detections from
individual filters and eliminates overlapping detections.
The first component of our system is a filter that receives as input a 20x20 pixel region of the image, and
generates an output ranging from 1 to -1, signifying the presence or absence of a face, respectively. To
detect faces anywhere in the input, the filter is applied at every location in the image. To detect faces
larger than the window size, the input image is repeatedly reduced in size (by subsampling), and the filter
is applied at each size. This filter must have some invariance to position and scale. The amount of
invariance determines the number of scales and positions at which it must be applied. For the work
presented here, we apply the filter at every pixel position in the image, and scale the image down by a
factor of 1.2 for each step in the pyramid. First, a preprocessing step, adapted from [21], is applied to a
window of the image. The window is then passed through a neural network, which decides whether the
window contains a face. The preprocessing first attempts to equalize the intensity values in across the
window. We fit a function which varies linearly across the window to the intensity values in an ovalregion inside the window. Pixels outside the oval may represent the background, so those intensity values
are ignored in computing the lighting variation across the face. The linear function will approximate the
overall brightness of each part of the window, and can be subtracted from the window to compensate for a
variety of lighting conditions. Then histogram equalization is performed, which non-linearly maps the
8/2/2019 Seminar Arun
7/20
SEMINAR REPORT 2012 COMMUNICATION ROBOT SYSTEM BASED ON THE
HANDSHAKING ACTION
DEPARTMENT OF E&I Page 7 VJEC,CHEMPERI
intensity values to expand the range of intensities in the window. The histogram is computed for pixels
inside an oval region in the window. This compensates for differences in camera input gains, as well as
improving contrast in some cases. The preprocessing steps are shown in. The preprocessed window is then
passed through a neural network. The network has retinal connections to its input layer; the receptive
fields of hidden units . There are three types of hidden units: 4 which look at 10x10 pixel subregions, 16
which look at 5x5 pixel subregions, and 6 which look at overlapping 20x5 pixel horizontal stripes of
pixels. Each of these types was chosen to allow the hidden units to detect local features that might be
important for Rowley, Baluja, and Kanade: Neural Network-Based Face Detection (PAMI, January 1998)
3 face detection. In particular, the horizontal stripes allow the hidden units to detect such features as
mouths or pairs of eyes, while the hidden units with square receptive fields might detect features such as
individual eyes, the nose, or corners of the mouth. Although the figure shows a single hidden unit for each
subregion of the input, these units can be replicated. For the experiments which are described later, we use
networks with two and three sets of these hidden units. Similar input connection patterns are commonly
used in speech and character recognition tasks [10, 24]. The network has a single, real-valued output,
which indicates whether or not the window contains a face.
saturation (constant) saturation (adjusted) saturation (constant) saturation (adjusted)
(a) S=40--55 (b) S=70--255 (a) S=40255 (b) S=20--255
Fig 4.1Detection of the flesh tint (case 1) Fig 4.2Detection of the flesh tint (case 2)
8/2/2019 Seminar Arun
8/20
SEMINAR REPORT 2012 COMMUNICATION ROBOT SYSTEM BASED ON THE
HANDSHAKING ACTION
DEPARTMENT OF E&I Page 8 VJEC,CHEMPERI
5.2 STAGE TWO: MERGING OVERLAPPING DETECTIONS AND
ARBITRATION
The raw output from a single network will contain a number of false detections. In this section, we
present two strategies to improve the reliability of the detector: merging overlapping detections from a
single network and arbitrating among multiple networks5.2.1 Merging Overlapping Detections
Note that in Fig. 3, most faces are detected at multiple nearby positions or scales, while false
detections occur with less consistency. This observation leads to a heuristic which can eliminate many
false detections. For each location and scale, the number of detections within a specified neighborhood of
that location can be counted. If the number is above a threshold, then that location is classified as a face.
The centroid of the nearby detections defines the location of the detection result, thereby collapsing
multiple detections. In the experiments section, this heuristicwill be referred to as thresholding.
If a particular location is correctly identified as a face, then all other detection locations which overlap it
are likely to be errors, and can therefore be eliminated. Based on the above heuristic regarding nearby
detections, we preserve the location with the higher number of detections within Rowley, Baluja, and
Kanade: Neural Network-Based Face Detection (PAMI, January 1998) 5 a small neighborhood, and
eliminate locations with fewer detections. In the discussion of the experiments, this heuristic is called
overlap elimination. There are relatively few cases in which this heuristic fails; however, one such case
is illustrated by the left two faces in Fig. 3B, where one face partially occludes another.The
implementation of these two heuristics is illustrated in Fig. 6. Each detection at a particular location and
scale is marked in an image pyramid, labelled the output pyramid. Then, eachlocation in the pyramid is
replaced by the number of detections in a specified neighborhood of that location. This has the effect of
spreading out the detections. Normally, the neighborhood extends an equal number of pixels in the
dimensions of scale and position, detections are only spread out in position. A threshold is applied to these
values, and the centroids (in both position and scale) of all above threshold regions are computed. All
detections contributing to a centroid are collapsed down to a single point. Each centroid is then examined
in order, starting from the ones which had the highest number of detections within the specifiedneighborhood. If any other centroid locations represent a face overlapping with the current centroid, they
are removed from the output pyramid. All remaining centroid locations constitute the final detection
result. In the face detection work described in [3], similar observations about the nature of the outputs
were made, resulting in the development of heuristics similar to those described above.
8/2/2019 Seminar Arun
9/20
SEMINAR REPORT 2012 COMMUNICATION ROBOT SYSTEM BASED ON THE
HANDSHAKING ACTION
DEPARTMENT OF E&I Page 9 VJEC,CHEMPERI
5.2.2 Arbitration among Multiple Networks
To further reduce the number of false positives, we can apply multiple networks, and arbitrate
between their outputs to produce the final decision. Each network is trained in a similar manner, but with
random initial weights, random initial nonface images, and permutations of the order of presentation of the
scenery images. As will be seen in the next section, the detection and false positive rates of the individual
networks will be quite close. However, because of different training conditions and because of self-
selection of negative training examples, the networks will have different biases and will make different
errors.. Each detection at a particular position and scale is recorded in an image pyramid, as was done with
the previous heuristics. One way to combine two such pyramids is by ANDing them. This strategy signals
a detection only if both networks detect a face at precisely the same scale and position. Due to the
different biases of the individual networks, they will rarely agree on a false detection of a face. This
allows ANDing to eliminate most false detections. Unfortunately, this heuristic can decrease the detection
rate because a face detected by only one network will be thrown out. However, we will see later that
individual networks can all detect roughly the same set of faces, so that the number of faces lost
due to ANDing is small.
Similar heuristics, such as ORing the outputs of two networks, or voting among three networks,
were also tried. Each of these arbitration methods can be applied before or after the thresholding
and overlap elimination heuristics. If applied afterwards, we combine the centroid locations
rather than actual detection locations, and require them to be within some neighborhood of oneanother rather than precisely aligned.
Arbitration strategies such as ANDing, ORing, or voting seem intuitively reasonable, but perhaps
there are some less obvious heuristics that could perform better. To test this hypothesis, we
applied a separate neural network to arbitrate among multiple detection networks. For a location
of interest, the arbitration network examines a small neighborhood surrounding that location in the
Rowley, Baluja, and Kanade: Neural Network-Based Face Detection output pyramid of each individual
network. For each pyramid, we count the number of detections in a 3x3 pixel region at each of three scales
around the location of interest, resulting in three numbers for each detector, which are fed to the
arbitration network, as shown in Fig. 8. The arbitration network is trained to produce a positive output for
a given set of inputs only if that location contains a face, and to produce a negative output for locations
without a face. HAKUEN looks down, the total saturation value in the color image increases further.
8/2/2019 Seminar Arun
10/20
SEMINAR REPORT 2012 COMMUNICATION ROBOT SYSTEM BASED ON THE
HANDSHAKING ACTION
DEPARTMENT OF E&I Page 10 VJEC,CHEMPERI
Therefore the threshold values of saturation and value are adjusted automatically based on
the total values of saturation and value in the color image. The threshold values of saturation
The examples of detection of the flesh tint using image processing are shown in Figure2 and Figure3.
The Figure2(a) is the case of the constant threshold value of the saturation when the HAKUEN looks
down.
Since the color of the floor is similar to the flesh tint, the floor area is detected as the flesh tints
area. The flesh tints area is detected correctly (Figure2(b)), because the threshold values of the saturation
and value are adjusted automatically based on the total values of saturation and value in the color image.
After the detection of the flesh tints area, the human face is recognized in consideration of the
maximum area and circularity about the flesh tints area. We determine that the threshold value of the
circularity is 0.1.
5.3 EXPERIMENT OF THE FACE TRACKING FUNCTION
The face tracking function is that the HAKUEN moves its head so as to get the operator's face at the
center of the image. The relative location of the operator and HAKUEN is shown in Figure6. The number
of operators is five. Each operator gave the voice command thirty times. The system could detect the face
at the all cases. The average time of the face detection was 18.16[sec].
Figure6 Experiment of the face tracking function
Fig 5.1 Experiment on face tracking fuction
8/2/2019 Seminar Arun
11/20
SEMINAR REPORT 2012 COMMUNICATION ROBOT SYSTEM BASED ON THE
HANDSHAKING ACTION
DEPARTMENT OF E&I Page 11 VJEC,CHEMPERI
CHAPTER 6
DISTANCE DETECTION USING STEREO IMAGE
Since HAKUEN has two cameras, we use the disparity of stereo image in order to detect the
distance between the HAKUEN and the operator. As shown in Figure4, the disparity is a difference
between the target object's position of the right image and that of the left image. When the positions of
two cameras are fixed, the disparity is changed according to the distance L between the object and the
camera. We can obtain the distance L between the robot and the operator by means of the disparity .
Total Saturation Threshold Total Value Threshold
Fig 6.1 shows Disparity between two images
. Difference of disparities based on the distance
As shown in Figure, the disparity decreases with the distance between camera and the object. The
disparity is the difference of pixels values of the human faces center position in the right image and the
left image. The relation between the disparity and the distance to the object is obtained through the
experiment
8/2/2019 Seminar Arun
12/20
SEMINAR REPORT 2012 COMMUNICATION ROBOT SYSTEM BASED ON THE
HANDSHAKING ACTION
DEPARTMENT OF E&I Page 12 VJEC,CHEMPERI
6.1 EXPERIMENTS OF THE APPROACH FUNCTIONThe approach function is that the HAKUEN approaches the operator and keeps the suitable
distance (0.6[m]--1.2[m]). We define five cases of the distance (1.5[m],2.0[m],2.5[m],3.0[m],3.5[m])
between the HAKUEN and the operator. Each case of the experiment was done thirty times. The average
of the successrate was 88.33%. The example of the approach function is shown in Figure7.
a)initial function b)approach function
Fig6.2 Experiment on approach function
8/2/2019 Seminar Arun
13/20
SEMINAR REPORT 2012 COMMUNICATION ROBOT SYSTEM BASED ON THE
HANDSHAKING ACTION
DEPARTMENT OF E&I Page 13 VJEC,CHEMPERI
CHAPTER 7
ROBOT HAND SHACKING ACTION
7.1 DETECTION OF THE HAND
At first, the system has to detect the hand area in the image of the work space. The hand area is
detected based on the RGB pixel values of the flesh tint in the colour image. The RGB value is apt to be
influenced by the light. Therefore, we use the hue of flesh tint in order to reduce the influence of the light
The area of the flesh tint is detected roughly in the colour image using the hue value and the noise
isremoved using the RGB value
Fig 7.1 robot system
After the hand area is detected using a RGB value and the hue value of the colour image, we determine the
center position of the hand that is called the CP in order to trace the hand. Since the size of the fist of the
human is approximately equal to the sphere with a radius of 40mm, the system searches for the center of
the sphere with the maximum density of pixels of flesh tint. The center of the sphere is regarded as the CP
of the hand. Once the CP is detected, the hand is traced by the tracking of the CP.
Assistive Robot System Using Gesture and Voice Instructions 67
8/2/2019 Seminar Arun
14/20
SEMINAR REPORT 2012 COMMUNICATION ROBOT SYSTEM BASED ON THE
HANDSHAKING ACTION
DEPARTMENT OF E&I Page 14 VJEC,CHEMPERI
6.2 RECOGNITION OF THE HAND GESTURE
As shown in Figure 2, we define several instructions using hand configurations.
We make the manipulator move in accordance with the instructions of handgestures. For example, when the operator opens the hand upwards (Inst.2), the
manipulator
Inst.1 Grasp Inst.2 Deliver the object Inst.3 Approach Inst.4 Stand by
Fig 7.2.Instructions of hand gestures
We define three characteristic dimensions (A,B and C) of the hand in order to recognize the hand gesture
rapidly. As shown in Figure 3, hand gestures are divided into branches based on the conditions. The length
A is the distance from the CP to the tip of the forefinger. The length B is the maximum width of the hand
block. The length C is the maximum width of the finger block. For example if the length A is less than 60
mm, we consider that the operator closes the hand and the hand gesture means the instruction 1. If the
length A is more than 60 mm, we calculate the length B. Because we don't use the whole hand
configuration but the three characteristic dimensions, the hand gesture is determined rapidly.
8/2/2019 Seminar Arun
15/20
SEMINAR REPORT 2012 COMMUNICATION ROBOT SYSTEM BASED ON THE
HANDSHAKING ACTION
DEPARTMENT OF E&I Page 15 VJEC,CHEMPERI
Fig7.3General flow about the recognition of the hand gesture
7.3 EXPERIMENTS OF THE HAND SHAKING FUNCTION
The hand shaking function is that the HAKUEN holds out its right hand toward the operator when
the HAKUEN keeps the suitable distance (0.6[m]--1.2[m]) from the operator. We define four cases of thedistance (0.4[m],0.8[m],1.0[m],1.4[m]) between the HAKUEN and the operator. We define three cases of
the angle (-20,0,20) between the HAKUEN and the operator. Each case of the experiment was done
thirty times. The percentage of success is shown in Table . The average of the success rate was 97.63%.
Fig 7.4 experiment on hand sacking fuction
8/2/2019 Seminar Arun
16/20
SEMINAR REPORT 2012 COMMUNICATION ROBOT SYSTEM BASED ON THE
HANDSHAKING ACTION
DEPARTMENT OF E&I Page 16 VJEC,CHEMPERI
CHAPTER 8
VOICE INSTRUCTIONS
The voice command is compo Speech Recognition technology promises to change the way we interact
with machines(robots, computers etc.) in the future. This technology is getting matured day by day
and scientists are still working hard to overcome the remaining limitation. Now a days it is introducing
many important areas (like - in the field of Aerospace where the training and operational demands on the
crew have significantly increased with the proliferation of technology [27], in the Operation Theater as a
surgeons aid to control lights, cameras, pumps and equipment by simple voice commands [1]) in the
social context. Speech recognition is the process of converting an acoustic signal, captured by micro-
phone or a telephone, to a set of words [8]. There two important part of in Speech Recognition - i)
Recognize the series of sound and ii) Identified the word from the sound. This recognition techniquedepends also on many parameters - Speaking Mode, Speaking Style, Speaker Enrollment, Size of the
Vocabulary, Language Model, Perplexity,Transducer etc [8]. There are two types of Speak Mode for
speech recognition system- one word at a time (isolated-word speech) and continuous speech. Depending
on the speaker enrolment, the speech recognition system can also divide - Speaker dependent and Speaker
independent system. In Speaker dependent systems user need to be train the systems before using them, on
the other hand Speaker independent system can identify any speakers speech.Vocabulary size and the
language model also important
The system does not determine the position of the target object based on the image processing, when the
many objects lie on the table. The system recognizes the configuration and colour of the target object from
the voice instruction. For example, the operator gives the voice instruction Take the red ball to the
manipulator, the position of the red ball is determined in the work space. We use the voice recognition
software (via voice, IBM) in order to recognize the voice The operator gives the voice commands to the
system in order to move the HAKUEN.
Language model or artificial grammars are used to confine word combination in a series of word or
sound. The size of the vocabulary also should be in a suitable number. Large numbers of vocsed of the
simple word. We define six voice commands which are shown in Table .
For example, when the operator gives the voice command "a ku shu", the HAKUEN approaches the
operator and holds out its right hand to shake hand with operator. We use the voice recognition software
(Via Voice, IBM) in order to recognize the voice commands.
8/2/2019 Seminar Arun
17/20
SEMINAR REPORT 2012 COMMUNICATION ROBOT SYSTEM BASED ON THE
HANDSHAKING ACTION
DEPARTMENT OF E&I Page 17 VJEC,CHEMPERI
Table 8.1 voice commands
7.1 EXPERIMENTS OF THE VOICE RECOGNITION FUNCTION
The voice recognition function is that the HAKUEN moves according to the voice command. We
define six voice commands. Each voice command was given forty times in the experiment. The number ofoperators is five. . The average recognition rate of the voice commands was 92.8%.
8/2/2019 Seminar Arun
18/20
SEMINAR REPORT 2012 COMMUNICATION ROBOT SYSTEM BASED ON THE
HANDSHAKING ACTION
DEPARTMENT OF E&I Page 18 VJEC,CHEMPERI
CHAPTER 9TOTAL EXPERIMENT OF THE SYSTEM
The total experiment in order to clarify the effectiveness of our system. At first operator gives
the voice command i do o to the HAKUEN. Then the HAKUEN approaches the operator and stops the
suitable position in front of the operator. Next, the operator gives the voice command a ku shu to the
HAKUEN. The HAKUEN holds out its right hand toward the operator. We define four cases of the
distance(1.5[m],2.0[m],2.5[m],3.0[m]) between the HAKUEN and the operator. Each case of the
experiment was done thirty times. The average of the success rate was 81.67%.
Fig 9.1 total experiment
8/2/2019 Seminar Arun
19/20
SEMINAR REPORT 2012 COMMUNICATION ROBOT SYSTEM BASED ON THE
HANDSHAKING ACTION
DEPARTMENT OF E&I Page 19 VJEC,CHEMPERI
CONCLUTION
Human-Robot interaction is an important, attractive and challenging area in HRI. The Service
Robot popularity gives the researcher more interest to work with user interface for robots to make it more
user friendly to the social context. Speech Recognition (SR) technology gives the researcher the
opportunity to add Natural language (NL) communication with robot in natural and even way.
In this paper, we developed the communication robot HAKUEN based on image processing
and voice recognition. This system has four motion functions (face tracking function, shaking hand
function, voice recognition function and Approach function). The average ofthe success rate about
the total experiment was 81.67%. In future work, we have to define many kinds of functions for the
practical application of our system
8/2/2019 Seminar Arun
20/20
SEMINAR REPORT 2012 COMMUNICATION ROBOT SYSTEM BASED ON THE
HANDSHAKING ACTION
DEPARTMENT OF E&I Page 20 VJEC,CHEMPERI
REFERENCES[1] N. Yamasaki and Y. Anzai, "Active Interface for Human-Robot Interaction" : Proc. of the IEEE Int
Conf. on
Robotics and Automation, pp.3103-3109, 1995.
[2] N. Kawarazaki, N. Kashiwagi, I. Hoya and K. Nishihara, Manipulator Work System Using Gesture
Instructions,Journal of Robotics and Mechatronics, Vol.14 No.5, pp.506-513.
[3 ] N. Kawarazaki, Y. SUZUKI, Y. TAKASHIMA, K. Nishihara and T. Yoshidome: Robot Control
System
Using Omnidirectional Image, Proc.of Japan-China Conference on Mechatoronics 2005, pp.97-98.
[4] N. Kawarazaki, K. KAWASHIMA, T. YOSHIDOME and K. NISHIHARA: Communication
Robot System based on stereo vision and voice instructions, Proc.of China-Japan Conference on
Mechatoronics 2007, pp.23-25.
[5] A. Agah and K. Tanie, Human Interaction with a Service Robot : Mobile-Manipulator Handing Over
an
Object to a Human, Proc. of the IEEE Int. Conf. on Robotics and Automation, pp.575-580.
[6] John C. Russ : The Image ProcessingHandbook, A CRC Handbook Published in Cooperation with
IEEE press,1999