Object Grasping Instructions to Support Robot by Laser

Shintani, M. et al.

Paper:

Object Grasping Instructions to Support Robot by Laser BeamOne Drag Operations

Momonosuke Shintani∗, Yuta Fukui∗, Kosuke Morioka∗, Kenji Ishihata∗,

Satoshi Iwaki∗, Tetsushi Ikeda∗, and Tim C. Luth∗∗

∗Hiroshima City University

3-4-1 Ozukahigashi, Asaminami, Hiroshima, Hiroshima 731-3194, Japan

E-mail: {shintani, fukui, morioka, ishihata, iwaki, ikeda}@robotics.info.hiroshima-cu.ac.jp∗∗Technical University of Munich (TUM)

Boltzmannstrasse 15, Garching 85748, Germany

E-mail: [email protected]

[Received January 20, 2021; accepted May 7, 2021]

We propose a system in which users can intuitively in-

struct the robot gripper’s positions and attitudes sim-

ply by tracing the object’s grasp part surface with

one stroke (one drag) of the laser beam. The pro-

posed system makes use of the “real world clicker

(RWC)” we have developed earlier, a system capable

of obtaining with high accuracy the three-dimensional

coordinate values of laser spots on a real object by

mouse-operating the time-of-flight (TOF) laser sensor

installed on the pan-tilt actuator. The grasping point

is specified as the centroid of the grasp part’s plane re-

gion by the laser drag trajectory. The gripper attitude

is specified by selecting the left and right drag modes

that correspond to the PC mouse’s left and right click

buttons. By doing so, we realize a grasping instruction

interface where users can take into account various

physical conditions for the objects, environments, and

grippers. We experimentally evaluated the proposed

system by measuring the grasping instruction time of

multiple test subjects for various daily use items.

Keywords: laser distance sensor, pan-tilt actuator, real

world clicker, gripper attitude, grasping

1. Introduction

Calls for practical applications of robots for life and

nursing care support have recently grown louder. As the

real work of robots is to grasp and carry diverse objects in

our daily living spaces, it is extremely difficult at present

to completely automate their work of stably grasping un-

registered and atypical objects. Therefore, it is crucial

to develop a system for instructing robots on their mo-

tions by making aggressive use of the intelligence and

abilities that remain in care receivers, that is, an instruc-

tion interface. From the abovementioned perspectives, we

have developed an interface where various instructions

to a robot can be generated by clicking real objects by

Fig. 1. Real world clicker and support robot.

mouse-operating the time-of-flight (TOF) laser sensor in-

stalled on the pan-tilt actuator (Fig. 1) [1–4]. Users can

click real objects by operating the laser beam directions

with the PC mouse (to obtain the three-dimensional coor-

dinate values of the laser spots). The developed interface

enables drag-and-drop operations between a PC icon and

a real object or between two real objects. For example,

it enables a support robot to throw away trash by drag-

ging and dropping trash into a real trash box. However,

conventional systems, which are capable of instructing

the gripper position at the robot arm end but not its at-

titude, often fail to grasp objects. Therefore, in this study,

we propose a system capable of intuitively instructing not

only the gripper position but also its attitude by expanding

the conventional real world clicker (RWC).

756 Journal of Robotics and Mechatronics Vol.33 No.4, 2021

https://doi.org/10.20965/jrm.2021.p0756

© Fuji Technology Press Ltd. Creative Commons CC BY-ND: This is an Open Access article distributed under the terms of

the Creative Commons Attribution-NoDerivatives 4.0 International License (http://creativecommons.org/licenses/by-nd/4.0/).

http://creativecommons.org/licenses/by-nd/4.0/

Grasping Instructions to Robot by Laser One Drag Operations

2. Related Studies

2.1. Current Grasping Systems by Robot Hands

Studies on grasping objects by robot hands have been

available for a very long time, and their approaches are

roughly classified into automated and manual (instruc-

tions). The differences between typical conventional stud-

ies and our study can be summarized as follows.

We first refer to the differences between fully auto-

mated technologies such as artificial intelligence (AI) and

the system proposed in this study. In factories, ware-

houses, and other relatively well-organized environments,

automated grasping using AI, such as deep learning, has

seen a certain degree of success and is highly expected to

be put into practical use in the future [5–12]. However,

this technology has the following disadvantages: learn-

ing takes a long time, it is difficult to deal with unregis-

tered objects, it is difficult to distinguish objects from the

background, it cannot take into account frictions, weights,

etc., and it cannot respond to the grasping strategy that

anticipates how to handle the objects after grasping them.

Such disadvantages seem to have limited its application

scope. Besides AI, there are numerous approaches avail-

able to solve the planning of robot hand motion based on

the path-planning scheme [13]. They have the same dis-

advantages as mentioned above, and their applications to

many different types of shapes, such as those of daily use

products, are still in the developing stage.

Next, we refer to the differences between the conven-

tional laser-pointing instruction systems and our study.

Because robot instruction systems using laser pointers are

very useful as intuitive and affordable object specifying

means, numerous studies have been conducted on such

robot instruction systems for many years [14–16]. How-

ever, they use ordinary laser pointers with no laser beam

distance measuring function and need a separate cam-

era to measure the 3D coordinate values of laser spots,

making their accuracies extremely lower than those of the

TOF laser measurements. As users hold a laser pointer in

their hands, they cannot accurately re-irradiate the laser

to the point they have once pointed, making it unable to

specify objects by re-irradiation of the laser.

We now refer to the differences between conventional

teleoperation systems and the system proposed in this

study. The system where users operate the robot arm

by viewing the camera images installed on a teleoperator

robot on the display screen at hand (called a teleoperation

system) has been used as a remote operation system in

dangerous places such as nuclear power plants and seas

for many years [17–19]. This system is more suitable for

operating objects that are too remote to view. When ap-

plying the teleoperation system to our households, users

would need to take trouble, first moving the robot body

itself to a target object and operating it while closely ob-

serving it on the screen at hand, even if it is so close and

nearly reachable to them. In contrast, the system pro-

posed in this paper has the advantage that it can provide

extremely intuitive instructions while visually observing

real objects in a relatively small room. In addition, users

can only engage themselves in operating the laser to the

objects of interest without paying attention to the presence

of the robot. In other words, we may say that the proposed

system is specialized for use in a relatively narrow space

where the entire room is visible.

In addition to the above-mentioned studies, there are

some studies available on grasping objects based on the

shared autonomy concept [20, 21], which incorporates

both human instructions and machine learning. The ob-

jects that can be handled with this approach are limited to

those with relatively simple shapes, and they require that

the users select menus and objects while viewing the PC

screen and realize some other procedures. This concept

lacks the intuitiveness of the proposed system, which op-

erates while looking directly at objects of many different

shapes with the naked eye.

2.2. Overview of Conventional Real-World Clicker

Systems and Their Problems

With the aim of realizing early practical applications

of support robots, we have dared not to develop fully au-

tomated support robots but to take a manual (semiauto-

mated) approach, making the most use of the intelligence

and abilities remaining in care receivers together with the

robot’s high-accuracy measurement control technology.

Thus, this strategy becomes the most crucial way to give

robots their motion instructions simply and surely, that is,

to research and develop an intuitive instruction interface.

From the above-mentioned perspective, we have devel-

oped an intuitive interface (real world clicker) capable of

generating various object operation commands by operat-

ing with a PC mouse the high-accuracy TOF laser distance

sensor installed on the pan-tilt actuator and by clicking a

real object while viewing it (Fig. 1). The real-world click

here refers to measuring the 3D positions on the world co-

ordinate system of laser spots from the pan angles, tilt an-

gles, and laser beam lengths; its accuracy has a resolution

of approximately 5 mm for approximately 5 m ahead. A

user real-world-clicks a target object’s grasping point and

sends its coordinate values to the robot. Then, the robot

automatically moves closer to the object, grasps it with its

arm hand (an open/close-type gripper), and carries it to

the user. The conventional systems often fail to grasp ob-

jects because they can instruct the gripper’s position at the

robot arm end but cannot instruct its attitude. Therefore,

in this paper, we propose a system capable of instructing

both the gripper position and attitude intuitively, without

much difficulty.

3. Proposed System

3.1. This Study’s Approach

When grasping an object, humans can instantaneously

determine where and how to grasp it, comprehensively

considering not only its physical properties, such as size,

shape, attitude, weight, friction, and stiffness, but also

Journal of Robotics and Mechatronics Vol.33 No.4, 2021 757

Shintani, M. et al.

Fig. 2. Experiment system components and coordinate systems.

its surroundings and post-grasping operations. In other

words, humans notice where to grasp (grasp part) from the

object’s entire shape and determine the desirable grasping

points within the grasp part and the grasping attitudes. In

this study, based on the above-mentioned excellent human

abilities, we discuss how to instruct grasping operations to

robots as simply as possible, making use of the drag op-

eration function possessed by RWC. In other words, we

continuously measure the laser spot positions by moving

the PC mouse while keeping its button pressed, that is, we

acquire the laser beam trajectories on the grasp part sur-

face and use them to determine the grasping points and

attitudes. To achieve minimum difficulty, our basic ap-

proach is to realize the above-mentioned operations using

only one drag (single stroke of the brush). We further con-

sider how to specify the desirable direction from which to

approach an object, taking into account the environment

in which it is placed and its grasp part shape, by selective

use of the PC mouse’s left/right click buttons.

3.2. Definitions of System’s Coordinate Systems

Figure 2 shows the components of the system consid-

ered in this study and their respective coordinate systems.

The world coordinate system ∑PT of the entire system

is installed on the basis of RWC. The robot coordinate

system ∑R is installed on the mobile robot, and it simul-

taneously corresponds to the base coordinate system of

the robot manipulator. A two-finger open/close gripper is

installed at the manipulator tip as a robotic hand. With

the center of the gripper’s grasp part set as the origin, the

travel direction on the wrist’s extended line is denoted by

zzzG and the coordinate axis vertical to zzzG on the plane con-

figured by the two fingers is denoted by xxxG. Thus, the

gripper coordinate system ∑G is defined as yyyG = zzzG ×xxxG.

The grasping object coordinate system ∑Obj is attached to

the object as a target when the gripper grasps it. When the

gripper coordinate system ∑G is approached to ∑Obj in the

zzzG direction and both coordinate systems correspond with

each other, the gripper is closed to execute grasping.

3.3. How to Determine Grasping Points

Users one-drag (single-stroke of the brush) the RWC

laser over the object surface while viewing the neighbor-

hood of the grasp part of their own choice with the naked

Fig. 3. Relations between gripper coordinate system and

grasping object.

eye. The point group data obtained by the drag is plane-

approximated by the least square method, or other meth-

ods, after removing noises and outliers from them, and all

the data is projected on the said plane. Herein, we call the

laser-dragged trajectory, the grasp part’s contour curve;

the vector from the dragging start point to the end point,

the start-end vector; and the closed region configured by

the start-end point vector and the grasp part’s contour line,

the grasp part’s plane region (Fig. 3). The centroid of the

grasp part’s plane region is set as the object grasping point

Gp, and it is set as the origin of the grasping object coor-

dinate ∑Obj.

3.4. How to Determine Grasping Attitude

3.4.1. Select Directions to Approach Objects Using

Left/Right Click Buttons

This study assumes that with RWC arranged in the

neighborhood of the user’s head, the user’s visual line al-

most corresponds to the laser beam line. Then, we pro-

pose the following two methods for the user to select with

the PC mouse’s left/right click buttons the directions from

which the gripper approaches objects as viewed by the

user.

When the user wants to grasp an object from its front

direction as viewed by the user, the user can approximate

the gripper to the object from the laser’s travel direction,

aligning the laser beam line’s travel direction almost ex-

actly to the gripper’s approach direction zzzG by dragging

the laser with the PC mouse’s right button, which we call

the right drag mode. When the user finds it easier to grasp

the object at its sides from the left, right, upper, and lower

directions rather than from its front direction, the start-end

point vector is aligned to the gripper’s approach direction

zzzG by dragging the laser with the left button of the PC

mouse kept pressed, which we call the left drag mode.

3.4.2. Algorithm to Derive the Grasping Object

Coordinate System Attitude

In correspondence to the above-mentioned two modes,

we describe below the algorithm to calculate the rotation

matrix PT RRRObj = (xxxObj yyyObj zzzObj), which represents the

attitude of the grasping object coordinate system ∑Obj as

viewed from the pan-tilt coordinate system ∑PT . Figs. 4



Fig. 4. Calculation process for grasping position and atti-

tude (right drag).

Fig. 5. Calculation process for grasping position and atti-

tude (left drag).

and 5 show the schematics of the calculation processes.

The start and end points of the point group data obtained

by dragging are denoted by points S and E, and the origin

of ∑PT by OPT . To seek PT RRRObj, we first seek a formal

rotation matrix RRRObj =(

xxxObj yyyObj zzzObj

)

by each of the

above-mentioned methods.

In the right drag method, the gripper’s approach di-

rection zzzG is aligned to the laser beam emitting direc-

tion, zzzObj =−−−−→OPT Gp ((i) of Fig. 4). To make the gripper

open/close plane parallel to the grasp part’s plane region,

xxxObj =−→SE, then yyyObj = [zzzObj × xxxObj] ((ii) of Fig. 4).

In the left drag method, to first align the gripper’s ap-

proach direction zzzG almost exactly to the start-end point

vector, zzzObj =−→SE ((i) of Fig. 5). To make the grip-

per open/close parallel to the grasp part’s plane region,

xxxObj =−−→SGp, then yyyObj = [zzzObj × xxxObj] ((ii) of Fig. 5).

In both of the above-mentioned methods, when yyyObj =

yyyObj/∥

∥yyyObj

∥

∥, zzzObj = zzzObj/∥

∥zzzObj

∥

∥, and xxxObj = [yyyObj×zzzObj],PT RRRObj =

(

xxxObj yyyObj zzzObj

)

constitutes an orthonormal

coordinate system to obtain PT RRRObj ((iii) of Figs. 4 and 5).

3.5. Specific Examples of Basic Instruction Strategy

and Drags

3.5.1. Example of Basic Instruction Strategy

We first present specific examples of various instruc-

tion strategies, taking familiar daily use products as ex-

amples. Basically, we determine desirable grasping points

and attitudes by considering the object’s physical proper-

ties, gripper, external environment, etc. Then, we select

Fig. 6. Grasp PET bottle from left side (left drag).

Fig. 7. Grasp PET bottle from front (right drag).

Fig. 8. Grasp PET bottle from upper part (left drag).

a drag mode to realize them and specify an appropriate

grasp part’s plane region using one drag. Fig. 6 shows an

example of the drag trajectory when trying to grasp a PET

bottle from the horizontal and left directions. As the start-

end point vector is almost horizontal and headed from left

to right, and Gp is positioned inside the PET bottle, we

can expect that it can be grasped with the left drag mode.

Fig. 7 shows an example of the drag trajectory when try-

ing to grasp a PET bottle from the user’s front. As Gp

is positioned inside the PET bottle, as shown in Fig. 6,

we can expect it to be grasped with the right drag mode.

Fig. 8 shows an example of the drag trajectory when try-

ing to grasp a PET bottle from the upper direction. As the

start-end point vector is not vertical but slightly inclined

to the left and Gp is positioned in the neighborhood of

the right end of the cap, we can expect it to be grasped

slightly inclined rather than vertically. Next, we present a

specific example of differences in accordance with the at-

titudes (directions and angles) of the same object. Fig. 9


Shintani, M. et al.

Fig. 9. Example of differences in instruction strategy in accordance with the directions of the object.

Fig. 10. Example of an actually measured drag trajectory successful in giving instructions to enable grasping (left: sequence

photographs; middle: 3D data; right: mapping data on XY -plane).

Fig. 11. Example of an actually measured drag trajectory that failed in giving instructions to enable grasping (left: sequence

photographs; middle: 3D data; right: mapping data on XZ-plane).

shows a rectangular rod-like object placed on a wooden

base in different directions. As the ends of the rod-like

object protrude from the base into the open space, it is

advantageous to make them the grasp parts. Because if

an object is grasped at its midsection, the robot hand is

expected to collide with the base, making it difficult to

grasp the rod. Therefore, we can think that dragging with

the left drag mode should enable the robot hand to grasp

with the expected attitude, as shown on the left and right

sides of Fig. 9.

3.5.2. Examples of Actually Measured Drag

Trajectories

Here, we present examples of the actually measured

drag trajectories. Taking a rectangular parallelepiped

shaped sponge as an example, Figs. 10 and 11 illustrate

the actual drag trajectories, coordinate values, and hand

attitudes instructed in the calculation processes described

in Section 3.4. More specifically, they show the sequence

photographs of the moving trajectories of the laser spot

and their corresponding drag trajectory points (3D coor-

dinate values in ∑PT in Fig. 2 and their projections on

two-dimensional planes) in graphs. The drag start and end

points are also specified.

Figure 10 shows a successful example of instructions

for grasping an object in the left drag mode. We can

judge from the hand’s maximum opening width and the

object’s size that we should better aim horizontally at the

thinner part of the rectangular parallelepiped and horizon-

tally grasp it. Thus, we have almost horizontally dragged

over the L-shaped surface to generate an almost grasp

part’s plane region. The mapping graph at the right end of

Fig. 10 represents the characteristics of the L-shape. To

grasp the object, the hand approaches it at a slant along a

straight line from the start point S to the end point E.

Figure 11 shows a failure case of instructions to grasp



Table 1. Principal specifications of experiment system.

the object with the left drag mode, the same as in Fig. 10.

From the rightmost mapping graph on the XZ plane, we

can see the L-shape of the object’s upper part. We at-

tempted to grasp the object as if hanging from above the

rectangular parallelepiped’s upper part but failed to clutch

the whole object because of the object’s ceiling length,

which is larger than the hand’s maximum opening width.

In this section, we presented specific examples of in-

struction strategies and drag trajectories, taking various

familiar objects. These examples indicate that in the pro-

posed system, users first need to image in their heads the

robot hand’s positions and attitudes that are supposed to

enable a generally stable grasping of objects, taking into

account not only the shapes, sizes, attitudes, etc. of the ob-

jects to be grasped, but also the size of the robot hand to

use. Then, they need to perform laser dragging to obtain

point groups that will assure the abovementioned stable

grasping of objects, including selections of right/left drag

modes. As described above, the proposed system requires

advanced human’s consideration, skills, and experience.

From another perspective, as users themselves can con-

sider every condition of the objects and the robot hand,

they can provide detailed grasping instructions that will

meet such conditions.

3.6. Robot Control System

Based on the object grasping point Gp obtained in

Section 3.3 and the rotation matrix PT RRRObj obtained in

Section 3.4, here we propose an object grasping control

system using the mobile robot, manipulator, and gripper

shown in Fig. 2. We assume that the robot arm’s direct

kinematics and inverse kinematics are given by Eqs. (1)

and (2), respectively.

RTTT G =

(

RRRRG(qqq) R pppG(qqq)0 1

)

, . . . . . . (1)

qqq = IK(

RRRRG,R pppG

)

, . . . . . . . . . (2)

where qqq denotes the robot arm’s joint angle vector, RRRRG

is the rotation matrix symbolizing the gripper attitude as

viewed from the robot coordinate system ∑R, R pppG is the

vector symbolizing the gripper position, RTTT G is the ho-

mogeneous transformation matrix, and IK is the inverse

kinematic function to output joint angle vector that corre-

sponds to the specified gripper position and attitude.

When controlling a mobile robot, it is moved to the

range where the gripper at the tip of the robot arm could

reach the target object. More specifically, it calculates

the maximum distance between the origins of ∑R and ∑G,

max(∥

∥

R pppG(qqq)∥

∥

)

, and measures its self-position PT pppR on

∑PT to control its movement in such a way that the mea-

sured position should satisfy Eq. (3). Here, we assume

that the target object’s position PT rrrObj is the coordinate

value of the point Gp.

max(∥

∥

R pppG(qqq)∥

∥

)

>∣

∣

PT rrrObj −PT pppR

∣

∣ . . . . (3)

In calculating the 3D coordinate value RrrrObj of the tar-

get object on ∑R, first seek the homogeneous transforma-

tion matrix PT TTT R that symbolizes the robot’s position and

attitude on ∑PT by the self-position measurement and ob-

tain it from Eq. (4).(

RrrrObj

1

)

= PT TTT−1R

(

PT rrrObj

1

)

. . . . . (4)

By assuming that R pppG = RrrrObj and RRRRG = PT TTT−1R

PT RRRObj

in Eq. (2), qqq is obtained to determine the joint angle of the

robot arm.

The split gripper at the robot hand end begins to split

after reaching its target position and attitude R pppG and RRRRG

and stops at the point of time when the gripper’s motor

load exceeds a fixed value, so that it can grasp an object.

4. Evaluation Experiments

4.1. Overview of Entire Experiments

The purpose of the experiments is to verify the validity

of the proposed instruction system and evaluate its ini-

tial performance as a support robot system for grasping

daily use items. First, in the preliminary experiments, the

system developers conducted trial instruction and grasp-

ing experiments by themselves to grasp numerous famil-

iar daily use products. Next, we selected typical objects

based on the results of the preliminary experiments and

conducted instruction time evaluation experiments using

multiple test subjects. Table 1 presents the principal spec-

ifications of the main equipment in the experimental sys-

tem, and Fig. 12 shows the candidate grasping target ob-

jects. Fig. 13 shows the positional relations among the

user, real-world clicker, PC mouse, support robot, and


Shintani, M. et al.

Fig. 12. Grasping the target daily use items in the preliminary experiments and laser drag trajectories (A–F: right drag; G–L: left drag).

Fig. 13. Appearance of experiment environment as viewed

from user’s viewpoint.

grasping object as viewed from the user’s viewpoint. On

the other hand, Fig. 1 shows the user, with the robot and

object behind him, operating the camera images installed

on the pan-tilt actuator through the PC’s viewing window.

It should be noted that in this evaluation experiments, the

user operates the system by directly viewing the object

with the naked eye, as shown in Fig. 13, without using the

PC’s viewing window. The grasping object was placed on

a wooden table approximately 22 cm high and approxi-

mately 50 cm away from the mobile robot. We took into

account the mobile robot’s position and attitude control

accuracy in deciding the above-mentioned distance be-

tween the robot and the object as the shortest distance

where the instruction-focused experiments can be imple-

mented, as aimed in this study. We also considered the

operable range of the 5-DOF manipulator in determining

the above-mentioned object installation height. We de-

fined it as successful grasping when the closed gripper

vertically lifted the grasped object to a height of approxi-

mately 40 cm.

4.2. Preliminary Experiments by System Developer

Himself

4.2.1. Experiment Overview

We tried the system’s instruction and grasping opera-

tions with all 12 types of objects, as shown in Fig. 12.

Fig. 12 illustrates, schematically superposed on their pho-

tographs, the laser dragging start and end points and tra-

jectories with the right drag mode (laser travelling direc-

tion) from A to F, and with the left drag mode (dragging

in the start-end point vector direction) from G to L. As

a result, we succeeded in grasping all objects, except L.

Below, we describe our thinking process and consider the

instruction strategy we have decided for each object.

4.2.2. Results and Consideration

For A (coffee cup), we easily succeeded in giving in-

structions with the right drag mode to grasp it at its body

face, avoiding the complex handle part, to ensure a stable

grasping. We attempted to grasp it at its handle part with-

out success because the handle part has a small surface

area, where it is difficult to measure the contour curve of

the grasp part stably in the depth direction.

For B (cup-and-ball), C (PET bottle), D (toilet paper),

and E (stuffed toy), roughly assuming them as objects

with vertical cylindrical shapes like that of A, we easily

succeeded in giving instructions to grasp them with the

right drag mode. As for B, we secured the grasp part’s

plane region with a large effective area by aiming at the

handle part with a relatively large diameter. As for E, we

dragged around the toy’s neck based on our judgment that

it should be easily and stably grasped there.

As for F (plastic plate) (right drag), unlike objects A–

E, it protrudes from the base, and we visually judged that

the object diameter exceeded the gripper’s opening width.

Therefore, we adopted the strategy to grasp it at its edge

from the laser traveling direction. Noticing that the plate

edge was roundish, we secured an almost vertical grasp

part’s plane region so that we could rotate the xxxGzzzG plane

almost vertically.

As for G (sponge) (left drag), taking the gripper’s open-

ing width into account, we adopted the strategy to aim

horizontally at the thinner side face of the rectangular

parallelepiped. Noticing the sponge’s high flexibility, we

found it possible to grasp it crushed in the right drag mode

with the same drag trajectory.

As for H (gummed tape) (left drag), we determined that

it was possible to insert the gripper into the cylindrical



hollow because the hollow diameter was larger than the

finger of the gripper. Therefore, we secured the grasp

part’s plane region by taking advantage of the remaining

tape thickness.

As for I (measuring tape), which we had decided to

grasp with the left drag mode aiming at its thinner face,

we succeeded in grasping it with the right drag mode. As

its size was smaller than the gripper’s opening width, it

could be completely grabbed at its grasping point Gp.

As for J (wrapping container) (left drag), noting that it

protruded horizontally from the base, in the same manner

as for F, we adopted the strategy of horizontally approach-

ing it from its length direction. Thus, we secured a hori-

zontal grasp part’s plane region by intentionally bulging

the drag trajectory on the top surface of the wrapping

box. In addition, we succeeded in grasping it using the

L-shaped trajectory by left-dragging the tip sideways of

the wrapping container, as shown in Fig. 9.

As for K (food bag) (left drag), aiming to grab it from

above, we instructed to drag vertically downward from the

dragging start point to the end point.

L (game controller) (left drag) is the only item that we

did not succeed to grasp. This is attributable to the fact

that the gripper tip came into contact with the base, thus

activating the emergency stop of the robot. In such a case,

we think that the problem could be solved by installing a

force sensor on the gripper to provide mechanical flexibil-

ity to any contact with its external environment, as well as

by altering the position of the object grasping point Gp.

In the above-mentioned experiments, the proposed sys-

tem succeeded in grasping many different types of objects

almost exactly as instructed by the user, thus proving its

basic validity. We should bear in mind, however, that the

above-mentioned experiments were no more than prelim-

inary experiments that the system developer, who is fa-

miliar with the developed system, conducted on a trial-

and-error basis. To determine the extent to which general

users can use the developed system, in the next section

we conducted experiments with multiple test subjects to

verify its usability.

4.3. Instruction Time Evaluation Experiments by

Plural Subjects

4.3.1. Experiment Method

Based on the results of the preliminary experiment in

Section 4.2, we selected A, C, F, G, J, and K as a new ex-

periment objects because they represent characteristic ob-

jects for which the user has hesitated to select a desirable

drag mode. In this experiment, we asked eight test sub-

jects (males in their 20s), who had never used the system

before, to give grasping instructions on it and measured

their instruction time and grasping success rates to evalu-

ate the usability and performance of the grasping support

system. Here, instruction time refers to the total time re-

quired for a test subject to select a drag mode while view-

ing an object, to plan its dragging start and end points in

his mind, and to finish one drag operation with the PC

mouse based on the plan. In the experiments, the test sub-

Fig. 14. Grasping objects and drag trajectories in instruction

time evaluation experiments by plural test subjects (arranged

from upper left to lower right in trial order).

jects performed two sets of experiments: practice and real.

First, a person in charge of the experiments verbally ex-

plained to the test subjects the experimental system char-

acteristics and the outline instruction methods. Next, to

familiarize them with the system and instruction methods,

the test subjects received hands-on guidance for the drag-

ging patterns, as shown in Fig. 14, and practiced instruc-

tion work by right-dragging A, C, and F and left-dragging

G, J, and K. Subsequently, in the real experiments, each

test subject selected a drag mode by himself and gave in-

structions to grasp C, A, F, G, K, and J, in that order.

4.3.2. Experimental Results and Consideration

Figure 15 shows the measurement results for all the

test subjects’ grasping instruction times. It is a box-and-

whisker diagram of data for respective objects in practice

and real experiments, where the mean values (× marks)

are also indicated. In Fig. 15, we have treated as an

outlier (© mark) “a point at the practice with A (cof-

fee cup)” for the following reason. We interviewed the

test subject about how he behaved at the said experiment

and he responded that “I repeatedly practiced to hit the

laser at the object accurately on a trial and error basis, re-

gardless of the instruction operations and without caring

about the operation time.” Therefore, we determined that

he significantly deviated from the action instructed by the

experiment planner and that the said data should not be

included in the instruction time measurement evaluation,

but should be treated as an outlier. The overall grasping

success rate was 78.1%. Fig. 16 shows the ratios of the

drag modes selected by the test subjects in the real ex-

periment and the number of successes and failures. We

describe below the considerations we have acquired from

the experimental results.

The success rate of grasping C (PET bottle) in the real

experiment was 87.5%. Six test subjects selected the right

drag mode to provide grasping instructions, although in-

structions could be easily given with either the right or

left drag mode. This seems attributable to the fact that as

the PET bottle grasping was planned immediately before


Shintani, M. et al.

Fig. 15. Measurement results of instruction time to grasp objects by eight subjects.

Fig. 16. Drag modes selected at real experiment and success

or failure.

the experiments began, they may have been greatly influ-

enced by their successes experienced at the immediately

prior practices. As the mean instruction time to grasp C in

the real experiments was the shortest and its interquartile

range was the narrowest among all objects, we consider

that the users were able to plan object grasping strategies

and drag the laser without hesitation.

The success rate of grasping A (coffee cup) in the real

experiment was 62.5%. Five test subjects were able to

easily give instructions and grasp C with the right drag

mode, as in A of Fig. 14, in the same way as for C. Never-

theless, among the 6 types of objects, only the instruction

time for A increased in the real experiment compared to

that in the practice experiment. This is attributable to the

fact that three test subjects aggressively tried to grasp A at

the grasp part, which was considered difficult to grasp. In

other words, as the grasp part was nearly semi-ring shaped

with a maximum width as narrow as approximately 1 cm,

it would take a very long time to irradiate the laser ac-

curately to such a narrow part. In fact, their instructions

turned out to be not as they aimed, which resulted in total

failures in all grasping operations.

The success rate of grasping F (plastic plate) in the

real experiment was 75%. We expected that the grasp-

ing method and strategy used by the subjects would be

either to grasp it at its edge from above by using the drag

trajectory shown in Fig. 14 or to grasp it sideways at the

part protruding from the base with the right drag mode.

However, more right drags were used to grasp it. More-

over, as the area where the laser can be dragged is much

smaller than that of A, the grasping instruction time in-

creases. In addition, the widest interquartile range among

the grasping objects seems to prove the difficulties in pro-

viding grasping instructions.

The success rate of grasping G (sponge) in the real ex-

periment was 100%. As it has a simple shape, flexibility,

and moderate friction, it was expected to be the easiest

object to grasp, even for beginners. Compared to that in

the practice experiment, the grasping instruction time in

the real experiment was wide in the maximum to min-

imum range and in the interquartile range. This seems

attributable to the fact that the instructions to grasp G, us-

ing its corners, were different from those for C, A, and

F. Further, it was so easy to plan its grasping strategy by

either drag mode that the test subjects rather hesitated to

select the drag mode, taking a long time as a result. As the

grasping instruction time was relatively shorter than that

of the other objects, its degree of difficulty should be as

low as expected.

The success rate of grasping K (food bag) in the real ex-

periment was 87.5%. Out of the eight test subjects, seven

subjects selected the left drag mode to grasp it. As it has

no simple convex shape but side faces of small areas, it

seems to have been difficult to plan any other strategy than

the left drag. As they selected the left drag without hesita-

tion in the real experiment, the mean and second quartiles

turned out to be lower in this experiment.

The success rate of grasping J (wrapping container) in

the real experiment was 75%. As it was the sixth grasping

trial, and thus the degrees of familiarization had gradually

improved, the test subjects dared to grasp it in a differ-

ent way from the one they have used in the practice ex-

periment. Therefore, they selected the right drag mode

relatively more frequently than the left drag mode. The

relatively short grasping instruction time seems to be at-

tributable to their familiarization effects.

From the above-mentioned considerations for each ob-

ject, we can discuss as a whole the following.

In the case of objects with geometrically simple shapes,

where a sufficiently wide grasp part’s plane region can be

generated, it is relatively easy with either drag mode to

give instructions to grasp them, and the grasping success

rates are high.



In the case of objects with complex shapes, however,

we need to consider in detail how to generate a grasp

part’s plane region in accordance with the grasp part’s

shape. Thus, efficient use of the system requires sufficient

familiarization.

To minimize the number of the instructions in this

study, we adopted a one-drag approach and limited it to

making the xxxGzzzG plane in ∑G and the grasp part’s plane

parallel to each other, so that we cannot set them at right

angles to each other only by selecting the left and right

drag modes. In the future, therefore, we may need to con-

sider increasing the number of modes that can be selected

by adding some geometrical meaning to the one-drag tra-

jectory.

The causes for grasping failures are roughly classified

into inaccuracies in instructions and low accuracy in robot

control. The former is mainly attributable to the fact that

if the grasp part is relatively small, the reliability of the

point group data obtained by real-world drag is degraded,

making it difficult to calculate a numerically stable grasp

part’s plane region. The latter seems mainly attributable

to the experimental environment where we had to use the

mobile robot’s degrees of freedom, which is lower in ac-

curacy than the 5 DOF of the robot arm in order to realize

arbitrary attitudes of the gripper.

We also found that the smaller the grasp part and the

more distant the object, the more difficult it is to secure a

stable grasp part’s plane region. In the future, therefore,

we may need to consider a system where, for example,

only grasping points are specified by RWC while grasping

attitudes are instructed by some other means.

5. Conclusion

We have proposed an object-grasping instruction sys-

tem capable of intuitively instructing not only object

grasping points but also grasping attitudes with one laser

drag by expanding our conventional system using RWC.

We have proved the basic validity of the proposed system

through instruction and grasping experiments using many

different types of daily use items. We also evaluated the

proposed system’s usability as an instruction system and

the experimental system’s grasping performance through

experiments with multiple test subjects. As a major re-

sult, the proposed system achieved an overall success rate

of over 70% in the experiments where eight test subjects

attempted to grasp 6 types of objects, as shown in Fig. 14.

In the future, we will proceed with more detailed evalu-

ation experiments and engage in solving the issues with

the current system as described in Section 4.3.2. Based

on the development, we expect to put the proposed system

in practical use as soon as possible as an object grasping

technology for nursing care and life support robots mainly

aimed at lower-limb movement handicapped persons.

Acknowledgements

This study was supported in part by the JSPS Grants-in-Aid

for Scientific Research JP18K12151 and in part by the Suzuken

Memorial Foundation.

References:[1] Y. Abiko, S. Nakasako, Y. Hidaka, S. Iwaki, and K. Taniguchi,

“Linkage of Virtual Object and Physical Object for Teaching toCaregiver-Robot,” Proc. of the 24th Int. Conf. on Artificial Realityand Telexistence, doi: 10.2312/ve.20141373, 2014.

[2] Y. Abiko, Y. Hidaka, and S. Iwaki, “Fundamental Study on Head-motion Pointing System for Seamless Access to Objects in both PCand Real World,” Trans. of the Society of Instrument and ControlEngineers, Vol.52, No.2, pp. 77-85, 2016.

[3] Y. Abiko, Y. Hidaka, K. Sato, S. Iwaki, and T. Ikeda, “Real WorldClick with a TOF Laser Sensor Set on a Pan-tilt Actuator and Its Ap-plication for Teaching a Life Support Robot,” Trans. of the Societyof Instrument and Control Engineers, Vol.52, No.11, pp. 614-624,2016 (in Japanese).

[4] K. Sato, Y. Hidaka, S. Iwaki, and T. Ikeda, “Pointing PerformanceImprovement of Real World Clicker System with a TOF Laser Sen-sor Set on a Pan-tilt Actuator – Proposal of Laser Spot Marker inthe Viewing Window –,” Trans. of the Society of Instrument andControl Engineers, Vol.54, No.2, pp. 290-297, 2018.

[5] Q. Bai, S. Li, J. Yang, Q. Song, Z. Li, and X. Zhang, “Object Detec-tion Recognition and Robot Grasping Based on Machine Learning:A Survey,” IEEE Access, Vol.8, pp. 181855-181879, 2020.

[6] J. Hatori, Y. Kikuchi, S. Kobayashi, K. Takahashi, Y. Tsuboi, Y.Unno, W. Ko, and J. Tan, “Interactively Picking Real-World Ob-jects with Unconstrained Spoken Language Instructions,” Proc. of2018 IEEE Int. Conf. on Robotics and Automation (ICRA 2018),pp. 3774-3781, 2018.

[7] S. Levine, P. Pastor, A. Krizhevsky, J. Ibarz, and D. Quillen, “Learn-ing hand-eye coordination for robotic grasping with deep learn-ing and large-scale data collection,” Int. J. of Robotics Research,Vol.37, Nos.4-5, pp. 421-436, 2018.

[8] H. O. Song, M. Fritz, D. Goehring, and T. Darrell, “Learning to de-tect visual grasp affordance,” IEEE Trans. Autom. Sci. Eng., Vol.13,No.2, pp. 798-809, 2016.

[9] V. Satish, J. Mahler, and K. Goldberg, “On-Policy Dataset Synthe-sis for Learning Robot Grasping Policies Using Fully ConvolutionalDeep Networks,” IEEE Robotics and Automation Letters, Vol.4,No.2, pp. 1357-1364, 2019.

[10] Y. Chao, X. Chen, and N. Xiao, “Deep learning-based grasp-detection method for a five-fingered industrial robot hand,” IETComput. Vis., Vol.13, No.1, pp. 61-70, 2019.

[11] Y. Xu, L. Wang, A. Yang, and L. Chen, “GraspCNN: Real-timegrasp detection using a new oriented diameter circle representa-tion,” IEEE Access, Vol.7, pp. 159322-159331, 2019.

[12] C. M. O. Valente, A. Schammass, A. F. R. Araujo, and G. A.P. Caurin, “Intelligent Grasping Using Neural Modules,” Proc. of1999 IEEE Int. Conf. on Systems, Man, and Cybernetics (IEEESMC’99), pp. 780-785, 1999.

[13] K. Harada, “Manipulation Research,” J. of the Robotics Society ofJapan, Vol.31, No.4, pp. 320-325, 2013.

[14] C. C. Kemp, C. D. Anderson, H. Nguyen, A. J. Trevor, and Z. Xu,“A Point-and-Click Interface for the Real World: Laser Designationof Objects for Mobile Manipulation,” Proc. of the 3rd ACM/IEEEInt. Conf. on Human-Robot Interaction (HRI), pp. 241-248, 2008.

[15] H. Nguyen, A. Jain, C. Anderson, and C. C. Kemp, “A ClickableWorld: Behavior Selection Through Pointing and Context for Mo-bile Manipulation,” Proc. of 2008 IEEE/RJS Int. Conf. on Intelli-gent Robots and Systems, pp. 787-793, 2008.

[16] K. Ishii, S. Zhao, M. Inami, T. Igarashi, and M. Imai, “Design-ing Laser Gesture Interface for Robot Control,” Proc. of the 12thIFIP Conf. on Human-Computer Interaction, (INTERACT 2009),T. Gross et al. (Eds.), “Human-Computer Interaction – INTERACT2009,” Springer, pp. 479-492, 2009.

[17] G. Hirzinger, “The space and telerobotic concepts of the DFVLRROTEX,” Proc. of 1987 IEEE Int. Conf. on Robotics and Automa-tion, pp. 443-449, 1987.

[18] T. Nunogaki and L. Joo-Ho, “Graspable area presentation bymonocular camera based simple modeling for supporting manip-ulator teleoperation,” Proc. of JSME Annual Conf. on Robotics andMechatronics (ROBOMEC 2013), 2A1-F03, 2013 (in Japanese).

[19] R. Balasubramanian, L. Xu, P. D. Brook, J. R. Smith, and Y. Mat-suoka, “Physical Human Interactive Guidance: Identifying Grasp-ing Principles From Human-Planned Grasps,” IEEE Trans. onRobotics, Vol.28, No.4, pp. 899-910, 2012.


Shintani, M. et al.

[20] K. Nagata, T. Miyasaka, Y. Kanamiya, N. Yamanobe, K.Maruyama, S. Kawabata, and Y. Kawai, “Grasping an IndicatedObject in a Complex Environment,” Trans. of the Japan Society ofMechanical Engineers, Series C, Vol.79, No.797, pp. 27-42, 2013.

[21] P. Michelman and P. Allen, “Shared autonomy in a robot handteleoperation system,” Proc. of IEEE/RSJ Int. Conf. on IntelligentRobots and Systems (IROS’94), Vol.1, pp. 253-259, 1994.

Name:Momonosuke Shintani

Affiliation:Hiroshima City University

Address:3-4-1 Ozukahigashi, Asaminami, Hiroshima, Hiroshima 731-3194, Japan

Brief Biographical History:2020 Graduated from Hiroshima City University

Name:Yuta Fukui




2019-2021 Master’s Course Student, Hiroshima City University

Name:Kosuke Morioka




Name:Kenji Ishihata




2018-2020 Master’s Course Student, Hiroshima City University

Name:Satoshi Iwaki



Brief Biographical History:1984 Received M.E. from Hokkaido University

1984- Nippon Telegraph and Telephone Corp.

2007- Professor, Graduate School of Informatics, Hiroshima City

University

Membership in Academic Societies:• The Japan Society of Mechanical Engineers (JSME)

• The Society of Instrument and Control Engineers (SICE)

• The Robotics Society of Japan (RSJ)

• The Institute of Electrical and Electronics Engineers (IEEE)

Name:Tetsushi Ikeda



Brief Biographical History:1997 Received M.E. from Kyoto University

1997- Mitsubishi Electric Corp.

2016- Lecturer, Graduate School of Informatics, Hiroshima City

University

Membership in Academic Societies:• The Institute of Electronics, Information, and Communication Engineers

(IEICE)

• The Society of Instrument and Control Engineers (SICE)



Name:Tim C. Luth

Affiliation:Technical University of Munich (TUM)

Address:Boltzmannstrasse 15, Garching 85748, Germany

Brief Biographical History:1989 Received Diploma in Electrical Engineering from Technical

University of Darmstadt

2005- Professor and Chair, Micro Technology and Medical Device

Engineering, Technical University of Munich (TUM)

2016- Vice-Dean, Mechanical Engineering School, Technical University

of Munich (TUM)

Main Works:• “A 3D-printed functioning anatomical human middle ear model,”

Hearing Research, Vol.340, pp. 204-213, 2016.

• “A planning system of the implant size and position for

minimally-invasive closure of the left atrial appendage,” Proc. of the 6th

IEEE Int. Conf. on Biomedical Robotics and Biomechatronics (BioRob),

pp. 293-298, 2016.

• “G-Code Generation for a New Printing Process Based on 3D Plastic

Polymer Droplet Generation,” Proc. of the ASME 2013 Int. Mechanical

Engineering Congress and Exposition, Vol.2A,

doi: 10.1115/IMECE2013-63152, 2013.

Membership in Academic Societies:• The Institute of Electrical and Electronics Engineers (IEEE)

• The American Society of Mechanical Engineers (ASME)


Powered by TCPDF (www.tcpdf.org)

http://www.tcpdf.org

Documents

Object Grasping Instructions to Support Robot by Laser