Fast Robot-Obstacles Distance Evaluation in the Depth Space

Fast Robot-Obstacles DistanceEvaluation in the Depth Space

eingereichteBACHELORARBEIT

von

Fabian Hirt

geb. am 13.06.1991wohnhaft in:

Connollystr. 3/0T2480809 Munchen

Tel.: 0176 83538794

Lehrstuhl furSTEUERUNGS- und REGELUNGSTECHNIK

Technische Universitat Munchen

Univ.-Prof. Dr.-Ing./Univ. Tokio Martin Buss

Betreuer: M. Sc. Matteo Saveriano, Prof. Dongheui LeeBeginn: 02.03.2015Zwischenbericht: 21.05.2015Abgabe: 13.08.2015

In your final hardback copy, replace this page with the signed exercise sheet.

Abstract

A robot-obstacle distance evaluation algorithm is described in this thesis whichdirectly operates on the image given by a depth sensor, also known as Depth Space.The algorithm’s main features are that the entire visible part of the robot is accu-rately surveyed and that the different robot links are separately considered for theminimum robot-obstacle distance.

Several design parameters are described and evaluated, introduced mainly to im-prove the algorithm’s time performance. Possible implementations of the algorithmwith the respective design parameters are given both for a CPU and a GPU. Per-formance results of the CPU and GPU implementations, in terms of accuracy andcomputational time, are provided and compared for different parameter values. Theimplementation on the GPU outperforms the one on the CPU clearly, especiallyconcerning the execution time. With well chosen parameter values, the algorithmon the GPU can reach cycle times of 1000 Hz, by simultaneous providing accurateresults.

Zusammenfassung

In dieser Arbeit wird ein Algorithmus beschrieben, der den Abstand zwischeneinem Roboter und beliebig geformten Hindernissen berechnet. Dabei benutzt er dieInformationen, die von einem Tiefensensor geliefert werden. Die Operationen werdendirekt auf der Bildebene des Tiefensensors ausgefuhrt. Zu den Hauptmerkmalen desAlgorithmus gehoren, dass der gesamte sichtbare Teil des Roboters, nach Punktenmit minimalem Abstand zu den Hindernissen abgesucht wird, sowie das dabei jedesRoboterglied seperat betrachtet wird.

Verschiedene Design Parameter des Algorithmus werden eingefuhrt, beschriebenund ausgewertet, und dienen hauptsachlich dazu die Ausfuhrungsgeschwindigkeitdes Algorithmus zu verbessern. Mogliche Implementierungen, sowohl auf einer CPU,als auch auf einer GPU, werden behandelt. Die Leistungsergebnisse der Imple-mentierungen werden bezuglich der verschiedenen Design Parameter mit einanderverglichen. Qualitatsmaßstabe sind dabei die Ausfurungsgeschwindigkeit und dieGenauigkeit der Ergebnisse. Die Implementierung auf der GPU ubertrifft die aufder CPU, vorallem bezuglich der Ausfuhrungsgeschwindigkeit. Mit geeigneten Pa-rametern kann die Implementierung auf der GPU Wiederholungsfrequenzen von1000 Hz erreichen und trotzdem genaue Ergebnisse liefern.

2

Fur Mama und Papa

CONTENTS 3

Contents

1 Introduction 71.1 Structural Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2 Distance Evaluation in the Depth Space 132.1 Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.1.1 Depth Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.1.2 Distance Evaluation in the Depth Space . . . . . . . . . . . . 17

2.2 Distance Evaluation Algorithms . . . . . . . . . . . . . . . . . . . . . 212.2.1 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.2.2 Overview of the Distance Evaluation Algorithm . . . . . . . . 232.2.3 Region of Surveillance . . . . . . . . . . . . . . . . . . . . . . 232.2.4 Lattice of Robot Points . . . . . . . . . . . . . . . . . . . . . . 282.2.5 Minimum / Mean Distance Vector . . . . . . . . . . . . . . . . 292.2.6 Pseudo-code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3 Results 333.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.1.1 URDF-filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.1.2 CPU - algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 35

Lattice of Robot Points . . . . . . . . . . . . . . . . . . . . . . 35Search Pattern . . . . . . . . . . . . . . . . . . . . . . . . . . 37Object Points Skipping . . . . . . . . . . . . . . . . . . . . . . 37Mean Distance Vector . . . . . . . . . . . . . . . . . . . . . . 38

3.1.3 GPU - algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 39General Framework . . . . . . . . . . . . . . . . . . . . . . . . 39Lattice of Robot Points . . . . . . . . . . . . . . . . . . . . . . 40Region of Surveillance and Search Pattern . . . . . . . . . . . 41

3.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 42CPU - Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 42GPU - Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4 CONTENTS

4 Conclusion 59

List of Figures 61

Bibliography 63

CONTENTS 5

Nomenclature

CP Point P in Cartesian Coordinate System

P ′ Cartesian point CP projected into the image plane

DP Depth Space representation of CP

P ′ Discretized point (pixel) in the image plane

DP ′ Discretized point in the Depth Space

fx Camera’s focal length in x-direction in pixel units

fy Camera’s focal length in y-direction in pixel units

cx Pixel location of optical center’s x component

cy Pixel location of optical center’s y component

DS Region of surveillance in the Depth Space

6 CONTENTS

7

Chapter 1

Introduction

The rising popularity in robotics forces the necessity to deal with collision avoid-ance in a fast, robust and accurate way. In many robot applications it has to beensured that the robot don’t harm anybody or damage objects in its work space[HASDLH08, BB06, SL14b]. The more collaborative and extensive the specificrobot tasks are, the more important a reliable collision avoidance algorithm be-comes. While there are many different implementations of these algorithms, theyall comprise some sort of a distance evaluation algorithm. The quality of the distanceevaluation influences the quality of the collision avoidance algorithm greatly, sincethe collision avoidance’s most valuable input information is the actual distance be-tween the considered agents or objects. Therefore, the distance evaluation’s perfor-mance should exceed, or at least fulfill the collision avoidance’s desired performancefeatures, e.g. concerning accuracy and computation time.

Visual sensors are one of the best choices for getting the relevant spatial sceneinformation. They are applicable in normal robot work space scenarios, the observedobjects need not to be modified or equipped with markers in order to be observableand visual sensors are highly developed. Especially for distance evaluation tasks,the scene’s 3D information is very relevant. Whereas the 3D information given bya stereo camera needs first to be processed in order to be exploited easily, depthsensors, like the Microsoft Kinect R© provide the relevant spatial information moreaccessible[PS13, DM12, Iko14, Zha12].

An usual approach to use the spatial scene information for distance evaluationalgorithms, is to transform the data given by the visual sensors into point cloudsin a Cartesian coordinate system [PSCM13, KHB+14, SL13]. However, operatingdirectly on the data as it is available as a depth image from a depth sensor, increasesthe performance, since no transformation is needed [FKDLK14, FKDLK12]. Inaddition, for accurate representation, the transformation into a Cartesian coordinatesystem needs additional information, which isn’t provided directly from the depthsensor. For example, occluded points, which are represented only indirectly in the

8 CHAPTER 1. INTRODUCTION

depth sensor image, need to be modeled explicitly in a Cartesian space.

Given this advantages, this thesis describes a fast, robust and accurate distanceevaluation algorithm, which operates directly in the depth space. The algorithmdistinguishes between the several robot parts, thus the minimum distance to everyrobot link is evaluated. Concerning collision avoidance algorithms, this is valuableinformation. In general, because of a robot’s degree of freedom redundancy, thereare many trajectories which can be performed by the robot to evade an obstacle.Avoiding one obstacle with respect to one specific link could therefore place anotherrobot’s link close to the same or another obstacle. Having the distance informationwith respect to all controllable robot’s link thus enables obstacle avoidance controlalgorithms to avoid the obstacles with respect to all robot’s links.

The main contribution of this thesis is to give and compare different implementa-tions of the general distance evaluation algorithm described in [FKDLK14]. Severaldesign parameters are introduced and the computational time, as well as the accu-racy are compared for those parameters. It is shown that specific parameter setsproduce an distance evaluation algorithm which is both fast and accurate, givingrise to use it in realtime capable applications.

1.1 Structural Overview

After presenting briefly the problem statement, a short summary about relatedwork is given. Then, some fundamental information on the Depth Space, the trans-formation between Cartesian Space and Depth Space as well as basic information onhow to calculate distances in the Depth Space are given. Afterwards, the differentparts of the distance evaluation algorithm are introduced and described. A pseudocode then illustrates the connection between the different parts.The next chapter begins by giving more details about the filter which is used topreprocess the depth image and focuses afterwards on how to implement the pre-viously described parts of the algorithm on a CPU. The same is then done withdescribing the implementation details on the GPU. This is followed by presentingand comparing the experimental results of the algorithm. Firstly, the results of theCPU - algorithm are considered, then the ones of the GPU - algorithm. The chapteris then closed with a reflection on the results and their validity.The last chapter summarizes the main parts of the thesis and gives finally a shortoutlook.

1.2 Problem Statement

A robot-obstacle distance evaluation algorithm has to be implemented which isbased on [FKDLK14]. Instead of approximating the robot using geometric primi-

1.3. RELATED WORK 9

tives, a realistic 3D mesh is used (URDF model). In this way, the closest point onthe robot’s surface and the robot-obstacles distance will be estimated with higheraccuracy. To speed up the procedure, making its application real-time capable,parallel programming techniques on GPU are used.

1.3 Related Work

Due to its significance, there are many scientific articles about robot-obstacle dis-tance evaluation. [FKDLK14] gives a good overview about the distance calculationsin different coordinate spaces. Since the approach in this thesis is based strongly onthe approach given in that paper, it is explained in more detail. The most intuitiveand easiest coordinate space to use is a Cartesian coordinate system, especially ifthe spatial location data of the robot and the obstacles is already present. But inmany applications, the spatial location of the objects has to be first assessed byusing visual sensors. In [FKDLK14], the visual capturing of the scene is done by aRGB-Depth sensor (Microsoft Kinect R©). Thus, in order to get the Cartesian coor-dinates of the respective objects, a transformation of the image depth data has to beperformed. It is mentioned, that in this way only the object point which is nearest tothe depth sensor, on the line between sensor and object point, is transformed intothe Cartesian coordinates. Therefore, without further enhancement, the distancetowards occluded objects points cannot be assessed properly.

Another possibility is to transfer the image depth data into the robot’s Configura-tion space. The Configuration space describes the robot posture with its generalizedcoordinates. Transferring an object point to the robot’s Configuration space requiresto calculate all possible robot posture in which the robot will be in contact withthis object point. Although there is a valid distance representation between twopostures in the Configuration space, it is too costly to transform the object pointsinto that space, especially if the dimension of the Configuration space is high dueto an high robot’s degree of freedom.

Thus, [FKDLK14] proposed to perform the distance evaluation directly in the Depthspace of the depth sensor. The distance evaluation is done between arbitrary ob-stacles and a control point, which can represent a real object or a virtual point.There are two main distinction to make: The control point can lay in front of theconsidered obstacle’s foremost part (i.e. the control point’s depth is smaller thanthe depth of the obstacle), or the control point lays behind the obstacle’s foremostpart. The former case is depicted in 1.1 to the object at the bottom right corner, thelatter case to the object in the upper part of the image. As depicted, the minimumdistance from the control point to an object with a greater depth can be calculatedas the Cartesian distance to the foremost plane of the object’s frustum, which isthe depth image pixel’s projection into Cartesian coordinates. The frustum of an


object is therefore composed of its foremost surface and of all the points which thissurface is occluding with respect to the current camera frame. When calculatingthe minimum distance to an object with smaller depth, the frustum’s side planeof the object has to be considered. Since there is no information on the points inthe object’s frustum whether the points are representing objects or free space, forsafety reasons it is assumed that all points in the object’s frustum are part of theobstacle itself. Thus, due to geometric considerations, the closest point on the ob-ject with respect to the control point lays on the object’s frustum plane facing thecontrol point. The exact location of this point on the frustum can be calculatedanalytically, but can also be approximated as the point on the frustum’s side planewhich has the same depth as the control point. The minimum distance between thepoints is then finally computed again, by calculating the Cartesian distance of thepoints re-projected into the Cartesian space. In [FKDLK14] this approach is calledan approach in the depth space, but as the relevant points are projected back inCartesian Space for calculating the distance, the naming causes confusion.

Figure 1.1: All pixels in the image plane create a frustum. This frustum hasto be considered by performing a minimum distance evaluation. Flacco, Fabrizio[FKDLK14]

Regarding collision avoidance, [FKDLK14] argues that as inputs for collisionavoidance algorithms, an hybrid agglomeration of distance information is most ben-eficial. This means, that the minimum distance’s magnitude should be provided anda mean distance vector, which is the average of all distance vectors of the considered

1.3. RELATED WORK 11

control point to all obstacle points which lay within a chosen region of surveillance.If instead of a mean distance vector, a minimum distance vector is sufficient (min-imum distance model), a faster distance evaluation algorithm is presented, whichreduces the computational burden by shrinking the region of surveillance around acontrol point dynamically.Using one control point [FKDLK14] shows that the hybrid model distance infor-mation is less sensitive to noise and exhibits more continuously behavior than theminimum distance model, whose information were calculated once using Cartesianand once Depth space. Fast, reactive and accurate experimental collision avoidanceresults were presented, using a KUKA LWR manipulator with an artificial potentialfield method. The distance to obstacles were not monitored to all robot points, butto a selection of control points placed on the robot, where the endeffector was givenparticular importance.

However, many collision avoidance and distance evaluation tasks are performedby using point clouds in the Cartesian Space [PSCM13, SL13]. In [KHB+14] apoint cloud data approach is presented which is real-time capable by using parallelGPU processing. More real-time capable obstacle detection and avoidance imple-mentations are shown in [PB13, BB06, Kha85]. Information about the obstacles areincorporated in [KH07].


13

Chapter 2

Distance Evaluation in the DepthSpace

2.1 Fundamentals

2.1.1 Depth Space

The Depth space is a 212D coordinate space. It’s the natural representation of a

picture taken by a depth sensor. Such a sensor produces an image of a scene wherethe value of a its pixel maps the depth of the object which is projected into thispixel, as seen in figure 2.1.

Figure 2.1: The manipulator and the obstacle are projected into the image plane ofthe depth sensor performing a transformation into the Depth Space. Lighter areasin the image plane are representing an object nearer to the image plane than darkerareas. The gray area is not visible in the image plane. Figure adapted by Schiavi,R [SBF09].

Given a Cartesian Point CP with coordinates (CPx ,C Py ,

C Pz )T with respect to a

14 CHAPTER 2. DISTANCE EVALUATION IN THE DEPTH SPACE

Cartesian coordinate frame, and a depth sensor whose image plane is perpendicularto the coordinate frame’s z-axis, then the projection onto the image plane of thedepth sensor is given by:

P ′ =

P ′x

P ′y

1

=1

CPz

fsx fsθ cx0 fsy cy0 0 1

CPxCPyCPz

(2.1)

where x′ and y′ are the pixel coordinates of the projected point, f is the focal lengthof the camera, sx and sy are the dimensions of a pixel, sθ is the pixel’s shear factorand cx and cy are the pixel coordinates of the camera’s optical center [MSKS03, p.55]. The projection is depicted in Figure 2.2. Note that these pixel coordinates arecontinuously, a discretized model is described later.

Figure 2.2: The point CP is projected to P ′ in the image plane with pixel coordinatesP ′x and P ′

y. The projection of the optical center O into the image plane is (cx, cy).

The matrix

K =

fsx fsθ cx0 fsy cy0 0 1

rectangular pixels=

fx 0 cx0 fy cy0 0 1

is known as the camera’s intrinsic matrix K. The pixel’s shear factor sθ vanishes forrectangular pixels, which is the case for an usual camera. The product fsx and fsyrespectively can be seen as the direction dependent focal length expressed in pixelunits and will be denoted as fx and fy throughout this thesis. Thus, equation 2.1can be rewritten to

P ′ =

P ′x

P ′y

1

=1

CPz

fx 0 cx0 fy cy0 0 1

CPxCPyCPz

(2.2)

2.1. FUNDAMENTALS 15

The Depth Space representation of point CP is then according to [FKDLK12]:

DP =

P ′x

P ′y

CPz

(2.3)

Therefore, in the Depth space, two coordinates (vertical and horizontal pixel coordi-nates x′ and y′) represent a Cartesian point’s projection onto the image plane, andthe third coordinate represents the positive distance between the sensor frame andthe projected object. As a consequence, only the object point which is closest to thesensor on the virtual line connecting the camera’s optical center O and the PointP ′, is projected into the Depth Space. As a consequence, there exists only one validvalue for CPz for each (P ′

x, P′y) combination. All points on that virtual line with a

higher CPz value are occluded by the foremost point and are part of the Gray Area.The Gray Area is thus a region of uncertainty, since no information on this regioncan be stored in the Depth Space representation. As it can be seen in Figure 2.3, foreach object point in the depth image exists one ray of occluded points which formsin combination the entire Gray Area.

Figure 2.3: A colored Depth Image and the related Gray Area for a typical human-robot-interaction scene. Figure adapted by Flacco, Fabrizio [FKDLK14].

As typical depth sensors are not able to extract the information in the imageplane continuously, a discretized form of the Depth Space has to be developed. Inthe following it is assumed that the camera discretizes the information in the imageplane such that pixel (Px, Py) incorporates the information of the rectangle whose


corner points are: (Px, Py), (Px+1, Py), (Px+1, Py+1) and (Px, Py+1). Consequentlya point P ′ in the image plane is then represented by P ′ through [FKDLK14]:

P ′ =

P ′x

P ′y

1

=

trunc(P ′x)

trunc(P ′y)

1

(2.4)

and the corresponding Depth Space representation is:

DP =

P ′x

P ′y

CPz

=

trunc(P ′x)

trunc(P ′y)

CPz

(2.5)

Figure 2.4 shows the relation between an object point P and its discrete pixel rep-resentation.

Figure 2.4: Cartesian point CP of an object and its discretized projection into theDepth Space, as well as the Gray Area frustum of the discretized point in the DepthSpace. Figure adapted by Flacco, Fabrizio from [FKDLK14].

Since a point in the discretized Depth Space is represented by a pixel with finiteextends, the points which are occluded by this pixel do no longer lay on a single line,but do now form an infinite frustum of a pyramid. As depicted in 1.1, the top baseif formed by the back-projection of the pixel into the Cartesian Space with a z-valueequal to its stored Depth value. The Gray Area is consequently the summation ofall frustums.


2.1.2 Distance Evaluation in the Depth Space

This section serves to explain how a distance evaluation can be performed betweentwo objects which are expressed as points in the Depth Space. Firstly, the contin-uously Depth Space description is used. Secondly, it is described how the distanceevaluation changes when the points are expressed in the discretized Depth Space.

Continuous Depth Space

A direct distance measure in the Depth Space can be created by taking the squareroot of the squared differences of the respective coordinate values. Given two pointsin the Depth Space DO = (O′

x, O′y, dO)T and DP = (P ′

x, P′y, dP )T , the distance

directly in the Depth Space would be:

DD(DO,D P ) =√

(O′x − P ′

x)2 + (O′

y − P ′y)

2 + (dO − dP )2

However, this distance measure depends on the intrinsic parameters of the useddepth sensor, thus it will be different for different depth sensors. To eliminate thisdependency, one has to back-project the points into the Cartesian Space and performthe distance evaluation there. Referring to equation 2.1 and 2.5, the back-projectionof a point DP = (P ′

x, P′y, dP )T from the Depth Space into the Cartesian Space is given

by:

CPx =(P ′

x − cx) ∗ dPfx

CPy =(P ′

y − cy) ∗ dPfy

CPz = dP

(2.6)

The Euclidean Distance between two points represented in the Depth Space DO andDP is therefore given by:

D(DO,D P ) =√

(vx)2 + (vy)2 + (vz)2 (2.7)

with

vx =(P ′

x − cx) ∗ dP − (O′x − cx) ∗ dO

fx

vy =(P ′

y − cy) ∗ dP − (O′y − cy) ∗ dO

fy

vz = dP − dO

(2.8)

However, this distance evaluation is only valid for two given points. If one is inter-ested in the minimum distance between two entire objects, which are represented


by points in the Depth Space, one has to consider the Gray Area created by thesepoints, too. Especially if the distance evaluation algorithm will be used in collisionavoidance, a reasonable safety assumption would be that all points which are oc-cluded are potential obstacles, or in general objects. Therefore, if there are pointsin the Gray Area created by one object which are nearer to another object than thevisible points in the Depth image, then the distance evaluation algorithm shouldreturn these minimal points and distances.

As mentioned earlier, the Gray Area of points in the continuous Depth Spaceare lines. Thus, the problem of finding the minimum distance between two objectsresults in finding the minimum distance between a set of lines. Let us denote thepoints on the lines li which are closest together with mPi. The lines, which arecomposing the Gray Area, have in common that they are diverging from each otherwith increasing depth. This is due to the fact that they all have to intersect theoptical center of the camera and simultaneously intersect different points in theimage plane. This can also be seen when considering the lines in 1.1 going throughthe corners of the pixels or in 2.5. Therefore, the minimum distance between anyof those lines must lay as near to the image plane as possible. Hence, consideringthe minimum distance between two lines l1 and l2, either mP1 or mP2, or both mustcoincide with one of the starting points of the lines, dependent on their depth valuesdi. If d2 > d1, then mP2 coincide with CP2. If d2 < d1, then mP1 coincide with CP1.If d2 = d1, then the minimum distance between their lines is the direct connectionbetween them. The start points of the lines are directly obtained by the back-projection from the Depth Space representation DPi to the Cartesian Space CPi.Thus, after identifying which point has a greater depth value di, one point of theline which represents the minimum distance can be directly obtained. The point onthe other line must be calculated by taking into consideration that this minimumdistance line is perpendicular to it. Figure 2.5 illustrates this relation.Instead of calculating the missing point mPi, one can a priori approximate it as thepoint on li which has the same depth value di as the already obtained point mPi onthe other line. This approach is used in [FKDLK14] on a similar problem, producingonly negligible errors. Conclusively, if one is interested in the distance between toobjects which are (partly) represented by the points DO and DP , one has to set thedepths dO and dP equal to each other and use equation 2.7 and 2.8 with

dO = dP (2.9)

Discretized Depth Space

If the points are represented in the discretized Depth Space, which is the usualcase for Depth sensors, one has to consider additionally the finite extends of a pixel.This will change the above derivation for the continuous case in such a way that theminimum distance between two objects, represented in pixels in the image plane,


Figure 2.5: Cartesian Points CPi and their projections into the image plane P ′i . The

minimum distance relative to the middle line l2 is obtained by creating perpendic-ular lines (blue) intersecting the starting points CPi of the Grey Area lines. Thisminimum distance problem can be approximated by taking the green lines as min-imum distances, which are obtained by choosing points ˆmPi on the lines with thesame depth dj as their respective CPj counterpart.

results in finding the minimum distance between a set of pyramid frustums. Asone can see easily, e.g. in Figure 1.1, the points which are the closest together ondifferent frustums lay on their side planes facing each other. We will assume againthat pixel (Px, Py) incorporates the information of the rectangle in the image planewhose corner points are: (Px, Py), (Px + 1, Py), (Px + 1, Py + 1) and (Px, Py + 1).

The projection onto the image plane of a point CP which lays on the frustums’ side

planes, generated by pixel P ′ is denoted as ˆP ′. Given two object pixel points in the

image plane P ′ and O′, the points ˆP ′ = ( ˆP ′x,

ˆP ′y, 1)T and Ô′ = ( Ô′

x,Ô′y, 1)T which

are the projection of points laying on the frustum’s sides facing each other, can be


determined by:

ˆP ′x =

{P ′x + 1 , P ′

x < O′x

P ′x , otherwise

Ô′x =

{O′x + 1 , P ′

x > O′x

O′x , otherwise

ˆP ′y =

{P ′y + 1 , P ′

y < O′y

P ′y , otherwise

Ô′y =

{O′y + 1 , P ′

y > O′y

O′y , otherwise

(2.10)

Thus, the x and y coordinates of the frustums’ points which have the minimumdistance among them, is now known; only the depth information about these pointsis still unknown. The problem of finding the minimum distance points’ depth valueis again a problem of finding the minimum distance between lines, which solution isdescribed above in the continuous case.Thus, the minimum distance between two objects represented by the points DO′

and DP ′ in the Depth Space is given as:

D(DO′,D P ′ =√

(vx)2 + (vy)2 + (vz)2 (2.11)

with

vx =( ˆP ′

x − cx) ∗ dP − ( Ô′x − cx) ∗ dO

fx

vy =( ˆP ′

y − cy) ∗ dP − ( Ô′y − cy) ∗ dO

fy

vz = dP − dO

(2.12)

The same approximation like in 2.9 in the continuous case, changes 2.12 into

vx =( ˆP ′

x − cx) ∗ dP − ( Ô′x − cx) ∗ dP

fx

vy =( ˆP ′

y − cy) ∗ dP − ( Ô′y − cy) ∗ dP

fy

vz = 0

(2.13)

However, in consideration of the implemented algorithm which evaluates the distancebetween a known robot model and arbitrary objects, a slight modification is made.

2.2. DISTANCE EVALUATION ALGORITHMS 21

Instead of using equation 2.9 in every case to approximate the depth values forthe minimum distance points, a case differentiation is introduced depending on thedepth values of the respective robot point dR and object point dO:

dO =

{dR , dR >= dO

dO , otherwise(2.14)

This means to approximate the closest object point depth value dO with the robotpoint depth value dR if, and only if, the object point lays further in front than therobot point. This makes sense if the robot’s shape is known and its shape is regularwith similar scales in each dimension. Otherwise, if using equation 2.9 in everycase, the algorithm would produce a infinitesimal distance between the robot andan object if their projections into the image plane happen to be very close together,but the object is very far behind the robot. The reason why equation 2.9 should beonly applied if the object is in front of the robot is that one still wants to considerthe uncertainties concerning arbitrary object shapes and therefore this implies asafety measure.

2.2 Distance Evaluation Algorithms

The design of the distance evaluation algorithm in this thesis is strongly based on[FKDLK14]. However, modifications are made, especially in terms of distinguishingbetween different robot links during the distance evaluation and in choosing thepoints on the robot to which the minimum distance shall be observed. This chapterfirst describes how the inputs to the distance evaluation algorithm are created andexplains afterwards the main concepts of the distance evaluation algorithms whichwere implemented on both a CPU and GPU (Graphics Processing Unit).

2.2.1 Filtering

One of the main motives of this thesis is to develop a robot-obstacle distanceevaluation algorithm which takes into account the exact shape of the robot. Thus,a model of the robot has to be supplied. In our case, the robot is described byan Unified Robot Description Format (URDF)[URD15], which is basically a XMLdescription of the robot’s links and joints. The other necessary input is a represen-tation of the scene containing the robot, its workspace, as well as objects to whichthe distances shall be evaluated. The necessary condition the possible input devicesmust fulfill is that they need to capture the 3D-information of the scene. Still, thereare many possible input devices to choose from, but the most straightforward de-vices are optical sensors. Whereas it is possible to reconstruct the 3D information ofa scene by one moving camera or two stationary cameras through feature detectionand feature matching [SK14, KSLW14], a simpler way to obtain the 3D information


of a scene is to use a depth sensor, like the Microsoft Kinect R©. As a depth sensordirectly outputs the spatial scene information in the Depth Space, it is perfectlysuited as input for the distance evaluation algorithm which is described in this the-sis. In [Zha12], a more detailed explanation about the Microsoft Kinect R© depthsensor is given.

Figure 2.6: The URDF robot model, which is used in the algorithm. The table ispart of the robot model.

However, since the depth sensor doesn’t distinguish between points which are partof the robot and points which are part of obstacles, the depth image of the depthsensor has to be first processed to be suitable for the distance evaluation algorithm.Thus, a filter is used in order to segment the points into robot points and obstaclepoints. The filter used for this task in this thesis is an advancement of [Blo15]. Analternative method is described in [RRC11], producing point cloud data.The URDF-filter uses the URDF Model of the robot 2.6 to create a renderablemesh, which is then rendered into an image. By comparing this image with thedepth image from the depth sensor, it is possible to identify the points which are


part of the robot and which are not. Thus, the output of the filter is an depth imagewhich contains only the points on the robot, in the following referred to as robotdepth image, and one which contains all other points belonging to possible obstaclesor to the background, in the following referred to as filtered depth image. Sincethe URDF Model is a tree like model, the different robot links can be distinguishedwhen creating the renderable mesh and also at the rendering step itself, producing anadditional image, where the information is saved to which link a specific robot pointis belonging to. This image will be referred to as linkID image. Thus, the differentrobot links can be distinguished in the distance evaluation algorithm. Figure 2.7shows the images which are created by the filter, as well as the input. Some of theimages in this figure are stenciled with an enlarged silhoutte of the robot. This playsa part in the GPU implementation of the algorithm and can be neglected at thispoint.

2.2.2 Overview of the Distance Evaluation Algorithm

The aim of the Distance Evaluation Algorithm is to output the minimum dis-tance between each robot link and arbitrary objects in its workspace, or at least ina specific region of surveillance, by only taking into account the robot depth image,the filtered depth image and the linkID image. Referring to equation 2.11 and 2.13this could be done by computing for every point on each robot link the distanceto all points present in the filtered depth image, and then comparing the obtaineddistances to identify the minimum of these. However, this means a lot of compu-tational work, since many robot-object distances have to be computed. In the caseof the implementation in this thesis, all input images are having the dimensions of480 pixels x 640 pixels, and therefore, if only one robot point is being considered,480 ∗ 640 = 307200 robot-object distances have to be computed.Hence, it makes sense to decrease the number of object pixels which have to be con-sidered for each robot point and additionally to reduce the number of robot points,to which the distances are computed. The first measure can be accomplished byintroducing an region of surveillance, whereas the second measure can be accom-plished by creating a lattice of robot points.

2.2.3 Region of Surveillance

Regarding collision avoidance, obstacles which are closer to the robot have anhigher significance than obstacles further away. Therefore it is especially importantfor a distance evaluation algorithm to calculate the distances in a specified regionaround the robot fast and accurately. As mentioned above, the more object pointsone has to consider, the longer the computation will last. Therefore, a compromisemust be made on the number of object points to consider. In [FKDLK14] a cubecentered at the Cartesian robot point is proposed as a region of surveillance, whichmeans that for this robot point only the objects within this cubic region are consid-


Figure 2.7: The URDF filter. Top row, from the left: (1.1) Original depth imagewithin robot shaped region of surveillance, (1.2) Normal representation of the ren-dered robot model within robot shaped region of surveillance, (1.3) OpenGL depthvalues corresponding to the rendered robot model.Middle row, from the left: (2,1) Filtered original depth image. All points in red areidentified as robot points, yellow points are not considered in the algorithm (not inregion of surveillance or erroneous values of the depth sensor), (2,2) Red values: ren-dered robot’s real depth. Green values: not part of the robot.(2,3) Original Depthimage.Bottom row, from the left: (3,1) Values in red are identifier values which are statingto which robot link each point belongs. There is no distinction visible, since thevalues range only from 0-7 in a visible range from 0 - 255; (3,2) and (3,3): not used


ered for the distance evaluation. Two region of surveillances are depicted in figure2.8 for two robot points.

Figure 2.8: Two cubic region of surveillance centered at the Cartesian robot pointsCP1 and CP2. Only objects which lay within this regions are considered for thedistance evaluation.

As the algorithm operates directly in the Depth Space, the region of surveillancemust be also transferred into the Depth Space. In [FKDLK14] it is proposed toproject the front side of the cube into the image plane, and to calculate the relatedregion of surveillance by adding and subtracting the dimensions of this projectedrectangle to the robot point P ′ in the image plane. Hence, considering a cubecentered at CP = (CPx,

C Py, dP )T with side length 2L, then the side lengths xs andys of the projected front plane are:

xs = 2 ∗ L ∗ fxdP − L

ys = 2 ∗ L ∗ fydP − L

(2.15)

The region of surveillance expressed in the Depth Space DS , is then by referring to[FKDLK14] given as:

DS = [P ′x −

xs2, P ′

x +xs2

]× [P ′y −

ys2, P ′

y +ys2

]× [dP − L, dP + L] (2.16)


However, this is not entirely correct, but is a valid approximation if the cube laysnear the optical axis of the depth sensor, or if the side length of the cube is very smallcompared to the depth of the robot point. Otherwise, this region of surveillance istoo small, as can be seen in figure 2.9. The figure shows the projection of an cubicregion of surveillance into the image plane. The complete region of surveillance isnot covered by adding xs

2(magenta lines) and ys

2(orange lines) to P ′ (the white

cross).

Therefore, to consider the entire cubic region of surveillance in the depth image adifferent approach has to be used. First, the Cartesian representation of the upperleft corner of the cubic region of surveillance’s front side CPULF = (CPx − L,CPy −L,CPz − L)T is projected into the image plane by using 2.1:

P ′ULF =

P ′ULF,x

P ′ULF,y

1

=1

CPy − L

fx 0 cx0 fy cy0 0 1

CPx − LCPy − LCPz − L

(2.17)

Afterwards, the projected point P ′ULF is compared with the location of the sensor’s

optical center (cx, cy). If P ′ULF lays to the right of (cx, cy), then one can see the left

side of the cube in the Cartesian space from the optical center, which means thatthe projected front side of the cube doesn’t occlude this side in the image plane.Therefore, the upper left corner of the cube’s backside has to be projected into theimage plane:

P ′ULB =

P ′ULB,x

P ′ULB,y

1

=1

CPy + L

fx 0 cx0 fy cy0 0 1

CPx − LCPy − LCPz + L

(2.18)

The left border of the region of surveillance in the Depth Space DS is therefore givenby P ′

ULB,x, since it is the leftmost part in the image plane which is created by thecubic region of surveillance in the Cartesian coordinate frame.This procedure must be analogously done with comparing the y-component of P ′

UL

with the sensor’s optical center (cx, cy), as well as doing the same with the projectionof the bottom right corner of the cube’s front side. To clarify the procedure, it issummarized in algorithm 1.

The resulting region of surveillance in the Depth Space will in most cases be slightlybigger than the original one in the Cartesian coordinate frame, but it certainlycontains all points which are also contained in the Cartesian’s one. Therefore, it issufficient to consider for one robot point DP ′ in the Depth Space only the objectpoints DO′ which are contained in DS.


Algorithm 1 Region of Surveillance

1: function [DS] = get Region of Surveillance(DP ′, L)

2: P ′ULF

proj.←−− CPULF3: if P ′

ULF,x > cx then

4: P ′ULB,x

proj.←−− CPULB5: DSminx ← P ′

ULB,x

6: else7: DSminx ← P ′

ULF,x

8: end9: if P ′

ULF,y > cy then

10: P ′ULB,y

proj.←−− CPULB11: DSminy ← P ′

ULB,y

12: else13: DSminy ← P ′

ULF,y

14: end15: P ′

BRF

proj.←−− CPBRF16: if P ′

BRF,x < cx then

17: P ′BRB,x

proj.←−− CPBRB18: DSmaxx ← P ′

BRB,x

19: else20: DSmaxx ← P ′

BRF,x

21: end22: if P ′

BRF,y < cy then

23: P ′BRB

proj.←−− CPBRB24: DSmaxy ← P ′

BRB,y

25: else26: DSmaxy ← P ′

BRF,y

27: end28: DS ← [DSminx ,DSmaxx ]× [DSminy ,DSmaxy ]× [dP − L, dP + L]29: return DS

Subscript ULF means Upper Left corner of cube’s Front side, ULB : Upper Leftcorner of cube’s Back side, BRF : Bottom Right corner of cube’s Front side andBRB : Bottom Right corner of cube’s Back side


Figure 2.9: The figure shows the projection of a cubic region of surveillance intothe image plane. The solid red lines are the projection of the foremost side of thecube, the white lines corresponds to the center plane, i.e. the plane where thecenterpoint CP is located. The green lines are the projection of the cube’s backside.The orange lines represent the dimensions ys and ys

2calculated in 2.15, which are

the y dimension of the projected cube’s front side. The magenta lines represent xsand xs

2respectively. The long orange and magenta lines are exactly twice the size

as the short orange and magenta lines, and they are centered at P ′, the projectionof CP . As one can see, if one adds and subtracts xs

2and ys

2to P ′ (the white cross),

the region which is created will not contain the entire projection of the cube.

2.2.4 Lattice of Robot Points

The number of robot points to which the distances to object points should bemonitored, influences greatly the computation time of the algorithm, since for everyrobot point the distances to every object point within its region of surveillancemust be calculated. The main idea behind creating a lattice of robot points is toreduce the number of robot points whose robot-object distance must be calculated.However, it must be ensured that the points which are selected by the lattice arerepresenting the actual shape of the robot close enough to produce an accuratedistance evaluation. Thus, the lattice must cover the entire robot, but must be veryfine in locations around the closest robot-obstacle distance points.

In the following it is assumed that the robot’s shape is rather smooth and doesn’thave strong discontinuities. Thus, robot points which are close together in the imageplane, are also close together in the Cartesian Space. The lattice’s requirement to


cover the entire robot is thus transferred into covering the entire robot points in therobot depth image, and the requirement to be fine enough in the Cartesian Spacetransfers into being fine enough in the image plane.

To fulfill both requirement simultaneously and reduce the number of robot pointsconsiderably, the following procedure is proposed:Firstly, an equally spaced coarse lattice of robot points in the robot depth imageis created. Afterwards, the distances of these robot lattice points to all objects intheir respective region of surveillance is calculated. Then, the robot lattice point isidentified which is the closest to an object. To simplify the phrasing, whenever inthis thesis it is talked about the closest lattice point, the lattice point is meant, whichlay closer to the closest object than all the other lattice points. In the neighborhoodof the closest lattice point, an equally spaced, finer lattice of robot points is created.Thereafter, the distances of these newly created robot lattice points to the objectsin their region of surveillance is calculated again. After having identified, whichrobot lattice point is the closest to an object again, the lattice is refined in theneighborhood of this lattice point. This is continued until the space between twolattice points reaches a threshold. Then, one can either take the last robot latticepoint obtained by this algorithm, as the point which has the minimum distancetowards the objects in the scene, or one can refine the lattice again with no spacingin-between the lattice points, meaning that one takes every robot point near the lastrobot lattice point into consideration to calculate the distances to the objects onelast time. A more detailed insight into this procedure is given in the implementationchapter of this thesis.

2.2.5 Minimum / Mean Distance Vector

With usages in collision avoidance in mind, it is helpful to not only know whatthe minimum distance between a robot link and an obstacle is, but also to have aninformation about the relative position between the respective points on the robotand the obstacles. The most straightforward concept of obtaining this information isto save the locations of the minimum distance points during the distance evaluationalgorithm and computing afterwards a vector V min between those points. Thisvector is then called Minimum Distance Vector. As shown in [FKDLK14], thisvector shows strong oscillating behavior, especially if multiple objects are similarlyclose to the robot. Also, when using the collision avoidance control approach tomove the manipulator in the opposite direction of the Minimum Distance Vector,the whole dynamic behavior of the manipulator can become oscillatory. This willbecome clear, if one considers a scenario where one obstacle pushes the manipulatorin the opposite direction of the Minimum Distance Vector, and an other obstaclewhich is located in that direction, pushes the manipulator back.Therefore, a Mean Distance Vector V mean can be utilized, which considers all objectpoints within the region of surveillance of the robot point which is closest to theobstacles. It is simply the mean of all distance vectors between this robot point and


the object points. V mean thus expresses the mean direction to all obstacles within theregion of surveillance. By only taking the Mean Distance Vector into considerationfor collision avoidance, obstacles which are very close to the manipulator do nothold particular significance, which may lead to collisions. Therefore, [FKDLK14]proposed an hybrid model, where the minimum distance influences the intensity ofa collision avoidance control reaction, whereas the Mean Distance Vector influencesthe control reaction’s direction.However, when the minimum distance information including V min is sufficient, onecan save computational cost at the distance evaluation algorithm. When calculatingthe distances from one robot point to all objects in the region of surveillance, andone finds a temporary minimum distance Dtemp, then the region of surveillance canbe shrunk to a cube with side length 2Dtemp, since all other object points must havea greater distance to the robot point than Dtemp.

2.2.6 Pseudo-code

To summarize the above mentioned steps of the distance evaluation algorithm,the pseudo-code is presented in algorithm 2. Please not that this algorithm outputsthe minimum distance and the Minimum Distance Vector for one robot link. Tocalculate the distances for all robot links, this algorithm has to be repeated forevery robot link, or a more intermingled algorithm has to be used which would betoo confusing to state in such a brief outline. Please do also note that to producea Mean Distance Vector, some additional steps and storage variables have to beintroduced in algorithm 2.


Algorithm 2 Distance Evaluation

1: function [Dmin, V min] = getMinDistance(img, L)

2: [IrD, IfD, IID]filter←−−− img

3: Dmin =∞4: Pmin

R = −15: repeat6: [PRL] = createLatticePoints(IrD, IID, P

minR )

7: for every robotLatticePoint PRL do8: DS = getRegionOfSurveillance (PRL, L)9: for every objectPoint PO in DS do

10: [D, V ] = getDistanceBetween (PRL, PO, IfD)11: if D < Dmin then12: Dmin = D13: L = Dmin

14: V min = V15: Pmin

R = PRL

16: endIf17: endFor18: endFor19: until lattice space < threshold

Symbol explanation: IrD= robot depth image; IfD= filtered depth image; IID=LinkID image; PRL= robot lattice point; Pmin

R = minimal distance robot point;Dmin= minimum distance; Vmin= minimum distance vector; DS= region of surveil-lance; L= region of surveillance length; img= input image of the depth sensor;


33

Chapter 3

Results

3.1 Implementation

This section provides more details about the implementation of the URDF-filterand the distance evaluation algorithms. The URDF-filter operates mostly on aGPU, whereas there are two versions of the distance evaluation algorithms: the firstis implemented on a CPU, the second on a GPU to reduce the computation time.In order to classify the results in this thesis, the hardware specifications are brieflymentioned in table 3.1.The URDF-filter, as well as the distance evaluation algorithms are embedded intoa ROS (Robot Operation System) environment, providing some auxiliary functions,especially in terms of communication between processes.

Table 3.1: Brief hardware specification of the computer used for this thesis.

CPU Intel R©CoreTM i7-4790K ProcessorCores: 4Frequency: 4 - 4.4 GHzGPU GeForce GTX 660CUDA cores: 960Memory: 2 GBBase Clock: 980 MHzBoost Clock: 1033 MHzMemory Team-Elite-1600Size: 16 GBFrequency: 1.6 GHz

3.1.1 URDF-filter

As already mentioned, the URDF-filter which is used in this thesis is an advance-ment of [Blo15]. Therefore, only a general overview and the main modifications

34 CHAPTER 3. RESULTS

which are made are described here.

The URDF-filter uses the OpenGL- framework. Hence, the filter operation con-sists of an OpenGl rendering pipeline [SSKLK13, WLH07]. In the first step, theURDF-model of the robot is transfered into a mesh, which is further transferedinto an array of vertices. These vertices are uploaded onto the GPU and a vertexshader computes the locations of these vertices dependent on the current viewpointand robot pose. Afterwards the vertices are clipped, which means basically that allvertices are discarded which do not fit in the current viewport, and rasterized tocreate a pixel inside the viewport. Then, these pixels must undergo certain tests: adepth test to check whether the pixel can be seen or is occluded by the same pixelcoming from a different vertex, and a stencil test which is used to draw pixels onlyin predefined regions. All pixels which has passed both tests are processed by afragment shader, which determines the color of the pixel. The result so far is similarto the image one would get if one takes a picture of the robot.

The filtering of the robot out of the original depth image takes place in the frag-ment shader. First, the depth image is uploaded to the GPU so that it is accessibleby the fragment shader. Then the fragment shader compares the depth value of thepixel it which it gets from the previous stages of the rendering pipeline, with thedepth value which is stored in the original depth image. If the depth value of theoriginal depth image is higher than the depth value of the rendered model, the ob-ject which is represented by this pixel in the original depth image, lays between therobot and the depth sensor. Thus it must be considered as an object pixel and thisdepth image’s pixel is copied to the filtered depth image. If the pixel of the depthimage has the same depth value than the one produced by the rendering pipeline,then the pixel on the depth image represents the robot and thus a replacement valueis entered into the filtered depth image on this pixel location. If all measurements,models and calibrations were perfect, then the third case, i.e. the depth value ofthe depth image is greater than the one produced by rendering the model, wouldnot happen, since this would mean that one can look through the robot. However,imperfections occur and therefore if the third case arises, a replacement value is alsocopied to the filtered depth image.

Additionally, cases in which the depth value of the depth image is only infinitesimalsmaller than the rendered one, are also considered to be caused by imperfectionsand are therefore replaced in the filtered depth image, too. Otherwise, one wouldobtain several points which are very close to the robot, and the distance evaluationalgorithm would return those as closest obstacles, although they are very likely partof the robot itself. Moreover, the model of the robot is slightly enlarged at therendering step in order to remedy discrepancies also in the x and y dimension of thedepth image.

3.1. IMPLEMENTATION 35

The meshes of each robot links are accompanied by a link identifier values, whichare also passed to the fragment shader. Accordingly, the fragment shader knowswhich pixel belongs to which robot link and puts a respective value into the linkIdimage at this pixel location.

The stencil test of the rendering pipeline provides the opportunity to create anadditionally region of surveillance very easily. Before rendering the actual robotmodel, one can render an enlarged version of the robot model and consider onlythe pixels in the original depth image which are within this enlarged hull of therobot. Figure 2.7 shows only the pixels which are within an hull which is 6 timeslarger in every dimension than the robot itself. However, this region of surveillanceis strongly dependent on the robot shape, resulting in high variations in the vicinityof thick and thin robot parts. Hence, the cubic region of surveillance produces morestable results and is therefore more accurate. Nevertheless, this method can be usedif computation time is more important than accuracy.

3.1.2 CPU - algorithm

The general overview of the distance evaluation algorithm can be seen in algorithm2. This section will provide more details about the individual steps of the algorithm.Since section 2.2.3 gave already detailed information on how to obtain a regionof surveillance in the Depth Space, we will focus here on the other parts of thealgorithm.

Lattice of Robot Points

The reason for creating a lattice of robot points is to reduce the number of robotpoint which have to be considered in the distance evaluation algorithm as stated insection 2.2.4. An equally spaced lattice, which will refine around the closest pointfound so far, is generated in the CPU - distance evaluation algorithm as follows:Firstly, all pixels in the robot depth image are looked through until a robot point isfound (the robot depth image consists of the depth values of the robot, or 0’s if apixel doesn’t belong to the robot). This point is then saved into an array of latticepoints and a circle is drawn around this point which prohibits taking another pointinto the array of lattice points coming from within this circle. Then, the search foranother robot point continues, and as soon as a robot point is found which is notlocated within any of the previous drawn circles, it is saved in the array of latticepoints and a circle around this point is drawn, too. This process is repeated aslong as there are no more robot points outside the drawn circles. The result ofthis step is illustrated in sub-figure 3.1a. Then, for all stored robot lattice points,the distances towards all objects within their regions of surveillance is calculated asdescribed in section 2.1.2 and section 2.2.3. During this computation, the region ofsurveillance can be shrunk to the current minimum distance, even if a Mean Distance


Vector should be outputted in the end, since this distance evaluation serves only todetermine the closest lattice point - obstacle point pair.

(a) Lattice creation 1. Step (b) Lattice creation 2. Step

Figure 3.1: Lattice of robot points creation. The image shown is based on theoriginal depth image from the depth sensor. The robot is surrounded by a yellowborder for better visual segmentation, and the box object is bordered with bluecolor. Both the robot and the box are standing on a table, which is not consideredas an obstacle.(a) The numbers indicate the order of choosing the lattice points. Each lattice pointcreates a circle where no other lattice point is allowed inside.(b) The lattice is refined in the neighborhood of the closest lattice point to theobject from the previous step. The former circle is depicted in pink, the refinedlattice points and their surrounding circles are green-colored.

After the closest lattice point is identified, new lattice points are chosen from thedepth image and new circles around them are drawn. Since we want to refine thelattice, we use circles with have of the previous radius this time. The region fromwhere the new lattice points are chosen is restrained to within the previous biggercircle of the closest lattice point, reducing the number of lattice points. The resultof this step is depicted in sub-figure 3.1b.

After having computed the distances between those lattice points and the obstacleswithing their regions of surveillance again, the closest lattice point is identified andthe lattice refinement process is repeated until the inter lattice point spacing passesa threshold. The implementation in this thesis tested values from 1pixel to 3pixelsas a final lattice point spacing.

Having reached the threshold, the distances between the robot points, which laywithin a radius of the final lattice point spacing around the current closest latticepoint, and the objects are computed to obtain the final minimum distance.


To describe this lattice creation and refinement algorithm, it was not mentionedthat the robot consists of several links. However, since one of the main advantagesof this algorithm is to be capable of calculating the distance information separatelyfor the different robot links, the lattice creation and refinement process also dis-tinguishes between the different robot links. Accordingly, the algorithm creates forevery robot link the respective lattice points and circles. This does only slightlyincrease the computation time, because the number of lattice point will remain al-most the same. Only at the links’ connections, the lattice points of the differentlinks can be closer together than the current radius of the drawn circles. But ifone erases this spacing policy relaxation, the minimum distance information won’tbe that accurate anymore. This can be seen by considering a scenario where theobject is near a link connection. Then one link would be allowed to place a latticepoint near the link connection, whereas the other link would not be allowed to placea lattice point there, since the lattice point of the other link already draw a circlethere.

Search Pattern

During the distance evaluation between the robot lattice points and the objectswithin their region of surveillance, the region of surveillance can be shrunk to thecurrent minimum distance which is currently found. This gives rise to the idea that,when beginning to search for object points within the region of surveillance, to startwith object pixels which are close to the robot lattice pixel, because if an objectpoint is found there, the region of surveillance can be shrunk dramatically. This as-sumption is of course only valid, if small pixel distances mean also small Euclideandistances between the respective points.

Hence three different search patterns are implemented in this thesis. The first one isa normal for loop, where all pixels within the region of surveillance are accessed byfirst iterating through the pixels’ columns and then through the pixels’ rows. Thesecond one is an oscillating pattern, which is accessing the object pixel in a waywhich is best described by referring to figure 3.2b. And the third one is a spiralsearch pattern, where the object pixels are accessed in the order of following a spiralwhich begins at the robot lattice point. All three search patterns are illustrated in3.2 for better understanding.

Object Points Skipping

In addition to the design parameters introduced in section 2.2, object points skip-ping is another design parameter to improve the algorithm’s computational time.It will be also referred to as lattice of object points. Instead of calculating the dis-tances between the robot lattice points and all the object points within their regionof surveillance, one only now skips several object points. In other words, one cal-culates only the distances between robot lattice points and object lattice points.


(a) For loop (b) Oscillating (c) Spiral

Figure 3.2: This figure illustrates the search patterns in the image plane. The redarea represents a part of the robot, the yellow part is the border the robot (thedepth sensor produces often erroneous values for object borders), the light grey partis the background of the scene (objects which are far away), and the black part isa box shaped object close to the robot. The current robot or lattice point is shownas blue star. The numbers in (b) expresses the order of the pixel’s access.

The spacing between the object lattice points, denoted as stepX and stepY in thisthesis, can be varied to produce more or less accurate results. Figure 3.3 illustratesa lattice of object points.

Mean Distance Vector

This section describes briefly the changes which have to be applied to the algo-rithm presented in 2 to produce a Mean Distance Vector instead of a MinimumDistance Vector. The lattice of robot point creation and the distance evaluation be-tween the lattice points and object points remains the same. The only modificationswhich have to be done are occurring during the last step of the algorithm, i.e. inthe stage, where the distances between a final set of robot points and the respectiveobjects have to be calculated.

For each robot point in the set, a storage vector vs = (0, 0, 0)T is initialized. Dur-ing the distance evaluation of one robot point and the object points in its regionof surveillance, it is kept track of the number of object points which are accessed.Additionally, all distance vectors between the robot point and the object points arenormalized and added together to vs. As soon as all object points in the region ofsurveillance have been accessed, vs is divided through the number of accessed objectpoints, creating a normalized Mean Distance Vector of one robot point. It is impor-tant that the region of surveillance stays constant during the distance evaluation, asopposed to the pure minimum distance evaluation algorithm. Note, that for everyrobot link only that Mean Distance Vector is returned which belongs to the closestrobot point.


Figure 3.3: This figure illustrates the lattice of object points in the image plane. Anexplanation about the meaning of the colors red, yellow, gray and black are givenin 3.2. Distances are only computed between the robot point (star) and the objectlattice points (green diamonds). The blue arrows indicates the search pattern (forloop) used to access the object pixels. In this illustration, stepX would be 3 andstepY would be 2.

3.1.3 GPU - algorithm

The algorithm implemented on the GPU follows closely the general design ofthe algorithm developed on the CPU. After giving a short introduction about thegeneral framework in which the GPU algorithm is running, this section focuses onthe differences between the algorithm on the GPU and on the CPU.

General Framework

OpenGl provides general purpose computing on GPUs (GPGPU) through Com-pute Shaders. These shaders resemble the ones of the general OpenGl pipeline intheir design, however, their inputs and outputs can be better customized and onedoesn’t have to draw geometric primitives in order to run the computation.

The biggest advantage of delegating computational tasks to the GPU, is that thecomputation can be performed in parallel there [Ebe14]. A single sequential line ofcode processing is called invocation in OpenGL. They are grouped into local workgroups. The invocations within a local work group are able to share memory andto synchronize with each other. The local work groups are further arranged intoglobal work groups. The total amount of parallel code executions is therefore givenby #globalWorkGroups × size(localWorkGroup) , where the symbol # indicates”the number of ”. Generally, each invocation should process another part of the


input, otherwise a parallelization would not make sense. Therefore, each invocationpossesses the information about its location within the global and local work groups.Depending on the location, other parts of the inputs can be processed and theassociated outputs can be returned. One important aspect to consider is that theorder in which the invocations occur are not deterministic.

Lattice of Robot Points

The implementation of lattice points creation used in the algorithm for the CPUis based on a sequential procedure. The lattice points are created after each otherand after the respective circles of forbidden regions are drawn. If one wants todirectly transfer this algorithm on the GPU, one has to implement multiple mutualexclusions and data locks, which dramatically increases the computation time of thealgorithm. One solution for this problem is to let the CPU handle the lattice pointcreation, and to calculate the distances on the GPU afterwards. This method needssome GPU - CPU communication, but the already implemented lattice creationalgorithm for the CPU can be used. Nonetheless, a second solution to this problemis proposed in this thesis, which exploits a different algorithm directly operating onthe GPU.

Firstly, the robot depth image is divided into quadratic tiles. For each tile, a localwork group is dispatched (executed). Within this work group, the invocations arechecking whether their location corresponds to a robot point. If they do, they savetheir location into an output variable. There is only one output variable per tileand per robot link, which means that for every robot link only one robot point pertile is returned, forming one point in the robot lattice. Since the order of executionamong the invocations is not deterministic, the location which is returned in theoutput variable is the location of the last ending invocation within a work group.Consequently, the lattice of robot points which is produced is not equally spaced.To remedy this drawback, each invocation is associated with a quality value. Thequality value describes the Manhattan distance between the invocation’s locationand the center of the tile to which it is belonging. An invocation is further onlyallowed to put its location into the output variable if its quality value is higher thanthe quality value which belongs to the already stored location. Thus, the returnedlocations are close to the tiles’ center and are producing a more equally spaced mesh.As shown in figure 3.4, the produced lattice is indeed almost equally spaced. Thelattice refinement process is then similar to the above stated procedure. The regionaround the closest lattice point is divided into smaller tiles, and for each tile onlyone robot lattice point is allowed to be created. Locations near the new tiles’ centersare preferred again. However, one disadvantage compared to the lattice refinementon the CPU remains: The tile sizes must be hard-coded in the computation shaders,leading to less variability.


Figure 3.4: This figure illustrates the lattice which is created by the GPU latticegeneration algorithm. The red points are the lattice points from the first latticecreation step, the green points are the lattice points of the refinement step for eachrobot link. Note that the points have been enlarged for better visual perception.

Region of Surveillance and Search Pattern

The dimensions of the region of surveillance DS in the Depth Space are depen-dent on the depth of the robot point. Accordingly, one can only construct the regionof surveillance after having obtained the depth of the robot point. Moreover, thenumber of object point within the region of surveillance of one robot point is notidentical with the number of object points within the region of surveillance of an-other robot point, although the side length of the region in the Cartesian Space isequal. Consequently, the number of robot point - object point distances one hasto compute, are varying. In the GPU implementation of the distance evaluationalgorithm, each invocation computes exactly one robot-object distance. Unfortu-nately, by using a compute shader, the total number of invocations must be knownbefore its execution[JK15, Ope15]. Therefore, the region of surveillance for everyrobot or lattice point is the entire depth image. However, referring to 3.1.1, duringthe filtering step, one can use an enlarged robot as a mask and produce only objectpoints within this mask region. As a result, many of the robot-obstacle distanceevaluation invocations return very quickly, since the locations where they are at, do


not contain object points.

There is no predefined search pattern implemented in the GPU algorithm, sincethe order of the invocations’ execution is not deterministic. Therefore, the robot-object distances are computed in a random order.

3.2 Experimental Results

The algorithms were tested with the scene depicted in 3.1. For the accuracy eval-uation, the results which were obtained from a brute force algorithm, were set asground truth. This brute force algorithm computes for every robot point the dis-tance to every obstacle point and takes the minimum distance afterwards. The timethe algorithm needs is shown in figure 3.5. Note that in all following figures, thearea of surveillance length is the side length 2L of the cubic region of surveillancein the Cartesian Space. As one can clearly see, the computation time, as well as thenumber of calculated robot-object distances are increasing with increasing area ofsurveillance length, until an upper bound is reached. The upper bound is reachedwhen the respective area of surveillance length forces the algorithm to search theentire image for object points. This upper bound is dependent on the depth valuesof the robot points, since DS is dependent on the robot points depths.

CPU - Algorithm

First, the effects of the search pattern on the computation time and the numberof calculated robot-object point distances shall be examined. As one can see infigure 3.6a, the spiral search pattern takes the longest, whereas the normal for loopis the fastest. This is surprising, since the idea behind the spiral search pattern wasto reduce the computation time, because it should shrink the region of surveillancefor the robot points the fastest. But, as one can see in figure 3.6b, the number ofcalculated robot points - object points distances are almost the same for each searchpattern. The reason of this is that in the observed scene, the object is not very closeto the robot. Hence, the spiral search pattern accesses the object points (which arelocated towards the border of the region of surveillance) at the end. Additionally,the computational overhead for calculating the spiral search’s order indices increasesthe computation time.

In figure 3.7 one can see how the number of skipped object pixels is affecting thetotal number of visited object points. Clearly, the more points are skipped, the fewerhave to be considered. However, 3.7b shows that above a bound, the increment ofstepX and stepY does not lead to fewer calculated robot-object distances. Figure3.8 displays the execution time corresponding to those measurements. One can seean obvious analogy between the visited object points and the execution time in these

3.2. EXPERIMENTAL RESULTS 43

Figure 3.5: The figure shows the time needed for evaluating the distances with abrute force algorithm. The number of calculated robot-object distances is shown onthe right y-axis. The different lines corresponds to different search patterns.

figures. To explain the possible reasons of the upper bound, it has to be mentionedthat the object pixel procedure is only applied when calculating the closest robotlattice points. When the final minimum distance is being calculated between therobot points around the closest lattice point and the objects, no pixels are beingskipped. The upper bound is therefore the time, which the algorithm needs tocreate the lattice points and to calculate the final distances in the last step. Thedecrease in accuracy due to object skipping turns out to be very small. Referringto figure 3.9, even if one skips 32 object pixels in each dimension, the error in theminimum distance is less than 2 mm. However, this condition is strongly dependenton the scene which one is observing. If there are small objects in the depth image,it is possible that they are completely skipped by this algorithm. Or, if an object’sexpansion is predominant in the Cartesian z-dimension, then the depth values ofneighboring object pixels are differing by a significantly amount. Thus, if only oneobject lattice point falls onto the object, its value is not representative of the entireobject, leading to less accurate distance information.


(a) Time (b) Object points

Figure 3.6: Dependency of the search pattern on computation time and visitedobject points. The bars in the figures are showing the standard deviation.

When creating and refining the lattice of robot points, there are two design pa-rameters which are changeable. The first parameter is the spacing in between theinitially created lattice of robot points. As it is constructed by drawing circlesaround the selected lattice points, this parameter is referred to as initial grid radius.The bigger the initial grid radius, the fewer lattice points are created in the firststep. The second changeable parameter is the final grid radius, which is the thresh-old value for the lattice refinement process. This means that the lattice is as longrefined as long the radius drawn around the lattice points are bigger than the finalgrid radius. In the final step of the distance evaluation algorithm, the distances be-tween the robot points within the final grid radius and the obstacles are calculatedwithout skipping pixels to maintain the accuracy. This fact can be seen in Figure3.10b. The higher the threshold parameter final grid radius, the more accurate theoutput. There is no clear trend visible in the dependency of the initial grid radiuson accuracy as figure 3.10a shows. Figure 3.11 provides the dependencies betweenthe execution time, final grid radius and initial grid radius. Concerning computa-tional time, there is no big difference if one choses the initial grid radius in a rangefrom 40 to 100 pixels. However, the final grid radius has an higher influence onthe computation time, and should be chosen to be 2 in this case to maintain theaccuracy.

Until now, the values for the computation time displayed in the figures were meanvalues averaged over many different parametrizations. To get an idea on how fast


(a) step 1 - 4 (b) step 8 - 32

Figure 3.7: Dependency of object skipping on the visited object points. step = 1means that no object points are skipped during the distance evaluation, step = 4means that 3 pixels in x and 3 pixels in y direction are skipped. The bars in thefigures are showing the standard deviation.

this algorithm can be, figure 3.12a shows the minimum execution time which wasmeasured for the fastest parametrization for the same scene. Figure 3.12b showsagain the accuracy depending on the lattice of object points spacing. Additionally,the errors which are obtained by using the fastest parametrization are added in thisfigure. One can see that the errors of fastest parametrization are above average, butat least the minimum distance information is still acceptable.


(a) step 1 - 4 (b) step 8 - 32

Figure 3.8: Dependency of object skipping on the execution time. step = 1 meansthat no object points are skipped during the distance evaluation, step = 4 meansthat 3 pixels in x and 3 pixels in y direction are skipped. The bars in the figuresare showing the standard deviation.


Figure 3.9: This figure shows the mean values of the differences between the resultsobtained from the CPU distance evaluation algorithm and the brute force algorithm.Both algorithms are returning in addition to the minimum distance, also the locationof the closest robot and object point for each link. The differences between thosevalues are averaged and shown here with the respective standard deviation. Thex-axis is the sum of stepX and stepY .


(a) Initial grid radius (b) Final grid radius

Figure 3.10: Dependency of the initial grid radius and final grid radius on accuracy.


(a) Initial and final grid radius (b) Standard deviation

Figure 3.11: Dependency of the initial grid radius (first robot points lattice spacing)in combination with the final lattice spacing, as well as the standard deviation ofthe measurements.


(a) Computation time and object points (b) Accuracy

Figure 3.12: (a) shows the minimum execution time which was obtained during themeasurements.(b) shows the associated error values for the parametrization which was used toobtain the minimum execution time (dotted lines). For comparison the mean valueof all measurements is displayed, too (solid lines)


GPU - Algorithm

The main distinction in presenting the results for the GPU-algorithm is made bydistinguishing between the algorithm where the lattice of robot point handling isdone on the CPU (CPU Rasterization), and where everything is done on the GPU(GPU Rasterization). Figure 3.13 depicts the computation time as well as the ac-curacy for object point skipping. One can see that the GPU rasterization methodis much more faster than the CPU rasterization method, but this time efficiencycomes with a loss in accuracy, especially if more than 32 object pixels are skippedin both directions. Figure 3.14 illustrates the same issues with the accuracy of theclosest robot and object points locations.

(a) Computation time (b) Accuracy

Figure 3.13: Dependence of object point skipping on computation time and accuracy.

Figure 3.15 shows the dependence of the initial grid radius and the initial tile size onthe computation time. There were only two tile sizes tested, since the tile size hasto be hardcoded in the algorithm. Anyway, the timing difference between the initialtile sizes are negligible. The timing differences concerning the initial radius at theCPU rasterization, fluctuate more. As one can see in Figure 3.16b, the initial radiusparametrization which is the fastest, is also the less accurate if one only considersthe more accurate data, where object skipping above 32 pixels is not allowed. Figure3.16a shows that the initial tile size in the GPU rasterization algorithm does noteffect the accuracy.

To get an impression about the general conflict between computation time and ac-


Figure 3.14: Dependence of object point skipping on robot and object points’ accu-racy.

curacy, Figure 3.17 shows all measurements grouped together by their parametriza-tions. Parametrizations where stepX and stepY are greater than 32 are discarded,since these parametrizations lead to a great loss in accuracy as shown in 3.13b.The remaining 2200 parameter sets have been ordered with ascending computa-tion time. As one can see, with increasing computation time, the accuracy alsoincreases. However, the local fluctuations in the minimum distance error are highcompared to the general increase in accuracy. Hence, there are many parameter setswhich are equally accurate, but have great timing differences. Figure 3.18 basicallyzooms in the leftmost region of figure 3.17, showing the 30 fastest parametrizationsets. To output a minimum distance information for all robot links, which variesnot more than 5 mm from the ground true value, the algorithm needs around 2.2ms for the scene we are capturing. This is much faster than the fastest executionof the pure CPU algorithm which needed around 70 ms for a region of surveillancelength of 0.8 m, but the CPU algorithm is slightly more accurate for that specificparametrization.

All the measurements presented here, came solely from one static robot-obstaclescenario. However, to be able to make statements about the general timing per-formance of the GPU- algorithm, a worst case scenario is developed. This worstcase scenario reflects the situation where the entire filtered depth image consists of

3.3. DISCUSSION 53

Figure 3.15: Dependence of the initial grid radius (CPU Rasterization) and tile size(GPU rasterization) on computation time.

object points, and the robot is located behind those points, in order to not filterany values out of the depth image. Additionally, the robot covers the entire imageplane, too. Thus, the entire image has to be filled with a lattice of robot points andeach lattice point has to calculate the distances to every object point in the entireimage. However, the timing results, which are provided in 3.19 are not much worsethan the ones which were obtained by the scene we were observing. Since there isno ground true for measuring the accuracy in the worst case, the 30 parameter setswhich were determined to be the fastest in the worst case, have been applied to theprevious scene, to produce the accuracy plot in this figure.

3.3 Discussion

The distance evaluation problem in the Depth Space does show in general parallelcharacteristics. The distance between 2 points can be calculated independentlyfrom any other information. Thus, it is not surprising that, as shown in section 3.2,parallel computing techniques outperforms sequential code execution. The obtainedresults for the distance evaluation algorithm are very promising. With an executiontime of 1 ms - 10 ms, the algorithm can be repeated 100 - 1000 times per second.


(a) Tile size accuracy (b) Initial radius accuracy

Figure 3.16: Dependence of the initial tile size and radius on accuracy. Since theaccuracy is decreasing strongly with object pixel skipping steps above 32 pixels,there are 2 lines depicted, where the blue line takes the mean over all measurements,whereas the orange line only considers those measurements where the object pixelskipping steps were less than 32.

However, this time doesn’t include the filtering step, which is in its current statethe bottleneck concerning faster execution times. Within the above mentioned timerange, an accuracy of 5 mm deviation can be easily reached. In the current setup,an higher accuracy in the distance evaluation algorithm is not necessary, becauseother parts of the setup have a more important influence on the accuracy. Thebottleneck concerning accuracy is again the URDF-filter. In order to eliminatealignment errors and erroneous measurements of the Kinect-depth-sensor, the robotmodel was enlarged slightly. The distance evaluation algorithm uses this enlargedmodel as representation for the real robot and can thus not be more accurate than themagnitude of re-sizing, which happens to be more than 5 mm in general. However,it is possible to improve the accuracy of the filter, if one keeps track of the objects inthe scene. Erroneous measurements of the Kinect can than be detected and deletedif the measured depth value doesn’t occur in the vicinity of the objects from theprevious time step or in the direction of its movement.

3.3. DISCUSSION 55

Figure 3.17: All parameter sets, where stepX and stepY were not greater than 32,are displayed with their accuracy and execution time.

Although a worst case scenario has been presented in this thesis, universally validstatements about the performance of the presented algorithms can not be made,since too few robot-obstacle configurations has been tested. However, especiallyfor similar scenarios like that presented in the thesis, the results can give a generalunderstanding about the relations between different parameters, computational timeand accuracy.

Further limitations on the validity of the results are due to considering the re-sults of a brute force algorithm as ground truth. More meaningfully statementscan therefore be made, if the results of the algorithm are compared with real wordmeasurements. However, statements concerning the computational time of the al-gorithm, are rather unaffected by this. Comparing the results with real world data,instead of the results of the brute force algorithm, does mostly assess the quality ofthe filter.


Figure 3.18: The 30 fastest parametrization sets under the constraint that stepXand stepY <= 32. The computation time, as well as the accuracy, is shown forthose parametrization sets.

3.3. DISCUSSION 57

Figure 3.19: Worst case scenario. The error values are not coming from the worstcase scenario, but from the captured robot-obstacle scene by using the parametriza-tion determined by the worst case scenario.


59

Chapter 4

Conclusion

Starting from a general distance evaluation algorithm design in the Depth Space,several design parameters and extensions have been introduced, mostly to improvethe algorithm’s computation time. For this purpose, a lattice of robot points, alattice of object points and a region of surveillance is created, to reduce the numberof robot-obstacle distances which has to be computed.

It has been shown, how the design parameters and extension can be implemented ina sequential and a parallel fashion, suited for either running on a CPU or GPU. Themost challenging part has been to create an equally spaced lattice of robot points,which is refining itself around the closest robot point. On the CPU this was solvedby picking points from the robot depth image which are not within circles, which aredrawn dynamically by the lattice points. On the GPU, the robot depth is dividedinto tiles, and points close to the center of the tiles are picked as robot lattice points.

Furthermore, the influences of the design parameters have been studied and per-formance results have been presented for one robot-obstacle configuration. Compu-tational time and the deviation of the results to the results of a brute force algorithmhas been used as performance qualities. It has been shown that the parallel imple-mentation on the GPU outperforms the implementation on the CPU clearly. Themost important parameter which is affecting the computational time is the numberof object pixels which are skipped during the distance computation. However, ifthis parameter is kept under a scene dependent threshold value, the results of thedistance evaluation are staying accurate.

The overall results of the GPU distance algorithm are very promising. Executionfrequencies of 100 - 1000 Hz are in theory possible by simultaneously keeping theresults accurate. The validity of these results needs to be further verified by con-sidering different robot-obstacles configurations, as well by comparing the obtaineddistance results with real world data.

60 CHAPTER 4. CONCLUSION

Future work might address the usage of this algorithm in robot-obstacle collisionavoidance tasks, where the feature of the algorithm, to output for every robot linkthe minimum distance, can be analyzed and evaluated.

LIST OF FIGURES 61

List of Figures

1.1 Object pixel’s frustum . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.1 Depth sensor and Depth space . . . . . . . . . . . . . . . . . . . . . . 132.2 Projection into the image plane . . . . . . . . . . . . . . . . . . . . . 142.3 Depth Image and Gray Area . . . . . . . . . . . . . . . . . . . . . . . 152.4 Pixel discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.5 Distances between Lines . . . . . . . . . . . . . . . . . . . . . . . . . 192.6 KUKA LWR model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.7 URDF filter screenshot . . . . . . . . . . . . . . . . . . . . . . . . . . 242.8 Region of Surveillance . . . . . . . . . . . . . . . . . . . . . . . . . . 252.9 Approximated Region of Surveillance . . . . . . . . . . . . . . . . . . 28

3.1 Lattice of robot points creation . . . . . . . . . . . . . . . . . . . . . 363.2 Search Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383.3 Object Skipping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393.4 GPU Robot Lattice . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413.5 Brute Force algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 433.6 Search pattern - time . . . . . . . . . . . . . . . . . . . . . . . . . . . 443.7 Object skipping - visited points . . . . . . . . . . . . . . . . . . . . . 453.8 Object skipping - time . . . . . . . . . . . . . . . . . . . . . . . . . . 463.9 Object pixel skipping - accuracy . . . . . . . . . . . . . . . . . . . . . 473.10 Initial and final grid radius - accuracy . . . . . . . . . . . . . . . . . . 483.11 Initial and final grid radius - time . . . . . . . . . . . . . . . . . . . . 493.12 Fastest Parametrization Time - Accuracy . . . . . . . . . . . . . . . . 503.13 GPU: Object skipping . . . . . . . . . . . . . . . . . . . . . . . . . . 513.14 GPU: Accuracy for robot and object points . . . . . . . . . . . . . . 523.15 GPU: Initial Radius - Tile size . . . . . . . . . . . . . . . . . . . . . . 533.16 GPU: Tile size / Initial radius accuracy . . . . . . . . . . . . . . . . . 543.17 GPU: Time for all parameter sets . . . . . . . . . . . . . . . . . . . . 553.18 GPU: Fastest 30 parameter sets . . . . . . . . . . . . . . . . . . . . . 563.19 GPU: Worst case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

62 LIST OF FIGURES

BIBLIOGRAPHY 63

Bibliography

[BB06] L. Balan and G.M. Bone. Real-time 3d collision avoidance methodfor safe human and robot coexistence. In Intelligent Robots and Sys-tems, 2006 IEEE/RSJ International Conference on, pages 276–282,Oct 2006.

[Blo15] Nico Blodow. Realtime urdf filter, May 2015. Available from: https://github.com/blodow/realtime_urdf_filter.

[DM12] P. Dzitac and A.M. Mazid. A depth sensor to control pick-and-placerobots for fruit packaging. In Control Automation Robotics Vision(ICARCV), 2012 12th International Conference on, pages 949–954,Dec 2012.

[Ebe14] David H. Eberly. GPGPU Programming for Games and Science. A.K. Peters, Ltd., 1st edition, 2014.

[FKDLK12] F. Flacco, T. Kroger, A. De Luca, and O. Khatib. A depth space ap-proach to human-robot collision avoidance. In Robotics and Automa-tion (ICRA), 2012 IEEE International Conference on, pages 338–345,May 2012.

[FKDLK14] Fabrizio Flacco, Torsten Kroeger, Alessandro De Luca, and OussamaKhatib. A depth space approach for evaluating distance to objects.Journal of Intelligent and Robotic Systems, pages 1–16, 2014.

[HASDLH08] S. Haddadin, A. Albu-Schaeffer, A. De Luca, and G. Hirzinger. Col-lision detection and reaction: A contribution to safe physical human-robot interaction. In Intelligent Robots and Systems, 2008. IROS2008. IEEE RSJ International Conference on, pages 3356–3363, Sept2008.

[Iko14] N. Ikoma. On gpgpu parallel implementation of hands and arms mo-tion estimation of a car driver with depth image sensor by particlefilter. In World Automation Congress (WAC), 2014, pages 246–251,Aug 2014.

https://github.com/blodow/realtime_urdf_filter

https://github.com/blodow/realtime_urdf_filter

64 BIBLIOGRAPHY

[JK15] Randi Rost John Kessenich, Dave Baldwin. The opengl shadinglanguage, Jan 2015. Available from: https://www.opengl.org/

registry/doc/GLSLangSpec.4.50.pdf.

[KH07] S. Kuhn and D. Henrich. Fast vision-based minimum distance deter-mination between known and unkown objects. In Intelligent Robotsand Systems, 2007. IROS 2007. IEEE/RSJ International Conferenceon, pages 2186–2191, Oct 2007.

[Kha85] O. Khatib. Real-time obstacle avoidance for manipulators and mo-bile robots. In Robotics and Automation. Proceedings. 1985 IEEEInternational Conference on, volume 2, pages 500–505, Mar 1985.

[KHB+14] K.B. Kaldestad, S. Haddadin, R. Belder, G. Hovland, and D.A. Anisi.Collision avoidance with potential fields based on parallel process-ing of 3d-point cloud data on the gpu. In Robotics and Automation(ICRA), 2014 IEEE International Conference on, pages 3250–3257,May 2014.

[KSLW14] Hao-Yuan Kuo, Hong-Ren Su, Shang-Hong Lai, and Chin-Chia Wu.3d object detection and pose estimation from depth image for roboticbin picking. In Automation Science and Engineering (CASE), 2014IEEE International Conference on, pages 1264–1269, Aug 2014.

[MSdPPF00] B. Martinez-Salvador, A.P. del Pobil, and M. Perez-Francisco. A hi-erarchy of detail for fast collision detection. In Intelligent Robots andSystems, 2000. (IROS 2000). Proceedings. 2000 IEEE/RSJ Interna-tional Conference on, volume 1, pages 745–750 vol.1, 2000.

[MSKS03] Yi Ma, Stefano Soatto, Jana Kosecka, and S. Shankar Sastry. An Invi-tation to 3-D Vision: From Images to Geometric Models. SpringerVer-lag, 2003.

[Ope15] Opengl reference pages, Aug 2015. Available from: https://www.

opengl.org/sdk/docs/man4/.

[PB13] B. Peasley and S. Birchfield. Real-time obstacle detection and avoid-ance in the presence of specular surfaces using an active 3d sensor.In Robot Vision (WORV), 2013 IEEE Workshop on, pages 197–202,Jan 2013.

[PS13] C. Pillajo and J.E. Sierra. Human machine interface hmi using kinectsensor to control a scara robot. In Communications and Computing(COLCOM), 2013 IEEE Colombian Conference on, pages 1–5, May2013.

https://www.opengl.org/registry/doc/GLSLangSpec.4.50.pdf

https://www.opengl.org/registry/doc/GLSLangSpec.4.50.pdf

https://www.opengl.org/sdk/docs/man4/

https://www.opengl.org/sdk/docs/man4/

BIBLIOGRAPHY 65

[PSCM13] Jia Pan, I.A. Sucan, S. Chitta, and D. Manocha. Real-time collisiondetection and distance computation on point cloud sensor data. InRobotics and Automation (ICRA), 2013 IEEE International Confer-ence on, pages 3593–3599, May 2013.

[RRC11] P. Rakprayoon, M. Ruchanurucks, and A. Coundoul. Kinect-basedobstacle detection for manipulator. In System Integration (SII), 2011IEEE/SICE International Symposium on, pages 68–73, Dec 2011.

[SBF09] R. Schiavi, A. Bicchi, and F. Flacco. Integration of active and pas-sive compliance control for safe human-robot coexistence. In Roboticsand Automation, 2009. ICRA ’09. IEEE International Conference on,pages 259–264, May 2009.

[SK14] M.V. Sotnikova and M.V. Korovkin. Object recognition and distanceevaluation using single camera. In Computer Technologies in Physicaland Engineering Applications (ICCTPEA), 2014 International Con-ference on, pages 174–174, June 2014.

[SL13] M. Saveriano and Dongheui Lee. Point cloud based dynamical systemmodulation for reactive avoidance of convex and concave obstacles.In Intelligent Robots and Systems (IROS), 2013 IEEE/RSJ Interna-tional Conference on, pages 5380–5387, Nov 2013.

[SL14a] M. Saveriano and Dongheui Lee. Distance based dynamical systemmodulation for reactive avoidance of moving obstacles. In Roboticsand Automation (ICRA), 2014 IEEE International Conference on,pages 5618–5623, May 2014.

[SL14b] M. Saveriano and Dongheui Lee. Safe motion generation and onlinereshaping using dynamical systems. In Ubiquitous Robots and Am-bient Intelligence (URAI), 2014 11th International Conference on,pages 45–45, Nov 2014.

[SSKLK13] Dave Shreiner, Graham Sellers, John M. Kessenich, and Bill M. Licea-Kane. OpenGL Programming Guide: The Official Guide to LearningOpenGL, Version 4.3. Addison-Wesley Professional, 8th edition, 2013.

[STD09] Giovanna Sansoni, Marco Trebeschi, and Franco Docchio. State-of-the-art and applications of 3d imaging sensors in industry, culturalheritage, medicine, and criminal investigation. Sensors, 9(1):568,2009.

[SW14] Robert Sedgewick and Kevin Wayne. Algorithmen - Algorithmen undDatenstrukturen (4. aktualisierte Auflage). Pearson Studium, 2014.

66 BIBLIOGRAPHY

[URD15] Xml robot description format (urdf), August 2015. Available from:http://wiki.ros.org/urdf/XML/model.

[WLH07] Richard Wright, Benjamin Lipchak, and Nicholas Haemel.Opengl® Superbible: Comprehensive Tutorial and Reference,Fourth Edition. Addison-Wesley Professional, fourth edition, 2007.

[Zha12] Zhengyou Zhang. Microsoft kinect sensor and its effect. MultiMedia,IEEE, 19(2):4–10, Feb 2012.

http://wiki.ros.org/urdf/XML/model

LICENSE 67

License

This work is licensed under the Creative Commons Attribution 3.0 Germany License.To view a copy of this license, visit http://creativecommons.org or send a letter toCreative Commons, 171 Second Street, Suite 300, San Francisco, California 94105,USA.

http://creativecommons.org/licenses/by/3.0/de/

Documents

Fast Robot-Obstacles Distance Evaluation in the Depth Space