Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
The City College of New YorkPengantar Robotika – Universitas Gunadarma 1
Robot Computer VisionDr. Mohammad Iqbal
Based-on slide Dr. John (Jizhong) Xiao
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Introduction
• What is computer vision?– Cameras can be thought of as an array of individual sensors– Computer vision attempts to use known constraints (whether
from physics or known environmental structure of our world), to extract relevant information from the sensor values and this known world structure
• We will only deal with geometric issues of computer vision– i.e. not optics or illumination issues
• We will discuss two primary functions:1. Grouping/segmenting objects within an image2. Extracting position and orientation information from the image
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Machine Vision : Akuisisidata citra, diikutipemrosesan daninterpretasi data untukaplikasi industri sepertiinspeksi, pengukurandsb.• Pemosisi obyek yang akurat• Menjaga posisi relatif • Pengukuran obyek• Pengenalan obyek• Object Recognition• Registrasi obyek
Visual Servoing
Machine Vision dan robot
The City College of New YorkPengantar Robotika – Universitas Gunadarma
• Inspeksi
• Identifikasi
• Robot guidance
Jenis Machine Vision
The City College of New YorkPengantar Robotika – Universitas Gunadarma
• 2D system• Mengukur dimensi• Verifikasi keberadaan suatu komponen.• Verifikasi fitur dan warna• Cek hasil pencetakan dan kode• Deteksi cacat
• 3D system–Rekonstruksi dan inspeksi dari bentuk
kompleks
Jenis Machine Vision
The City College of New YorkPengantar Robotika – Universitas Gunadarma
• Akuisisi citra• Pemrosesan dan analisis citra• Interpretasi citra
Steps of Machine Vision
The City College of New YorkPengantar Robotika – Universitas Gunadarma 7
COGNEX• 33th year• Public company
(CGNX on NASDAQ)
• 750,000+ systems installed
• Worldwide leader in machine vision
Contoh Machine Vision System
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Behind Cognex ID tools • Teknologi terkini:
– Object Location– Feature detection– Object recognition– Image processing
• Intellectual Property– 193 patent
PatMax Object Location
Cognex ID Technology
Contoh Machine Vision System
The City College of New YorkPengantar Robotika – Universitas Gunadarma
COGNEXProducts
Contoh Machine Vision System
The City College of New YorkPengantar Robotika – Universitas Gunadarma
• Single Perspective Camera• Multiple Perspective Cameras (Stereo Camera Pair)• Laser Scanner• Omnidirectional Camera• Structured Light Sensor
Sensor Vision
The City College of New YorkPengantar Robotika – Universitas Gunadarma
SinglePerspectiveCamera
Sensor Vision
The City College of New YorkPengantar Robotika – Universitas Gunadarma
MultiplePerspectiveCamera(StereoCamera)
Sensor Vision
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Aqsense
LaserScanner
Sensor Vision
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Omnidirectional Camera
Sensor Vision
The City College of New YorkPengantar Robotika – Universitas Gunadarma
StructuredLight
Sensor Vision
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Mekanisme Machine Vision
• Vision Guided robot memiliki sistem closed control loop.
Visual Servoing
The City College of New YorkPengantar Robotika – Universitas Gunadarma
• Konfigurasi kamera
End-Effector MountedFixed
Mekanisme Machine VisionVisual Servoing
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Contoh : ABB Integrated Vision
Mekanisme Machine Vision
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Motosight
MotomanMotosight
Mekanisme Machine Vision
The City College of New YorkPengantar Robotika – Universitas Gunadarma
1. PersiapanImagingsystem:kamera(board/webcam),lensa,scene.
2. PersiapanpemrogramanVision:bahasa,frameworkvisionopensource(opencv,simplecv,...)
3. IntegrasiController4. Integrasipemrog.Visiondengan
pemrog.Mekatronik.
Mendalami Machine Vision
The City College of New YorkPengantar Robotika – Universitas Gunadarma
MemilihController1. Mikrokontroler:
• Arduino• Jenis2tertentu
2. MiniPC• Beaglebug• raspberryPi
3. PCdekstop/laptop4. Smartphone/tablet:Android
Mendalami Machine Vision
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Frame Koordinat Kamera• The camera consists of a lens (focal length λ) and an image
plane (where the pixel array is physically located)• The image plane is an array of pixels of dimension Nrows x
Ncolumns
– (u,v) are used to parameterize the image plane• First define the camera coordinates:
– Definition: the center of projection is the origin of the camera frame, located λ behind the image plane
• The x and y components of the origin of the camera frame are parallel to the image plane
– Definition: the optical axis is the line that is collinear with the z coordinate of the origin of the camera frame
– Definition: the principal point is the intersection point of the image plane and the optical axis
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Frame Koordinat Kamera• Thus, any point on the image plane will have coordinates (u,v,λ)• Perspective projection:
– Let P be a point in the world with coordinates (x,y,z)– Let p be the projection of P onto the image frame with coordinates
(u,v,λ)– The points P, p, and the origin of the camera frame are collinear,
thus:
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
=
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
λ
vu
zyx
k
zyv
zxu λλ == ,
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Segmentasi
• Dividing an image into separate components
• Numerous ways to do this, one is to choose thresholds– If a pixel value is above a threshold, it is in
one group, and if it is below the threshold it is in another
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Definisi1. Histogram: for an 8-bit grayscale image (pixel values from 0-
255), the histogram H(z) is a count of the number of occurrences of the value z• Note that:• And:
2. Probability that a pixel will have value z:
3. Mean value of the grayscale image:
4. Variance of a grayscale image:
( ) ( )( )
( )columnsrowszNNzH
zHzHzP
×==
∑ =
255
0
( )∑=
=255
0zzzPµ
( ) ( )∑=
−=255
0
22
zzPz µσ
( ) columnsrows NNzH ×≤≤0
( ) columnsrowsz
NNzH ×=∑=
255
0
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Definisi• Suppose that the image is composed of a number of objects• Instead of computing the mean for the whole image, we can
compute the mean for each object– First, construct individual histograms for each object and for the
background– Notation: Hi is the histogram for the ith object, i = 0 is the
background– Then the mean for the ith object is:
– Called the conditional mean– The conditional variance is:
( ) ( )( )∑
∑∑
==
=
==255
0255
0
255
0 zz i
i
zii
zHzHzzzPµ
( ) ( ) ( ) ( )( )∑
∑∑
==
=
−=−=255
0255
0
2255
0
22
zz i
iii
zii
zHzHzzPz µµσ
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Pemilihan Threshold• Call zt the threshold• We will first divide the image into two groups based upon the
threshold:– If a pixel value z > zt, that pixel belongs to group 1, otherwise group
0• First we rewrite the conditional means and variances for each
group– Definition: qi(zt) is the probability that a pixel will belong to a group i
given the threshold zt
– Of course, q0(zt) + q1(zt) = 1
( )( )
( )( )
255
10
0columnsrows
zzt
columnsrows
z
zt NN
zHzq
NNzH
zq t
t
×=
×=
∑∑ ==
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Pemilihan Threshold• Now lets rewrite the conditional means:
• Using the separation of the two groups by the threshold, we can write:
• Combining with the above conditional means:
( )( )
( ) ( )( ) ( )∑
∑∑∑ =
===
×
×== 255
0
255
0
255
0255
0/
/
z columnsrowsi
columnsrowsi
zzz i
ii
NNzHNNzHz
zHzHzµ
( ) ( ) ( )⎩⎨⎧ ≤×
=else00
tcolumnsrows zzzPNNzH ( )
( ) ( )⎩⎨⎧
×
≤=
else0
1 zPNNzz
zHcolumnsrows
t
( ) ( ) ( )( ) ( )
( )( )t
z
zz
z columnsrows
columnsrowsz
zt zq
zPzNNzH
NNzHzzt
t
t
000 0
0
00
//
∑∑
∑=
==
=×
×=µ
( ) ( ) ( )( ) ( )
( )( )tzz
zz columnsrows
columnsrows
zzt zq
zPzNNzH
NNzHzzt
tt 1
255
1255
1 1
1255
11
//
∑∑
∑+=
+=+=
=×
×=µ
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Pemilihan Threshold• Similarly, the conditional variances are:
• Ok, now we have the histograms, conditional means, and conditional variances for each group– but these are based upon some threshold zt… how do we
determine zt?• Intuitively, if we have a good choice for zt, the variances will be
small– i.e. the values of the pixels in a given group will be close to the
group mean– Thus a good choice for zt is a value that minimizes all group
variances
( ) ( )( ) ( )( )tzz
tt zqzPzzz
t 1
255
1
21
21 ∑
+=
−= µσ
( ) ( )( ) ( )( )t
z
ztt zq
zPzzzt
00
20
20 ∑
=
−= µσ
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Pemilihan Threshold• Definition: the within-group variance is defined as:
– This is a weighted-average of the variances of the two groups
• Weighted with the probability of a pixel being in that group– Choose zt to minimize σw
2(zt) – To do this, iterate over all values of zt and choose the value
that minimizes the within group variance– Note that this can be computationally expensive and there
are alternative approaches that are faster
( ) ( ) ( ) ( ) ( )tttttw zzqzzqz 211
200
2 σσσ +=
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Connected components• Now we have separated the image into two groups: object(s) and
background• It is possible that there are multiple objects in the image… how do
we identify individual objects?• First, we define connected pixels:
– Consider a pixel with coordinates (r,c)– This pixel will have nearest neighbors (r-1,c), (r+1,c), (r,c-1), (r,c+1)– Two pixels are 4-connected if, for a given pixel, another pixel is one of
its four nearest neighbors– If you include the diagonal pixels, we can define 8-connected in the
same way• A connected component is a set of pixels such that for any two
pixels in the set, there is a connected path between them
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Connected components• We want to assign a unique identifier to each set of
connected components – To identify individual objects in the image perform the following
algorithm:1. Raster-scan the image from left to right and top to bottom
• For a given pixel with coordinates (r,c), if the pixel immediately above and immediately to the left are background, then this is a new object and assign a new label
• If either of the pixels directly above or directly to the left have received an assignment, take the minimum of these two as the image assignment for pixel (r,c)
• If both pixels directly above and directly below have been assigned a value, note an equivalence
2. Raster scan again, this time noting the equivalence• Replace each label with the minimum of the label’s equivalence
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Connected components• (a) initial thresholded image, (b) assignment after first raster-
scan, (c) assignment after second raster-scan, (d) final component assignment
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Indicator function• Once the image is separated into individual objects, it is
very useful to construct an indicator function as a mapping to describe if a pixel belongs to a particular object
– Thus there are Nobjects indicator functions that are each the size of the original image
( ) ( )⎩⎨⎧
=else0
component in is , pixel1,
icrcriI
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Position and orientation• Finally, we get to the point that we can extract useful information
from the world• We use the indicator function to extract information about each
object– Since we have a planar image, all we can get is position and orientation
• Intermediate step: Moments• Definition: the moment of the kth object (group) in the image plane is
defined as:
• Note: m00 is the number of pixels in a given object
( ) ( )∑ ∑= =
=rows columnsN
r
N
ck
jiij crcrkm
1 1,I
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Position and orientation
• The order of the moment is defined as i + j• First order moments are very useful in computing the
centroid of an object:
( ) ( )∑ ∑= =
=rows columnsN
r
N
ck crrkm
1 110 ,I
( ) ( )∑ ∑= =
=rows columnsN
r
N
ck crckm
1 101 ,I
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Position and orientation• Now we can use the moments of a group to directly get
information about the object it represents
The City College of New YorkPengantar Robotika – Universitas Gunadarma
centroids• Define the centroid as the point at which, if all mass were
at this point, the first moment would not change– Notation:– From this definition, we can say:
( ) ( )∑ ∑∑ ∑= == =
=rows columnsrows columns N
r
N
ck
N
r
N
ckk crrcrr
1 11 1,, II
( )cr,
( ) ( )∑ ∑∑ ∑= == =
=rows columnsrows columns N
r
N
ck
N
r
N
ckk crccrc
1 11 1,, II
( ) ( )kmkmrk 1000 =
( ) ( )kmkmck 0100 =
( )( )kmkmrk
00
10=
( )( )kmkmck
00
01=
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Central moments• Calculate the moment with respect to the object’s center
of mass• Called the central moments
• Thus the moments are now invariant to translation of the object
( ) ( ) ( ) ( )∑ ∑= =
−−=rows columnsN
r
N
ck
jk
ikij crccrrkC
1 1,I
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Object orientation• Define the line that minimizes the second moment as the
orientation of the object– This second moment is:
– Where d(r,c) is the distance from pixel (r,c) to the line• Now we require two parameters to define a line, ρ θ
• So (cosθ, sinθ) is the unit normal to the line and ρ is the minimum distance from the line to the origin
• Using this parameterization, the distance the line to a point (r,c) is:
( ) ( )∑ ∑= =
=rows columnsN
r
N
ck crcrdL
1 1
2 ,, I
0sincos =−+ ρθθ yx
( ) ρθθ −+= sincos, crcrd
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Object orientation• Now suppose that L* is the minimum value of L given as:
• To find this minimum, we can take partial derivatives with respect to ρ and θ and set them to zero– First with respect to ρ:
( ) ( )∑ ∑= =
−+=rows columnsN
r
N
ccrcrL
1 1
2
,,sincosmin* Iρθθ
θρ
( ) ( )
( ) ( )
( ) ( ) ( )
( )ρθθ
ρθρθρ
ρθρθρθθθθρ
ρθθρρ
−+−=
+−−=
+−−++∂
∂=
−+∂
∂=
∂
∂
∑ ∑∑ ∑∑ ∑
∑ ∑
∑ ∑
= == == =
= =
= =
sincos2
,2,sin2,cos2
,cos2cos2sinsincos2cos
,sincos
00
1 11 11 1
1 1
22222
1 1
2
crm
crcrccrr
crcrcrcr
crcrL
rows columnsrows columnsrows columns
rows columns
rows columns
N
r
N
c
N
r
N
c
N
r
N
c
N
r
N
c
N
r
N
c
III
I
I
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Object orientation
• Setting this to zero gives the following:
– Since we are assuming that an object will have at least one pixel– But this is just saying that the line must pass through the centroid– We use this to simplify the remaining equations
• Define new coordinates:
– Since the line that minimizes L passes through the centroid, it also passes through the line r’ = 0, c’ = 0
– So we can write:
0sincos =−+ ρθθ cr
cccrrr −=ʹ−=ʹ ,
( ) ( ) 0sincossincos =−+−=ʹ+ʹ θθθθ ccrrcr
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Object orientation• Now take the partial derivative with respect to θ:
– First some simplifications
( ) ( )
( ) ( ) ( ) ( ) ( ) ( )∑ ∑∑ ∑∑ ∑
∑ ∑
= == == =
= =
ʹ+ʹʹ+ʹ=
ʹ+ʹ=
rows columnsrows columnsrows columns
rows columns
N
r
N
c
N
r
N
c
N
r
N
c
N
r
N
c
crccrcrcrr
crcrL
1 1
22
1 11 1
22
1 1
2
,sin,sincos2,cos
,sincos
III
I
θθθθ
θθ
( ) ( )∑ ∑= =
−rows columnsN
r
N
ccrrr
1 1
2 ,I ( )( ) ( )∑ ∑= =
−−rows columnsN
r
N
ccrccrr
1 1,I ( ) ( )∑ ∑
= =
−rows columnsN
r
N
ccrcc
1 1
2 ,I
θθθθ 20211
220 sinsincos2cos CCCL ++=
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Object orientation• Further simplification:
• Take the partial derivative with respect to θ:
( ) ( ) θθ
θθθ
θθθθ
2sin2cos21
21
2cos21
212sin2cos
21
21
sinsincos2cos
1102200220
021120
20211
220
CCCCC
CCC
CCCL
+−++=
⎟⎠
⎞⎜⎝
⎛−++⎟
⎠
⎞⎜⎝
⎛+=
++=
( ) ( )
( ) θθ
θθθθ
2cos22sin
2sin2cos21
21
110220
1102200220
CCC
CCCCCL
+−−=
⎟⎠
⎞⎜⎝
⎛+−++
∂
∂=
∂
∂
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Object orientation
• Set this to zero:
• Finally, the orientation is defined by the following:
0220
1122tanCC
C−
=θ
( ) θθ 2cos22sin0 110220 CCC +−−=
⎟⎟⎠
⎞⎜⎜⎝
⎛
−= −
0220
111 2tan21
CCC
θ
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Camera calibration
• Now that we know the position and orientation of objects in the camera coordinate frame, we want to use these in a robotic manipulation task– i.e. we want to convert from camera coordinate to world
coordinates– We said that we do not have enough information to extract the
world coordinates from a 2D image, but if we know the camera’s position and orientation relative to the world frame, we can write:
wc
cwc
w oxRx +=
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Other aspects of vision• Feature detection: a way of abstracting images from • Example: edge detection
– Detect abrupt changes in an image– i.e. use (estimated) image derivatives (with finite
differences)• Qualitatively equivalent to spatial high pass filtering
– Estimate derivatives using finite differences:
• Where hi,j are the pixel values at position (i,j)• This is equivalent to a convolution, where the kernel is:• This will have a strong positive response to a vertical edge
that is positive on one side and negative on the other• But this will also give a strong response in the presence of
noise!• How to alleviate that?... First smooth the image by adding
Gaussian noise
jiji hhxh
,1,1 −+ −≈∂
∂
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
−
000101000
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Other aspects of vision• The Hough transform
– Once we have an image that has its features extracted, we may want to know information about the features
• In the case that the features are edges, maybe we want the positions and angles of the edges
– Hough transform, simplest form: represent a line by a parameterization (for example ρ and θ as we already have done), then transform each pixel in the image to the (ρ,θ) space
• i.e. any line can be represented as a point in (ρ,θ) – There can be an infinite number of lines through any point– For example, all the lines that go through a point (x0,y0) must obey
the following:
• This gives sinusoids in the Hough space for every point in the image • The points where two sinusoids intersect represents two points that can
be in the same line
( ) θθθρ sincos 00 yx +=
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Other aspects of vision
• The Hough transform– Thus the Hough transform breaks down into finding the points of
highest intersection in the Hough space – But how to do that… it may be unclear how many to select
The City College of New YorkPengantar Robotika – Universitas Gunadarma
Other aspects of vision• Optical flow
– A measure of the movement of features in a visual field
– e.g. elementary motion detector– Creates a vector field that locally describes
motion
The City College of New YorkPengantar Robotika – Universitas Gunadarma 51
Thank you!