COMP 4180: Intelligent Mobile Robotics Computer Vision

COMP 4180: Intelligent Mobile Robotics

Computer Vision

Jacky BaltesAutonomous Agents LabUniversity of Manitoba

Winnipeg, CanadaR3T 2N2

Email: [email protected]: http://www.cs.umanitoba.ca/~jacky

http://aalab.cs.umanitoba.ca

11-9-26

Computer Vision

● Introduction

● Digital images

● Edge detection

● Convolution

● Sobel Edge Detector

● Thresholding

● Morphological Operators

● Erosion and Dilation

● Open and Close

● Hough transform

● Detect lines

● General Hough transform

11-9-26

Digital Images

● Computer graphics: model -> image● Computer vision: image -> model● Two dimensional array of picture

elements (pixels)● Captured through a sensor

● camera, scanner, mobile phone● structured lighting

11-9-26

Levels of Abstraction

● Lowest level: ● Pre-processing, noise removal, edge detection

● Middle level: ● Segmentation into regions

● High level:● Object recognition, motion analysis● World model

11-9-26

Optical Illusions

11-9-26

Optical Illusions

11-9-26

Image Format

Most common format is RGB format

24 bit 8-8-8: 0..255 for each channel

16-bit 5-6-5: more emphasis on green

255,107,217,214,105,215,212,104,212,214,107,215218,107,215,218,113,218,224,115,219,230,114,216230,...

11-9-26

Human Eye Characteristics

● A RGB image can be broken down into three greyscale images (Decompose into channels)

Red

Green

Blue

11-9-26


● Blur Green Channel and Recompose

Red

Green

Blue

11-9-26


● Blur Blue Channel and Recompose

Red

Green

Blue

11-9-26

Which Square is Brighter?

11-9-26

Angle of the Lines

11-9-26

What is wrong with this picture?

● Apart from it is flipped upside down

11-9-26

What is wrong with this picture?

● Apart from it is flipped upside down

11-9-26

Edge Detection

● Colours are very susceptible to lighting● An important pre-processing step is

edge detection● How do we find an edge?

● Sharp contrast in the image● Derivative of the image function I(i,j)● Approximate derivative with I(i+1,j) - I(i-1,j)

11-9-26

Convolution

● Edge Detection and many other image pre-processing steps can be implemented as a convolution

● A convolution mask is a matrix that is applied to each pixel in the image

● Specifies weights of the neighbors

0 0 0

-1 0 +1

0 0 0

Derivative

11-9-26

Convolution

● What is the output of [ 0, 255, 0, 0, ..... ]

● -255?● Use divisor and offset to normalize

result of convolution to 0 .. 255● How to deal with colour images?

● Convert to grey scale● Handle each channel seperately

11-9-26

Sobel Edge Detection

● To reduce noise, average over several rows

● Weigh rows differently● Divisor = 8, Offset = 128

-1 0 +1

-2 0 +2

-1 0 +1

11-9-26

Sobel Edge Detection

● Use separate convolution matrices for horizontal and vertical direction

-1 0 +1

-2 0 +2

-1 0 +1

-1 -2 -1

0 0 0

+1 +2 +1

Horizontal Vertical

11-9-26

Convolution

● Many other filters can be implemented efficiently as convolution

● One problem: what to do at the borders● What does the following filter do?

+1 +1 +1

+1 +1 +1

+1 +1 +1

11-9-26

Blurring (Simple)

● Blurring is used to reduce noise in the image

+1 +1 +1

+1 +1 +1

+1 +1 +1

11-9-26

Template Matching Convolution

● Find Specific features in the image

0 +1 0

+1 +1 +1

0 0 0

Divisor = 4

11-9-26

Segmentation

Image

"What are the objects to be analyzed?"

Pre-processing, image enhancement

Segmentation

Binary operations

Morphological operations and feature extraction

Classification and matching

Image analysisDataData

11-9-26

Segmentation

● Full segmentation: Individual objects are separated from the background and given individual ID numbers (labels).

● Partial segmentation: The amount of data is reduced (usually by separating objects from background) to speed up the further processing.

● Segmentation is often the most difficult problem to solve in the process; there is no universal solution!

● The problem can be made much easier if solved in cooperation with the constructor of the imaging system (choice of sensors, illumination, background etc) .

11-9-26

Three Types of Segmentation

● Classification – Based on some similarity measure between pixel values. The simplest form is thresholding.

● Edge-based – Search for edges in the image. They are then used as borders between regions

● Region-based – Region growing, merge & split

Common idea: search for discontinuities or/and similitudes in the image

11-9-26

Thresholding (Global and Local)

● Global: based on some kind of histogram: grey-level, edge, feature etc.● Lighting conditions are extremely important,

and it will only work under very controlled circumstances.

● Fixed thresholds: the same value is used in the whole image

● Local (or dynamic thresholding): depends on the position in the image. The image is divided into overlapping sections which are thresholded one by one.

11-9-26

Classical Automatic Thresholding Algorithm

1. Select an initial estimate for T2. Segment the image using T. This produces 2

groups: G1 , pixels with value >T and G2 , with value <T

3. Compute µ1 and µ2, average pixel value of G1 and G2

4. New threshold: T=1/2(µ1+µ2)

5. Repeat steps 2 to 4 until T stabilizes.

Very easy + very fastAssumptions: normal dist. + low noise

11-9-26

Optimal Thresholding

● Based on the shape of the current image histogram. Search for valleys, Gaussian distributions etc.

Background

Real histogram

Optimalthreshold ?

Both

Foreground

11-9-26

Histograms

To love… …and to hate

11-9-26

Thresholding and illumination

● Solutions:● Calibration of the

imaging system● Percentile filter with

very large mask● Morphological

operators

11-9-26

MR non-uniformity

median filtering thresholding

-

11-9-26

More thresholding

● Can also be used on other kinds of histogram: grey-level, edge, feature etc.

Multivariate data (⇒ see next lectures)

● Problems:● Only considers the graylevel pixel value, so it

can leave “holes” in segmented objects.– Solution: post-processing with

morphological operators● Requires strong assumptions to be efficient● Local thresholding is better ⇒ see region

growing techniques

11-9-26

Edge-based Segmentation

Based on finding discontinuities (local variations of image intensity)

1. Apply an edge detector ex gradient operator (Sobel)

second derivative (Laplace)2. Threshold the edge image to get a binary image

3. Depending on the type of edge detector: Link edges together to close shapes (using

edge direction for example) Remove spurious edges

11-9-26

Gradient based procedure

Sobel

Sobel

11-9-26

Zero-crossing based procedure

LoG

11-9-26

Laplacian of Gaussian

11-9-26

Edge-based Segmentation: examples

Prewitt: needs edge linking Canny: needs “cleaning”

11-9-26

Region based segmentation

● Work by extending some region based on local similarities between pixels

● region growing (bottom-up method)● region splitting and merging (top-down method)

Bottom-up: from data to representation Top-down: from model to data

11-9-26

The Hough transform

● A method for finding global relationships between pixels.Example: We want to find straight lines in an image

● 1. Apply edge enhancing filter (ex Laplace)● 2. Set a threshold for what filter response is considered a

true ”edge pixel”● 3. Extract the pixels that are on a straight line using the

Hough transform

original image edge enhanced image

thresholded edge image

11-9-26

Finding Lines

● Edge detection is the first step in image processing

● Finding lines or line segments is often the next step

● Local methods● Chaining● Morphological operators

● Global methods● Hough transform

11-9-26

Input Image of Taipei

11-9-26

Output of Edge Detection

11-9-26

Thresholding

11-9-26

Edge Growing

11-9-26

Edge Growing

● Adding edge pixels may● Combine lines that are not the same

– Red and green examples

● Be broken up into non-continuous lines– Blue examples

● Combine only edge pixels with the same gradient direction

● Fill holes in the edge pixels

11-9-26

Morphological Operators

● Refers to shape and structure of an object● In computer vision, we use a structural

element● Common elements

11-9-26

Dilation

● Increase the size of an object by● Adding the structural element at every position of

the image where the origin of the structural element and the image have both a pixel

11-9-26

Dilation

● Output

11-9-26

Erosion

● Replace a pixel everywhere the whole structural element matches the input image

11-9-26

Opening and Closing

● Opening is● Erosion (remove small artefacts) ● Dilation (grow to original size)● Can be used to remove artefacts that are smaller

than a minimum size

● Closing is● Dilation (fill in gaps)● Erosion (reduce size)● Fill in holes of a maximum size

11-9-26

Border Detection

● By using a structural element will all neighbours and eroding the image, you can remove all boundary points of an image

● Comparing the difference between the original image and the eroded image results in all boundary pixels

11-9-26

Example: Close

11-9-26

Example: Open

11-9-26

Border Detection

11-9-26

The Hough transform

Finding straight lines:

1. consider a pixel in position (xi,yi)2. equation of a straight line yi=axi+b 3. set b=-axi+ yi and draw this (single) line in ”ab-

space”4. consider the next pixel with position (xj,yj) and draw

the line b=-axj+ yj ”ab-space” (also called parameter space). The points (a’,b’) where the two lines intersect represent the line y=a’x+b’ in ”xy-space” which will go through both (xi,yi) and (xj,yj).

5. draw the line in ab-space corresponding to each pixel in xy-space.

6. divide ab-space into accumulator cells and find most common (a’, b’) which will give the line connecting the largest number of pixels

11-9-26

y

The Hough transform

xy-space

x

ab- or parameter space

b

a

11-9-26

The Hough transform● In reality we have a problem with y=ax+b because a

reaches infinity for vertical lines.Use instead.

● It is common to use ”filters” for finding the intersection: ”butterfly filters”

● Different variations of the Hough transform can also be used for finding other shapes of the form g(v,c)=0, v is a vector of coordinates, c is a vector of coefficients.

● Possible to find any kind of simple shapeex. circle: (3D parameter space)

x−c1 2 y−c2

2=c3

2

11-9-26

The Hough transform

11-9-26

Conclusions

► The segmentation procedure● Pre-processing● Segmentation● Post-processing● ⇒ Like any IP procedure

► There exists NO universal segmentation method► Evaluation of segmentation performance is

important

11-9-26

References

● Les Kitchen (Lecture Slides)● http://www.cs.mu.oz.au/480/lec_intro_part.pdf

● Lucia Ballerini (Lecture slides)● http://www.cb.uu.se/~lucia/

● Andrew Moore's lecture slides (CMU)● http://www2.cs.cmu.edu/~awm/tutorials/constrain

t05.pdf

Documents

COMP 4180: Intelligent Mobile Robotics Computer Vision