2015-10-27 EIE426-AICV 1 Computer Vision Filename: eie426-computer-vision-0809.ppt

23/4/20 EIE426-AICV 1

Computer Vision

Filename: eie426-computer-vision-0809.ppt

23/4/20 EIE426-AICV 2

Contents Perception generally Image formation Color vision Edge detection Image segmentation Visual attention 2D 3D Object recognition

Perception generally

Stimulus (percept) S, World W

S = g(W)

E.g., g = “graphics."

Can we do vision as inverse graphics?

W = g-1(S)

Problem: massive ambiguity!

23/4/20 EIE426-AICV 3

Missing depth information!

Better approaches

Bayesian inference of world configurations:

P(W|S) = P(S|W) x P(W) / P(S) = α x P(S|W) x P(W)

“graphics” “prior knowledge”

Better still: no need to recover exact scene!

Just extract information needed for navigation manipulation recognition/identification

23/4/20 EIE426-AICV 4

Vision “subsystems”

23/4/20 EIE426-AICV 5

Vision requires combining multiple cues

Image formation

P is a point in the scene, with coordinates (X; Y; Z)

P’ is its image on the image plane, with coordinates (x; y; z)

x = -fX/Z; y = -fY/Z (by similar triangles)

Scale/distance is indeterminate!

23/4/20 EIE426-AICV 6

Len systems

fS

fS

SS

fSS

2

2

21

21

11

111

f : the focal length of the lens

Images

23/4/20 EIE426-AICV 8

Images (cont.)

I(x; y; t) is the intensity at (x; y) at time t

CCD camera 4,000,000 pixels; human eyes 240,000,000 pixels

23/4/20 EIE426-AICV 9

Color vision

Intensity varies with frequency infinite-dimensional signal

Human eye has three types of color-sensitive cells; each integrates the signal 3-element vector intensity

23/4/20 EIE426-AICV 10

Color vision (cont.)

23/4/20 EIE426-AICV 11

3

),(),(),(,

jibjigjirjis

Edge detectionEdges are straight lines or curves in the image plane across which there

is “significant” changes in image brightness.

The goal of edge detection is to abstract away from messy, multi-megabyte image and towards a more compact, abstract representation.

23/4/20 EIE426-AICV 12

Edge detection (cont.)

Edges in image discontinuities in scene:

1) Depth discontinuities

2) surface orientation

3) reflectance (surface markings) discontinuities

4) illumination discontinuities (shadows, etc.)

23/4/20 EIE426-AICV 13


23/4/20 EIE426-AICV 14


Sobel operator

121

000

121

and

101

202

101

LLLL yx

x

y

yx

L

L

LLL

arctan :nOrientatio

:Amplitude 22

Other operators: Roberts (2x2), Prewitt (3x3), Isotropic (3x3)

the location of the origin (the image pixel to be processed)


23/4/20 EIE426-AICV 16

A color picture of a steam engine. The Sobel operator applied to that image.

Edge detection: application 1

23/4/20 EIE426-AICV 17

An edge extraction based method to produce the pen-and-ink like drawings from photos

23/4/20 EIE557-CI&IA 18

Leaf (vein pattern) characterization

Edge detection: application 2

Image segmentation

In computer vision, segmentation refers to the process of partitioning a digital image into multiple segments (sets of pixels).

The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze.

Image segmentation is typically used to locate objects and boundaries (lines, curves, etc.) in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain visual characteristics.

23/4/20 EIE426-AICV 19

Image segmentation (cont.)

23/4/20 EIE426-AICV 20

Image segmentation: the quadtree partition based split-and-merge algorithm (1) Split into four disjoined quadrants any region Ri where P(Ri) =

FALSE.

(2) Merge any adjacent regions Ri and Rk for which P(Ri Rk ) = TRUE; and

(3) Stop when no further merging or splitting is possible.

P(Ri) = TRUE if all pixels in Ri have the same intensity or are uniform in some measure.

23/4/20 EIE426-AICV 21

Image segmentation: the quadtree partition based split-and-merge algorithm (cont.)

23/4/20 EIE426-AICV 22

Visual attention

Attention is the cognitive process of selectively concentrating on one aspect of the environment while ignoring other things.

Attention mechanism of human vision system has been applied to serve machine visual system for sampling data nonuniformly and utilizing its computational resources efficiently.

23/4/20 EIE426-AICV 23

Visual attention (cont.)

The visual attention mechanism may have at least the following basic components:

(1) the selection of a region of interest in the visual field; (2) the selection of feature dimensions and values of interest; (3) the control of information flow through the network of neurons that constitutes the visual system; and (4) the shifting from one selected region to the next in time.

23/4/20 EIE426-AICV 24

Attention-driven object extraction

23/4/20 EIE426-AICV 25

The more attentive a object/region, the higher priority it has

Attention-driven object extraction (cont.)

23/4/20 EIE426-AICV 26

Objects 1, 2, …,background

Motion

23/4/20 EIE426-AICV 27

•The rate of apparent motion can tell us something about distance. A nearer object has a larger motion.

•Object tracking

Motion Estimation

23/4/20 EIE426-AICV 28

Stereo

23/4/20 EIE426-AICV 29

The nearest point of the pyramid is shifted to the left in the right image and to the right in the left image.

Disparity (x difference in two images) Depth

Disparity and depth

23/4/20 EIE426-AICV 30

(depth)

)(disparity

d

bfZ

Z

bfxxd

Z

Xbfx

Z

Xb

f

x

Z

fXx

Z

X

f

x

LR

RR

LL

z

x

f

Z

Lx Rx

ZYXP ,,X

b

Image Plane

Left Camera Rigth Camera

Disparity and depth (cont.)Depth is inversely proportional to disparity.

Example: Electronic eyes for the blind

23/4/20 EIE426-AICV 32

Example: Electronic eyes for the blind (cont.)

23/4/20 EIE426-AICV 33

Nearer

Farther object

Left camera

Right camera

Left captured image

Right captured image

Pixels matching for calculating the disparities

Example: Electronic eyes for the blind (cont.)

23/4/20 EIE426-AICV 34

Left: x=549Right: x=476

∆=73

Left: x=333Right: x=273

∆=60

Texture

Texture: a spatially repeating pattern on a surface that can be sensed visually.

Examples: the pattern windows on a building, the stitches on a sweater,

The spots on a leopard’s skin, grass on a lawn, etc.

23/4/20 EIE426-AICV 35

Edge and vertex types

23/4/20 EIE426-AICV 36

• “+” and “-” labels represent convex and concave edges, respectively. These are associated with surface normal discontinuities wherein both surfaces that meet along the edge are visible.

• A “” or a “” represents an occluding convex edge. As one moves in the direction of the arrow, the (visible) surfaces are to the right.

• A “” or a “” represents a limb. Here, the surface curves smoothly around to occlude itself. As one moves in the direction of the twin arrow, the (visible) surfaces lies to the right.

Object recognitionSimple idea:

- extract 3-D shapes from image

- match against “shape library”

Problems:

- extracting curved surfaces from image

- representing shape of extracted object

- representing shape and variability of library object classes

- improper segmentation, occlusion

- unknown illumination, shadows, markings, noise, complexity, etc.

Approaches:

- index into library by measuring invariant properties of objects

- alignment of image feature with projected library object feature

- match image against multiple stored views (aspects) of library object

- machine learning methods based on image statistics

23/4/20 EIE426-AICV 37

Biometric identificationCriminal investigations and access control for restricted facilities

require the ability to indentify unique individuals.

23/4/20 EIE426-AICV 38

Biometric systems forperson authentication /

verification

Signature

FaceFingerprinting

Retinal imageIris image

Voice signal Hand veins

(the blueish area)

Content-based image retrieval The application of computer vision to the image retrieval problem,

that is, the problem of searching for digital images in large databases.

“Content-based” means that the search will analyze the actual contents of the image. The term ‘content’ in this context might refer to colors, shapes, textures, or any other information that can be derived from the image itself. Without the ability to examine image content, searches must rely on metadata such as captions or keywords, which may be laborious or expensive to produce.

23/4/20 EIE426-AICV 39

Content-based image retrieval (cont.)

http://labs.systemone.at/retrievr/?sketchName=2009-03-26-01-22-37-828150.3#sketchName=2009-03-26-01-23-35-358087.4

23/4/20 EIE426-AICV 40

Handwritten digit recognition

3-nearest-neighbor = 2.4% error

400-300-10 unit MLP (a neural network approach) = 1.6% error

LeNet: 768-192-30-10 unit MLP = 0.9% error

23/4/20 EIE426-AICV 41

Summary

Vision is hard -- noise, ambiguity, complexity

Prior knowledge is essential to constrain the problem

Need to combine multiple cues: motion, contour, shading, texture, stereo

“Library” object representation: shape vs. aspects

Image/object matching: features, lines, regions, etc.

23/4/20 EIE426-AICV 42

Documents

2015-10-27 EIE426-AICV 1 Computer Vision Filename: eie426-computer-vision-0809.ppt