November 13, 2014Computer Vision Lecture 17: Object Recognition I 1 Today we will move on to… Object Recognition

November 13, 2014 Computer Vision Lecture 17: Object Recognition I

1

Today we will move on to…

Object Recognition


2

Pattern and Object Recognition

• Pattern recognition is used for region and object classification, and represents an important building block of complex machine vision processes.

• No recognition is possible without knowledge.

• Specific knowledge about both the objects being processed and hierarchically higher and more general knowledge about object classes is required.


3

Statistical Pattern Recognition• Object recognition is based on assigning classes to

objects.

• The device that does these assignments is called the classifier.

• The number of classes is usually known beforehand, and typically can be derived from the problem specification.

• The classifier does not decide about the class from the object itself — rather, sensed object properties called patterns are used.


4

Statistical Pattern Recognition


5


• For statistical pattern recognition, quantitative descriptions of objects’ characteristics (features or patterns) are used.

• The set of all possible patterns forms the pattern space or feature space.

• The classes form clusters in the feature space, which can be separated by discrimination hyper-surfaces.


6



7



8

Object RecognitionHow can we devise an algorithm that recognizes certain everyday objects?

Problems:• The same object looks different from different perspectives.• Changes in illumination create different images of the same object.• Objects can appear at different positions in the visual field (image).• Objects can be partially occluded.• Objects are usually embedded in a scene.


9

Object RecognitionWe are going to discuss an example for view-based object recognition.

The presented algorithm (Blanz, Schölkopf, Bülthoff, Burges, Vapnik & Vetter, 1996) tackles some of the problems that we mentioned:

• It learns what each object in its database looks like from different perspectives. • It recognizes objects at any position in an image.• To some extent, the algorithm could compensate for changes in illumination.• However, it would perform very poorly for objects that are partially occluded or embedded in a complex scene.


10

The Set of ObjectsThe algorithm learns to recognize 25 different chairs:

It is shown each chair from 25 different viewing angles.


11

The Algorithm


12

The Algorithm

For learning each view of each chair, the algorithm performs the following steps:

• Centering the object within the image,• Detecting edges in four different directions,• Downsampling (and thereby smoothing) the resulting five images.• Low-pass filtering of each of the five images in four

different directions.


13

The Algorithm

For classifying a new image of a chair (determining which of the 25 known chairs is shown), the algorithm carries out the following steps:

• In the new image, centering the object, detecting edges, downsampling and low-pass filtering as done for the database images,• Computing the difference (distance) of the representation of the new image to all representations of the 2525 views stored in the database,• Determining the chair with the smallest average distance of its 25 views to the new image (“winner chair”).


14

The Algorithm

Centering the object within the image:

Binarize the image:

5

4

3

2

1

54321

x

y

Compute the center of gravity:

6.26

16

6

443221

x

38.16

11

6

322211

y

Finally, shift the image content so that the center of gravity coincides with the center of the image.


15

Object Recognition

Detecting edges in the image:

• Use a convolution filter for edge detection.• For example, a Sobel or Canny filter would serve this purpose.• Use the filter to detect edges in four different orientations.• Store the resulting four images r1, …, r4 separately.


16

Object Recognition

Downsampling the image from 256256 to 1616 pixels:

• In order to keep as much of the original information as possible, use a Gaussian averaging filter that is slightly larger than 1616.• Place the Gaussian filter successively at 1616 positions throughout the original image.• Use each resulting value as the brightness value for one pixel in the downsampled image.


17

Object Recognition

Low-pass filtering the image:

• Use the following four convolution filters:

000

111

000

1k

010

010

010

2k

001

010

100

3k

100

010

001

4k

• Apply each filter to each of the images r0, …, r4.

• For example, when you apply k1 to r1 (vertical edges), the resulting image will contain its highest values in regions where the original image contains parallel vertical edges.


18

Object Recognition

Computing the difference between two views:• For each view, we have computed 25 images

(r0, …, r4 and their convolutions with k1, …, k4).

• Each image contains 1616 brightness values.• Therefore, the two views to be compared, va and vb,

can be represented as 6400-dimensional vectors.• The distance (difference) d between the two views

can then be computed as the length of their difference vector:d = || va – vb ||


19

Results

• Classification error: 4.7%• If no edge detection is performed, the error

increases to 21%.• We should keep in mind that this algorithm was only

tested on computer models of chairs shown in front of a white background.

• The algorithm would fail for real-world images.• The algorithm would require components for image

segmentation and completion of occluded parts.

Documents

November 13, 2014Computer Vision Lecture 17: Object Recognition I 1 Today we will move on to… Object Recognition