INTRODUCTION TO IMAGE PROCESSING(COMPUTER VISION)Revision: 1.5, dated: April 20, 2006
Tomas Svoboda
Czech Technical University, Faculty of Electrical Engineering
Center for Machine Perception, Prague, Czech Republic
http://cmp.felk.cvut.cz/~svoboda
2/49Course E383ZS (image processing part): Admin
� Course homepage http://cmp.felk.cvut.cz/cmp/courses/EZS
� Both lectures and computer excercises will be tought at Karlovo
namesti.
• Lectures, Thu 8:30-10:00, G102a
• Computer excercises Fri 12:45-14:15, K132
� Grades (image processing part 12/24):
• Final exam: 9/12
• Assignments: 3/12
� Contact: [email protected]
3/49Why digital image processing?
� Digital images are ubiquitous.
• digital photographs
• digital video
• digital tv broadcasting
• on-line resources on the web
� Firmware for acquisition devices.
� Cleaning noise.
� SW for digilabs
� . . .
4/49
What will we learn — few examplesIntensity corrections
Poorly exposed snapshot. Correction by histogram
manipulation.
5/49
What will we learn — few examplesHomomorphic filtering
Original image. Filtered image. Note improved
sharpness and balanced tonality.
6/49
What will we learn — few examplesImage restoration
Blurred image. Restored imaged by inverse filtering.
7/49
What will we learn — few examplesCorrection of perspective distortion
Distorted image. Correction by collineation.
8/49HUMAN VISION vs. COMPUTER VISION
Vision allows humans to perceive and understand the world surrounding
them.
Computer vision aims to imitate the effect of human vision by
electronically perceiving image and understanding its content using
computers.
Digital image = the input (understood intuitively), e.g., on the retina or
captured by a TV camera. Image function f(x, y), f(x, y, t), or a
matrix (after digitization).
9/49EXAMPLES OF INPUT IMAGES
10/49SEVERAL DISCIPLINES INDUCED
Digital image processing – 2D static world, no image interpretation
involved (rather independent of an application domain), signal
processing techniques.
Image analysis – 2D world, image interpretation involved, i.e. image
interpretation constitutes the crucial step.
Computer vision – the most general problem formulations, 3D world,
interpretations, potentially dynamic (i.e., image sequence needed), ill
posed tasks very ambitious.
11/49LOW LEVEL vs. HIGH LEVEL PROCESSING
Low level = image processing
� Image data are not interpreted, i.e. semantics is not explored.
� Signal processing methods, e.g., 2D Fourier transformation.
� Same methods for wide class of problems.
Images → Images
High level = image understanding, computer vision
� Interpretation to a specific application domain.
� Complex, artificial intelligence techniques, feedback.
� A tough problem. Often needs to be simplified.
Images → Description
12/49ROLE OF INTERPRETATION, SEMANTICS
Interpretation: Observation → Model
Syntax → Semantics
Examples:
� Looking out of the window → {rains, does not rain}.
� An apple on the conveyer belt → {class 1, class 2, class 3}.
� Traffic scene → seeking number plate of a car.
Theoretical background: mathematical logic, theory of formal languages.
13/49
WHY IS COMPUTER VISION HARD?6 REASONS
Loss of information in 3D → 2D due to perspective transformation
(mathematical abstraction = pinhole model).
Measured brightness is given by a complicated image formation physics.
Radiance (≈ brightness) depends on light sources intensity and
positions, observer position, surface local geometry, and albedo. Inverse
task is ill-posed.
Inherent presence of noise as each real world measurement is corrupted
by noise.
A lot of data Sheet A4, 300 dpi, 8 bit per pixel = 8.5 Mbytes.
Non-interlaced video 512 × 768, RGB (24 bit) = 225 Mbits/second.
Interpretation needed (as discussed above).
Local window vs. need for global view
14/49
OBJECT RECOGNITIONHIERARCHY OF REPRESENTATIONS
Objector scene
2D image
Digitalimage
Regions Edgels Scale Orientation Texture
Image withfeatures
Objects
from objects to images
from images to features
from features to objects
understanding objects
15/49IMAGE
Image - understood intuitively; image on the retina, captured by a TV
camera.
Image function f(x, y), f(x, y, t). Outcome of the perspective projection.
Y P(x,y,z)
Z
X
y'y
x'x
image plane
f
point in the 3D scene
x′ =x f
z, y′ =
y f
z.
16/49IMAGE FUNCTION, 2-DIMENSIONAL SIGNAL
Monochromatic static image f(x, y), where
(x, y) are coordinates in a plane with domain
R = {(x, y), 1 ≤ x ≤ xm, 1 ≤ y ≤ yn} ;
f is image function value (≈ brightness, density of a transparent object,
distance to observer, etc.)
(Natural) 2D Images:
Thin specimen in optical microscope, image of a character on a paper,
finger print, a single cut from a tomograph, etc.
17/49IMAGE FUNCTION → DIGITAL IMAGE
From continuous to discrete space.
� Sampling of the image domain. Selection of dicrete points.
� Quantization of the image range. selection of disrete values.
� Usual representation = matrix. f(x, y) → f(r, c).
� Pixel = Picture element.
columns
row
s
Sampling 50x50, Quantization to 32 values
5 10 15 20 25 30 35 40 45 50
5
10
15
20
25
30
35
40
45
50
18/49SAMPLING
Two involved problems:
1. Arrangement of the sampling points.
(b)(a)
2. Distance between sampling points (Shannon sampling theorem).
19/49Sampling vs. Quantization
� dots per inch [dpi]
� frames per second [fps]
� 24-bit color
� 256 gray levels
Sampling is usually described as
� frequency or (frame)rate
� spacing
� density
Quantization is usually described by
� the number of bits (bytes) per sample
� number of discrete values
20/49SAMPLING EXAMPLE 1
Original 256× 256 128× 128
21/49SAMPLING EXAMPLE 2
Original 256× 256 64× 64
22/49SAMPLING EXAMPLE 3
Original 256× 256 32× 32
23/49QUANTIZATION EXAMPLE 1
Original 256 levels 64 levels
24/49QUANTIZATION EXAMPLE 2
Original 256 levels 16 levels
25/49QUANTIZATION EXAMPLE 3
Original 256 levels 4 levels
26/49QUANTIZATION EXAMPLE 4
Original 256 levels 2 levels
27/49Resolution
is the ability to distinguish between details. It is not the number of pixels.
Both images have the same number of pixels but different resolution. The
resolution is more related to what we can reconstruct from the image.
28/49DISTANCE
Function D is called the distance iff
D(p, q) ≥ 0 , particularly D(p, p) = 0 (identity).
D(p, q) = D(q, p) , (symmetry).
D(p, r) ≤ D(p, q) + D(q, r) , (triangular inequality).
29/49Several definitions of distance in a square raster
Euclidean distance (as the crow flies)
DE((x, y), (h, k)) =√
(x− h)2 + (y − k)2 .
City block distance (also called Manhattan distance)
D4((x, y), (h, k)) =| x− h | + | y − k | .
Chessboard distance
D8((x, y), (h, k)) = max{| x− h |, | y − k |} .
0
1
2
0 1 2 3 4
DE
D4
D8
30/494-neighbourhood and 8-neighbourhood
A set consisting of the pixel itself and its neighbours of distance 1.
31/49CROSSING LINES PARADOX
32/49
BINARY IMAGE & RELATION ‘beingcontiguous’
black ∼ objects & white ∼ background
Neigbouring pixels are contiguous.
Two pixels are contiguous if there is a path between them consisting of
neigbouring pixels.
33/49REGION = compact set
Relation ’being contiguous’ is reflexive, symmetric, and transitive, i.e. this is
an equivalence relation.
Thus it decomposes the set of ’object’ pixels into equivalence classes =
regions.
34/49A FEW CONCEPTS
Boundary (of a region) vs. edge vs. edgel.
Inner and outer boundary.
Convex hull, lakes, bays.
35/49
Histogram of Image Intensities aka ImageHistogram
Histogram of image intensities is an estimate of probability density.
0 50 100 150 200 250
0
500
1000
1500
2000
2500
3000
3500
image histogram of intensities
36/49Histogram Equalization
The Aim:
� Increase contrast for a human observer.
� Normalize intensities for e.g., automatic image comparison
0 50 100 150 200 250
0
500
1000
1500
2000
2500
3000
3500
0 50 100 150 200 250
0
500
1000
1500
2000
2500
3000
3500
original histogram histogram after equalization
37/49Increased contrast after histogram equalization
original image . . . after equalization
38/49Histogram equalization — derivation
Input: histogram H(p) of the image with gray leveles p = 〈p0, pk〉.
Aim: find a monotonic pixel brightness transformation q = T (p), such that
the desired output histogram G(q) is uniform over the whole output
brightness scale q = 〈q0, qk〉.
The monotonicity of the transformation implies:
k∑i=0
G(qi) =k∑
i=0
H(pi) .
Equalized histogram ≈ uniform density
G(q) =N2
qk − q0.
39/49Histogram equalization — derivation II
The exactly uniform histogram may be obtained only in continuous space.∫ q
q0
G(s) ds =∫ p
p0
H(s) ds .∫ q
q0
N2
qk − q0ds =
∫ p
p0
H(s) ds .
N2(q − q0)qk − q0
=∫ p
p0
H(s) ds .
N2(q − q0) = (qk − q0)∫ p
p0
H(s) ds .
q = T (p) =qk − q0
N2
∫ p
p0
H(s) ds + q0 .
40/49Histogram equalization — derivation III
Continous space distribution function
q = T (p) =qk − q0
N2
∫ p
p0
H(s) ds + q0 .
Dicrete space cumulative histogram
q = T (p) =qk − q0
N2
p∑i=p0
H(i) + q0 .
41/49More intensity transformations — original
0 50 100 150 200 250
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
42/49
More intensity transformations — brightnessq = p + const
0 50 100 150 200 250
0
1000
2000
3000
4000
5000
43/49More intensity transformations — original
0 50 100 150 200 250
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
44/49
More intensity transformations — contrastq = p ∗ const
0 50 100 150 200 250
0
1000
2000
3000
4000
5000
45/49More intensity transformations — original
0 50 100 150 200 250
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
46/49
More intensity transformations — gammacorrected q = pγ
0 50 100 150 200 250
0
500
1000
1500
2000
2500
3000
3500
4000
47/49More intensity transformations — original
0 50 100 150 200 250
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
48/49
More intensity transformations — histogramequalization q =
∑pi=p0
H(i)
0 50 100 150 200 250
0
1000
2000
3000
4000
5000
6000