Image Fundamentals

Digital Image ProcessingDigital imaging fundamentals

Massimo Fierro

Department of Computer & Information Communications EngineeringDaegu Catholic University

Daegu, Korea

March 04, 2011

Image size

An image is a multi-dimensional object

I 2 spatial dimensions: width, height

I 1 color dimension: number of color channels

Width and heightWidth and height are measured in pixel:bigger size = more imaging space

Color channelsThe number of color channels is adimensional.Common representations:

I Red, Green, Blue (display)

I Cyan, Yellow, Magenta, K (for printing)

I YCbCr = Luminance, Blue, Red (compression)

I N-channels (multi-spectral imaging)

I ...

M. Fierro, Daegu Catholic University Digital Image Processing, Digital imaging fundamentals 2/49

Image size










I ...


Image size










I ...


Spatial resolution

Resolution is not sizeResolution is a measure of the smallest detail “visible”:higher resolution = more detail

Unit of measureResolution only makes sense if you have a unit of measure (e.g. mm)

Common units of resolution

I Dots per inch (DPI), common in printing

I Lines per inch (LPI), common in device characterization


Spatial resolution







Spatial resolution







Spatial resolution examples

Resolution of a camera sensorCanon APS-C sensor (22.3 x 14.9 mm)

I Sensor A: 3456 x 2304 (8 Mpixel)

I Sensor B: 3888 x 2592 (10.1 Mpixel)

Images in print35mm size sensor (36x24 mm)

72 DPI 300 DPIA 722 x 542 mm 173 x 130 mmB 1372 x 914 mm 329 x 219 mm

Table: Maximum print size according to DPI


Spatial resolution examples

Resolution of a camera sensorCanon APS-C sensor (22.3 x 14.9 mm)

I Sensor A: 3456 x 2304 (8 Mpixel)

I Sensor B: 3888 x 2592 (10.1 Mpixel)

Images in print35mm size sensor (36x24 mm)

72 DPI 300 DPIA 722 x 542 mm 173 x 130 mmB 1372 x 914 mm 329 x 219 mm

Table: Maximum print size according to DPI


Changing spatial resolution

Figure: Same image at 1250, 300, 150, 72 dpi (not true on slides)M. Fierro, Daegu Catholic University Digital Image Processing, Digital imaging fundamentals 5/49

Intensity resolution (color depth)

DefinitionIntensity resolution = number of distinguishable “colors”

Common color depths

I 1 bit = binary (e.g. black and white)

I 8-bits = 256 levels (e.g. gray-scale)

I 3 x 8-bits = 16M colors (256 levels for R, G and B)


Intensity resolution (color depth)

DefinitionIntensity resolution = number of distinguishable “colors”

Common color depths

I 1 bit = binary (e.g. black and white)

I 8-bits = 256 levels (e.g. gray-scale)

I 3 x 8-bits = 16M colors (256 levels for R, G and B)


Changing intensity resolution

(a) (b)


Image interpolation

ProblemWe have an image and we want to make it bigger (i.e. we want toincrease it’s width and height).

Missing dataWhere do we get the missing data?


Image interpolation

ProblemWe have an image and we want to make it bigger (i.e. we want toincrease it’s width and height).

Missing dataWhere do we get the missing data?


Different interpolation methods

Nearest neighborUse the intensity value of the nearest pixel in the original image

Bilinear interpolationUse information from the four nearest neighbors

F (x ′, y ′) = (1− b) [(1− a)F (x , y) + aF (x + 1, y)] +

b [(1− a)F (x , y + 1) + aF (x + 1, y + 1)]

Bicubic interpolationUse information from the sixteen nearest neighbors

I (x , y) =3∑

i=0

3∑j=0

aijxiy j





F (x ′, y ′) = (1− b) [(1− a)F (x , y) + aF (x + 1, y)] +

b [(1− a)F (x , y + 1) + aF (x + 1, y + 1)]


I (x , y) =3∑

i=0

3∑j=0

aijxiy j





F (x ′, y ′) = (1− b) [(1− a)F (x , y) + aF (x + 1, y)] +

b [(1− a)F (x , y + 1) + aF (x + 1, y + 1)]


I (x , y) =3∑

i=0

3∑j=0

aijxiy j


(c) 4x4 grid (d) 4x4 and 12x12 grid

(e) Nearest neighbor (f) Bilinear (g) Bicubic


Interpolation examples

(h) 64x64 (i) 128x128 N.Neigh.

(j) 128x128 Bilin. (k) 128x128 Bicub.


Interpolation examples


Pixel neighbors

4-neighborhoodGiven a pixel p = I (x , y), the 4-neighbors are

N4(p) = {I (x + 1, y), I (x − 1, y), I (x , y + 1), , I (x , y − 1)}

NoteIf the p is on the border of the image some of the neighbors will beoutside of the image


Pixel neighbors

Diagonal-neighborhoodGiven a pixel p = I (x , y), the diagonal-neighbors are

ND(p) = {I (x + 1, y + 1), I (x + 1, y − 1), I (x − 1, y + 1), , I (x − 1, y − 1)}



Pixel neighbors

8-neighborhoodThe 8-neighborhood is given by

N8(p) = N4 ∪ ND



Adjacency

4-adjacencyGiven two pixels p 6= 0, q 6= 0 they are 4-adjacent if

q ∈ N4(p) or p ∈ N4(q)

8-adjacencyGiven two pixels p 6= 0, q 6= 0 they are 8-adjacent if

q ∈ N8(p) = p ∈ N8(q)

m-adjacencyGiven two pixels p 6= 0, q 6= 0 they are m-adjacent if

I q ∈ N4(p), or

I q ∈ ND(p) and all pixels in N4(p) ∩ N4(q) are zero


Paths (curves)

DefinitionA path from a pixel p = (x , y) to a pixel q = (s, t) isa sequence of adjacent pixels {(x0, y0), (x1, y1), ..., (xn, yn)} such that(x0, y0) = (x , y) and (xn, yn) = (s, t)

Closed pathA path is closed if (x , y) = (x0, y0) = (xn, yn) = (s, t)

4- 8- and m-paths

I A 4-path is computed using 4-adjacency


I A m-path is computed using m-adjacency


Paths (curves)



4- 8- and m-paths





Paths (curves)



4- 8- and m-paths





Connected components

ConnectionLet S be a subset of pixels in the image.Two pixels p, q are connected in S if there is a path between them madeentirely of pixels in S

Connected componentFor any pixel p ∈ S , the set of pixels connected to it are called aconnected component of S

Connected setIf in S there is just one connected component, then S is a connected setor region.

















Examples

(l) Set S1 (m) Connected compo-nents in S1

(n) Set S2 (o) Region in S2M. Fierro, Daegu Catholic University Digital Image Processing, Digital imaging fundamentals 19/49

Relationships between regions

Adjacent regionsTwo regions Ri ,Rj , in the same image, are adjacent if Ri ∪ Rj is a regionitself.

Disjoint regionsTwo regions Ri ,Rj , in the same image, are disjoint if they are notadjacent.

Choice of adjacencyThe choice of the adjacency method is extremely important!

















Distance measures

DefinitionA function D(p, q), where p and q are points in an image, is called adistance function or distance metric if all the following conditions areverified

I D(p, q) ≥ 0

I D(p, q) = D(q, p)

I D(p, z) ≤ D(p, q) + D(q, z)

Euclidean distanceGiven p = (px , py ) and q = (qx , qy ) the Euclidean distance is

De(p, q) =√

(px − qx)2 + (py − qy )2


Distance measures

DefinitionA function D(p, q), where p and q are points in an image, is called adistance function or distance metric if all the following conditions areverified

I D(p, q) ≥ 0

I D(p, q) = D(q, p)

I D(p, z) ≤ D(p, q) + D(q, z)

Euclidean distanceGiven p = (px , py ) and q = (qx , qy ) the Euclidean distance is

De(p, q) =√

(px − qx)2 + (py − qy )2


Distance measures

D4 distance (alias “Manhattan” or “city block”)Given p = (px , py ) and q = (qx , qy ) the Manhattan distance is

D4(p, q) = |px − qx |+ |py − qy |

D8 distance (alias “chessboard”)Given p = I (px , py ) and q = I (qx , qy ) the chessboard distance is

D8(p, q) = max(|px − qx |, |py − qy |)


Distance measures

D4 distance (alias “Manhattan” or “city block”)Given p = (px , py ) and q = (qx , qy ) the Manhattan distance is

D4(p, q) = |px − qx |+ |py − qy |

D8 distance (alias “chessboard”)Given p = I (px , py ) and q = I (qx , qy ) the chessboard distance is

D8(p, q) = max(|px − qx |, |py − qy |)


Matrix vs Array (point by point) operations

Given two matrices a =

[a1,1 a1,2a2,1 a2,2

]and b =

[b1,1 b1,2b2,1 b2,2

]Array productThe array product of a and b is

ab =

[a1,1 a1,2a2,1 a2,2

] [b1,1 b1,2b2,1 b2,2

]=

[a1,1b1,1 a1,2b1,2a2,1b2,1 a2,2b2,2

]

Matrix productThe matrix product of a and b is

ab =

[a1,1 a1,2a2,1 a2,2

] [b1,1 b1,2b2,1 b2,2

]=

[a1,1b1,1 + a1,2b2,1 a1,1b1,2 + a1,2b2,2a2,1b1,1 + a2,2b2,1 a2,1b1,2 + a2,2b2,2

]


Matrix vs Array (point by point) operations

Given two matrices a =

[a1,1 a1,2a2,1 a2,2

]and b =

[b1,1 b1,2b2,1 b2,2

]Array productThe array product of a and b is

ab =

[a1,1 a1,2a2,1 a2,2

] [b1,1 b1,2b2,1 b2,2

]=

[a1,1b1,1 a1,2b1,2a2,1b2,1 a2,2b2,2

]

Matrix productThe matrix product of a and b is

ab =

[a1,1 a1,2a2,1 a2,2

] [b1,1 b1,2b2,1 b2,2

]=

[a1,1b1,1 + a1,2b2,1 a1,1b1,2 + a1,2b2,2a2,1b1,1 + a2,2b2,1 a2,1b1,2 + a2,2b2,2

]


Linear vs Non-linear operations

Consider operator H such that, given the image f (x , y) as input, itproduces the image g(x , y) as output

H [f (x , y)] = g(x , y)

Linear operationsH is said to be a linear operator if it satisfies the additivity property, i.e.

H [ai fi (x , y) + aj fj(x , y)] = [ai fi (x , y)] + [aj fj(x , y)]

= aigi (x , y) + ajgj(x , y)

Non-linear operationsH is said to be a non-linear operator if the additivity property is notsatisfied




H [f (x , y)] = g(x , y)








H [f (x , y)] = g(x , y)






Arithmetic Operations

DefinitionArithmetic operations between images are array operations and they canbe written as

s(x , y) = f (x , y) + g(x , y)

d(x , y) = f (x , y)− g(x , y)

p(x , y) = f (x , y)× g(x , y)

v(x , y) = f (x , y)÷ g(x , y)


Arithmetic Operations

DefinitionArithmetic operations between images are array operations and they canbe written as

s(x , y) = f (x , y) + g(x , y)

d(x , y) = f (x , y)− g(x , y)

p(x , y) = f (x , y)× g(x , y)

v(x , y) = f (x , y)÷ g(x , y)


Frequent use of image summation: image denoising

AssumptionsSuppose that we have a clean image f (x , y) and some non-correlatednoise η(x , y), then g(x , y) = f (x , y) + η(x , y) is a noise corrupted image.

AverageIf we have multiple noisy images gi (x , y), we can obtain f (x , y) by usingthe average operation

g(x , y) =1

K

K∑i=1

gi (x , y)

Then

E {g(x , y)} = f (x , y)

σ2g(x,y) =

1

Kσ2η(x,y)


Frequent use of image summation: image denoising

AssumptionsSuppose that we have a clean image f (x , y) and some non-correlatednoise η(x , y), then g(x , y) = f (x , y) + η(x , y) is a noise corrupted image.

AverageIf we have multiple noisy images gi (x , y), we can obtain f (x , y) by usingthe average operation

g(x , y) =1

K

K∑i=1

gi (x , y)

Then

E {g(x , y)} = f (x , y)

σ2g(x,y) =

1

Kσ2η(x,y)


Image averaging example


Frequent use of image subtraction: visibility enhancement


Frequent use of image multiplication (division): shadingcorrection

DescriptionSometimes the lighting in a scene may be non-uniform, or the cameralens may present the vignetting phenomenon. This can be easilycorrected if we can model the shading pattern.


Note on data range

Data range problemThe usual data range for 8-bit images is [0, 255]. If we performarithmetics on such images, we may end up with values outside thatrange (i.e. 15− 200 = −175)

Data range solutionAfter performing arithmetics you should do the following

fm = f −min(f )

fs = K

[fm

max(fm)

]


Note on data precision

Data precision problemEspecially when performing division, you might want better precisionthan available by using integer operations.

Data precision solution

1. Convert the data to floating-point

2. Perform arithmetics

3. Convert the data back to integer


Sets

DefinitionA set is a collection of elements and it is indicated by a list of allelements, or a specification of the element characteristics, e.g.

A = {1, 2, 4, 7, 10, 52}A = {w |w = 2k + 1}

Set membershipIf an element a is in set A, then we write

a ∈ A

Similarly, if a is not in set A, then we write

a /∈ A

The empty set is denoted by the symbol φ


Sets

DefinitionA set is a collection of elements and it is indicated by a list of allelements, or a specification of the element characteristics, e.g.

A = {1, 2, 4, 7, 10, 52}A = {w |w = 2k + 1}

Set membershipIf an element a is in set A, then we write

a ∈ A

Similarly, if a is not in set A, then we write

a /∈ A

The empty set is denoted by the symbol φ


Set operations

Union and intersectionThe union of sets A and B is

W = A ∪ B = {w ∈ A or w ∈ B}

The intersection of sets A and B is

W = A ∩ B = {w ∈ A and w ∈ B}

Complement and differenceThe complement of a set A is

AC = {w |w /∈ A}

The difference between two sets A and B is

A− B = {w ∈ A,w /∈ B} = A ∩ BC


Set operations

Union and intersectionThe union of sets A and B is

W = A ∪ B = {w ∈ A or w ∈ B}

The intersection of sets A and B is

W = A ∩ B = {w ∈ A and w ∈ B}

Complement and differenceThe complement of a set A is

AC = {w |w /∈ A}

The difference between two sets A and B is

A− B = {w ∈ A,w /∈ B} = A ∩ BC


Logical operations


Spatial operations

Spatial (image space) operations are divided in three kinds

I Single pixel

I Neighborhood

I Geometric spatial transforms

Single pixel operationsAlso known as pixel-by-pixel or pixel-wise operations, they are performedon every pixel transforming it according to a function

O(x , y) = T (I (x , y))

where O(x , y) is the new intensity and I (x , y) the original intensity value.


Spatial operations

Spatial (image space) operations are divided in three kinds

I Single pixel

I Neighborhood

I Geometric spatial transforms

Single pixel operationsAlso known as pixel-by-pixel or pixel-wise operations, they are performedon every pixel transforming it according to a function

O(x , y) = T (I (x , y))

where O(x , y) is the new intensity and I (x , y) the original intensity value.


Spatial operations

Neighborhood operationsAlso known as local operations, they are performed on a “slidingwindow” (i.e. an area that is centered on one pixel at a time)

O(x , y) = T (W (x , y))

where W (x , y) is the set of pixels’ intensities in the window centered onpixel (x , y), m is the window width and n is the window height


Neighborhood operation example


Spatial operations

Geometric spatial transformationsGeometric transformations consist of two steps

1. Spatial transformation of coordinates

2. Intensity interpolation

Transformation of coordinatesThe transformation of coordinates can be expressed as

(x , y) = T{(v ,w)}

where (v ,w) are the original coordinates and (x , y) the transformedones.

Example: (x , y) = ( v2 ,

w2 ) down-scales the image by a factor 2.


Spatial operations

Geometric spatial transformationsGeometric transformations consist of two steps

1. Spatial transformation of coordinates

2. Intensity interpolation

Transformation of coordinatesThe transformation of coordinates can be expressed as

(x , y) = T{(v ,w)}

where (v ,w) are the original coordinates and (x , y) the transformedones.

Example: (x , y) = ( v2 ,

w2 ) down-scales the image by a factor 2.


Geometric spatial transformations

Affine transformThe affine transform is a kind of “generic” transform that can performthe following operations at the same time

I Translation

I Rotation

I Scaling

I Shearing

The affine transform has the following form

[x y 1

]=[v w 1

]T =

[v w 1

] t1,1 t1,2 0t2,1 t2,2 0t3,1 t3,2 1

where x = vt1,1 + wt2,1 + t3,1 and y = vt1,2 + wt2,2 + t3,2


Geometric spatial transformations

Affine transformThe affine transform is a kind of “generic” transform that can performthe following operations at the same time

I Translation

I Rotation

I Scaling

I Shearing

The affine transform has the following form

[x y 1

]=[v w 1

]T =

[v w 1

] t1,1 t1,2 0t2,1 t2,2 0t3,1 t3,2 1

where x = vt1,1 + wt2,1 + t3,1 and y = vt1,2 + wt2,2 + t3,2


Affine transform usage


Image registration

DefinitionImage registration is that process that allows to compute thetransformation (usually affine) that separates two images. It is used inpanorama stitching, super-resolution, etc...


Vector and matrix operations

A pixel with multiple values is a vectorImagine a pixel with red, green and blue values: we can write it as

z =

rgb

=

z1z2z3

An image with vector valued pixels is a 3D matrixIf our image has width M and height N, then we can write it as anM × N × 3 matrix.

Distance not only in image spaceIf we want we can compute distances also in color space, or a mix ofimage and color space. E.g. the generic formula for the Euclideandistance (vector norm) is

||b− a|| = D(a,b) =√

(a1 − b1)2 + (a2 − b2)2 + · · · (an − bn)2




z =

rgb

=

z1z2z3



||b− a|| = D(a,b) =√

(a1 − b1)2 + (a2 − b2)2 + · · · (an − bn)2




z =

rgb

=

z1z2z3



||b− a|| = D(a,b) =√

(a1 − b1)2 + (a2 − b2)2 + · · · (an − bn)2



Linear transformationsWhen using matrix notation it is very easy to express lineartransformations

w = A(b− a)

where A has size m × n and w, a, b are all column vectors of size n × 1

Image as a vectorAnother useful representation of an image is that of a MN × 1 vector(assuming one color channel) where each row is a group of N elements ofthe vector. Many linear transformations can then be expressed as

g = Hh + n

where H has size MN ×MN and represents the linear process, f is aMN × 1 vector representing the input image, n is a MN × 1 vectorrepresenting noise and g is the output image in vector form. all columnvectors of size n × 1



Linear transformationsWhen using matrix notation it is very easy to express lineartransformations

w = A(b− a)

where A has size m × n and w, a, b are all column vectors of size n × 1

Image as a vectorAnother useful representation of an image is that of a MN × 1 vector(assuming one color channel) where each row is a group of N elements ofthe vector. Many linear transformations can then be expressed as

g = Hh + n

where H has size MN ×MN and represents the linear process, f is aMN × 1 vector representing the input image, n is a MN × 1 vectorrepresenting noise and g is the output image in vector form. all columnvectors of size n × 1


Space transforms

Going into outer space...There are many different space in which image analysis can beperformed. The most well known is the Fourier space, which lies in thefrequency domain instead of the spatial domain.

Foorward T (u, v) transformsA particular class of 2D linear transforms, known as T (u, v) can beexpressed as

T (u, v) =M−1∑x=0

N−1∑y=0

f (x , y)r(x , y , u, v)

where f (x , y) is the input image, r(x , y , u, v) is called the forwardtransformation kernel and 0 ≤ u ≤ M − 1, 0 ≤ v ≤ N − 1. u, v are calledthe tranform variables.


Space transforms

Inverse T (u, v) transforms

f (x , y) =M−1∑x=0

N−1∑y=0

T (u, v)s(x , y , u, v)

where s(x , y , u, v) is called the inverse transformation kernel and0 ≤ u ≤ M − 1, 0 ≤ v ≤ N − 1.


Separable and symmetric transforms

Separable kernelA kernel is said to be separable if

r(x , y , u, v) = r1(x , u)r2(y , v)

Symmetric kernelA separable kernel is said to be symmetric if

r(x , y , u, v) = r1(x , u)r1(y , v)


Fourier Transform

Forward transform

r(x , y , u, v) = e−i2π(uxM + vy

N )

Reverse transform

s(x , y , u, v) =1

MNe i2π(

uxM + vy

N )

i or jThe book (written by an engineer) uses the letter j to indicate

√−1, but

I will use the letter i , because it is traditional to mathematics.

Other transformsThere are many other transforms expressed in such form as the Walsh,Hadamard, Haar, discrete cosine, etc...


Probabilistic methods

Probability of an intensity value in an image

p(zk) =nkMN

where zk is the k-th intensity level in the image, nk is the number ofpixels with intensity k and MN is the total number of pixels.

Sum of probabilities

L∑k=1

p(zk) = 1

where L is the number of possible levels

Image average (mean): first moment

m =L∑

k=1

zkp(zk)


Probabilistic methods

Image variance: second moment

σ2 =L∑

k=1

(zk −m)2p(zk)

n-th statistical moment

µn(z) =L∑

k=1

(zk −m)np(zk)


Documents

Image Fundamentals