Upload
others
View
14
Download
0
Embed Size (px)
Citation preview
Advanced Artificial IntelligenceC
reativ
e Desig
n | C
hu
ng
-An
g U
niv
ersity | N
arra
tion
: Pro
f. Ja
esun
g L
ee
Digital Image Processing
Presenter: Nguyen The Vi
Good afternoon Professor and everyone. Today I am delighted to be here to talk about Digital Image processing (DIP).
A digital image is a representation of a two-dimensional image as a finite set of digital values, called picture elements or pixels. Pixel values typically represent gray levels, colors, heights, opacities, etc. And we should remember that digitalization implies that a digital image is an approximation of a real scene.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
2
What is a digital image?
Image processing is a method to convert an image into digital form and perform some operations on it, in order to get an enhanced image or to extract some useful information from it. It is a type of signal dispensation in which input is image, like video frame or photograph and output may be image or characteristics associated with that image.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
3
What is digital image processing?
Digital image processing focuses on two major tasks, which are improvement of pictorial information for human interpretation and processing of image data storage, transmission and representation for autonomous machine perception.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
4
What is digital image processing?
One of the most common uses of the digital image processing techniques including improve quality, remove remove noise, etc. Major uses of imaging based on X-rays include medicine and astronomical observations. X-rays are among the oldest sources of radiation used for imaging. The best known use of X-rays is medical diagnostics, but they also are used extensively in industry and other areas, like astronomy.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
5
Examples
Another major area of visual processing is remote sensing, which usually includes several bands in the visual and infrared regions of the spectrum.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
6
Examples
One another application is artistic effects in movies which are used to make images more visually appealing and to add a special effects to make composite images.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
7
Examples
Others applications of digital image processing in visual spectrum include automated counting and, in law enforcement,
the reading of the serial number for the purpose of tracking and identifying bills.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
8
Examples
One another application of DIP is face recognition and gesture recognition.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
9
Examples
Now we discuss about the key states in DIP.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
10
Key states in DIP
First step is image acquisition, it could be as simple as being given an image that is already in digital form.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
11
Key states in DIP
Second step is Image Enhancement. This is the process of manipulating an image so that the result is more suitable than the original for a specific application. The word specific is important here, because it establishes at the outset that
enhancement techniques are problem oriented. Thus, for example, a method that is quite useful for enhancing X-ray images may not be the best approach for enhancing satellite images taken in the infrared band of the electromagnetic spectrum
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
12
Key states in DIP
Objective of image enhancement – process the image (e.g. contrast improvement, image sharpening ,…) so that it is better suited for further processing or analysis. Image enhancement methods are based on subjective image quality criteria. It means that there is no objective mathematical criteria are used for optimizing processing results.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
13
Image enhancement
There are some methods for solving enhancement problem including: Point processing, Spatial filtering, and Image colouring.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
14
Image enhancement
Contrast enhancements improve the perceptibility of objects in the scene by enhancing the brightness difference between objects and their backgrounds. Contrast enhancements are typically performed as a contrast stretch followed by a tonal enhancement, although these could both be performed in one step. Most contrast enhancement methods make use of the gray-level histogram
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
15
Image enhancement
Most contrast enhancement methods make use of the gray-level histogram, created by counting the number of times each gray-level value occurs in the image, then dividing by the total number of pixels in the image to create a distribution of the percentage of each gray level in the image. The gray-level histogram describes the statistical distribution of the gray levels in the image but contains no spatial information about the image.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
16
Image enhancement
Spatial filtering refers to image operator that change the gray value at any pixel (x,y) depending on the pixel
in a square neighborhood centered at (x,y) using a fixed integer matrix of the same size. The integer matrix is
called a filter, mask, kernel or a window.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
17
Spatial filtering
The concept of filtering has its roots in the use of the Fourier transform for signal processing in the so-called frequency domain. Spatial filtering term is the filtering operations that are performed directly on the pixels of an image. The process consists simply of moving the filter mask from point to point in an image.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
18
Spatial filtering
The mechanism of spatial filtering consists of moving the filter mask from pixel to pixel in an image. At each pixel (x,y) the response of the filter at that pixel is calculated using a predefined relationship ( linear or nonlinear )
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
19
Spatial filtering
We consider linear spatial filtering, which we call convolution. It is the process that consists of moving the filter mask from pixel to pixel in an image. At each pixel (x,y), the response is given by a sum of products of the filter coefficients and the corresponding image pixels in the area spanned by the filter mask.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
20
Spatial filtering
Next, we consider nonlinear spatial filtering. The operation also consists of moving the filter mask from pixel to pixel in an image. The filtering operation is based conditionally on the values on the pixel in the neighborhood, and they do not explicitly use coefficients in the sum-of-products manner. For example,noise reduction can be achieved effectively with a nonlinear filter whose basic function is to compute the median gray-level value in the neighborhood in which filter located.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
21
Spatial filtering
Third step is Image Restoration, which is the operation of taking a corrupt/noisy image and estimating the clean, original image. Corruption may come in many forms such as motion blur, noise and camera mis-focus. Image restoration is performed by reversing the process that blurred the image and such is performed by imaging a point source and use the point source image, which is called the Point Spread Function (PSF) to restore the image information lost to the blurring process.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
22
Image Restoration
Model the degradation and applying the inverse process in order to recover the original image. The principal goal of restoration techniques is to improve an image in some predefined sense. Although there are areas of overlap, image enhancement is largely a subjective process, while restoration is for the most part an objective process.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
23
Image Restoration
The problem is how to estimating the degradation function. This problem can tackled by building a mathematical model of the degradation as given figure above. And reproducing the degradation process on a known image. In degradation model for blurring image, the image is blurred using different kinds of filters and an additive noise. The image can be degraded by using salt and pepper noise and Gaussian Noise.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
24
Image Restoration
Restoration is obtained by degrading the image using restoration filters. In this process, noise and blur image factor is removed and we get the estimated original image.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
25
Image Restoration
Fourth step is Morphological Processing, which deals with tools for extracting image components that are useful in the representation and description of shape. Morphological image processing is a collection of non-linear operations related to the shape or morphology of features in an image.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
26
Key states in DIP
Morphological techniques probe an image with a small shape or template called a structuring element. The structuring element is positioned at all possible locations in the image and it is compared with the corresponding neighbourhoodof pixels. Some operations test whether the element "fits" within the neighbourhood, while others test whether it "hits" or intersects the neighbourhood. A morphological operation on a binary image creates a new binary image in which the pixel has a non-zero value only if the test is successful at that location in the input image.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
27
Morphological image processing
When a structuring element is placed in a binary image, each of its pixels is associated with the corresponding pixel of the neighbourhood under the structuring element. The structuring element is said to fit the image if, for each of its pixels set to 1, the corresponding image pixel is also 1. Similarly, a structuring element is said to hit, or intersect, an image if, at least for one of its pixels set to 1 the corresponding image pixel is also 1. Zero-valued pixels of the structuring element are ignored, i.e. indicate points where the corresponding image value is irrelevant.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
28
Morphological image processing
Fifth step is Segmentation, which procedures partition an image into its constituent parts or objects. In this step, we are going to segment the image, separating background from foreground.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
29
Segmentation
Let’s understand image segmentation using a simple example. Consider the above left hand side image. There’s only one object here – a dog. We can build a straightforward cat-dog classifier model and predict that there’s a dog in the given image. But what if we have both a cat and a dog in a single image? We can train a multi-label classifier, in that instance. Now, there’s another caveat – we won’t know the location of either animal/object in the image.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
30
Segmentation
We can divide or partition the image into various parts called segments. It’s not a great idea to process the entire image at the same time as there will be regions in the image which do not contain any information. By dividing the image into segments, we can make use of the important segments for processing the image. That, in a nutshell, is how image segmentation works. An image is a collection or set of different pixels. We group together the pixels that have similar attributes using image segmentation. Take a moment to go through the below visual (it’ll give you a practical idea of image segmentation).
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
31
So how does image segmentation work?
We can broadly divide image segmentation techniques into two types. Consider the above images. Can you identify the difference between these two? Both the images are using image segmentation to identify and locate the people present. In image 1, every pixel belongs to a particular class (either background or person). Also, all the pixels belonging to a particular class are represented by the same color (background as black and person as pink). This is an example of semantic segmentation.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
32
The Different Types of Image Segmentation
Image 2 has also assigned a particular class to each pixel of the image. However, different objects of the same class have different colors (Person 1 as red, Person 2 as green, background as black, etc.). This is an example of instance segmentation. Let me quickly summarize what we’ve learned. If there are 5 people in an image, semantic segmentation will focus on classifying all the people as a single instance. Instance segmentation, on the other hand. will identify each of these people individually.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
33
The Different Types of Image Segmentation
Sixth step is Representation and Description, which almost always follow the output of a segmentation stage, which usually is raw pixel data, constituting either the boundary of a region (i.e., the set of pixels separating one image regionfrom another) or all the points in the region itself.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
34
Representation and Description
The results of segmentation is a set of regions. Regions have then to be represented and described. There are two main ways of representing a region: external characteristics (its boundary)- focus on shape and internal characteristics (its internal pixels): focus on color, textures…The next step is description. E.g.: a region may be represented by its boundary, and its boundary described by some features such as length, regularity… Features should be insensitive to translation, rotation, and scaling. Both boundary and regional descriptors are often used together.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
35
Representation and Description
In order to represent a boundary, it is useful to compact the raw data (list of boundary pixels). Chain codes is a list of segments with defined length and direction. First is 4-directional chain codes and second is 8-directional chain codes.
It may be useful to downsample the data before computing the chain code to reduce the code dimension and to remove small detail along the boundary.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
36
Representation and Description
Seventh step is Recognition, which is the process that assigns a label (e.g., “vehicle”) to an object based on its descriptors.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
37
Key states in DIP
Object recognition is a general term to describe a collection of related computer vision tasks that involve identifying objects in digital photographs. Image classification involves predicting the class of one object in an image. Object localization refers to identifying the location of one or more objects in an image and drawing abounding box around their extent. Object detection combines these two tasks and localizes and classifies one or more objects in an image.When a user or practitioner refers to “object recognition“, they often mean “object detection“.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
38
Object recognition
As such, we can distinguish between these three computer vision tasks: Image Classification ( Predict the type or class of an object in an image), Object Localization (Locate the presence of objects in an image and indicate their location with a bounding box) and Object Detection (Locate the presence of objects with a bounding box and types or classes of the located objects in an image).
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
39
Object recognition
The R-CNN family of methods refers to the R-CNN, which may stand for “Regions with CNN Features” or “Region-Based Convolutional Neural Network,” developed by Ross Girshick, et al. This includes the techniques R-CNN, Fast R-CNN, and Faster-RCNN designed and demonstrated for object localization and object recognition. Let’s take a closer look at the highlights of each of these techniques in turn.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
40
Object recognition
The R-CNN was described in the 2014 paper by Ross Girshick, et al. from UC Berkeley titled “Rich feature hierarchies for accurate object detection and semantic segmentation.”It may have been one of the first large and successful application of convolutional neural networks to the problem of object localization, detection, and segmentation. The approach was demonstrated on benchmark datasets, achieving then state-of-the-art results on the VOC-2012 dataset and the 200-class ILSVRC-2013 object detection dataset.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
41
Object recognition
Given the great success of R-CNN, Ross Girshick, then at Microsoft Research, proposed an extension to address the speed issues of R-CNN in a 2015 paper titled “Fast R-CNN.” Fast R-CNN is proposed as a single model instead of a pipeline to learn and output regions and classifications directly. The architecture of the model takes the photograph a set of region proposals as input that are passed through a deep convolutional neural network. A pre-trained CNN, such as a VGG-16, is used for feature extraction. The end of the deep CNN is a custom layer called a Region of Interest Pooling Layer, or RoI Pooling, that extracts features specific for a given input candidate region.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
42
Object recognition
The output of the CNN is then interpreted by a fully connected layer then the model bifurcates into two outputs, one for the class prediction via a softmax layer, and another with a linear output for the bounding box. This process is then repeated multiple times for each region of interest in a given image..
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
43
Object recognition
Faster R-CNN: The model architecture was further improved for both speed of training and detection by Shaoqing Ren, et al. at Microsoft Research in the 2016 paper titled “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.” The architecture was designed to both propose and refine region proposals as part of the training process, referred to as a Region Proposal Network, or RPN. These regions are then used in concert with a Fast R-CNN model in a single model design. These improvements both reduce the number of region proposals and accelerate the test-time operation of the model to near real-time with then state-of-the-art performance.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
44
Object recognition
Image Compression, as the name implies, which deals with techniques for reducing the storage required to save an image, or the bandwidth required to transmit it. Although storage technology has improved significantly over the past decade, the same cannot be said for transmission capacity.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
45
Key states in DIP
Color Image Processing is an area that has been gaining in importance because of the significant increase in the use of digital images over the Internet.
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
46
Key states in DIP
Ad
va
nced
AI | C
hu
ng
-An
g U
niv
ersity | P
resenter: N
gu
yen
Th
e Vi
47
Questions and answers ???!!!