Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
FUSION OF SATELLITE IMAGES
WITH GUIDED FILTER
A PROJECT REPORT
Submitted by
NATHIYA V N
Register No: 13MCO13
in partial fulfillment for the requirement of award of the degree
of
MASTER OF ENGINEERING
in
COMMUNICATION SYSTEMS
Department of Electronics and Communication Engineering
KUMARAGURU COLLEGE OF TECHNOLOGY
(An autonomous institution affiliated to Anna University, Chennai)
COIMBATORE-641049
ANNA UNIVERSITY: CHENNAI 600 025
APRIL 2016
ii
BONAFIDE CERTIFICATE
Certified that this project report titled “FUSION OF SATELLITE IMAGES WITH
GUIDED FILTER” is the bonafide work of NATHIYA.V.N [Reg. No. 13MCO13] who
carried out the research under my supervision. Certified further, that to the best of my
knowledge the work reported herein does not form part of any other project or dissertation
on the basis of which a degree or award was conferred on an earlier occasion on this or any
other candidate.
HHHH
The Candidate with Register No. 13MCO13 was examined by us in the
project viva –voice examination held on............................
INTERNAL EXAMINER EXTERNAL EXAMINER
SIGNATURE
Ms. S.NAGARATHINAM
PROJECT SUPERVISOR
Department of ECE
Kumaraguru College of Technology
Coimbatore-641 049
SIGNATURE
Dr. A.VASUKI
HEAD OF THE DEPARTMENT
Department of ECE
Kumaraguru College of Technology
Coimbatore-641 049
iii
ACKNOWLEDGEMENT
First, I would like to express my praise and gratitude to the Lord, who has
showered his grace and blessings enabling me to complete this project in an excellent
manner.
I express my sincere thanks to the management of Kumaraguru College of
Technology and Joint Correspondent Shri Shankar Vanavarayar for his kind
support and for providing necessary facilities to carry out the work.
I would like to express my sincere thanks to our beloved Principal
Dr.R.S.Kumar Ph.D., Kumaraguru College of Technology, who encouraged me with
his valuable thoughts.
I would like to thank Dr.A.Vasuki Ph.D., Head of the Department, Electronics
and Communication Engineering, for her kind support and for providing necessary
facilities to carry out the project work.
In particular, I wish to thank with everlasting gratitude to the Project
Coordinator Dr.M.Alagumeenaakshi Ph.D., Asst. Professor-III, Department of
Electronics and Communication Engineering, throughout the course of this project
work.
I am greatly privileged to express my heartfelt thanks to my project guide
Ms.S.Nagarathinam M.E., (Ph.D)., Asst. Professor-III, Department of Electronics
and Communication Engineering, for her expert counselling and guidance to make
this project to a great deal of success and I wish to convey my deep sense of gratitude
to all teaching and non-teaching staff of ECE Department for their help and
cooperation.
Finally, I thank my parents and my family members for giving me the moral
support and abundant blessings in all of my activities and my dear friends who helped
me to endure my difficult times with their unfailing support and warm wishes.
iv
ABSTRACT
In remote sensing, image fusion is one of the fast growing techniques to integrate
two images. Remote sensing satellites often provide panchromatic (PAN) image with
high spatial resolution and multispectral (MS) image with high spectral resolution. Image
Fusion refers to combine two or more images into a single image which gives more
information than source images. An effective and fast image fusion method is proposed
for creating a fused image from multiple images. A multispectral image is bundled with a
higher-resolution panchromatic image to produce a pan sharpened image, where the color
bands are merged with the high-resolution black-and-white imagery using Guided filter
and weighted averaging fusion technique. Guided filter is one of the recent filtering
methods used in image processing. The proposed technique is to analyze the performance
of the Guided filter and to make use of spatial consistency for fusion of the base and
detail layers. The fused image is enhanced using edge preservation. Proposed method is
based on intensity variations in the base and detail layer of the input images. The various
edge detection methods doesn’t produce good results, hence PSO segmentation is used
for saliency map construction. The source images that are used is from Quickbird and
Ikonos sensor.
TABLE OF CONTENTS
CHAPTER
NO
TITLE
PAGE
NO
ABSTRACT iv
LIST OF FIGURES vii
LIST OF TABLES viii
LIST OF ABBREVIATIONS ix
1 INTRODUCTION 1
1.1 Overview 1
1.2 Satellite Remote Sensing 1
1.3 Types of Satellite Images 3
1.4 Types of Resolution 4
1.5 Image Fusion 5
1.6 Image Fusion Algorithms 6
1.6.1 Spatial Domain Fusion Techniques 6
1.6.2 Transform Domain Fusion Techniques 9
2 LITERATURE SURVEY 12
3 METHODOLOGY 18
3.1 Introduction 18
3.2 Guided Image Filtering 18
3.3 Smoothing Operators 20
3.3.1 Average Filter 20
3.3.2 Gaussian Filter 21
3.4 Various Image Segmentation Techniques 21
3.4.1 Sobel Operator 22
3.4.2 Robert’s Cross Operator 22
3.4.3 Prewitt’s Operator 23
3.4.4 Canny Operator 24
3.4.5 Laplacian of Gaussian 24
3.4.6 PSO Segmentation 25
3.5 Weighted Averaging 27
3.6 Block Diagram 27
3.6.1 Image Decomposition 28
3.6.2 Weight Map Construction 28
3.6.3 Image Reconstruction 29
3.7 Parameters used for Comparison 29
4 SIMULATION RESULTS 32
4.1 Results 32
4.1.1 Input Images of Ikonos Sensor 32
4.1.2 Input Images of Quickbird Sensor 33
4.1.3 Comparison of Fusion Algorithms 34
4.1.4 Comparison of various Segmentation Techniques 38
4.1.5 Comparison of Smoothing Operators 42
5 CONCLUSION 45
REFERENCES 46
LIST OF PUBLICATIONS 48
vii
LIST OF FIGURES
FIGURE
NO.
CAPTION
PAGE
NO.
1.1 Spectrum of Objects 2
1.2 Advantage of image fusion 5
1.3 Principal Component Analysis 7
1.4 IHS 8
1.5 Laplacian Pyramid 9
1.6 Discrete Wavelet Transform 11
3.1 Window Choice 20
3.2 Masks used by Sobel Operator 22
3.3 Masks used by Robert Operator 23
3.4 Masks used by Prewitt Operator 23
3.5 Masks used by Canny Operator 24
3.6 Laplacian Filter 25
3.7 Block Diagram 27
4.1 Panchromatic and Multispectral Image of Ikonos Sensor 33
4.2 Panchromatic and Multispectral Image of Quickbird Sensor 34
4.3 Fused result of Weighted Average Fusion 35
4.4 Fused result of Laplacian Pyramid Fusion 35
4.5 Fused result of Principal Component Analysis Fusion 36
4.6 Fused result of Discrete Wavelet Transform Fusion 36
4.3 Fused Images using LoG Operator 38
4.4 Fused Images using Canny Operator 38
4.5 Fused Images using Sobel Operator 39
4.6 Fused Images using Prewitt Operator 39
4.7 Fused Images using Roberts Operator 40
4.8 Fused Images using PSO Segmentation 40
4.9 Fused Images using average smoothing operator 42
4.10 Fused Images using disk smoothing operator 43
4.11 Fused Images using Gaussian smoothing operator 43
viii
viii
LIST OF TABLES
TABLE
NO.
CAPTION
PAGE
NO.
4.1 Spectral Resolution of Ikonos Sensor 32
4.2 Spectral Resolution of Quickbird Sensor 34
4.3 Performance Metrics of Fusion Algorithms of Input set 1 37
4.4 Performance Metrics of Fusion Algorithms of Input set 2 37
4.5 Performance Metrics of Quickbird image 41
4.6 Performance Metrics of Ikonos image 41
4.7 Comparison Table of Quickbird image 44
4.8 Comparison Table of Ikonos image 44
ix
LIST OF ABBREVIATIONS
RADAR Radio Detection and Ranging
PAN Panchromatic Image
MS Multispectral Image
RGB Red Green Blue
PCA Principal Component Analysis
IHS Intensity Hue Saturation
DCT Discrete Cosine Transform
DWT Discrete Wavelet Transform
MRF Markov Random Field
WLS Weighted Least Square
TV Total Variation
LoG Laplacian of Gaussian
PSO Particle Swarm Optimization
MSE Mean Square Error
PSNR Peak Signal to Noise Ratio
SSIM Structural Similarity Index Measure
NK Normalized Cross Correlation
MD Maximum Difference
NAE Normalized Absolute Error
ix
1
CHAPTER 1
INTRODUCTION
1.1 OVERVIEW
Remote sensing is the process of acquiring data or information about objects and
substances not in direct contact with the sensor, by gathering its inputs using
electromagnetic radiation or acoustical waves that emanate from the targets of interest.
Remote sensing makes it possible to collect data on dangerous or inaccessible areas.
Remote sensing also replaces costly and slow data collection on the ground, ensuring in
the process that areas or objects are not disturbed. The sun is a source of energy or
radiation, which provides a very convenient source of energy for remote sensing. The
sun's energy is either reflected, as it is for visible wavelengths, or absorbed and then
reemitted, as it is for thermal infrared wavelengths. There are two main types of remote
sensing: Passive remote sensing and Active remote sensing.
Passive remote sensing - Remote sensing systems which measure energy that is
naturally available are called passive sensors. The sensor records energy that is reflected
such as visible wavelenghts from the sun or emitted (thermal infrared) from the source.
Examples of passive remote sensors include film photography, infrared, and radiometers.
Active remote sensing - Active sensors emits energy in order to scan objects or areas. It
then detects and measures the radiation that is reflected or backscattered from the target.
RADAR is an example of active remote sensing where the time delay between emission
and return is measured, establishing the location, height, speeds and direction of an
object.
1.2 SATELLITE REMOTE SENSING
Remote sensing images are acquired by earth observation satellites. These remote
sensing satellites are equipped with sensors looking down to the earth. Orbital platforms
collect and transmit data from different parts of the electromagnetic spectrum, which in
conjunction with larger scale aerial or ground-based sensing and analysis provides
2
researchers with enough information to monitor trends. Satellite sensors record the
intensity of electromagnetic radiation (sunlight) reflected from the earth at different
wavelengths. Energy that is not reflected by an object is absorbed. Each object has its
own unique 'spectrum' as shown in Fig.1.1.
Fig.1.1 Spectrum of Objects
Remote sensing relies on the fact that particular features of the landscape such as bush,
crop, salt-affected land and water reflect light differently in different wavelengths. Grass
looks green, for example, because it reflects green light and absorbs other visible
wavelengths. This can be seen as a peak in the green band in the reflectance spectrum for
green grass above. The spectrum also shows that grass reflects even more strongly in the
infrared part of the spectrum. This can't be detected by the human eye but it can be
detected by an infrared sensor. Instruments mounted on satellites detect and record the
energy that has been reflected. The detectors are sensitive to particular ranges of
wavelengths, called 'bands'. The satellite systems are characterised by the bands at which
they measure the reflected energy. The Landsat TM satellite, which provides the data
used in this project, has bands at the blue, green and red wavelengths in the visible part of
the spectrum and at three bands in the near and mid infrared part of the spectrum and one
3
band in the thermal infrared part of the spectrum. The satellite detectors measure the
intensity of the reflected energy and record it.
1.3 TYPES OF SATELLITE IMAGES
• Panchromatic image
• Multispectral image
• Hyperspectral image
Panchromatic image
A panchromatic image consists of only one band. Thus, a panchromatic image
may be similarly interpreted as a black-and-white aerial photograph of the area.
Panchromatic image is usually displayed as a gray scale image, i.e. the displayed
brightness of a particular pixel is proportional to the pixel digital number which is related
to the intensity of solar radiation reflected by the targets in the pixel and detected by the
detector. The spectral information or "colour" of the targets is lost. IKONOS Pan,
QuickBird Pan, SPOT Pan, LANDSAT ETM+ Pan is an example of panchromatic
sensor.
Multispectral image
A multispectral image is one that captures image data at specific frequencies across
the electromagnetic spectrum. A multispectral image consists of several bands of data.
For visual display, each band of the image may be displayed one band at a time as a gray
scale image, or in combination of three bands at a time as a colour composite image. It is
a multilayer image which contains both the brightness and spectral information of the
targets being observed. Multispectral data sets are usually composed of about 5 to 10
bands of relatively large bandwidths (70-400 nm). Landsat TM, MSS, Spot HRV-XS,
Ikonos MS, QuickBird MS is an example of multispectral imaging.
Hyperspectral Image
It acquires images in about a hundred or more contiguous spectral bands. The
precise spectral information contained in a hyperspectral image enables better
characterisation and identification of targets. Hyperspectral images are of high spectral
4
resolution compared to panchromatic and multispectral images. Hyperspectral data sets
are generally composed of about 100 to 200 spectral bands of relatively narrow
bandwidths (5-10 nm). The Hyperion is an example of a hyperspectral sensor.
Colour Composite Images
In displaying a colour composite image, three primary colours RGB (red, green and
blue) are used. When these three colours are combined in various proportions, they
produce different colours in the visible spectrum. Associating each spectral band (not
necessarily a visible band) to a separate primary colour produces a colour composite
image.
1.4 TYPES OF RESOLUTION
In remote sensing the term resolution is used to represent the resolving power,
which includes not only the capability to identify the presence of two objects, but also
their properties. In qualitative terms resolution is the amount of details that can be
observed in an image. Four types of resolutions are defined for the remote sensing
systems.
i. Spatial Resolution
Spatial resolution is a measure of the area or size of the smallest dimension on the Earth’s
surface over which an independent measurement can be made by the sensor. It is
expressed by the size of the pixel on the ground in meters.
ii. Radiometric Resolution
Radiometric Resolution refers to the smallest change in intensity level that can be
detected by the sensing system. The intrinsic radiometric resolution of a sensing system
depends on the signal to noise ratio of the detector.
iii. Spectral resolution
It is used to describe the ability of a sensor to distinguish between wavelength intervals in
the electromagnetic spectrum (bands). The finer the spectral resolution, the narrower the
wavelength range for a particular channel or band.
5
iv. Temporal resolution
Temporal resolution is a measure of how often data are otained for the same area. The
temporal resolution specifies the revisiting frequency of a satellite sensor for a specific
location.
1.5 IMAGE FUSION
In remote sensing applications, the increasing availability of space borne sensors
gives a motivation for different image fusion algorithms. Image fusion is an effective
technique to integrate spatial and spectral information of the PAN and MS images.
Through remote sensing image fusion technique, we cannot only overcome the limitation
of information obtained from individual sensor but also achieve a better observation.
Image fusion has been used in many application areas. In remote sensing multi-sensor
fusion is used to achieve high spatial and spectral resolutions by combining images from
two sensors, one of which has high spatial resolution and the other one high spectral
resolution.
Several situations in image processing require high spatial and high spectral
resolution in a single image. Most of the available equipment is not capable of providing
such data convincingly. Image fusion techniques allow the integration of different
information sources. The fused image can have complementary spatial and spectral
resolution characteristics. However, the standard image fusion techniques can distort the
spectral information of the multispectral data while merging. Advantage of image fusion
is given in Fig.1.2.
Fig.1.2 Advantage of image fusion
6
Image fusion techniques can improve the quality and increase the application of
these data. At the receiver station, the panchromatic image is merged with the
multispectral data to convey more information. The images used in image fusion should
already be registered. Many methods exist to perform image fusion. The very basic one is
the high pass filtering technique. Later techniques are based on Discrete Wavelet
Transform, uniform rational filter bank, and Laplacian pyramid.
A multispectral image can be bundled with a higher-resolution, panchromatic
image. This allows the user to combine the two in a process called pan-sharpening, which
merges the color bands with the high-resolution black-and-white imagery. As a result,
remote sensing satellites often provide panchromatic (PAN) image with high spatial
resolution and multispectral (MS) image with high spectral resolution.
1.6 IMAGE FUSION ALGORITHMS
The objective of image fusion algorithms is to make full use of spatial and spectral
information in the Panchromatic and Multispectral images respectively, in order to reduce
the potential colour distortion and provide clear image information.
There are two major techniques available.
Spatial domain fusion
Transform domain fusion
1.6.1 SPATIAL DOMAIN FUSION TECHNIQUES
Simple average
It is a fact that regions of images that are in focus tend to be of higher pixel
intensity. Thus this algorithm is a simple way of obtaining an output image with all
regions in focus. The value of the pixel P (i, j) of each image is taken and added. This
sum is then divided by 2 to obtain the average. The average value is assigned to the
corresponding pixel of the output image which is given in equation (1.1). This is repeated
for all pixel values.
7
𝐾(𝑖, 𝑗) = {𝑋(𝑖, 𝑗) + 𝑌(𝑖, 𝑗)} 2⁄ (1.1)
Where X (i , j) and Y ( i, j) are two input images.
Select maximum
The greater the pixel values the more in focus the image. Thus this algorithm
chooses the in-focus regions from each input image by choosing the greatest value for
each pixel, resulting in highly focused output. The value of the pixel P (i, j) of each image
is taken and compared to each other. The greatest pixel value is assigned to the
corresponding pixel. It is given as
𝐹(𝑖, 𝑗) = ∑ ∑ max 𝐴(𝑖, 𝑗)𝐵(𝑖, 𝑗)𝑀𝑗=1
𝑀𝑖=1 (1.2)
Where,A and B are input images and F is fused image.
Principal Component Analysis (PCA)
Principal component analysis is a method in which number of correlated variables
is transformed into number of uncorrelated variables called principal components. The
information flow diagram of PCA-based image fusion algorithm is shown in Fig.1.3.
Fig 1.3: Principal Component Analysis
I1 (x, y) and I2 (x, y) are the two input images which are to be fused. From the
input image matrices produce the column vectors. Compute the covariance matrix of two
column vectors formed before. Compute the Eigen values and Eigen vectors of the
covariance matrix. The column vector corresponding to the larger Eigen value is
normalized by dividing each element with mean of Eigen vector. Normalized Eigen
8
vector value act as the weight values which are respectively multiplied with each pixel of
the input images. The fused image matrix will be sum of the two scaled matrices.
Weighted Average Method
In this method the resultant fused image is obtained by taking the weighted average
intensity of corresponding pixels from both the input image. The equation is given as
F (i, j) = ∑ ∑ W1(A(i, j)) + W2(B(i, j)) 𝑛𝑗=0
𝑛𝑖=0 (1.3)
where A(i, j), B(i, j) are input images and F(i, j) is fused image and W is weight factor.
The IHS-Based fusion
The IHS pan sharpening technique is the oldest known data fusion method and one
of the simplest. In this technique the following steps are performed as follows and it
described in Fig.1.4:
1. The low resolution MS imagery is co registered to the same area as the high resolution
PAN imagery and re-sampled to the same resolution as the PAN imagery.
2. The three re-sampled bands of the MS imagery, which represent the RGB space, are
transformed into IHS components.
3. The PAN imagery is histogram matched to the ‘I’ component. This is done in order to
compensate for the spectral differences between the two images, which occurred due to
different sensors or different acquisition dates and angles.
4. The intensity component of MS imagery is replaced by the histogram matched PAN
imagery. The RGB of the new merged MS imagery is obtained by computing a reverse
IHS to RGB transform.
Fig 1.4: IHS
9
1.6.2 TRANSFORM DOMAIN FUSION TECHNIQUES
In frequency domain methods, the pixel value is first transferred in to domain
methods by applying DCT and DWT based fusion ways and further image is enhanced by
altering frequency component of an image.
Laplacian Pyramid Fusion Method
The basic idea behind the Laplacian pyramid is to perform a pyramid
decomposition on every source image, then integrate of these decompositions to make a
composite representation and finally reconstruct the fused image by performing an
inverse pyramid transform. The various steps used in Laplacian pyramid based fusion
method are as follows:
1. The first step is to construct a pyramid for each source image.
2. Then the fusion is implemented at each level of the pyramid using a feature selection
decision method.
3. The feature selection method selects the most significant pattern from the source image
and copies it to the composite pyramid.
4. Finally, fused image is obtained by performing an inverse pyramid transform.
Fig 1.5: Laplacian Pyramid
Discrete Cosine Transform (DCT)
Spatial domain image fusion methods are complicated and time consuming which
are difficult to be performed on real-time images. Also, fusion approaches which are
applied in DCT are very adept when the source images are coded in Joint Photographic
Experts Group (JPEG) format or when the fused image will be saved or transmitted in
10
JPEG format. To perform the JPEG coding, an image is first subdivided into blocks of
8x8 pixels. The Discrete Cosine Transform (DCT) is then performed on every block. This
generates 64 coefficients which are then quantized to reduce their magnitude. The
coefficients are then reordered into a one-dimensional array in a zigzag manner before
further entropy encoding takes place. The compression is achieved in two stages the first
is during quantization and the second during the entropy coding procedure. JPEG
decoding is the reverse process of encoding.
Discrete Wavelet Transform (DWT)
The wavelet transform decomposes the image into low-low, low-high, high-low
and high-high spatial frequency bands at different scales. The LL band contains the
approximation coefficients whereas the other bands contain directional information due
to spatial orientation. LH band contains the horizontal detail coefficients. HL band
contains the vertical detail coefficients. HH contains the diagonal detail coefficients and
also contain the higher absolute values of wavelet coefficients correspond to salient
features such as edges or lines. Fig.1.6 shows Discrete Wavelet Transform (DWT) based
image fusion. The wavelets-based approach performs the following tasks:-
1. It is a multi scale (multi resolution) approach well suited to manage the different image
resolutions. It is Useful in a number of image processing applications including the image
fusion.
2. The discrete wavelets transform (DWT) allows the image decomposition in different
kinds of coefficients preserving the image information.
3. Such coefficients approaching from different images can be appropriately combined to
obtain new coefficients so that the information in the original images is collected
appropriately.
4. After the coefficients are merged then the final fused image is achieved by applying
the inverse discrete wavelets transform (IDWT), where the information in the merged
coefficients is also preserved.
11
Fig 1.6: Discrete Wavelet Transform
12
CHAPTER 2
LITERATURE SURVEY
An image fusion approach based on markov random fields
M. Xu, H. Chen, and P. Varshney
Markov random field (MRF) models are powerful tools to model image
characteristics accurately and have been successfully applied to a large number of image
processing applications. In this paper the problem of fusion of remote sensing images
based on MRF models is investigated. Fusion algorithm under maximum a posteriori
criterion is used. It is applicable to both multi-scale decomposition (MD)- based image
fusion and non-MD-based image fusion. It is provided to demonstrate the fusion
performance improvement.
Here, the image fusion problem is based on a statistical model. This approach is
applicable for both non-MD- and MD-based fusion approaches. When the raw source
images are directly used for fusion without pre-processing, the fused image can also be
modelled as an MRF, and then, the fusion result can be obtained by incorporating a priori
Gibbs distribution of the fused image. Visual inspection and quantitative performance
evaluation both demonstrate that the employment of the MRF model in the fusion
approaches resulted in a better fusion performance than the traditional fusion approaches.
A simple relationship between each source image and the true scene, i.e., a source image
either contributes to the fused image or does not contribute to the fused image is given. If
it results in a mismatch between the fusion model and the real image data set, assume that
the coefficient in the data model can take any real value, which may increase the
accuracy of the fusion algorithm. The noise in the source image is Gaussian noise.
13
Guided image filtering
K. He, J. Sun, and X. Tang
In this paper, the author proposed a novel explicit image filter called guided filter.
Derived from a local linear model, the guided filter computes the filtering output by
considering the content of a guidance image, which can be the input image itself or
another different image. The guided filter can be used as an edge-preserving smoothing
operator like the popular bilateral filter, but it has better behaviours near edges. The
guided filter is also a more generic concept beyond smoothing: It can transfer the
structures of the guidance image to the filtering output, enabling new filtering
applications like de-hazing and guided feathering. Moreover, the guided filter naturally
has a fast and non-approximate linear time algorithm, regardless of the kernel size and
the intensity range. Currently, it is one of the fastest edge-preserving filters. Experiments
show that the guided filter is both effective and efficient in a great variety of computer
vision and computer graphics applications, including edge-aware smoothing, detail
enhancement, compression, image matting/feathering, de-hazing, joint up-sampling, etc.
Generalized random walks for fusion of multi-exposure images
R. Shen, I. Cheng, J. Shi, and A. Basu
A single captured image of a real-world scene is usually insufficient to reveal all
the details due to under- or over-exposed regions.This problem can be solved by images
of the same scene can be first captured under different exposure settings and then
combined into a single image using image fusion techniques. The aim is to achieve an
optimal balance between two quality measures, i.e., local contrast and color consistency,
while combining the scene details revealed under different exposures. A generalized
random walks framework is proposed to calculate a globally optimal solution subject to
the two quality measures by probability estimation. Experiments demonstrate that this
algorithm generates high-quality images at low computational cost. Experimental results
demonstrated that this probabilistic fusion produces good results, in which contrast is
enhanced and details are preserved with high computational efficiency. Compared to
14
other fusion methods this algorithm produces images with comparable or even better
qualities.
Adaptive multi-focus image fusion using a wavelet based statistical sharpness
measure
J. Tian and L. Chen
Multi-focus image fusion is to combine a set of images that are captured from the
same scene but with different focuses for producing another sharper image. Based on the
marginal distribution of the wavelet coefficients is different for images with different
focus levels, a new statistical sharpness measure to measure the degree of the image’s
blur is proposed. It is evaluated using a locally adaptive Laplacian mixture model. The
proposed sharpness measure is then exploited to perform adaptive image fusion in
wavelet domain. The proposed approach could be further extended to be applied in the
redundant or complex wavelet domains.
Edge-preserving decompositions for multi-scale tone and detail manipulation
Farbman, R. Fattal, D. Lischinski, and R. Szeliski
The author says that many recent computational photography techniques
decompose an image into a piecewise smooth base layer, containing large scale variations
in intensity, and a residual detail layer capturing the smaller scale details in the image. It
is important to control the spatial scale of the extracted details, desirable to manipulate
details at multiple scales, while avoiding visual artifacts. A new way to construct edge-
preserving multi-scale image decompositions is given. Current base detail decomposition
techniques, based on the bilateral filter, are limited in their ability to extract detail at
arbitrary scales. The weighted least squares optimization framework, which is
particularly well suited for progressive coarsening of images and for multi-scale detail
extraction is described. After describing this operator, it is compared with bilateral filter
and other schemes.
Multi-scale contrast manipulation is a valuable digital darkroom technique.
Currently it is possible to sharpen images (which may be viewed as increasing the local
15
contrast of the finest scale details), as well as to adjust the global contrast. This does not
suffer from some of the drawbacks of bilateral filtering and other previous approaches.
Future the smoothness coefficients for the WLS formulation are enhanced further by
improving the ability to preserve edge color. While manually adjusting the saturation
alleviates the problem, a more principled solution is needed.
Image fusion: Advances in the state of the art
A. A. Goshtasby and S. Nikolov
The author describes that the image fusion is the process of combining information
from two or more images of a scene into a single composite image that is more
informative and is more suitable for visual perception or computer processing. The
objective in image fusion is to reduce uncertainty and minimize redundancy in the output
while maximizing relevant information particular to an application or task. There are
several benefits in using image fusion: wider spatial and temporal coverage, decreased
uncertainty, improved reliability, and increased robustness of system performance. A
single sensor cannot produce a complete representation of a scene. Visible images
provide spectral and spatial details, and if a target has the same color and spatial
characteristics as its background, it cannot be distinguished from the background. If
visible images are fused with thermal images, a target that is warmer or colder than its
background can be easily identified, even when its color and spatial details are similar to
those of its background. Fused images can provide information that sometimes cannot be
observed in the individual input images. Successful image fusion significantly reduces
the amount of data to be viewed or processed without significantly reducing the amount
of relevant information.
Multifocus image fusion using the non subsampled contourlet transform
Q. Zhang and B. Guo
A novel image fusion algorithm based on the non-subsampled contourlet transform
(NSCT) is proposed in this paper, which aims at solving the fusion problem of multifocus
images. Based on the directional vector normal, a ‘selecting’ scheme combined with the
16
‘averaging’ scheme is presented for the low-pass sub-band coefficients. Based on the
directional band limited contrast and the directional vector standard deviation, a selection
principle is put forward for the band-pass directional sub-band coefficients. It not only
extracts more important visual information from source images, but also effectively
avoids the introduction of artificial information. It significantly outperforms the
traditional discrete wavelet transform-based and the discrete wavelet frame transform-
based image fusion methods in terms of both visual quality and objective evaluation,
especially when the source images are not perfectly registered. The NSCT is more
suitable for image fusion because of many advantages such as multi-scale, localization,
multi-direction, and shift-invariance. Several sets of multi-focus images have been used
to evaluate the performance of the proposed fusion algorithm. NSCT- based fusion
algorithm performs well in some cases. However, the improved performance is at the cost
of increasing computational complexity and memory during the fusion process. In some
cases such as image coding and image compression, where redundancy is a major issue,
the higher redundancy of the NSCT may also limit its applications.
A total variation-based algorithm for pixel level image fusion
M. Kumar and S. Dass
In this paper, a total variation (TV) based approach is proposed for pixel-level
fusion to fuse images acquired using multiple sensors. Fusion is an inverse problem and a
locally affine model is used as the forward model. A TV semi norm based approach in
conjunction with principal component analysis is used iteratively to estimate the fused
image. The feasibility of the algorithm is demonstrated on images from computed
tomography and magnetic resonance imaging as well as visible-band and infrared
sensors. It is applied to several different types of datasets. It should focus on analysis of
the algorithm performance with additional datasets.
Image Fusion with Guided Filtering
Shutao Li, Xudong Kang, Jianwen Hu
A fast and effective image fusion method is proposed for creating a highly
informative fused image through merging multiple images. The proposed method is
17
based on a two-scale decomposition of an image into a base layer containing large scale
variations in intensity, and a detail layer capturing small scale details. A novel guided
filtering-based weighted average technique is proposed to make full use of spatial
consistency for fusion of the base and detail layers. Experimental results demonstrate that
the proposed method can obtain state-of-the-art performance for fusion of multispectral,
multifocus, multimodal, and multiexposure images. The proposed method utilizes the
average filter to get the two-scale representations, which is simple and effective. The
guided filter is used in a novel way to make full use of the strong correlations between
neighborhood pixels for weight optimization. Experiments show that the proposed
method can well preserve the original and complementary information of multiple input
images.
18
CHAPTER 3
METHODOLOGY
3.1 INTRODUCTION
In this paper explicit image filter called guided filter is used. The guided filter
computes the filtering output by considering the content of a guidance image, which can
be the input image itself or another different image. The guided filter can be used as an
edge preserving smoothing operator like the popular bilateral filter, but has better
behaviours near edges. The guided filter is also a more generic concept beyond
smoothing, it can transfer the structures of the guidance image to the filtering output,
enabling new filtering applications like de-hazing and guided feathering. Moreover, the
guided filter naturally has a fast and non-approximate linear time algorithm, regardless of
the kernel size and the intensity range.
Currently it is one of the fastest edge preserving filters. Experiments show that the
guided filter is both effective and efficient in a great variety of computer vision and
computer graphics applications including edge aware smoothing, detail enhancement,
HDR compression, image matting /feathering, de-hazing, joint up-sampling, etc.
3.2 GUIDED IMAGE FILTERING
The guided filter assumes that the filtering output O is a linear transformation of
the guidance image I in a local window 𝜔𝑘 centered at pixel k.
𝑂𝑖=𝑎𝑘𝐼𝑖 + 𝑏𝑘 ∀𝑖𝜖𝜔𝑘 (3.1)
where 𝜔𝑘 is a square window of size (2r+1)×(2r+1).
The linear coefficients 𝑎𝑘 and 𝑏𝑘 are constant in 𝜔𝑘 and can be estimated by
minimizing the squared difference between the output image O and the input image P.
𝐸(𝑎𝑘, 𝑏𝑘) = ∑ (𝑖∈𝜔𝑘(𝑎𝑘𝐼𝑖 + 𝑏𝑘 − 𝑃𝑖)2 + 𝜖𝑎2
𝑘) (3.2)
19
where 𝜖 is a regularization parameter given by the user.
The coefficients 𝑎𝑘 and 𝑏𝑘 can be directly solved by linear regression equation as
follows:
𝑎𝑘=
1
|𝜔| ∑ (𝑖∈𝜔𝑘
𝐼𝑖𝑃𝑖 − 𝜇𝑘𝑃′𝑘)
𝛿𝑘+𝜖 (3.3)
𝑏𝑘 = 𝑃′𝑘 − 𝑎𝑘 𝜇𝑘 (3.4)
here, 𝜇𝑘 and 𝛿𝑘 are the mean and variance of I in 𝜔𝑘 respectively, |𝜔| is the number of
pixels in 𝜔𝑘, and 𝑃𝑘 is the mean of P in 𝜔𝑘. Next, the output image can be calculated
according to (3.1). As shown in Fig. 3.1, all local windows centered at pixel k in the
window 𝑃𝑖 will contain pixel i. So, the value of 𝑂𝑖 in (3.1) will change when it is
computed in different windows,𝜔𝑘. To solve this problem, all the possible values of
coefficients 𝑎𝑘and 𝑏𝑘 are first averaged. Then, the filtering output is estimated as follows:
𝑂𝑖= 𝑎′𝑖𝐼𝑖 + 𝑏′𝑖 (3.5)
where 𝑎′𝑖 = 1
|𝜔| ∑ (𝑘∈𝜔𝑖
𝑎𝑘), 𝑏′𝑖 = 1
|𝜔| ∑ (𝑘∈𝜔𝑖
𝑏𝑘). In this paper, 𝐺𝑟,𝜖(P, I) is used to
represent the guided filtering operation, where r and 𝜖 are the parameters which decide
the filter size and blur degree of the guided filter, respectively. Moreover, P and I is the
input image and guidance image, respectively.
Furthermore, when the input is a color image, the filtering output can be obtained
by conducting the guided filtering on the red, green, and blue channels of the input
image, respectively. And when the guidance image I is a color image, the guided filter
should be extended by the following steps.
20
Fig. 3.1 Window Choice.
First, equation (3.1) is rewritten as follows:
𝑂𝑖=𝑎𝑘𝑇𝐼𝑖 + 𝑏𝑘 ∀𝑖𝜖𝜔𝑘
(3.6)
where, 𝑎𝑘 is a 3 × 1 coefficient vector and 𝐼𝑖 is a 3 × 1 color vector. Then, similar to
(3.3)–(3.5), the output of guided filtering can be calculated as follows:
𝑎𝑘 = (∑𝑘 + 𝜖U)(1
|𝜔| ∑ (𝐼𝑖𝑖∈𝜔𝑘
𝑃𝑖 - 𝜇𝑘 𝑝′𝑘)) (3.7)
𝑏𝑘 = 𝑃′𝑘 − 𝑎𝑘𝑇 𝜇𝑘 (3.8)
𝑂𝑖= 𝑎′𝑖𝑇𝐼𝑖 + 𝑏′𝑖 (3.9)
where ∑𝑘 is the 3×3 covariance matrix of I in 𝜔𝑘, and U is the 3 × 3 identity matrix.
3.3 SMOOTHING OPERATORS
Smoothing operators are used to smoothen and decompose the source images.
Here average and Gaussian filter are used for decomposition.
3.3.1 AVERAGE FILTER
Replace each pixel by the average of its neighboring pixels. Assume a 3x3
neighborhood. The filtering is done as
I (i, j) = p0+p1+p2+p3+p4+p5+p6+p7+p8
9 (3.10)
21
In general a filter applies a function over the values of a small neighborhood of
pixels to compute the result. The size of the filter is equal to the size of the neighborhood:
3x3, 5x5, 7x7…. 21x21……. The shape of the filter region is not necessarily square, can
be a rectangle, a circle.
3.3.2 GAUSSIAN FILTER
It is used to blur images and remove noise and detail.
The Gaussian function is:
(3.11)
Where, σ is the standard deviation. The distribution is assumed to have a mean of 0.
The effect of Gaussian smoothing is to blur an image, in a similar fashion to
the mean filter. The degree of smoothing is determined by the standard deviation of the
Gaussian. (Larger standard deviation Gaussians, of course, require larger convolution
kernels in order to be accurately represented.)The Gaussian outputs a `weighted average'
of each pixel's neighbor-hood, with the average weighted more towards the value of the
central pixels. This is in contrast to the mean filter's uniformly weighted average.
Because of this, a Gaussian provides gentler smoothing and preserves edges better than a
similarly sized mean filter.
3.4 VARIOUS IMAGE SEGMENTATION TECHNIQUES
Image segmentation is typically used to locate objects and boundaries (lines,
curves, etc.) in images. Edge detection techniques have therefore been used as the base of
segmentation technique. Edge detection refers to the process of identifying and locating
sharp discontinuities in an image. Edge detection significantly reduces the amount of data
and filters out useless information, while preserving the important structural properties in
an image. The discontinuities are abrupt changes in pixel intensity which characterize
boundaries of objects in a scene.
22
3.4.1 SOBEL OPERATOR
The operator consists of a pair of 3×3 convolution kernels as shown in Fig.3.2.
One kernel is simply the other rotated by 90°.
𝐺𝑥 𝐺𝑦
Fig.3.2: Masks used by Sobel Operator
These kernels are designed to respond maximally to edges running vertically and
horizontally relative to the pixel grid, one kernel for each of the two perpendicular
orientations. The kernels can be applied separately to the input image, to produce
separate measurements of the gradient component in each orientation(𝐺𝑥, 𝐺𝑦). These can
then be combined together to find the absolute magnitude of the gradient. The gradient
magnitude is given by:
|𝐺| = √𝐺𝑥2 + 𝐺𝑦2 (3.12)
Typically, an approximate magnitude is computed using:
|𝐺| = |𝐺𝑥| + |𝐺𝑦| (3.13)
which is much faster to compute. Sobel operator is easy to achieve in space, has a
smoothing effect on the noise, can provide more accurate edge direction information but
it will also detect many false edges with coarse edge width.
3.4.2 ROBERT’S CROSS OPERATOR
The Roberts Cross operator performs a simple, quick to compute, 2-D spatial
gradient measurement on an image. Pixel values at each point in the output represent the
estimated absolute magnitude of the spatial gradient of the input image at that point. The
23
operator consists of a pair of 2×2 convolution kernels as shown in Fig.3.3. One kernel is
simply the other rotated by 90°. This is very similar to the Sobel operator.
𝐺𝑥 𝐺𝑦
Fig.3.3: Masks used by Robert Operator
These kernels are designed to respond maximally to edges running at 45° to the
pixel grid, one kernel for each of the two perpendicular orientations. The kernels can be
applied separately to the input image, to produce separate measurements of the gradient
component in each orientation(𝐺𝑥, 𝐺𝑦). These can then be combined together to find the
absolute magnitude of the gradient. The gradient magnitude is given by:
|𝐺| = √𝐺𝑥2 + 𝐺𝑦2 (3.14)
Typically, an approximate magnitude is computed using:
|𝐺| = |𝐺𝑥| + |𝐺𝑦| (3.15)
which is much faster to compute.
3.4.3 PREWITT’S OPERATOR
Prewitt operator is similar to the Sobel operator and is used for detecting vertical
and horizontal edges in images. The operator consists of a pair of 3×3 convolution
kernels as shown in Fig.3.4.
𝐺𝑥 𝐺𝑦
Fig.3.4: Masks used by Prewitt Operator
+1 0
0 -1
+1 0
0 -1
24
3.4.4 CANNY OPERATOR
The Steps involved in Canny edge detection algorithm is as follows:
1. Apply Gaussian filter to smooth the image in order to remove the noise.
2. Find the intensity gradients of the image. The edges should be marked where the
gradients of the image has large magnitudes. The mask used for finding the gradients is
given in Fig.3.5.
𝐺𝑥 𝐺𝑦
Fig.3.5: Masks used by Canny Operator
3. Apply non-maximum suppression to get rid of spurious response to edge detection.
Non-maxima suppression is applied to the gradient magnitude to trace move along the
edge direction and suppress those pixel values that are not considered edge and thus
resulting in thinning of edge.
4. Apply double threshold to determine potential edges
5. Track edge by hysteresis: Finalize the detection of edges by suppressing all the other
edges that are weak and not connected to strong edges.
3.4.5 LAPLACIAN OF GAUSSIAN
Laplacian filters are derivative filters used to find areas of rapid change (edges) in
images. Since derivative filters are very sensitive to noise, it is common to smooth the
image (e.g., using a Gaussian filter) before applying the Laplacian. This process is called
the Laplacian of Gaussian (LoG) operation.
The Laplacian L(x, y) of an image with pixel intensity values I(x, y) is given by:
𝐿(𝑥, 𝑦) =𝜕2𝑓(𝑥,𝑦)
𝜕𝑥2 +
𝜕2𝑓(𝑥,𝑦)
𝜕𝑦2 (3.16)
There are different ways to find an approximate discrete convolution kernel that
approximates the effect of the Laplacian.
25
Two commonly used kernels are
Fig.3.6 Laplacian filter
This is called a positive Laplacian because the central peak is positive. It is just as
appropriate to reverse the signs of the elements, to get a positive Laplacian. To include a
smoothing Gaussian filter, combine the Laplacian and Gaussian functions to obtain a
single equation:
𝐿𝑜𝐺(𝑥, 𝑦) = − 1
𝜋𝜎4 [1 −𝑥2+𝑦2
2𝜎2 ] 𝑒−𝑥2+𝑦2
2𝜎2 (3.17)
The LoG operator takes the second derivative of the image. Where the image is basically
uniform, the LoG will give zero. Wherever a change occurs, the LoG will give a positive
response on the darker side and a negative response on the lighter side. At a sharp edge
between two regions, the response will be
• zero away from the edge
• positive just to one side
• negative just to the other side
• zero at some point in between on the edge itself
If the original image is filtered with a simple Laplacian, the resulting output is rather
noisy. Using a larger σ for the Gaussian will reduce the noise, but the sharpening effect
will be reduced. Hence it is full of compromises.
3.4.6 PSO (PARTICLE SWARM OPTIMIZATION) SEGMENTATION
The process of dividing a image into its multiple segments is called as Image
segmentation. Particle swarm optimization belongs to the class of swarm intelligence
techniques that are used to solve optimization problems. PSO simulates the behaviors of
bird flocking. Means, a group of birds are randomly searching food in an area. There is
26
only one piece of food in the area being searched. All the birds do not know where the
food is. But they know how far the food is in each iteration. So the best way to find the
food is to follow the bird which is nearest to the food. Flocking behavior is the behavior
exhibited when a group of birds, called a flock, are foraging.
Each particle in PSO [13,14] is updated by following two "best" values:
pbest- Each particle keeps track of its coordinates in the solution space which are
associated with the best solution (fitness) that has achieved so far by that particle. This
value is called personal best, pbest.
gbest- The best among the pbest's is tracked by the PSO. This best is called Global Best,
gbest.
Each particle modifies its position with the help of,
• the current positions,
• the current velocities,
• the distance between the current position and pbest,
• the distance between the current position and the gbest.
After finding the two best values, the particle updates its velocity and positions with the
following equation,
𝑣[ ] = 𝑣[ ] + 𝑐1 ∗ 𝑟𝑎𝑛𝑑( ) ∗ (𝑝𝑏𝑒𝑠𝑡[ ] − 𝑝𝑝𝑟𝑒𝑠𝑒𝑛𝑡[ ]) + 𝑐2 ∗ 𝑟𝑎𝑛𝑑( ) ∗ (𝑔𝑏𝑒𝑠𝑡[ ] −
𝑝𝑝𝑟𝑒𝑠𝑒𝑛𝑡[ ]) (3.18)
𝑝𝑟𝑒𝑠𝑒𝑛𝑡[ ] = 𝑝𝑟𝑒𝑠𝑒𝑛𝑡[ ] + 𝑣[ ] (3.19)
where, v[] is the particle velocity, persent[] is the current particle (solution). rand() is a
random number between (0; 1).c1; c2 are learning factors. usually c1 = c2 = 2.
ALGORITHM
The Algorithm of PSO Segmentation is sequenced as,
Step 1: Read the input image to be segmented.
Step 2: Select PSO method to be applied on that image with a particular threshold level.
Step 3: For each particle in the population update particle’s fitness in the search space and
update particle’s best in the search space, then move the particle in the population.
27
Step 4: For each particle, if swarm gets better then extend the swarm/particle life.
Step 5: For each particle, if swarm is not improving its performance then reduce the
swarm life.
Step 6: The swarm is considered for next iteration.
Step 7: The failed swarms are deleted.
Step 8: Reset threshold counter.
3.5 WEIGHTED AVERAGING
An average in which each quantity to be averaged is assigned a weight. These
weightings determine the relative importance of each quantity on the average. Weightings
are the equivalent of having that many like items with the same value involved in the
average.
3.6 BLOCK DIAGRAM
Fig.3.7 Block Diagram
Source images
I1 and I2
Base layers
B1 and B2
-
Detail layers
D1 and D2
Segmentation Weight map
P1 and P2
Refined Weight
map Saliency
comparison
Guided
filtering
Fused base
layer B
Fused
image
F Fused detail
layer
D
+
Smoothing
operator weighted
averaging
28
The sequence of steps are described as shown in Fig.3.7.
3.6.1 IMAGE DECOMPOSITION
The base layer is obtained from Gaussian filtering the source images.
𝐵𝑛=𝐼𝑛*Z (3.20)
where 𝐵𝑛 is the base layers, Z is the gaussian filter coefficient, 𝐼𝑛 is the source images.
The detail layers are obtained by subtracting the base layers from source images.
𝐷𝑛= 𝐼𝑛 – 𝐵𝑛 (3.21)
3.6.2 WEIGHT MAP CONSTRUCTION
PSO segmentation is applied to source images to get the Saliency map.
𝑆𝑛 = 𝐼𝑛 ∗ 𝑃𝑆𝑂 𝑠𝑒𝑔𝑚𝑒𝑛𝑡𝑎𝑡𝑖𝑜𝑛 (3.22)
where, 𝑆𝑛 is the saliency map.
Weight map is obtained by saliency comparison of Saliency maps.
𝑃𝑛𝑘 = {1 if Sn
k = max( S1k, S2
k, … SNk )
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 (3.23)
where N is number of source images, Snk is the saliency value of the pixel k in the nth
image.
Saliency Comparison refers to assigning similar weights if two adjacent pixels have
similar brightness or color.
Refined weight map is constructed by using guided filter.
• Computes the filtering output by considering the guidance image.
• Guidance image can be the input image itself or another different image.
29
Guided image filtering is performed on each weight map 𝑃𝑛 with the corresponding
source image 𝐼𝑛 serving as the guidance image.
WnB = 𝐺𝑟1 ,𝜀1
(𝑃𝑛, 𝐼𝑛) (3.24)
WnD= 𝐺𝑟2 ,𝜀2
(𝑃𝑛, 𝐼𝑛) (3.25)
3.6.3 IMAGE RECONSTRUCTION
The base and detail layers of different source images and the refined weight maps are
fused together by weighted averaging.
B’ = ∑ WnB. 𝐵𝑛
𝑁𝑛=1 (3.26)
D’ = ∑ WnD. 𝐷𝑛
𝑁𝑛=1 (3.27)
The final fused image is obtained by combining B’ and D’.
F = B’ + D’ (3.28)
3.7 PARAMETERS USED FOR COMPARISON
MEAN SQUARED ERROR (MSE)
Mean square error between the reference image and the fused image is given by
𝑀𝑆𝐸 =1
𝑚𝑛∑ ∑ (𝐴𝑖𝑗 − 𝐵𝑖𝑗)
2𝑛𝑗=1
𝑚𝑖=1 (3.29)
where,
• 𝐴𝑖𝑗 - the reference image,
• 𝐵𝑖𝑗- the fused image to be assessed,
• i – pixel row index,
• j – pixel column index,
• m, n - No. of row and column
30
PEAK SIGNAL TO NOISE RATIO (PSNR)
The ratio between maximum possible power of the signal to the power of the
corrupting noise that creates distortion of image. The PSNR measure is given by
𝑃𝑆𝑁𝑅 = 10 log102552
𝑀𝑆𝐸 (3.30)
STRUCTURAL SIMILARITY INDEX MEASURE(SSIM)
The Structural similarity index is a measure of structural information change in the
fused image such as luminance and contrast. The SSIM index is a decimal value between
0 and 1. The equation for SSIM is
SSIM = (2𝜇𝑥𝜇𝑦+𝑐1)(2𝜎𝑥𝑦+𝑐2)
(𝜇𝑥2+𝜇𝑦
2+𝑐1)(𝜎𝑥2+𝜎𝑦
2+𝑐2) (3.31)
where
𝜇𝑥 the average of x
𝜇𝑦 the average of y
𝜎𝑥2 the variance of x
𝜎𝑦2 the variance of y
𝜎𝑥𝑦 the covariance of x and y
𝑐1 and 𝑐2 two variables to stabilize the division with weak denominator
NORMALIZED CROSS CORRELATION(NK)
Normalized cross correlation is used to find out similarities between fused image
and registered image and it is given by the following equation
𝑁𝐾 =∑ ∑ (𝐴𝑖𝑗.𝐵𝑖𝑗)𝑛
𝑗=1𝑚𝑖=1
∑ ∑ (𝐴𝑖𝑗)2𝑛
𝑗=1𝑚𝑖=1
(3.32)
MAXIMUM DIFFERENCE(MD)
Difference between any two pixels such that the larger pixel appears after the smaller
pixel. The large value of maximum difference means that image is poor in quality.
31
𝑀𝐷 = 𝑀𝑎𝑥(|𝐴𝑖𝑗−𝐵𝑖𝑗|) (3.33)
NORMALIZED ABSOLUTE ERROR (NAE)
The large value of normalized absolute error means that image is poor quality.
NAE is defined as follows
𝑁𝐴𝐸 =∑ ∑ (|𝐴𝑖𝑗−𝐵𝑖𝑗|)𝑛
𝑗=1𝑚𝑖=1
∑ ∑ (𝐴𝑖𝑗)𝑛𝑗=1
𝑚𝑖=1
(3.34)
32
CHAPTER 4
SIMULATION RESULTS
The results of guided filter weighted averaging based image fusion method on MS
and PAN images are presented in this chapter. The performance measure that has been
used to analyse the results are Mean Square Error (MSE), Peak Signal to Nose ratio
(PSNR) and Structural Similarity index (SSIM),Normalized Cross Correlation(NK),
Maximum Difference(MD),Normalized Absolute Error(NAE). The images were analysed
and processed in MATLAB. IKONOS and Quick Bird satellite images are used here.
4.1 RESULTS
4.1.1 INPUT IMAGES OF IKONOS SENSOR
IKONOS is a commercial earth observation satellite, and was the first to collect
publicly available high resolution imagery at 1 and 4 meter resolution. It offers
multispectral (MS) and panchromatic (PAN) imagery. The IKONOS launch was called
“one of the most significant developments in the history of the space age”. Spatial
resolution of the sensor is 0.8 m for panchromatic (1m PAN) and 4 meter for
multispectral (4m MS). Spectral resolution of Ikonos sensor is given in Table.4.1.
Table.4.1. Spectral Resolution of Ikonos Sensor
Band Wavelength
Region(μm)
1 0.45-0.52 (blue)
2 0.52-0.60(green)
3 0.63-0.69(red)
4 0.76-0.90(nearIR)
PAN 0.45-0.90(PAN)
33
The input images (panchromatic and multispectral) from the Ikonos sensor are shown as
in Fig.4.1.
(a) (b)
Fig 4.1 Panchromatic and Multispectral Image of Ikonos Sensor
4.1.2 INPUT IMAGES OF QUICKBIRD SENSOR
QuickBird was a high resolution commercial earth observation satellite, owned by
Digital Globe. QuickBird used Ball Aerospace's Global Imaging System 2000 (BGIS
2000). The satellite was initially expected to collect at 1 meter resolution. The satellite
collected panchromatic (black and white) imagery at 61 centimeter resolution and
multispectral imagery at 2.44 (at 450 km) to 1.63meter (at 300km) resolution, as orbit
altitude is lowered during the end of mission life. At this resolution, detail such as
buildings and other infrastructure are easily visible. The imagery can be can be used in
mapping applications, such as Google Earth and Google Maps. Spectral resolution of
Ikonos sensor is given in Table.4.2.
34
Table.4.2. Spectral Resolution of Quickbird Sensor
Band Wavelength
Region(nm)
1 430 - 545 (blue)
2 466 - 620 (green)
3 590 - 710 (red)
4 715 - 918 (nearIR)
PAN 405 - 1053 (PAN)
The input images (panchromatic and multispectral) from the Quickbird sensor are shown
as in Fig.4.2.
(a) (b)
Fig 4.2 Panchromatic and Multispectral Image of Quickbird Sensor
4.1.3 COMPARISON OF FUSION ALGORITHMS
The various fusion algorithms such as Weighted Average, Laplacian Pyramid
transform, Principal Component Analysis and Discrete Wavelet Transform are performed
and their metrics have been evaluated. The fused results are as in following figures. The
smoothing operator used is average filter and edge detection is Laplacian of Gaussian.
35
(a) (b)
Fig.4.3 Fused result of Weighted Average Fusion
(a) (b)
Fig.4.4 Fused result of Laplacian Pyramid Fusion
36
(a) (b)
Fig.4.5 Fused result of Principal Component Analysis Fusion
(a) (b)
Fig.4.6 Fused result of Discrete Wavelet Transform Fusion
From the fused results, the fusion is clear in weighted averaging, whereas DWT and LPT
provide only brightness and PCA is too dark.
37
PERFORMANCE METRICS
The performance of the fusion methods is given in Table.4.3 and Table.4.4.
Table.4.3.Performance Metrics of Fusion Algorithms of Input set 1
METRICS PSNR MD NK NAE MSE SSIM
Weighted average 18.7739 175 0.9386 0.4192 862.3716 0.4599
Laplacian Pyramid
Transform
9.7281 254.02 0.0058 0.9862 6922.6 0.0018
PCA 12.4176 207 0.2963 0.7488 3726.6 0.2912
DWT 9.7279 254 0.0058 0.9868 6922.9 0.0017
Table.4.4.Performance Metrics of Fusion Algorithms of Input set 2
METRICS PSNR MD NK NAE MSE SSIM
Weighted average 17.7332 132 0.8181 0.2241 1095.9 0.7403
Laplacian Pyramid
Transform
5.5342 254.06 0.0040 0.9961 18183 7.7617e-05
PCA 7.2653 164 0.2035 0.8342 12205 0.2466
DWT 5.5344 254.02 0.0040 0.9961 18182 7.6097e-05
38
4.1.4 COMPARISON OF VARIOUS SEGMENTATION TECHNIQUES
The Various Segmentation methods such as Sobel, Prewitt, Roberts, Canny, LoG,
PSO Segmentation are carried out on the segmentation block and compared by using
various performance metrics. Average filter is carried out at the smoothing operator
block. The fused result of both set of input images for these various methods are shown
in below figures.
(a) (b)
Fig 4.7 Fused Images using LoG operator
(a) (b)
Fig 4.8 Fused Images using Canny operator
39
(a) (b)
Fig 4.9 Fused Images using Sobel operator
(a) (b)
Fig 4.10 Fused Images using Prewitt operator
40
Fig 4.11 Fused Images using Roberts operator
Fig 4.12 Fused Images using PSO Segmentation
From the fused images above, PSO Segmentation based fused image provides
good information than other edge detection methods such as LoG, Robert, Canny, Prewitt
and Sobel.
41
PERFORMANCE METRICS
The Performance metrics such as Mean Square Error(MSE), Peak Signal to Noise
Ratio(PSNR), Normalized Cross Correlation(NK), Maximum Difference(MD),
Normalized Absolute Error(NAE) and Structural Similarity Index Measure are calculated
for both Quickbird and Ikonos sensor images and are tabulated as shown in Table.4.5 and
Table.4.6.
Table.4.5.Performance Metrics of Quickbird image
Table.4.6.Performance Metrics of Ikonos image
MSE PSNR NK MD NAE SSIM
LOG 861.5980 18.7778 0.9387 175 0.4191 0.4600
PSO segmentation 366.4901 22.4902 1.0542 151 0.1905 0.6195
CANNY 1.3768e+03 16.7420 0.8745 187 0.5185 0.1583
SOBEL 687.0460 19.7609 1.0585 185 0.3355 0.2342
PREWITT 1.3773e+03 16.7405 0.8743 188 0.5186 0.1583
ROBERTS 1.3774e+03 16.7403 0.8743 187 0.5186 0.1583
MSE PSNR NK MD NAE SSIM
LOG 1.0961e+03 17.7322 0.8181 132 0.2242 0.7402
PSO segmentation 541.6869 20.7933 1.0675 131 0.1493 0.7436
CANNY 2.9450e+03 13.4400 0.7660 247 0.3404 0.0522
SOBEL 2.2685e+03 14.5735 0.9932 218 0.3994 0.0731
PREWITT 2.9478e+03 13.4358 0.7655 247 0.3406 0.0521
ROBERTS 2.9480e+03 13.4355 0.7655 247 0.3407 0.0521
42
INFERENCE
From the Table.4.5 and Table.4.6 it is clear that the PSO segmentation performs the best
compared to others.
4.1.5 COMPARISON OF SMOOTHING OPERATORS
Smoothing operators such as average, Gaussian, disk are used at the smoothing
operator block and are compared with the metrics. As PSO segmentation performs the
best in the above result, it is taken as the segmentation method in this process. The results
are as follows.
(a) (b)
Fig 4.13 Fused Images using average smoothing operator
Disk smoothing operator is nothing but a circular averaging filter. It filters the image
based on the radius value given. The result of the disk smoothing filter is given in
Fig.4.14.
43
(a) (b)
Fig 4.14 Fused Images using disk smoothing operator
(a) (b)
Fig 4.15 Fused Images using Gaussian smoothing operator
Gaussian smoothing provides clear image than average and disk smoothing.
44
PERFORMANCE METRICS
The Performance metrics such as Mean Square Error(MSE), Peak Signal to Noise
Ratio(PSNR), Normalized Cross Correlation(NK), Maximum Difference(MD),
Normalized Absolute Error(NAE) and Structural Similarity Index Measure(SSIM) are
calculated for various smoothing operators and are tabulated as shown in Table.4.7 and
Table.4.8.
Table.4.7.Comparison Table of Quickbird image
Table.4.8. Comparison Table of Ikonos image
INFERENCE
From the Table.4.7 and Table.4.8 it is clear that the Gaussian smoothing operator is best
suited for this process of image fusion.
MSE PSNR MD NAE SSIM
Average smoothing 366.4901 22.4902 151 0.1905 0.6195
Disk smoothing 125.8396 27.1326 119 0.0935 0.7493
Gaussian smoothing 46.7074 31.4370 109 0.0464 0.8963
MSE PSNR MD NAE SSIM
Average smoothing 541.6869 20.7933 131 0.1493 0.7436
Disk smoothing 478.7569 21.3297 164 0.1378 0.7883
Gaussian smoothing 246.6784 24.2095 122 0.0906 0.9039
45
CHAPTER 4
CONCLUSION
The fusion of panchromatic and multispectral images is carried out using guided
filtering and weighted averaging technique. Gaussian filtering provides better
decomposition of the images based on intensity than average filtering process. PSO
Segmentation based edge detection is performed to detect the edges clearly compared to
other methods. Analysing the results, the segmentation is further used to optimize the
problem. Quickbird and Ikonos satellite images are used. The performance of the
technique is analysed by using the performance metrics such as Mean Square Error, Peak
Signal to Noise Ratio, Normalized Cross Correlation, Maximum Difference, Normalized
Absolute Error and Structural Similarity Index Measure. The image details are preserved
using this technique.
In future, the fusion architecture can be implemented on FPGA with low power,
reduced area and high performance. This fusion process can also be further extended by
integrating multispectral and hyper spectral images.
46
REFERENCES
1. M. Xu, H. Chen, and P. Varshney, “An image fusion approach based on markov
random fields,” IEEE Trans. Geosci. Remote Sens., vol. 49,no. 12, pp. 5116–5127,
Dec. 2011.
2. J. Liang, Y. He, D. Liu, and X. Zeng, “Image fusion using higher order singular
value decomposition,” IEEE Trans. Image Process., vol. 21, no. 5, pp. 2898–2909,
May 2012.
3. Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assessment:
From error visibility to structural similarity,” IEEE Trans. Image Process., vol. 13,
no. 4, pp. 600–612, Apr. 2004.
4. Z. Liu, E. Blasch, Z. Xue, J. Zhao, R. Laganiere, and W. Wu, “Objective
assessment of multiresolution image fusion algorithms for context enhancement in
night vision: A comparative study,” IEEE Trans. Pattern Anal. Mach. Intell., vol.
34, no. 1, pp. 94–109, Jan. 2012.
5. R. Shen, I. Cheng, J. Shi, and A. Basu, “Generalized random walks for fusion of
multi-exposure images,” IEEE Trans. Image Process., vol. 20, no. 12, pp. 3634–
3646, Dec. 2011.
6. M. Kumar and S. Dass, “A total variation-based algorithm for pixel level image
fusion,” IEEE Trans. Image Process., vol. 18, no. 9, pp. 2137–2143, Sep. 2009.
7. Z. Wang and A. Bovik, “A universal image quality index,” IEEE Signal Process.
Letters, vol. 9, no. 3, pp. 81–84, Mar. 2002.
8. D. Looney and D. Mandic, “Multiscale image fusion using complex extensions of
EMD,” IEEE Trans. Signal Process., vol. 57, no. 4, pp. 1626–1630, Apr. 2009.
9. K. He, J. Sun, and X. Tang, “Guided image filtering,” in Proc. Eur.Conf. Comput.
Vis., Heraklion, Greece, pp. 1–14.,Sep. 2010.
10. D. Socolinsky and L. Wolff, “Multispectral image visualization through first-order
fusion,” IEEE Trans. Image Process., vol. 11, no. 8, pp. 923–931, Aug. 2002.
47
11. J. Tian and L. Chen, “Adaptive multi-focus image fusion using a waveletbased
statistical sharpness measure,” Signal Process., vol. 92, no. 9,pp. 2137–2146, Sep.
2012.
12. Z. Farbman, R. Fattal, D. Lischinski, and R. Szeliski, “Edge-preserving
decompositions for multi-scale tone and detail manipulation,” ACM Trans.
Graph., vol. 27, no. 3, pp. 67-1–67-10, Aug. 2008.
13. A. A. Goshtasby and S. Nikolov, “Image fusion: Advances in the state of the art,”
Inf. Fusion, vol. 8, no. 2, pp. 114–118, Apr. 2007.
14. Q. Zhang and B. Guo, “Multifocus image fusion using the non-subsampled
contourlet transform,” Signal Process. , vol. 89, no. 7, pp. 1334–1346, Jul. 2009.
15. Shutao Li, Xudong Kang, Jianwen Hu, “Image Fusion with Guided Filtering”,
IEEE Trans. Image Process., vol. 22, no. 7, pp. 2864–2875, Jul.2013.
48
LIST OF PUBLICATIONS
Conferences
Presented a paper titled “PSO Segmentation Based Satellite Image Fusion along
with Guided Filter” in the 5th
National Conference on “Communication,
Information & Telematics”-CITEL 2016 on 30th
-31st Mar 2016 at Kumaraguru
College of Technology, Coimbatore.
Presented a paper titled “Satellite Image Fusion with PSO Segmentation and
Guided Filter” in an International Conference on Engineering Digital Green Era -
EDGE 2016 on 17-19 Mar 2016 at Rajalakshmi Engineering College, Chennai.
Presented a paper titled “Fusion of Satellite Images with Guided Filter” in an
IEEE Sponsored 3rd
International Conference on Electronics and Communication
Systems(ICECS) on 25th
-26th
Feb 2016 at Karpagam College of Engineering,
Coimbatore.