FUSION OF SATELLITE IMAGES WITH GUIDED FILTER · information than source images. An effective and fast image fusion method is proposed for creating a fused image from multiple images

FUSION OF SATELLITE IMAGES

WITH GUIDED FILTER

A PROJECT REPORT

Submitted by

NATHIYA V N

Register No: 13MCO13

in partial fulfillment for the requirement of award of the degree

of

MASTER OF ENGINEERING

in

COMMUNICATION SYSTEMS

Department of Electronics and Communication Engineering

KUMARAGURU COLLEGE OF TECHNOLOGY

(An autonomous institution affiliated to Anna University, Chennai)

COIMBATORE-641049

ANNA UNIVERSITY: CHENNAI 600 025

APRIL 2016

ii

BONAFIDE CERTIFICATE

Certified that this project report titled “FUSION OF SATELLITE IMAGES WITH

GUIDED FILTER” is the bonafide work of NATHIYA.V.N [Reg. No. 13MCO13] who

carried out the research under my supervision. Certified further, that to the best of my

knowledge the work reported herein does not form part of any other project or dissertation

on the basis of which a degree or award was conferred on an earlier occasion on this or any

other candidate.

HHHH

The Candidate with Register No. 13MCO13 was examined by us in the

project viva –voice examination held on............................

INTERNAL EXAMINER EXTERNAL EXAMINER

SIGNATURE

Ms. S.NAGARATHINAM

PROJECT SUPERVISOR

Department of ECE

Kumaraguru College of Technology

Coimbatore-641 049

SIGNATURE

Dr. A.VASUKI

HEAD OF THE DEPARTMENT

Department of ECE

Kumaraguru College of Technology

Coimbatore-641 049

iii

ACKNOWLEDGEMENT

First, I would like to express my praise and gratitude to the Lord, who has

showered his grace and blessings enabling me to complete this project in an excellent

manner.

I express my sincere thanks to the management of Kumaraguru College of

Technology and Joint Correspondent Shri Shankar Vanavarayar for his kind

support and for providing necessary facilities to carry out the work.

I would like to express my sincere thanks to our beloved Principal

Dr.R.S.Kumar Ph.D., Kumaraguru College of Technology, who encouraged me with

his valuable thoughts.

I would like to thank Dr.A.Vasuki Ph.D., Head of the Department, Electronics

and Communication Engineering, for her kind support and for providing necessary

facilities to carry out the project work.

In particular, I wish to thank with everlasting gratitude to the Project

Coordinator Dr.M.Alagumeenaakshi Ph.D., Asst. Professor-III, Department of

Electronics and Communication Engineering, throughout the course of this project

work.

I am greatly privileged to express my heartfelt thanks to my project guide

Ms.S.Nagarathinam M.E., (Ph.D)., Asst. Professor-III, Department of Electronics

and Communication Engineering, for her expert counselling and guidance to make

this project to a great deal of success and I wish to convey my deep sense of gratitude

to all teaching and non-teaching staff of ECE Department for their help and

cooperation.

Finally, I thank my parents and my family members for giving me the moral

support and abundant blessings in all of my activities and my dear friends who helped

me to endure my difficult times with their unfailing support and warm wishes.

iv

ABSTRACT

In remote sensing, image fusion is one of the fast growing techniques to integrate

two images. Remote sensing satellites often provide panchromatic (PAN) image with

high spatial resolution and multispectral (MS) image with high spectral resolution. Image

Fusion refers to combine two or more images into a single image which gives more

information than source images. An effective and fast image fusion method is proposed

for creating a fused image from multiple images. A multispectral image is bundled with a

higher-resolution panchromatic image to produce a pan sharpened image, where the color

bands are merged with the high-resolution black-and-white imagery using Guided filter

and weighted averaging fusion technique. Guided filter is one of the recent filtering

methods used in image processing. The proposed technique is to analyze the performance

of the Guided filter and to make use of spatial consistency for fusion of the base and

detail layers. The fused image is enhanced using edge preservation. Proposed method is

based on intensity variations in the base and detail layer of the input images. The various

edge detection methods doesn’t produce good results, hence PSO segmentation is used

for saliency map construction. The source images that are used is from Quickbird and

Ikonos sensor.

TABLE OF CONTENTS

CHAPTER

NO

TITLE

PAGE

NO

ABSTRACT iv

LIST OF FIGURES vii

LIST OF TABLES viii

LIST OF ABBREVIATIONS ix

1 INTRODUCTION 1

1.1 Overview 1

1.2 Satellite Remote Sensing 1

1.3 Types of Satellite Images 3

1.4 Types of Resolution 4

1.5 Image Fusion 5

1.6 Image Fusion Algorithms 6

1.6.1 Spatial Domain Fusion Techniques 6

1.6.2 Transform Domain Fusion Techniques 9

2 LITERATURE SURVEY 12

3 METHODOLOGY 18

3.1 Introduction 18

3.2 Guided Image Filtering 18

3.3 Smoothing Operators 20

3.3.1 Average Filter 20

3.3.2 Gaussian Filter 21

3.4 Various Image Segmentation Techniques 21

3.4.1 Sobel Operator 22

3.4.2 Robert’s Cross Operator 22

3.4.3 Prewitt’s Operator 23

3.4.4 Canny Operator 24

3.4.5 Laplacian of Gaussian 24

3.4.6 PSO Segmentation 25

3.5 Weighted Averaging 27

3.6 Block Diagram 27

3.6.1 Image Decomposition 28

3.6.2 Weight Map Construction 28

3.6.3 Image Reconstruction 29

3.7 Parameters used for Comparison 29

4 SIMULATION RESULTS 32

4.1 Results 32

4.1.1 Input Images of Ikonos Sensor 32

4.1.2 Input Images of Quickbird Sensor 33

4.1.3 Comparison of Fusion Algorithms 34

4.1.4 Comparison of various Segmentation Techniques 38

4.1.5 Comparison of Smoothing Operators 42

5 CONCLUSION 45

REFERENCES 46

LIST OF PUBLICATIONS 48

vii

LIST OF FIGURES

FIGURE

NO.

CAPTION

PAGE

NO.

1.1 Spectrum of Objects 2

1.2 Advantage of image fusion 5

1.3 Principal Component Analysis 7

1.4 IHS 8

1.5 Laplacian Pyramid 9

1.6 Discrete Wavelet Transform 11

3.1 Window Choice 20

3.2 Masks used by Sobel Operator 22

3.3 Masks used by Robert Operator 23

3.4 Masks used by Prewitt Operator 23

3.5 Masks used by Canny Operator 24

3.6 Laplacian Filter 25

3.7 Block Diagram 27

4.1 Panchromatic and Multispectral Image of Ikonos Sensor 33

4.2 Panchromatic and Multispectral Image of Quickbird Sensor 34

4.3 Fused result of Weighted Average Fusion 35

4.4 Fused result of Laplacian Pyramid Fusion 35

4.5 Fused result of Principal Component Analysis Fusion 36

4.6 Fused result of Discrete Wavelet Transform Fusion 36

4.3 Fused Images using LoG Operator 38

4.4 Fused Images using Canny Operator 38

4.5 Fused Images using Sobel Operator 39

4.6 Fused Images using Prewitt Operator 39

4.7 Fused Images using Roberts Operator 40

4.8 Fused Images using PSO Segmentation 40

4.9 Fused Images using average smoothing operator 42

4.10 Fused Images using disk smoothing operator 43

4.11 Fused Images using Gaussian smoothing operator 43

viii

viii

LIST OF TABLES

TABLE

NO.

CAPTION

PAGE

NO.

4.1 Spectral Resolution of Ikonos Sensor 32

4.2 Spectral Resolution of Quickbird Sensor 34

4.3 Performance Metrics of Fusion Algorithms of Input set 1 37

4.4 Performance Metrics of Fusion Algorithms of Input set 2 37

4.5 Performance Metrics of Quickbird image 41

4.6 Performance Metrics of Ikonos image 41

4.7 Comparison Table of Quickbird image 44

4.8 Comparison Table of Ikonos image 44

ix

LIST OF ABBREVIATIONS

RADAR Radio Detection and Ranging

PAN Panchromatic Image

MS Multispectral Image

RGB Red Green Blue

PCA Principal Component Analysis

IHS Intensity Hue Saturation

DCT Discrete Cosine Transform

DWT Discrete Wavelet Transform

MRF Markov Random Field

WLS Weighted Least Square

TV Total Variation

LoG Laplacian of Gaussian

PSO Particle Swarm Optimization

MSE Mean Square Error

PSNR Peak Signal to Noise Ratio

SSIM Structural Similarity Index Measure

NK Normalized Cross Correlation

MD Maximum Difference

NAE Normalized Absolute Error

ix

1

CHAPTER 1

INTRODUCTION

1.1 OVERVIEW

Remote sensing is the process of acquiring data or information about objects and

substances not in direct contact with the sensor, by gathering its inputs using

electromagnetic radiation or acoustical waves that emanate from the targets of interest.

Remote sensing makes it possible to collect data on dangerous or inaccessible areas.

Remote sensing also replaces costly and slow data collection on the ground, ensuring in

the process that areas or objects are not disturbed. The sun is a source of energy or

radiation, which provides a very convenient source of energy for remote sensing. The

sun's energy is either reflected, as it is for visible wavelengths, or absorbed and then

reemitted, as it is for thermal infrared wavelengths. There are two main types of remote

sensing: Passive remote sensing and Active remote sensing.

Passive remote sensing - Remote sensing systems which measure energy that is

naturally available are called passive sensors. The sensor records energy that is reflected

such as visible wavelenghts from the sun or emitted (thermal infrared) from the source.

Examples of passive remote sensors include film photography, infrared, and radiometers.

Active remote sensing - Active sensors emits energy in order to scan objects or areas. It

then detects and measures the radiation that is reflected or backscattered from the target.

RADAR is an example of active remote sensing where the time delay between emission

and return is measured, establishing the location, height, speeds and direction of an

object.

1.2 SATELLITE REMOTE SENSING

Remote sensing images are acquired by earth observation satellites. These remote

sensing satellites are equipped with sensors looking down to the earth. Orbital platforms

collect and transmit data from different parts of the electromagnetic spectrum, which in

conjunction with larger scale aerial or ground-based sensing and analysis provides

2

researchers with enough information to monitor trends. Satellite sensors record the

intensity of electromagnetic radiation (sunlight) reflected from the earth at different

wavelengths. Energy that is not reflected by an object is absorbed. Each object has its

own unique 'spectrum' as shown in Fig.1.1.

Fig.1.1 Spectrum of Objects

Remote sensing relies on the fact that particular features of the landscape such as bush,

crop, salt-affected land and water reflect light differently in different wavelengths. Grass

looks green, for example, because it reflects green light and absorbs other visible

wavelengths. This can be seen as a peak in the green band in the reflectance spectrum for

green grass above. The spectrum also shows that grass reflects even more strongly in the

infrared part of the spectrum. This can't be detected by the human eye but it can be

detected by an infrared sensor. Instruments mounted on satellites detect and record the

energy that has been reflected. The detectors are sensitive to particular ranges of

wavelengths, called 'bands'. The satellite systems are characterised by the bands at which

they measure the reflected energy. The Landsat TM satellite, which provides the data

used in this project, has bands at the blue, green and red wavelengths in the visible part of

the spectrum and at three bands in the near and mid infrared part of the spectrum and one

3

band in the thermal infrared part of the spectrum. The satellite detectors measure the

intensity of the reflected energy and record it.

1.3 TYPES OF SATELLITE IMAGES

• Panchromatic image

• Multispectral image

• Hyperspectral image

Panchromatic image

A panchromatic image consists of only one band. Thus, a panchromatic image

may be similarly interpreted as a black-and-white aerial photograph of the area.

Panchromatic image is usually displayed as a gray scale image, i.e. the displayed

brightness of a particular pixel is proportional to the pixel digital number which is related

to the intensity of solar radiation reflected by the targets in the pixel and detected by the

detector. The spectral information or "colour" of the targets is lost. IKONOS Pan,

QuickBird Pan, SPOT Pan, LANDSAT ETM+ Pan is an example of panchromatic

sensor.

Multispectral image

A multispectral image is one that captures image data at specific frequencies across

the electromagnetic spectrum. A multispectral image consists of several bands of data.

For visual display, each band of the image may be displayed one band at a time as a gray

scale image, or in combination of three bands at a time as a colour composite image. It is

a multilayer image which contains both the brightness and spectral information of the

targets being observed. Multispectral data sets are usually composed of about 5 to 10

bands of relatively large bandwidths (70-400 nm). Landsat TM, MSS, Spot HRV-XS,

Ikonos MS, QuickBird MS is an example of multispectral imaging.

Hyperspectral Image

It acquires images in about a hundred or more contiguous spectral bands. The

precise spectral information contained in a hyperspectral image enables better

characterisation and identification of targets. Hyperspectral images are of high spectral

http://www.crisp.nus.edu.sg/~research/tutorial/optical.htm#pan

http://www.crisp.nus.edu.sg/~research/tutorial/optical.htm#xs

http://www.crisp.nus.edu.sg/~research/tutorial/optical.htm#pan

https://en.wikipedia.org/wiki/Electromagnetic_spectrum

http://www.crisp.nus.edu.sg/~research/tutorial/optical.htm#xs

http://www.crisp.nus.edu.sg/~research/tutorial/image.htm#multilayer

4

resolution compared to panchromatic and multispectral images. Hyperspectral data sets

are generally composed of about 100 to 200 spectral bands of relatively narrow

bandwidths (5-10 nm). The Hyperion is an example of a hyperspectral sensor.

Colour Composite Images

In displaying a colour composite image, three primary colours RGB (red, green and

blue) are used. When these three colours are combined in various proportions, they

produce different colours in the visible spectrum. Associating each spectral band (not

necessarily a visible band) to a separate primary colour produces a colour composite

image.

1.4 TYPES OF RESOLUTION

In remote sensing the term resolution is used to represent the resolving power,

which includes not only the capability to identify the presence of two objects, but also

their properties. In qualitative terms resolution is the amount of details that can be

observed in an image. Four types of resolutions are defined for the remote sensing

systems.

i. Spatial Resolution

Spatial resolution is a measure of the area or size of the smallest dimension on the Earth’s

surface over which an independent measurement can be made by the sensor. It is

expressed by the size of the pixel on the ground in meters.

ii. Radiometric Resolution

Radiometric Resolution refers to the smallest change in intensity level that can be

detected by the sensing system. The intrinsic radiometric resolution of a sensing system

depends on the signal to noise ratio of the detector.

iii. Spectral resolution

It is used to describe the ability of a sensor to distinguish between wavelength intervals in

the electromagnetic spectrum (bands). The finer the spectral resolution, the narrower the

wavelength range for a particular channel or band.

http://eo1.usgs.gov/sensors/hyperion

5

iv. Temporal resolution

Temporal resolution is a measure of how often data are otained for the same area. The

temporal resolution specifies the revisiting frequency of a satellite sensor for a specific

location.

1.5 IMAGE FUSION

In remote sensing applications, the increasing availability of space borne sensors

gives a motivation for different image fusion algorithms. Image fusion is an effective

technique to integrate spatial and spectral information of the PAN and MS images.

Through remote sensing image fusion technique, we cannot only overcome the limitation

of information obtained from individual sensor but also achieve a better observation.

Image fusion has been used in many application areas. In remote sensing multi-sensor

fusion is used to achieve high spatial and spectral resolutions by combining images from

two sensors, one of which has high spatial resolution and the other one high spectral

resolution.

Several situations in image processing require high spatial and high spectral

resolution in a single image. Most of the available equipment is not capable of providing

such data convincingly. Image fusion techniques allow the integration of different

information sources. The fused image can have complementary spatial and spectral

resolution characteristics. However, the standard image fusion techniques can distort the

spectral information of the multispectral data while merging. Advantage of image fusion

is given in Fig.1.2.

Fig.1.2 Advantage of image fusion

https://en.wikipedia.org/wiki/Remote_sensing

https://en.wikipedia.org/wiki/Spectral_resolution

https://en.wikipedia.org/wiki/Spectral_resolution

http://www.ece.lehigh.edu/SPCRL/IF/rcinfo.gif

6

Image fusion techniques can improve the quality and increase the application of

these data. At the receiver station, the panchromatic image is merged with the

multispectral data to convey more information. The images used in image fusion should

already be registered. Many methods exist to perform image fusion. The very basic one is

the high pass filtering technique. Later techniques are based on Discrete Wavelet

Transform, uniform rational filter bank, and Laplacian pyramid.

A multispectral image can be bundled with a higher-resolution, panchromatic

image. This allows the user to combine the two in a process called pan-sharpening, which

merges the color bands with the high-resolution black-and-white imagery. As a result,

remote sensing satellites often provide panchromatic (PAN) image with high spatial

resolution and multispectral (MS) image with high spectral resolution.

1.6 IMAGE FUSION ALGORITHMS

The objective of image fusion algorithms is to make full use of spatial and spectral

information in the Panchromatic and Multispectral images respectively, in order to reduce

the potential colour distortion and provide clear image information.

There are two major techniques available.

Spatial domain fusion

Transform domain fusion

1.6.1 SPATIAL DOMAIN FUSION TECHNIQUES

Simple average

It is a fact that regions of images that are in focus tend to be of higher pixel

intensity. Thus this algorithm is a simple way of obtaining an output image with all

regions in focus. The value of the pixel P (i, j) of each image is taken and added. This

sum is then divided by 2 to obtain the average. The average value is assigned to the

corresponding pixel of the output image which is given in equation (1.1). This is repeated

for all pixel values.

https://en.wikipedia.org/wiki/Image_registration

https://en.wikipedia.org/wiki/High_pass_filter

https://en.wikipedia.org/wiki/Discrete_Wavelet_Transform

https://en.wikipedia.org/wiki/Discrete_Wavelet_Transform

https://en.wikipedia.org/wiki/Laplacian_pyramid

7

𝐾(𝑖, 𝑗) = {𝑋(𝑖, 𝑗) + 𝑌(𝑖, 𝑗)} 2⁄ (1.1)

Where X (i , j) and Y ( i, j) are two input images.

Select maximum

The greater the pixel values the more in focus the image. Thus this algorithm

chooses the in-focus regions from each input image by choosing the greatest value for

each pixel, resulting in highly focused output. The value of the pixel P (i, j) of each image

is taken and compared to each other. The greatest pixel value is assigned to the

corresponding pixel. It is given as

𝐹(𝑖, 𝑗) = ∑ ∑ max 𝐴(𝑖, 𝑗)𝐵(𝑖, 𝑗)𝑀𝑗=1

𝑀𝑖=1 (1.2)

Where,A and B are input images and F is fused image.

Principal Component Analysis (PCA)

Principal component analysis is a method in which number of correlated variables

is transformed into number of uncorrelated variables called principal components. The

information flow diagram of PCA-based image fusion algorithm is shown in Fig.1.3.

Fig 1.3: Principal Component Analysis

I1 (x, y) and I2 (x, y) are the two input images which are to be fused. From the

input image matrices produce the column vectors. Compute the covariance matrix of two

column vectors formed before. Compute the Eigen values and Eigen vectors of the

covariance matrix. The column vector corresponding to the larger Eigen value is

normalized by dividing each element with mean of Eigen vector. Normalized Eigen

8

vector value act as the weight values which are respectively multiplied with each pixel of

the input images. The fused image matrix will be sum of the two scaled matrices.

Weighted Average Method

In this method the resultant fused image is obtained by taking the weighted average

intensity of corresponding pixels from both the input image. The equation is given as

F (i, j) = ∑ ∑ W1(A(i, j)) + W2(B(i, j)) 𝑛𝑗=0

𝑛𝑖=0 (1.3)

where A(i, j), B(i, j) are input images and F(i, j) is fused image and W is weight factor.

The IHS-Based fusion

The IHS pan sharpening technique is the oldest known data fusion method and one

of the simplest. In this technique the following steps are performed as follows and it

described in Fig.1.4:

1. The low resolution MS imagery is co registered to the same area as the high resolution

PAN imagery and re-sampled to the same resolution as the PAN imagery.

2. The three re-sampled bands of the MS imagery, which represent the RGB space, are

transformed into IHS components.

3. The PAN imagery is histogram matched to the ‘I’ component. This is done in order to

compensate for the spectral differences between the two images, which occurred due to

different sensors or different acquisition dates and angles.

4. The intensity component of MS imagery is replaced by the histogram matched PAN

imagery. The RGB of the new merged MS imagery is obtained by computing a reverse

IHS to RGB transform.

Fig 1.4: IHS

9

1.6.2 TRANSFORM DOMAIN FUSION TECHNIQUES

In frequency domain methods, the pixel value is first transferred in to domain

methods by applying DCT and DWT based fusion ways and further image is enhanced by

altering frequency component of an image.

Laplacian Pyramid Fusion Method

The basic idea behind the Laplacian pyramid is to perform a pyramid

decomposition on every source image, then integrate of these decompositions to make a

composite representation and finally reconstruct the fused image by performing an

inverse pyramid transform. The various steps used in Laplacian pyramid based fusion

method are as follows:

1. The first step is to construct a pyramid for each source image.

2. Then the fusion is implemented at each level of the pyramid using a feature selection

decision method.

3. The feature selection method selects the most significant pattern from the source image

and copies it to the composite pyramid.

4. Finally, fused image is obtained by performing an inverse pyramid transform.

Fig 1.5: Laplacian Pyramid

Discrete Cosine Transform (DCT)

Spatial domain image fusion methods are complicated and time consuming which

are difficult to be performed on real-time images. Also, fusion approaches which are

applied in DCT are very adept when the source images are coded in Joint Photographic

Experts Group (JPEG) format or when the fused image will be saved or transmitted in

10

JPEG format. To perform the JPEG coding, an image is first subdivided into blocks of

8x8 pixels. The Discrete Cosine Transform (DCT) is then performed on every block. This

generates 64 coefficients which are then quantized to reduce their magnitude. The

coefficients are then reordered into a one-dimensional array in a zigzag manner before

further entropy encoding takes place. The compression is achieved in two stages the first

is during quantization and the second during the entropy coding procedure. JPEG

decoding is the reverse process of encoding.

Discrete Wavelet Transform (DWT)

The wavelet transform decomposes the image into low-low, low-high, high-low

and high-high spatial frequency bands at different scales. The LL band contains the

approximation coefficients whereas the other bands contain directional information due

to spatial orientation. LH band contains the horizontal detail coefficients. HL band

contains the vertical detail coefficients. HH contains the diagonal detail coefficients and

also contain the higher absolute values of wavelet coefficients correspond to salient

features such as edges or lines. Fig.1.6 shows Discrete Wavelet Transform (DWT) based

image fusion. The wavelets-based approach performs the following tasks:-

1. It is a multi scale (multi resolution) approach well suited to manage the different image

resolutions. It is Useful in a number of image processing applications including the image

fusion.

2. The discrete wavelets transform (DWT) allows the image decomposition in different

kinds of coefficients preserving the image information.

3. Such coefficients approaching from different images can be appropriately combined to

obtain new coefficients so that the information in the original images is collected

appropriately.

4. After the coefficients are merged then the final fused image is achieved by applying

the inverse discrete wavelets transform (IDWT), where the information in the merged

coefficients is also preserved.

11

Fig 1.6: Discrete Wavelet Transform

12

CHAPTER 2

LITERATURE SURVEY

An image fusion approach based on markov random fields

M. Xu, H. Chen, and P. Varshney

Markov random field (MRF) models are powerful tools to model image

characteristics accurately and have been successfully applied to a large number of image

processing applications. In this paper the problem of fusion of remote sensing images

based on MRF models is investigated. Fusion algorithm under maximum a posteriori

criterion is used. It is applicable to both multi-scale decomposition (MD)- based image

fusion and non-MD-based image fusion. It is provided to demonstrate the fusion

performance improvement.

Here, the image fusion problem is based on a statistical model. This approach is

applicable for both non-MD- and MD-based fusion approaches. When the raw source

images are directly used for fusion without pre-processing, the fused image can also be

modelled as an MRF, and then, the fusion result can be obtained by incorporating a priori

Gibbs distribution of the fused image. Visual inspection and quantitative performance

evaluation both demonstrate that the employment of the MRF model in the fusion

approaches resulted in a better fusion performance than the traditional fusion approaches.

A simple relationship between each source image and the true scene, i.e., a source image

either contributes to the fused image or does not contribute to the fused image is given. If

it results in a mismatch between the fusion model and the real image data set, assume that

the coefficient in the data model can take any real value, which may increase the

accuracy of the fusion algorithm. The noise in the source image is Gaussian noise.

13

Guided image filtering

K. He, J. Sun, and X. Tang

In this paper, the author proposed a novel explicit image filter called guided filter.

Derived from a local linear model, the guided filter computes the filtering output by

considering the content of a guidance image, which can be the input image itself or

another different image. The guided filter can be used as an edge-preserving smoothing

operator like the popular bilateral filter, but it has better behaviours near edges. The

guided filter is also a more generic concept beyond smoothing: It can transfer the

structures of the guidance image to the filtering output, enabling new filtering

applications like de-hazing and guided feathering. Moreover, the guided filter naturally

has a fast and non-approximate linear time algorithm, regardless of the kernel size and

the intensity range. Currently, it is one of the fastest edge-preserving filters. Experiments

show that the guided filter is both effective and efficient in a great variety of computer

vision and computer graphics applications, including edge-aware smoothing, detail

enhancement, compression, image matting/feathering, de-hazing, joint up-sampling, etc.

Generalized random walks for fusion of multi-exposure images

R. Shen, I. Cheng, J. Shi, and A. Basu

A single captured image of a real-world scene is usually insufficient to reveal all

the details due to under- or over-exposed regions.This problem can be solved by images

of the same scene can be first captured under different exposure settings and then

combined into a single image using image fusion techniques. The aim is to achieve an

optimal balance between two quality measures, i.e., local contrast and color consistency,

while combining the scene details revealed under different exposures. A generalized

random walks framework is proposed to calculate a globally optimal solution subject to

the two quality measures by probability estimation. Experiments demonstrate that this

algorithm generates high-quality images at low computational cost. Experimental results

demonstrated that this probabilistic fusion produces good results, in which contrast is

enhanced and details are preserved with high computational efficiency. Compared to

14

other fusion methods this algorithm produces images with comparable or even better

qualities.

Adaptive multi-focus image fusion using a wavelet based statistical sharpness

measure

J. Tian and L. Chen

Multi-focus image fusion is to combine a set of images that are captured from the

same scene but with different focuses for producing another sharper image. Based on the

marginal distribution of the wavelet coefficients is different for images with different

focus levels, a new statistical sharpness measure to measure the degree of the image’s

blur is proposed. It is evaluated using a locally adaptive Laplacian mixture model. The

proposed sharpness measure is then exploited to perform adaptive image fusion in

wavelet domain. The proposed approach could be further extended to be applied in the

redundant or complex wavelet domains.

Edge-preserving decompositions for multi-scale tone and detail manipulation

Farbman, R. Fattal, D. Lischinski, and R. Szeliski

The author says that many recent computational photography techniques

decompose an image into a piecewise smooth base layer, containing large scale variations

in intensity, and a residual detail layer capturing the smaller scale details in the image. It

is important to control the spatial scale of the extracted details, desirable to manipulate

details at multiple scales, while avoiding visual artifacts. A new way to construct edge-

preserving multi-scale image decompositions is given. Current base detail decomposition

techniques, based on the bilateral filter, are limited in their ability to extract detail at

arbitrary scales. The weighted least squares optimization framework, which is

particularly well suited for progressive coarsening of images and for multi-scale detail

extraction is described. After describing this operator, it is compared with bilateral filter

and other schemes.

Multi-scale contrast manipulation is a valuable digital darkroom technique.

Currently it is possible to sharpen images (which may be viewed as increasing the local

15

contrast of the finest scale details), as well as to adjust the global contrast. This does not

suffer from some of the drawbacks of bilateral filtering and other previous approaches.

Future the smoothness coefficients for the WLS formulation are enhanced further by

improving the ability to preserve edge color. While manually adjusting the saturation

alleviates the problem, a more principled solution is needed.

Image fusion: Advances in the state of the art

A. A. Goshtasby and S. Nikolov

The author describes that the image fusion is the process of combining information

from two or more images of a scene into a single composite image that is more

informative and is more suitable for visual perception or computer processing. The

objective in image fusion is to reduce uncertainty and minimize redundancy in the output

while maximizing relevant information particular to an application or task. There are

several benefits in using image fusion: wider spatial and temporal coverage, decreased

uncertainty, improved reliability, and increased robustness of system performance. A

single sensor cannot produce a complete representation of a scene. Visible images

provide spectral and spatial details, and if a target has the same color and spatial

characteristics as its background, it cannot be distinguished from the background. If

visible images are fused with thermal images, a target that is warmer or colder than its

background can be easily identified, even when its color and spatial details are similar to

those of its background. Fused images can provide information that sometimes cannot be

observed in the individual input images. Successful image fusion significantly reduces

the amount of data to be viewed or processed without significantly reducing the amount

of relevant information.

Multifocus image fusion using the non subsampled contourlet transform

Q. Zhang and B. Guo

A novel image fusion algorithm based on the non-subsampled contourlet transform

(NSCT) is proposed in this paper, which aims at solving the fusion problem of multifocus

images. Based on the directional vector normal, a ‘selecting’ scheme combined with the

16

‘averaging’ scheme is presented for the low-pass sub-band coefficients. Based on the

directional band limited contrast and the directional vector standard deviation, a selection

principle is put forward for the band-pass directional sub-band coefficients. It not only

extracts more important visual information from source images, but also effectively

avoids the introduction of artificial information. It significantly outperforms the

traditional discrete wavelet transform-based and the discrete wavelet frame transform-

based image fusion methods in terms of both visual quality and objective evaluation,

especially when the source images are not perfectly registered. The NSCT is more

suitable for image fusion because of many advantages such as multi-scale, localization,

multi-direction, and shift-invariance. Several sets of multi-focus images have been used

to evaluate the performance of the proposed fusion algorithm. NSCT- based fusion

algorithm performs well in some cases. However, the improved performance is at the cost

of increasing computational complexity and memory during the fusion process. In some

cases such as image coding and image compression, where redundancy is a major issue,

the higher redundancy of the NSCT may also limit its applications.

A total variation-based algorithm for pixel level image fusion

M. Kumar and S. Dass

In this paper, a total variation (TV) based approach is proposed for pixel-level

fusion to fuse images acquired using multiple sensors. Fusion is an inverse problem and a

locally affine model is used as the forward model. A TV semi norm based approach in

conjunction with principal component analysis is used iteratively to estimate the fused

image. The feasibility of the algorithm is demonstrated on images from computed

tomography and magnetic resonance imaging as well as visible-band and infrared

sensors. It is applied to several different types of datasets. It should focus on analysis of

the algorithm performance with additional datasets.

Image Fusion with Guided Filtering

Shutao Li, Xudong Kang, Jianwen Hu

A fast and effective image fusion method is proposed for creating a highly

informative fused image through merging multiple images. The proposed method is

17

based on a two-scale decomposition of an image into a base layer containing large scale

variations in intensity, and a detail layer capturing small scale details. A novel guided

filtering-based weighted average technique is proposed to make full use of spatial

consistency for fusion of the base and detail layers. Experimental results demonstrate that

the proposed method can obtain state-of-the-art performance for fusion of multispectral,

multifocus, multimodal, and multiexposure images. The proposed method utilizes the

average filter to get the two-scale representations, which is simple and effective. The

guided filter is used in a novel way to make full use of the strong correlations between

neighborhood pixels for weight optimization. Experiments show that the proposed

method can well preserve the original and complementary information of multiple input

images.

18

CHAPTER 3

METHODOLOGY

3.1 INTRODUCTION

In this paper explicit image filter called guided filter is used. The guided filter

computes the filtering output by considering the content of a guidance image, which can

be the input image itself or another different image. The guided filter can be used as an

edge preserving smoothing operator like the popular bilateral filter, but has better

behaviours near edges. The guided filter is also a more generic concept beyond

smoothing, it can transfer the structures of the guidance image to the filtering output,

enabling new filtering applications like de-hazing and guided feathering. Moreover, the

guided filter naturally has a fast and non-approximate linear time algorithm, regardless of

the kernel size and the intensity range.

Currently it is one of the fastest edge preserving filters. Experiments show that the

guided filter is both effective and efficient in a great variety of computer vision and

computer graphics applications including edge aware smoothing, detail enhancement,

HDR compression, image matting /feathering, de-hazing, joint up-sampling, etc.

3.2 GUIDED IMAGE FILTERING

The guided filter assumes that the filtering output O is a linear transformation of

the guidance image I in a local window 𝜔𝑘 centered at pixel k.

𝑂𝑖=𝑎𝑘𝐼𝑖 + 𝑏𝑘 ∀𝑖𝜖𝜔𝑘 (3.1)

where 𝜔𝑘 is a square window of size (2r+1)×(2r+1).

The linear coefficients 𝑎𝑘 and 𝑏𝑘 are constant in 𝜔𝑘 and can be estimated by

minimizing the squared difference between the output image O and the input image P.

𝐸(𝑎𝑘, 𝑏𝑘) = ∑ (𝑖∈𝜔𝑘(𝑎𝑘𝐼𝑖 + 𝑏𝑘 − 𝑃𝑖)2 + 𝜖𝑎2

𝑘) (3.2)

19

where 𝜖 is a regularization parameter given by the user.

The coefficients 𝑎𝑘 and 𝑏𝑘 can be directly solved by linear regression equation as

follows:

𝑎𝑘=

1

|𝜔| ∑ (𝑖∈𝜔𝑘

𝐼𝑖𝑃𝑖 − 𝜇𝑘𝑃′𝑘)

𝛿𝑘+𝜖 (3.3)

𝑏𝑘 = 𝑃′𝑘 − 𝑎𝑘 𝜇𝑘 (3.4)

here, 𝜇𝑘 and 𝛿𝑘 are the mean and variance of I in 𝜔𝑘 respectively, |𝜔| is the number of

pixels in 𝜔𝑘, and 𝑃𝑘 is the mean of P in 𝜔𝑘. Next, the output image can be calculated

according to (3.1). As shown in Fig. 3.1, all local windows centered at pixel k in the

window 𝑃𝑖 will contain pixel i. So, the value of 𝑂𝑖 in (3.1) will change when it is

computed in different windows,𝜔𝑘. To solve this problem, all the possible values of

coefficients 𝑎𝑘and 𝑏𝑘 are first averaged. Then, the filtering output is estimated as follows:

𝑂𝑖= 𝑎′𝑖𝐼𝑖 + 𝑏′𝑖 (3.5)

where 𝑎′𝑖 = 1

|𝜔| ∑ (𝑘∈𝜔𝑖

𝑎𝑘), 𝑏′𝑖 = 1

|𝜔| ∑ (𝑘∈𝜔𝑖

𝑏𝑘). In this paper, 𝐺𝑟,𝜖(P, I) is used to

represent the guided filtering operation, where r and 𝜖 are the parameters which decide

the filter size and blur degree of the guided filter, respectively. Moreover, P and I is the

input image and guidance image, respectively.

Furthermore, when the input is a color image, the filtering output can be obtained

by conducting the guided filtering on the red, green, and blue channels of the input

image, respectively. And when the guidance image I is a color image, the guided filter

should be extended by the following steps.

20

Fig. 3.1 Window Choice.

First, equation (3.1) is rewritten as follows:

𝑂𝑖=𝑎𝑘𝑇𝐼𝑖 + 𝑏𝑘 ∀𝑖𝜖𝜔𝑘

(3.6)

where, 𝑎𝑘 is a 3 × 1 coefficient vector and 𝐼𝑖 is a 3 × 1 color vector. Then, similar to

(3.3)–(3.5), the output of guided filtering can be calculated as follows:

𝑎𝑘 = (∑𝑘 + 𝜖U)(1

|𝜔| ∑ (𝐼𝑖𝑖∈𝜔𝑘

𝑃𝑖 - 𝜇𝑘 𝑝′𝑘)) (3.7)

𝑏𝑘 = 𝑃′𝑘 − 𝑎𝑘𝑇 𝜇𝑘 (3.8)

𝑂𝑖= 𝑎′𝑖𝑇𝐼𝑖 + 𝑏′𝑖 (3.9)

where ∑𝑘 is the 3×3 covariance matrix of I in 𝜔𝑘, and U is the 3 × 3 identity matrix.

3.3 SMOOTHING OPERATORS

Smoothing operators are used to smoothen and decompose the source images.

Here average and Gaussian filter are used for decomposition.

3.3.1 AVERAGE FILTER

Replace each pixel by the average of its neighboring pixels. Assume a 3x3

neighborhood. The filtering is done as

I (i, j) = p0+p1+p2+p3+p4+p5+p6+p7+p8

9 (3.10)

21

In general a filter applies a function over the values of a small neighborhood of

pixels to compute the result. The size of the filter is equal to the size of the neighborhood:

3x3, 5x5, 7x7…. 21x21……. The shape of the filter region is not necessarily square, can

be a rectangle, a circle.

3.3.2 GAUSSIAN FILTER

It is used to blur images and remove noise and detail.

The Gaussian function is:

(3.11)

Where, σ is the standard deviation. The distribution is assumed to have a mean of 0.

The effect of Gaussian smoothing is to blur an image, in a similar fashion to

the mean filter. The degree of smoothing is determined by the standard deviation of the

Gaussian. (Larger standard deviation Gaussians, of course, require larger convolution

kernels in order to be accurately represented.)The Gaussian outputs a `weighted average'

of each pixel's neighbor-hood, with the average weighted more towards the value of the

central pixels. This is in contrast to the mean filter's uniformly weighted average.

Because of this, a Gaussian provides gentler smoothing and preserves edges better than a

similarly sized mean filter.

3.4 VARIOUS IMAGE SEGMENTATION TECHNIQUES

Image segmentation is typically used to locate objects and boundaries (lines,

curves, etc.) in images. Edge detection techniques have therefore been used as the base of

segmentation technique. Edge detection refers to the process of identifying and locating

sharp discontinuities in an image. Edge detection significantly reduces the amount of data

and filters out useless information, while preserving the important structural properties in

an image. The discontinuities are abrupt changes in pixel intensity which characterize

boundaries of objects in a scene.

http://homepages.inf.ed.ac.uk/rbf/HIPR2/mean.htm

https://en.wikipedia.org/wiki/Boundary_tracing

22

3.4.1 SOBEL OPERATOR

The operator consists of a pair of 3×3 convolution kernels as shown in Fig.3.2.

One kernel is simply the other rotated by 90°.

𝐺𝑥 𝐺𝑦

Fig.3.2: Masks used by Sobel Operator

These kernels are designed to respond maximally to edges running vertically and

horizontally relative to the pixel grid, one kernel for each of the two perpendicular

orientations. The kernels can be applied separately to the input image, to produce

separate measurements of the gradient component in each orientation(𝐺𝑥, 𝐺𝑦). These can

then be combined together to find the absolute magnitude of the gradient. The gradient

magnitude is given by:

|𝐺| = √𝐺𝑥2 + 𝐺𝑦2 (3.12)

Typically, an approximate magnitude is computed using:

|𝐺| = |𝐺𝑥| + |𝐺𝑦| (3.13)

which is much faster to compute. Sobel operator is easy to achieve in space, has a

smoothing effect on the noise, can provide more accurate edge direction information but

it will also detect many false edges with coarse edge width.

3.4.2 ROBERT’S CROSS OPERATOR

The Roberts Cross operator performs a simple, quick to compute, 2-D spatial

gradient measurement on an image. Pixel values at each point in the output represent the

estimated absolute magnitude of the spatial gradient of the input image at that point. The

23

operator consists of a pair of 2×2 convolution kernels as shown in Fig.3.3. One kernel is

simply the other rotated by 90°. This is very similar to the Sobel operator.

𝐺𝑥 𝐺𝑦

Fig.3.3: Masks used by Robert Operator

These kernels are designed to respond maximally to edges running at 45° to the

pixel grid, one kernel for each of the two perpendicular orientations. The kernels can be

applied separately to the input image, to produce separate measurements of the gradient

component in each orientation(𝐺𝑥, 𝐺𝑦). These can then be combined together to find the

absolute magnitude of the gradient. The gradient magnitude is given by:

|𝐺| = √𝐺𝑥2 + 𝐺𝑦2 (3.14)

Typically, an approximate magnitude is computed using:

|𝐺| = |𝐺𝑥| + |𝐺𝑦| (3.15)

which is much faster to compute.

3.4.3 PREWITT’S OPERATOR

Prewitt operator is similar to the Sobel operator and is used for detecting vertical

and horizontal edges in images. The operator consists of a pair of 3×3 convolution

kernels as shown in Fig.3.4.

𝐺𝑥 𝐺𝑦

Fig.3.4: Masks used by Prewitt Operator

+1 0

0 -1

+1 0

0 -1

24

3.4.4 CANNY OPERATOR

The Steps involved in Canny edge detection algorithm is as follows:

1. Apply Gaussian filter to smooth the image in order to remove the noise.

2. Find the intensity gradients of the image. The edges should be marked where the

gradients of the image has large magnitudes. The mask used for finding the gradients is

given in Fig.3.5.

𝐺𝑥 𝐺𝑦

Fig.3.5: Masks used by Canny Operator

3. Apply non-maximum suppression to get rid of spurious response to edge detection.

Non-maxima suppression is applied to the gradient magnitude to trace move along the

edge direction and suppress those pixel values that are not considered edge and thus

resulting in thinning of edge.

4. Apply double threshold to determine potential edges

5. Track edge by hysteresis: Finalize the detection of edges by suppressing all the other

edges that are weak and not connected to strong edges.

3.4.5 LAPLACIAN OF GAUSSIAN

Laplacian filters are derivative filters used to find areas of rapid change (edges) in

images. Since derivative filters are very sensitive to noise, it is common to smooth the

image (e.g., using a Gaussian filter) before applying the Laplacian. This process is called

the Laplacian of Gaussian (LoG) operation.

The Laplacian L(x, y) of an image with pixel intensity values I(x, y) is given by:

𝐿(𝑥, 𝑦) =𝜕2𝑓(𝑥,𝑦)

𝜕𝑥2 +

𝜕2𝑓(𝑥,𝑦)

𝜕𝑦2 (3.16)

There are different ways to find an approximate discrete convolution kernel that

approximates the effect of the Laplacian.

25

Two commonly used kernels are

Fig.3.6 Laplacian filter

This is called a positive Laplacian because the central peak is positive. It is just as

appropriate to reverse the signs of the elements, to get a positive Laplacian. To include a

smoothing Gaussian filter, combine the Laplacian and Gaussian functions to obtain a

single equation:

𝐿𝑜𝐺(𝑥, 𝑦) = − 1

𝜋𝜎4 [1 −𝑥2+𝑦2

2𝜎2 ] 𝑒−𝑥2+𝑦2

2𝜎2 (3.17)

The LoG operator takes the second derivative of the image. Where the image is basically

uniform, the LoG will give zero. Wherever a change occurs, the LoG will give a positive

response on the darker side and a negative response on the lighter side. At a sharp edge

between two regions, the response will be

• zero away from the edge

• positive just to one side

• negative just to the other side

• zero at some point in between on the edge itself

If the original image is filtered with a simple Laplacian, the resulting output is rather

noisy. Using a larger σ for the Gaussian will reduce the noise, but the sharpening effect

will be reduced. Hence it is full of compromises.

3.4.6 PSO (PARTICLE SWARM OPTIMIZATION) SEGMENTATION

The process of dividing a image into its multiple segments is called as Image

segmentation. Particle swarm optimization belongs to the class of swarm intelligence

techniques that are used to solve optimization problems. PSO simulates the behaviors of

bird flocking. Means, a group of birds are randomly searching food in an area. There is

26

only one piece of food in the area being searched. All the birds do not know where the

food is. But they know how far the food is in each iteration. So the best way to find the

food is to follow the bird which is nearest to the food. Flocking behavior is the behavior

exhibited when a group of birds, called a flock, are foraging.

Each particle in PSO [13,14] is updated by following two "best" values:

pbest- Each particle keeps track of its coordinates in the solution space which are

associated with the best solution (fitness) that has achieved so far by that particle. This

value is called personal best, pbest.

gbest- The best among the pbest's is tracked by the PSO. This best is called Global Best,

gbest.

Each particle modifies its position with the help of,

• the current positions,

• the current velocities,

• the distance between the current position and pbest,

• the distance between the current position and the gbest.

After finding the two best values, the particle updates its velocity and positions with the

following equation,

𝑣[ ] = 𝑣[ ] + 𝑐1 ∗ 𝑟𝑎𝑛𝑑( ) ∗ (𝑝𝑏𝑒𝑠𝑡[ ] − 𝑝𝑝𝑟𝑒𝑠𝑒𝑛𝑡[ ]) + 𝑐2 ∗ 𝑟𝑎𝑛𝑑( ) ∗ (𝑔𝑏𝑒𝑠𝑡[ ] −

𝑝𝑝𝑟𝑒𝑠𝑒𝑛𝑡[ ]) (3.18)

𝑝𝑟𝑒𝑠𝑒𝑛𝑡[ ] = 𝑝𝑟𝑒𝑠𝑒𝑛𝑡[ ] + 𝑣[ ] (3.19)

where, v[] is the particle velocity, persent[] is the current particle (solution). rand() is a

random number between (0; 1).c1; c2 are learning factors. usually c1 = c2 = 2.

ALGORITHM

The Algorithm of PSO Segmentation is sequenced as,

Step 1: Read the input image to be segmented.

Step 2: Select PSO method to be applied on that image with a particular threshold level.

Step 3: For each particle in the population update particle’s fitness in the search space and

update particle’s best in the search space, then move the particle in the population.

27

Step 4: For each particle, if swarm gets better then extend the swarm/particle life.

Step 5: For each particle, if swarm is not improving its performance then reduce the

swarm life.

Step 6: The swarm is considered for next iteration.

Step 7: The failed swarms are deleted.

Step 8: Reset threshold counter.

3.5 WEIGHTED AVERAGING

An average in which each quantity to be averaged is assigned a weight. These

weightings determine the relative importance of each quantity on the average. Weightings

are the equivalent of having that many like items with the same value involved in the

average.

3.6 BLOCK DIAGRAM

Fig.3.7 Block Diagram

Source images

I1 and I2

Base layers

B1 and B2

-

Detail layers

D1 and D2

Segmentation Weight map

P1 and P2

Refined Weight

map Saliency

comparison

Guided

filtering

Fused base

layer B

Fused

image

F Fused detail

layer

D

+

Smoothing

operator weighted

averaging

28

The sequence of steps are described as shown in Fig.3.7.

3.6.1 IMAGE DECOMPOSITION

The base layer is obtained from Gaussian filtering the source images.

𝐵𝑛=𝐼𝑛*Z (3.20)

where 𝐵𝑛 is the base layers, Z is the gaussian filter coefficient, 𝐼𝑛 is the source images.

The detail layers are obtained by subtracting the base layers from source images.

𝐷𝑛= 𝐼𝑛 – 𝐵𝑛 (3.21)

3.6.2 WEIGHT MAP CONSTRUCTION

PSO segmentation is applied to source images to get the Saliency map.

𝑆𝑛 = 𝐼𝑛 ∗ 𝑃𝑆𝑂 𝑠𝑒𝑔𝑚𝑒𝑛𝑡𝑎𝑡𝑖𝑜𝑛 (3.22)

where, 𝑆𝑛 is the saliency map.

Weight map is obtained by saliency comparison of Saliency maps.

𝑃𝑛𝑘 = {1 if Sn

k = max( S1k, S2

k, … SNk )

0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 (3.23)

where N is number of source images, Snk is the saliency value of the pixel k in the nth

image.

Saliency Comparison refers to assigning similar weights if two adjacent pixels have

similar brightness or color.

Refined weight map is constructed by using guided filter.

• Computes the filtering output by considering the guidance image.

• Guidance image can be the input image itself or another different image.

29

Guided image filtering is performed on each weight map 𝑃𝑛 with the corresponding

source image 𝐼𝑛 serving as the guidance image.

WnB = 𝐺𝑟1 ,𝜀1

(𝑃𝑛, 𝐼𝑛) (3.24)

WnD= 𝐺𝑟2 ,𝜀2

(𝑃𝑛, 𝐼𝑛) (3.25)

3.6.3 IMAGE RECONSTRUCTION

The base and detail layers of different source images and the refined weight maps are

fused together by weighted averaging.

B’ = ∑ WnB. 𝐵𝑛

𝑁𝑛=1 (3.26)

D’ = ∑ WnD. 𝐷𝑛

𝑁𝑛=1 (3.27)

The final fused image is obtained by combining B’ and D’.

F = B’ + D’ (3.28)

3.7 PARAMETERS USED FOR COMPARISON

MEAN SQUARED ERROR (MSE)

Mean square error between the reference image and the fused image is given by

𝑀𝑆𝐸 =1

𝑚𝑛∑ ∑ (𝐴𝑖𝑗 − 𝐵𝑖𝑗)

2𝑛𝑗=1

𝑚𝑖=1 (3.29)

where,

• 𝐴𝑖𝑗 - the reference image,

• 𝐵𝑖𝑗- the fused image to be assessed,

• i – pixel row index,

• j – pixel column index,

• m, n - No. of row and column

30

PEAK SIGNAL TO NOISE RATIO (PSNR)

The ratio between maximum possible power of the signal to the power of the

corrupting noise that creates distortion of image. The PSNR measure is given by

𝑃𝑆𝑁𝑅 = 10 log102552

𝑀𝑆𝐸 (3.30)

STRUCTURAL SIMILARITY INDEX MEASURE(SSIM)

The Structural similarity index is a measure of structural information change in the

fused image such as luminance and contrast. The SSIM index is a decimal value between

0 and 1. The equation for SSIM is

SSIM = (2𝜇𝑥𝜇𝑦+𝑐1)(2𝜎𝑥𝑦+𝑐2)

(𝜇𝑥2+𝜇𝑦

2+𝑐1)(𝜎𝑥2+𝜎𝑦

2+𝑐2) (3.31)

where

𝜇𝑥 the average of x

𝜇𝑦 the average of y

𝜎𝑥2 the variance of x

𝜎𝑦2 the variance of y

𝜎𝑥𝑦 the covariance of x and y

𝑐1 and 𝑐2 two variables to stabilize the division with weak denominator

NORMALIZED CROSS CORRELATION(NK)

Normalized cross correlation is used to find out similarities between fused image

and registered image and it is given by the following equation

𝑁𝐾 =∑ ∑ (𝐴𝑖𝑗.𝐵𝑖𝑗)𝑛

𝑗=1𝑚𝑖=1

∑ ∑ (𝐴𝑖𝑗)2𝑛

𝑗=1𝑚𝑖=1

(3.32)

MAXIMUM DIFFERENCE(MD)

Difference between any two pixels such that the larger pixel appears after the smaller

pixel. The large value of maximum difference means that image is poor in quality.

31

𝑀𝐷 = 𝑀𝑎𝑥(|𝐴𝑖𝑗−𝐵𝑖𝑗|) (3.33)

NORMALIZED ABSOLUTE ERROR (NAE)

The large value of normalized absolute error means that image is poor quality.

NAE is defined as follows

𝑁𝐴𝐸 =∑ ∑ (|𝐴𝑖𝑗−𝐵𝑖𝑗|)𝑛

𝑗=1𝑚𝑖=1

∑ ∑ (𝐴𝑖𝑗)𝑛𝑗=1

𝑚𝑖=1

(3.34)

32

CHAPTER 4

SIMULATION RESULTS

The results of guided filter weighted averaging based image fusion method on MS

and PAN images are presented in this chapter. The performance measure that has been

used to analyse the results are Mean Square Error (MSE), Peak Signal to Nose ratio

(PSNR) and Structural Similarity index (SSIM),Normalized Cross Correlation(NK),

Maximum Difference(MD),Normalized Absolute Error(NAE). The images were analysed

and processed in MATLAB. IKONOS and Quick Bird satellite images are used here.

4.1 RESULTS

4.1.1 INPUT IMAGES OF IKONOS SENSOR

IKONOS is a commercial earth observation satellite, and was the first to collect

publicly available high resolution imagery at 1 and 4 meter resolution. It offers

multispectral (MS) and panchromatic (PAN) imagery. The IKONOS launch was called

“one of the most significant developments in the history of the space age”. Spatial

resolution of the sensor is 0.8 m for panchromatic (1m PAN) and 4 meter for

multispectral (4m MS). Spectral resolution of Ikonos sensor is given in Table.4.1.

Table.4.1. Spectral Resolution of Ikonos Sensor

Band Wavelength

Region(μm)

1 0.45-0.52 (blue)

2 0.52-0.60(green)

3 0.63-0.69(red)

4 0.76-0.90(nearIR)

PAN 0.45-0.90(PAN)

33

The input images (panchromatic and multispectral) from the Ikonos sensor are shown as

in Fig.4.1.

(a) (b)

Fig 4.1 Panchromatic and Multispectral Image of Ikonos Sensor

4.1.2 INPUT IMAGES OF QUICKBIRD SENSOR

QuickBird was a high resolution commercial earth observation satellite, owned by

Digital Globe. QuickBird used Ball Aerospace's Global Imaging System 2000 (BGIS

2000). The satellite was initially expected to collect at 1 meter resolution. The satellite

collected panchromatic (black and white) imagery at 61 centimeter resolution and

multispectral imagery at 2.44 (at 450 km) to 1.63meter (at 300km) resolution, as orbit

altitude is lowered during the end of mission life. At this resolution, detail such as

buildings and other infrastructure are easily visible. The imagery can be can be used in

mapping applications, such as Google Earth and Google Maps. Spectral resolution of

Ikonos sensor is given in Table.4.2.

34

Table.4.2. Spectral Resolution of Quickbird Sensor

Band Wavelength

Region(nm)

1 430 - 545 (blue)

2 466 - 620 (green)

3 590 - 710 (red)

4 715 - 918 (nearIR)

PAN 405 - 1053 (PAN)

The input images (panchromatic and multispectral) from the Quickbird sensor are shown

as in Fig.4.2.

(a) (b)

Fig 4.2 Panchromatic and Multispectral Image of Quickbird Sensor

4.1.3 COMPARISON OF FUSION ALGORITHMS

The various fusion algorithms such as Weighted Average, Laplacian Pyramid

transform, Principal Component Analysis and Discrete Wavelet Transform are performed

and their metrics have been evaluated. The fused results are as in following figures. The

smoothing operator used is average filter and edge detection is Laplacian of Gaussian.

35

(a) (b)

Fig.4.3 Fused result of Weighted Average Fusion

(a) (b)

Fig.4.4 Fused result of Laplacian Pyramid Fusion

36

(a) (b)

Fig.4.5 Fused result of Principal Component Analysis Fusion

(a) (b)

Fig.4.6 Fused result of Discrete Wavelet Transform Fusion

From the fused results, the fusion is clear in weighted averaging, whereas DWT and LPT

provide only brightness and PCA is too dark.

37

PERFORMANCE METRICS

The performance of the fusion methods is given in Table.4.3 and Table.4.4.

Table.4.3.Performance Metrics of Fusion Algorithms of Input set 1

METRICS PSNR MD NK NAE MSE SSIM

Weighted average 18.7739 175 0.9386 0.4192 862.3716 0.4599

Laplacian Pyramid

Transform

9.7281 254.02 0.0058 0.9862 6922.6 0.0018

PCA 12.4176 207 0.2963 0.7488 3726.6 0.2912

DWT 9.7279 254 0.0058 0.9868 6922.9 0.0017

Table.4.4.Performance Metrics of Fusion Algorithms of Input set 2

METRICS PSNR MD NK NAE MSE SSIM

Weighted average 17.7332 132 0.8181 0.2241 1095.9 0.7403

Laplacian Pyramid

Transform

5.5342 254.06 0.0040 0.9961 18183 7.7617e-05

PCA 7.2653 164 0.2035 0.8342 12205 0.2466

DWT 5.5344 254.02 0.0040 0.9961 18182 7.6097e-05

38

4.1.4 COMPARISON OF VARIOUS SEGMENTATION TECHNIQUES

The Various Segmentation methods such as Sobel, Prewitt, Roberts, Canny, LoG,

PSO Segmentation are carried out on the segmentation block and compared by using

various performance metrics. Average filter is carried out at the smoothing operator

block. The fused result of both set of input images for these various methods are shown

in below figures.

(a) (b)

Fig 4.7 Fused Images using LoG operator

(a) (b)

Fig 4.8 Fused Images using Canny operator

39

(a) (b)

Fig 4.9 Fused Images using Sobel operator

(a) (b)

Fig 4.10 Fused Images using Prewitt operator

40

Fig 4.11 Fused Images using Roberts operator

Fig 4.12 Fused Images using PSO Segmentation

From the fused images above, PSO Segmentation based fused image provides

good information than other edge detection methods such as LoG, Robert, Canny, Prewitt

and Sobel.

41

PERFORMANCE METRICS

The Performance metrics such as Mean Square Error(MSE), Peak Signal to Noise

Ratio(PSNR), Normalized Cross Correlation(NK), Maximum Difference(MD),

Normalized Absolute Error(NAE) and Structural Similarity Index Measure are calculated

for both Quickbird and Ikonos sensor images and are tabulated as shown in Table.4.5 and

Table.4.6.

Table.4.5.Performance Metrics of Quickbird image

Table.4.6.Performance Metrics of Ikonos image

MSE PSNR NK MD NAE SSIM

LOG 861.5980 18.7778 0.9387 175 0.4191 0.4600

PSO segmentation 366.4901 22.4902 1.0542 151 0.1905 0.6195

CANNY 1.3768e+03 16.7420 0.8745 187 0.5185 0.1583

SOBEL 687.0460 19.7609 1.0585 185 0.3355 0.2342

PREWITT 1.3773e+03 16.7405 0.8743 188 0.5186 0.1583

ROBERTS 1.3774e+03 16.7403 0.8743 187 0.5186 0.1583

MSE PSNR NK MD NAE SSIM

LOG 1.0961e+03 17.7322 0.8181 132 0.2242 0.7402

PSO segmentation 541.6869 20.7933 1.0675 131 0.1493 0.7436

CANNY 2.9450e+03 13.4400 0.7660 247 0.3404 0.0522

SOBEL 2.2685e+03 14.5735 0.9932 218 0.3994 0.0731

PREWITT 2.9478e+03 13.4358 0.7655 247 0.3406 0.0521

ROBERTS 2.9480e+03 13.4355 0.7655 247 0.3407 0.0521

42

INFERENCE

From the Table.4.5 and Table.4.6 it is clear that the PSO segmentation performs the best

compared to others.

4.1.5 COMPARISON OF SMOOTHING OPERATORS

Smoothing operators such as average, Gaussian, disk are used at the smoothing

operator block and are compared with the metrics. As PSO segmentation performs the

best in the above result, it is taken as the segmentation method in this process. The results

are as follows.

(a) (b)

Fig 4.13 Fused Images using average smoothing operator

Disk smoothing operator is nothing but a circular averaging filter. It filters the image

based on the radius value given. The result of the disk smoothing filter is given in

Fig.4.14.

43

(a) (b)

Fig 4.14 Fused Images using disk smoothing operator

(a) (b)

Fig 4.15 Fused Images using Gaussian smoothing operator

Gaussian smoothing provides clear image than average and disk smoothing.

44

PERFORMANCE METRICS

The Performance metrics such as Mean Square Error(MSE), Peak Signal to Noise

Ratio(PSNR), Normalized Cross Correlation(NK), Maximum Difference(MD),

Normalized Absolute Error(NAE) and Structural Similarity Index Measure(SSIM) are

calculated for various smoothing operators and are tabulated as shown in Table.4.7 and

Table.4.8.

Table.4.7.Comparison Table of Quickbird image

Table.4.8. Comparison Table of Ikonos image

INFERENCE

From the Table.4.7 and Table.4.8 it is clear that the Gaussian smoothing operator is best

suited for this process of image fusion.

MSE PSNR MD NAE SSIM

Average smoothing 366.4901 22.4902 151 0.1905 0.6195

Disk smoothing 125.8396 27.1326 119 0.0935 0.7493

Gaussian smoothing 46.7074 31.4370 109 0.0464 0.8963

MSE PSNR MD NAE SSIM

Average smoothing 541.6869 20.7933 131 0.1493 0.7436

Disk smoothing 478.7569 21.3297 164 0.1378 0.7883

Gaussian smoothing 246.6784 24.2095 122 0.0906 0.9039

45

CHAPTER 4

CONCLUSION

The fusion of panchromatic and multispectral images is carried out using guided

filtering and weighted averaging technique. Gaussian filtering provides better

decomposition of the images based on intensity than average filtering process. PSO

Segmentation based edge detection is performed to detect the edges clearly compared to

other methods. Analysing the results, the segmentation is further used to optimize the

problem. Quickbird and Ikonos satellite images are used. The performance of the

technique is analysed by using the performance metrics such as Mean Square Error, Peak

Signal to Noise Ratio, Normalized Cross Correlation, Maximum Difference, Normalized

Absolute Error and Structural Similarity Index Measure. The image details are preserved

using this technique.

In future, the fusion architecture can be implemented on FPGA with low power,

reduced area and high performance. This fusion process can also be further extended by

integrating multispectral and hyper spectral images.

46

REFERENCES

1. M. Xu, H. Chen, and P. Varshney, “An image fusion approach based on markov

random fields,” IEEE Trans. Geosci. Remote Sens., vol. 49,no. 12, pp. 5116–5127,

Dec. 2011.

2. J. Liang, Y. He, D. Liu, and X. Zeng, “Image fusion using higher order singular

value decomposition,” IEEE Trans. Image Process., vol. 21, no. 5, pp. 2898–2909,

May 2012.

3. Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assessment:

From error visibility to structural similarity,” IEEE Trans. Image Process., vol. 13,

no. 4, pp. 600–612, Apr. 2004.

4. Z. Liu, E. Blasch, Z. Xue, J. Zhao, R. Laganiere, and W. Wu, “Objective

assessment of multiresolution image fusion algorithms for context enhancement in

night vision: A comparative study,” IEEE Trans. Pattern Anal. Mach. Intell., vol.

34, no. 1, pp. 94–109, Jan. 2012.

5. R. Shen, I. Cheng, J. Shi, and A. Basu, “Generalized random walks for fusion of

multi-exposure images,” IEEE Trans. Image Process., vol. 20, no. 12, pp. 3634–

3646, Dec. 2011.

6. M. Kumar and S. Dass, “A total variation-based algorithm for pixel level image

fusion,” IEEE Trans. Image Process., vol. 18, no. 9, pp. 2137–2143, Sep. 2009.

7. Z. Wang and A. Bovik, “A universal image quality index,” IEEE Signal Process.

Letters, vol. 9, no. 3, pp. 81–84, Mar. 2002.

8. D. Looney and D. Mandic, “Multiscale image fusion using complex extensions of

EMD,” IEEE Trans. Signal Process., vol. 57, no. 4, pp. 1626–1630, Apr. 2009.

9. K. He, J. Sun, and X. Tang, “Guided image filtering,” in Proc. Eur.Conf. Comput.

Vis., Heraklion, Greece, pp. 1–14.,Sep. 2010.

10. D. Socolinsky and L. Wolff, “Multispectral image visualization through first-order

fusion,” IEEE Trans. Image Process., vol. 11, no. 8, pp. 923–931, Aug. 2002.

47

11. J. Tian and L. Chen, “Adaptive multi-focus image fusion using a waveletbased

statistical sharpness measure,” Signal Process., vol. 92, no. 9,pp. 2137–2146, Sep.

2012.

12. Z. Farbman, R. Fattal, D. Lischinski, and R. Szeliski, “Edge-preserving

decompositions for multi-scale tone and detail manipulation,” ACM Trans.

Graph., vol. 27, no. 3, pp. 67-1–67-10, Aug. 2008.

13. A. A. Goshtasby and S. Nikolov, “Image fusion: Advances in the state of the art,”

Inf. Fusion, vol. 8, no. 2, pp. 114–118, Apr. 2007.

14. Q. Zhang and B. Guo, “Multifocus image fusion using the non-subsampled

contourlet transform,” Signal Process. , vol. 89, no. 7, pp. 1334–1346, Jul. 2009.

15. Shutao Li, Xudong Kang, Jianwen Hu, “Image Fusion with Guided Filtering”,

IEEE Trans. Image Process., vol. 22, no. 7, pp. 2864–2875, Jul.2013.

48

LIST OF PUBLICATIONS

Conferences

Presented a paper titled “PSO Segmentation Based Satellite Image Fusion along

with Guided Filter” in the 5th

National Conference on “Communication,

Information & Telematics”-CITEL 2016 on 30th

-31st Mar 2016 at Kumaraguru

College of Technology, Coimbatore.

Presented a paper titled “Satellite Image Fusion with PSO Segmentation and

Guided Filter” in an International Conference on Engineering Digital Green Era -

EDGE 2016 on 17-19 Mar 2016 at Rajalakshmi Engineering College, Chennai.

Presented a paper titled “Fusion of Satellite Images with Guided Filter” in an

IEEE Sponsored 3rd

International Conference on Electronics and Communication

Systems(ICECS) on 25th

-26th

Feb 2016 at Karpagam College of Engineering,

Coimbatore.

Documents

FUSION OF SATELLITE IMAGES WITH GUIDED FILTER · information than source images. An effective and fast image fusion method is proposed for creating a fused image from multiple images