A novel approach for satellite imagery storage by classify

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),

ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEME

147

A NOVEL APPROACH FOR SATELLITE

IMAGERY STORAGE BY CLASSIFYING THE

NON-DUPLICATE REGIONS

Cyju Varghese Computer Science Department

Karunya University, India

E-Mail: [email protected]

John Blesswin

Computer Science Department



Navitha Varghese Computer Science Department



Sonia Singha

Computer Science Department



ABSTRACT

Everyday satellite is capturing thousands of images which needs to be classified

in a proper way. In this paper, we address the problem of replacing the existing images

with the captured one. We provide a new solution by storing only the non-existing part of

the image. Though satellite images have been classified in past by using various

techniques, the researchers are always finding alternative strategies for satellite image

classification so that they may be prepared to select the most appropriate technique for

the feature extraction task in hand. In order to overcome this difficulty, we propose an

efficient approach, which consists of an algorithm that can adopt robust feature kernel

principle component analysis (KPCA) to reduce dimensionality of image. Concerning

image clustering, we utilize Fuzzy N-Means algorithm. Finally data is stored into

International Journal of Computer Engineering

and Technology (IJCET), ISSN 0976 – 6367(Print)

ISSN 0976 – 6375(Online) Volume 1

Number 2, Sep - Oct (2010), pp. 147- 159

© IAEME, http://www.iaeme.com/ijcet.html

IJCET

© I A E M E



148

database according to specific class by utilizing support vector machine classifier. Thus

the proposed scheme improve the efficient storage of satellite images in the database,

save time consumption and make the correction of the satellite images more proficiently.

Index Terms- Compression, Duplicate Detection, Feature Extraction, Image Clustering,

Satellite Image

1. INTRODUCTION

Satellite images are playing an important role in many applications, especially to

capture earth images for environmental study and homeland security. ‘Geo’ is a generally

used satellite, to capture earth images. Thousands of thousands images are transmitted

every day to ‘digital globe’ database. Everyday the topography of the earth is changing

and therefore updating the images in database frequently is very tedious. In current

applications, the images are being totally updated in the database. Instead of updating the

whole image, this paper employs an approach to detect non-duplicates and duplicate

blocks in the captured image and update the non-duplicate blocks only in the

corresponding image in the database. The approaches make use of a Duplication

Detection algorithm.

To avoid the duplication of same image duplication detection approach need to be

applied. Traditional approaches in duplication detection of image objects normally

partition images into several blocks. These detection methods are designed specifically

for the purpose of separating duplicate and non-duplicate image. It can detect duplication

when the locations of the extracted objects are invariant to scaling, translation, or

rotation.

The traditional techniques used in detecting duplication include discrete wavelet

transform (DWT), principle component analysis (PCA), fourier mellin transform (FMT).

These techniques are restricted with only linear features. Duplicate detection involves

division of the image into overlapping blocks, extract features from each block, detect

similar feature. Depending on the type of duplication, various measures and mechanisms

can be adopted and implemented to counter duplication.

A discrete wavelet transform (DWT) [3][4]maps the time-domain signal of f(t)

into a real-valued time frequency domain and the signals are described by the wavelet



149

coefficients. Five-scale signal decomposition is performed to ensure that all disturbance

features in both high and low frequencies are extracted. Thus, the output of the wavelet

transform consists of five decomposed scale signals, with different levels of resolutions.

Principal component analysis (PCA) [5] in signal processing can be described as a

transform of a given set of input vectors (variables) with the same length formed in the n-

dimensional vector. FMT [6][7]is a global transform and applies on all pixels in the same

way. Fourier-Mellin Transform includes translation, scaling, and rotation invariance. To

achieve these properties, image is divided into overlapping blocks. Fourier transform is

applied into each of the block and obtains features.

KPCA [1][2]is better over other techniques because it is used for non-linear

feature extraction. It can detect duplication if a particular portion of an image has been

rotated in any direction. Quantitative analyses indicate that the KPCA-based feature

obtains excellent performance in the additive noise and lossy JPEG compression

environments. This method uses global geometric transformation and the labeling

technique to indentify the mentioned duplication. Experiments with a good number of

natural images show very promising results, when compared with the other conventional

approach. Duplication detection involves division of the image into overlapping blocks,

extract features from each block, detect similar feature. KPCA technique is mainly used

for nonlinear feature extraction where other techniques are used for linear feature

extraction. KPCA extracts more useful features than the linear PCA. Initial mapping to

high-dimensional space provides smoother dimensionality reduction than the standard

PCA. It does not require nonlinear optimization but just the solution of eigen value

problem. Although signal reconstruction is unnecessary for the tampering detection,

KPCA is computationally more expensive than the linear PCA.

2. PROPOSED SCHEME

Satellite image is used as input for this application. At the time of storing this

image in database the image size will be reduced and then stored in the database. It

requires less memory. Kernel principle is used to reduce the dimensionality of the image.

Kernel PCA is a non-linear feature extractor which is used to detect duplicate and non-

duplicate regions from satellite image. In Kernel PCA one important concern is selection



150

of kernel function and computation of gram matrix. They can extract data-nonlinearity

and can simulate the behavior of other kernels. Gram matrix can be finding from

following equation:

K (xi, xj) = exp ( ) (1)

Where Gaussian kernel denote important property, the value of kernel parameter is very

important.

Figure 1 Flowchart of proposed scheme

To compute principle component following step has to follow:

1. Construct one training and one testing matrix.

2. Compute gram matrix for training matrix.

3. Center the training gram matrix.

4. Diagonalizable the new matrix and compute Eigen value and eigenvector.

5. Construct the test gram matrix.



151

6. Center the test gram matrix.

7. Compute projection of all vectors onto the eigenvectors.

From the compressed images, extraction of image features is the most important step

that has a great impact on the retrieval performance.

A. Satellite Image Clustering

The concept of points having significant membership to multiple classes is

deployed by Fuzzy algorithm. The points situated in the overlapped regions of different

clusters are first identified and excluded from consideration while clustering. Thereafter,

these points are given class labels based on Support vector Machine classifier which is

trained by the remaining points. The well known Fuzzy N-Means algorithm and some

recently proposed genetic clustering schemes are utilized in the process. Image is divided

into number of blocks. Each block can have same features or different kind of features.

Clustering is performed to group same kind of features. This step will give some

additional advantage for duplication detection from image.

B. Satellite Image Segmentation

Using KPCA image is divided into number of blocks. Each block can have same

features or different kind of features. Here image segmentation is performed to group

same kind of features. This step will give some additional advantage for duplication

detection. Image segmentation is the basis of image analysis & understanding. Image

segmentation is exactly the problem of classifying pixel set of image. Clustering analysis

is naturally applied into image segmentation.

Here we are using Fuzzy N means algorithm for image segmentation. Fuzzy N

means is improved version of fuzzy C means. Here outlier test is also performed to

improve performance of segmentation. The internal level is used for calculating new

centroid and updating fuzzy subjection-level matrix, and the external level is for judging

if the algorithm has been converged to estimated threshold. After finishing the iterative,

we can know generic subjection-level of certain pixel to certain clustering centre

according to generated fuzzy subjection-level matrix, and determine generic category of

the pixel by the size of the matrix[8]. Image segmentation means that image is indicated

as set of physically meaningful connected areas.



152

Algorithm:

Input: Test image

Output: Segmented image

Step1: Initialize the parameter ⌡ and also perform normalization

Step 2: For k=1…….N

Perform outlier test using equation

Outlier test: ||xk-vI||

Where ui(k)= 2

Update centroid by:

Vi(new)

=vi(old)

+ (x-vi(old)

)

Step 3: Termination test || - ||>Є

The problem of segmenting image into different clusters is iteratively [10]

handled by means of single parameter .Outlier test is performed to improve the cluster

validity index. After finding the centers of clusters fuzzy membership value can be

measured at any point. Thus groups of clusters with similar feature are obtained after

performing this algorithm.

Image segmentation means that image is indicated as set of physically [9]

meaningful connected areas. Generally we achieve image segmentation purpose through

analyzing such different image characteristics by using fuzzy N means clustering

algorithm.

Table 1 List of Symbols

List of symbols

ui Fuzzy Membership Value

Vi Centroid Value

Xk Pixel Values



153

C. Duplication Detection of Satellite Image

Satellite Image database contain previously captured images of real world that are

used as a training data. Everyday thousands of images is captured by satellite. In order to

update the database, duplication detection has to be performed before storing the image

into database. Each time satellite is storing the new image into database by replacing the

previous one which is captured by it. This process is time consuming and it requires

additional memory space. This paper proposes a new approach which updates the

existing image with the identified non-duplicate block.

To find the duplicate and non-duplicate blocks from the images duplication

detection algorithm has proposed. Input of this algorithm is test image block.

The duplication detection steps are as follows:

Algorithm:

Input: Test image of N pixels.

Output: Non duplicate block of image.

Step 1: Initialize block processing parameters:

b: Number of pixels per block,

Q: Number of quantization bins,

Rth: Number of neighboring rows to search in the lexicographically sorted matrix,

Dth: Minimum offset-magnitude threshold

∊: Fraction of the ignored variance along the principal axes or the fraction of the

ignored local variance of the wavelet coefficients

M: Number of training samples for the KPCA.

Step 2: Apply KPCA on each block, b, of data, and compute a transform vector of length

L, which is equal to (M, Nt2) for the KPCA-based features with dimension

reduction.

Step 3: Construct a data matrix, Mdata, of size Nb × L, where row-elements contain

component-wise quantized features, i.e., bai/Qc.

Step 4: Apply lexicographic sorting to the rows of the above matrix to obtain a new

matrix S. Let si, be the i-th row of S, which represents the i-th block with its

center coordinates (xi, yi).



154

Step 5: For every row si from S, select a number of adjacent rows, sj, such that |i − j| < Rth

and place all pairs of coordinates (xi, yi) and (xj , yj) for j = 0, 1, ..., (Rth − 1) onto

a list Pin.

Step 6: Eliminate all pairs of points, whose offset-magnitude, Dof , is less than Dth.

Construct a set, OF, of various offsets (m, n) and offset-frequencies (fm,n) for all

elements in Pin.

Step 7: Create a refined list of point-pairs, Pout, from Pin by the algorithm or by using

manual threshold, fth.

The proposed duplication detection algorithm has several parameters to be

selected and justified before using them. These are block-size (b), number of quantization

bins (Q), block-similarity threshold (Rth), minimum offset-magnitude threshold (Dth),

offset-frequency threshold (fth), and the fraction of ignored variance (∊). The selection

of Q depends on the feature variations. The selection of Rth depends on how well

lexicographic sorting arranges similar vectors (blocks) in the sorted matrix,S. The parameter Dth is

used to avoid false detection.

D. Categorization of Satellite Image

From the identified blocks to classify satellites image manually is a tedious

process. To perform this, computer utilizes the numerical "signatures" for each training

class. Each pixel in the image is compared to these signatures and labeled as the class it

most closely resembles digitally. Hence, supervised classifiers require the user to decide

which classes exist in the image, and then to define training areas of these classes. SVM

allows not only the best classification performance (e.g., accuracy) on the training data,

but also leaves much room for the correct classification of the future data. [11]

After detecting a few duplicate pixels whose similarity scores are bigger than the

threshold using the KPCA algorithm, we have positive examples, the identified duplicate

blocks in D, and negative blocks, namely, the remaining non duplicate blocks in N.



155

Table 2 Some Samples of the Test High-Resolution Satellite Image Database

Algorithm:

Input: Duplicate D and Non-Duplicate regions N

Original Image

Output: Updated Image

Step1: Train Classifier C1 using D and N.

Step2: Classify the Non-Duplicate region N to the corresponding class label C in the

database.

Step 3: Perform Step 2 until all the Non-Duplicate blocks in N are inserted into the

Original Image I.

The duplicate blocks in D and Non-Duplicate N are used to train the classifier

(SVM) inorder to identify where to categorize the non-duplicated block in the already

stored image I thus updating the image. Thus the satellite images are stored in an

efficient manner in the database. The proposed scheme works as follows: Image captured

is compressed using Kernel Principle Component Analysis (KPCA) and the feature

extracted. The features extracted are clustered, employing the Fuzzy P –Means Algorithm

inorder to perform the duplicate detection algorithm efficiently.

Duplication Detection is performed by comparing the captured image with the

stored image. The duplicate and non-duplicate blocks are thus detected. Later on the

missing part of the image stored in the database is updated by bringing in the non-

duplicate block.

Existing image

in database Captured Image

Results after

applying

the Algorithm

Stored Image MS

E PSNR

0 33.56



156

Table 2 Detection Accuracy for JPEG Dataset

Intra-dataset average precision

(P%) and recall (R%)

JPEG

Features

P R

KPCA 73.19 40.27

KPCA based feature obtains the best recall (40.27%) for JPEG and medium

precision (73.19%) for JPEG performances in the compressed and noisy domain shown

in Table 2.

KPCA is performed on JPEG Satellite images in our experiment. It can be also be

performed BMP and SNR images. Recall varies roughly in sigmoid fashion with

increasing JPG.

III. EXPERIMENTAL RESULTS

Experimental results on satellite images demonstrate four objectives. Thus more

than 100 satellite jpeg images have been tested. Sample tested satellite images are given

in Table1. The first is the implementation of KPCA. The dimensionality of the original

image is reduced. The image is resized to 256 x 256 before applying the proposed

duplicate detection method. Moreover, the features are extracted using KPCA. The

second is, clustering the extracted features of the compressed image. Fuzzy N-means

cluster algorithm groups the similar features. This clustered information is used to

identify duplicate and non-duplicate block of the image. The third objective is the

duplication detection. To show the non-duplicate block of the image a different color is

used. First set of experiments use parameters which were empirical fixed to b=64, Q=

256, Rth=50.Dth=16, �=0, =1. The identified duplicate D and non-duplicate N blocks

are used to train SVM classifier. This performs the task of blocks being inserted into the

database.

In our scheme, peak signal-to-noise ratio (PSNR) is used to evaluate the quality of

the updated image. Similarly, we use mean square error (MSE) to identify the difference

between the updated image and the captured image. The quality of the updated image is

considered by using two points of view. First, under the human resource system the



157

updated image is almost indistinguishable from the original image. Secondly, the PSNR

values of the updated images and the original images range from 32 to 34.5db. Moreover,

all MSE’s are equal to zero when the image is exactly updated.

IV. CONCLUSION

This paper has presented one method for detecting duplicated regions in the

satellite image. An automatic duplication detection forgery has been proposed. This

technique reduces false detection as well as eliminates an important threshold parameter.

Although time-cost is high, this method can have good performance. The next method

what we are applying is clustering method. Finally classification method is applied to

store the non-duplicate region of the image in the database.

REFERENCES

[1] M. K. Bashar, Member, IEEE, K. Noda, Non-member, N. Ohnishi, and K. Mori,

Member, IEEE ” ,Exploring Duplicated Regions in Natural Images”. IEEE

Transaction on Image Processing, Vol 1,pp. 1-40, March 2010.

[2] M. Turk and A. Pentland, “Eigenfaces for recognition,” Journal of Cognitive

Neuroscience, vol. 3, no. 1, 1991.

[3] G. Li, Q. Wu, D. Tu, and S. Sun, “A Sorted Neighborhood Approach for

Detecting Duplicated Regions in Image Forgeries based on DWT and SVD,” in

Proceedings of IEEE International Conference on Multimedia and Expo, Beijing

China, July 2-5, 2007, pp. 1750-1753.

[4] W .Luo, J. Huang, and G. Qiu, “Robust Detection of Region Duplication Forgery in

Digital Image,” in Proceedings of the 18th International Conference on Pattern

Recognition, Vol. 4, 2006, pp. 746-749.

[5] C. Popescu and H. Farid, “Exposing Digital Forgeries by Detecting Duplicated Image

Regions”, Technical Report, TR2004-515, Dartmouth College, Computer Science,

2004.

[6] Sevinc Bayram, Taha Sencar, and Nasir Memon, “An efficient and robust method

for detecting copy-move forgery,” in Proceedings of ICASSP 2009.



158

[7] H. Huang, W. Guo, and Y. Zhang, “Detection of Copy-Move Forgery in Digital

Images Using SIFT Algorithm,” in Proceedings of IEEE Pacific-Asia Workshop on

Computational Intelligence and Industrial Application, Vol. 2, pp. 272-276, 2008.

[8] Sang Wan Lee, Yong Soo Kim, and Zeungnam Bien, Fellow, IEEE, “A

Nonsupervised Learning Framework of Human Behavior Patterns Based on

Sequential Actions” IEEE Transactions on Knowledge and Data Engineering, vol.

22, no. 4, April 2010.

[9] Z. Bien and M.-G. Chun, “A Fuzzy Petri Net Model,” Handbook of Fuzzy

Computation, C2.4, IOP Publishing Ltd., 1998.

[10] T. Tajima et al., “Development of a Marketing System for Recognizing Customer

Buying Behavior Sensor,” J. Japan Soc. for Fuzzy Theory and Intelligent

Informatics, vol. 20, no. vol 5,pp 18-22,apr.2007

[11] Weifeng Su, Jiying Wang, and Frederick H. Lochovsky, Member, IEEE Computer

Society. “Record Matching over Query Results from Multiple Web Databases” IEEE

Transactions On Knowledge And Data Engineering, VOL. 22, NO. 4, APRIL 2010

[12] R. Baeza-Yates and B. Ribeiro-Neto, “Modern Information Retrieval.” ACM Press,

1999.

Cyju Elizabeth Varghese received the B.E degree in Computer Science and

Engineering from CSI Institute of Technology, Thovalai, India, in

2001 and been working since. Currently she is doing M. Tech in

Computer Science and Engineering in Karunya University,

Coimbatore. Her research interests include Web Mining and areas

related to Database.

John Blesswin received the B.Tech degree in Information Technology from Karunya

University, Coimbatore, India, in 2009. He passed B.Tech

examination with gold medal. He is doing M.Tech Computer

Science and Engineering in Karunya University. His research

interests include visual cryptography, visual secret sharing schemes,

image hiding, and information retrieval.



159

Navitha Varghese received the B.Tech degree in Computer Science and Engineering

from Model Engineering College, Ernakulam, India, in

2009.Currently she is doing M.Tech in Computer Science at

Karunya University, Coimbatore. Her research interests include Web

Mining, Web technology

Sonia Singha received the B.Tech degree in Computer Science and Engineering from

Calcutta Institute of Technology, Kolkata, India, in 2009.Currently

she is doing M. Tech in Computer Science at Karunya University,

Coimbatore. Her research interests include Data Mining, Image

Processing.

Documents

A novel approach for satellite imagery storage by classify