15
JUNE 25, 2012 1 TOPIC: BAND COMBINATIONAN D BAND RATIOS AND PRINCIPLE COMPONENT ANALYSIS (PCA) Submitted To: Sir Amir Mehmood Subject: Remote Sensing II Group: Girls_lll Members: Roll No’s Atiqa Ijaz Khan 03 Rafia Naheed 09 Syeda Rbiya Mahmood 14 Rabia Zahoor 28

Pca 2012

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Pca 2012

JUNE 25, 2012

1

TOPIC:

BAND

COMBINATIONAN

D BAND RATIOS

AND PRINCIPLE

COMPONENT

ANALYSIS (PCA)

Submitted To:

Sir Amir Mehmood

Subject:

Remote Sensing II

Group: Girls_lll

Members: Roll No’s

Atiqa Ijaz Khan 03

Rafia Naheed 09

Syeda Rbiya Mahmood 14

Rabia Zahoor 28

Page 2: Pca 2012

JUNE 25, 2012

2

Table of contents

Band Ratios ........................................................................................................................................... 03

Definitions ......................................................................................................................................... 03

Formula ............................................................................................................................................. 03

Advantage………………………………………………………………………………………………………………………………… 03

Band Combination ................................................................................................................................ 07

Principal Component Analysis ............................................................................................................. 09

History ........................................................................................................................................... 09

Objective…………………………………………………………………………………………………………………………………09

Definitions……………………………………………………………………………………………………………………………… 09

Mathematical Analysis…………………………………………………………………………………………………………… 10

Important Terms in PCA………………………………………………………………………………………………………… 10

Outline of PCA………………………………………………………………………………………………………………………. 12

Advantages……………………………………………………………………………………………………………………………..13

Disadvantages…………………………………………………………………………………………………………………………13

Summary…………………………………………………………………………………………………………………………………13

Expert Opinion………………………………………………………………………………………………………………………..14

References............................................................................................................................................15

Page 3: Pca 2012

JUNE 25, 2012

3

Band ratios

Definitions:

“Band rationing means dividing the pixels in one band by the corresponding pixels in a

second band.”

“Ratio images are enhancements resulting from the division of DN values in one spectral

band by the corresponding values in another band.”

“To generate a band ratio image, the high digital reflectance values of a specific material is

divided by the corresponding digital reflectance value with lowest reflectance.”

The reason for this is twofold:

1. One is that differences between the spectral reflectance curves of surface types can be

brought out.

2. The second is that illumination, and consequently radiance, may vary, the ratio

between an illuminated and a not illuminated area of the same surface type will be the

same.

Formula for Number of Possible Ratios:

The number of possible ratios that can be developed from n bands of data is:

“n (n-1)”

Example:

For the 6 non-thermal bands of the Landsat Tm or ETM+ data there are 6(6-1), or 30

possible combinations (15 original and 15 reciprocal).

Example on Vegetation:

The near-infrared-to-red for healthy vegetation is normally very high. That for stressed

vegetation is typically lowers (a near-infrared reflectance decrease and the red reflectance

increases). The near-infrared-to-red (red-to-near-infrared) rationed image might be useful

for differentiating between areas of the stressed and non-stressed vegetation. This type of

ratio has been employed extensively in vegetation indices aimed at qualifying greenness

and biomass.

Advantage:

1. The major advantage of ratio images is that they convey the spectral or color

characteristics of the image features, regardless of variations in the scene illumination

conditions.

Page 4: Pca 2012

JUNE 25, 2012

4

2. Rationed images are often useful for discriminating refined spectral variation in a

scene that are masked by the brightness variations in images from the spectral bands

or in standard color composite.

Which bands to ratio?

The answer depends on the purpose for creating the ratio. There is a physical basis why each

of the different band combinations works, but it is partially a trial and error process. These

sorts of band ratios come in a couple of different “flavors”.

ETM+ For an Example:

Landsat images are composed of seven different bands, each representing a different portion

of the electromagnetic spectrum. In order to work with Landsat band combinations (RGB

composites of three bands) first we must understand the specifications of each band.

Band 1: (0.45-0.52 µm, blue-green)

This short wavelength of light penetrates better than the other bands, and it is often the band

of choice for monitoring aquatic ecosystems (mapping sediment in water, coral reef habitats,

etc.). Unfortunately this is the “noisiest” of the Landsat bands since it is most susceptible to

atmospheric scatter.

Band 2: (0.52-0.60 µm, green)

This has similar qualities to band 1 but not as extreme. The band was selected because it

matches the wavelength for the green we see when looking at vegetation.

Page 5: Pca 2012

JUNE 25, 2012

5

Band 3: (0.63-0.69 µm, red)

Since vegetation absorbs nearly all red light (it is sometimes called the chlorophyll absorption

band) this band can be useful for distinguishing between vegetation and soil and in

monitoring vegetation health.

Band 4: (0.76-0.90 µm, near infrared)

Since water absorbs nearly all light at this wavelength water bodies appear very dark. This

contrasts with bright reflectance for soil and vegetation so it is a good band for defining the

water/land interface.

Band 5: (1.55-1.75 µm, mid-infrared)

Page 6: Pca 2012

JUNE 25, 2012

6

This band is very sensitive to moisture and is therefore used to monitor vegetation and soil

moisture. It is also good at differentiating between clouds and snow.

Band 6: (10.40-12.50 µm, thermal infrared)

This is a thermal band, which means it can be used to measure surface temperature. Band 6 is

primarily used for geological applications but it is sometime used to measure plant heat

stress. This is also used to differentiate clouds from bright soils as clouds tend to be very

cold. The resolution of band 6 (60m) is half of the other bands.

Band 7: (2.08-2.35 µm mid-infrared)

This band is also used for vegetation moisture although generally band 5 is preferred for that

application, as well as for soil and geology mapping.

Page 7: Pca 2012

JUNE 25, 2012

7

Band combination

Introduction:

Effective display of an image is critical for effective practice of remote sensing. Band

combination is the term that remote sensing use to refer to the assignment of the colors to

represent brightness in different regions of spectrum. A key constraint for the display of any

multispectral image is that human vision portrays differences in the color of surfaces through

our eyes ability to detect differences in brightness in three additives primary-blue, green, red.

Need of Band Combinations:

Single band remote sensing image may not be sufficient top extract desire information,

handling of multiple bands is also inconvenient. Multiple bands may be combined to generate

one or more transformed combined bands following different mathematical operations.

Multiple bands combination has the capability to enhance the features of the interest of the

analyst. Band combination includes addition, subtraction, and rationing, principal component

analysis and so on.

Mathematical treatment:

Band combination is general a combination of multi band (e.g. multispectral) images. It is

defined as an output of multiband functions or operations. In a normal band combination, the

same operation is carried out on each pixel in the image. The output of the operation is an

image of new pixel generated due to some mathematical combination of pixel values of

various bands of an input image.

ETM+ For an Example:

Common Landsat Band Combinations Individual bands can be composited in a Red, Green,

and Blue (RGB) combination in order to visualize the data in color. There are many different

combinations that can be made, and each has their own advantages and disadvantages. Here

are some commonly used Landsat RGB band combinations (color composites):

Page 8: Pca 2012

JUNE 25, 2012

8

3, 2, 1 RGB: This color composite is as close to true color that we can get with a Landsat

ETM image. It is also useful for studying aquatic habitats. The downside of this set of bands

is that they tend to produce a hazy image.

4, 3, 2, RGB: This has similar qualities to the image with bands 3, 2, 1 however, since this

includes the near infrared channel (band 4) land water boundaries are clearer and different

types of vegetation are more apparent. This was a popular band combination for Landsat

MSS data since that did not have a mid-infrared band.

4, 5, 3 RGB: This is crisper than the previous two images because the two shortest

wavelength bands (bands 1 and 2) are not included. Different vegetation types can be more

clearly defined and the land/water interface is very clear. Variations in moisture content are

evident with this set of bands. This is probably the most common band combination for

Landsat imagery.

Page 9: Pca 2012

JUNE 25, 2012

9

7, 4, 2 RGB: This has similar properties to the 4, 5, and 3 band combination with the biggest

difference being that vegetation is green. This is the band combination that was selected for

the global Landsat mosaic created for NASA.

5, 4, 1 RGB: This band combination has similar properties to the 7, 4, and 2 combination,

however it is better suited in visualizing agricultural vegetation.

Principal Component Analysis (PCA)

History:

Principal component analysis was first proposed in 1933 by Hotelling in order to solve the

problem of decor relating the statistical dependency between variables in multi-variety

statistical data derived from exam scores, Hotelling (1933). Since then, PCA has become a

widely used tool in statistical analysis for the measurement of correlated data relationships

between variables, but it has also found applications in signal processing and pattern

recognition for which it is often referred to as the Karhunen-Loeve transform, Therein

(1989).

Page 10: Pca 2012

JUNE 25, 2012

10

Objectives of Principal Component Analysis:

1. To discover or to reduce the dimensionality of the data set.

2. To identify new meaningful underlying variables.

Definition of Principal Components Analysis (PCA):

“Is a method in which original data is transformed into a new coordinate system, which acts

to condense the information, which is found in the original inter-correlated variables into a

few uncorrelated variables, called principal components”

“In any principal components rotation, the first component or dimension accounts for the

maximum proportion of the variance of the original image, and subsequent components

account for maximum proportion of the remaining variance.”

Mathematical background on principal component analysis:

Eigen Analysis:

The mathematical technique used in PCA is called Eigen analysis:

Solve for the eigenvalues and eigenvectors of a square symmetric matrix with sums of

squares and cross products. The eigenvector associated with the largest eigenvalue has the

same direction as the first principal component. The eigenvector associated with the second

largest eigenvalue determines the direction of the second principal component. The sum of

the eigenvalues equals the trace of the square matrix and the maximum number of

eigenvectors equals the number of rows (or columns) of this matrix.

Important Terms in Principle Component Analysis (PCA):

1. Factor analysis: The search for the “factors” (i.e. band combinations) that contain the

most information.

2. Original Data: The set of brightness values for n bands and m pixels.

Page 11: Pca 2012

JUNE 25, 2012

11

3. PCA: A linear method of factor analysis that uses the mathematical concepts of

eigenvalues and eigenvectors. It amounts to a “rotation” of the coordinate axes to identify the

Principle Components.

4. Principle Component: An optimum linear combination of band values comprising a new

data layer (or image).

5. Co-variance: A measure of the redundancy of two bands (i and j), created by summing the

product of the two band values over all the pixels (M).

6. Correlation: Co-variance normalized by the variances of the two bands.

7. Redundant bands: Bands with a CC=1 contain the same information.

8. Correlation Matrix: A square symmetric matrix containing the correlation coefficients

between every pair of bands. It contains statistical information about the data.

9. Eigenvector: The set of weights applied to band values to obtain the PC. Eigenvectors

represent the orientation of the principal axes of each component (as an angle). Eigenvectors

are standardized so that the squares of the elements sum to one. Therefore, an eigenvector

loading reflects the relative importance of a variable within a principal component but does

Page 12: Pca 2012

JUNE 25, 2012

12

not reflect the value of the component itself. Eigen Vectors show the direction of axes of a

fitted ellipsoid

10. Eigenvalue: A measure of the variance in a PC. Eigenvalues represent the lengths of each

successive component axis, therefore the greater the eigenvalues the more "important" the

component is for explaining the variation in the dataset. The percentage of the total variance

explained by each eigenvalue is often more useful as it gives the relative contribution of each

eigenvalue to explaining the variance in the dataset. Eigen Values show the significance of

the corresponding axis. The larger the Eigen value, the more separation between mapped

data. For high dimensional data, only few of Eigen values are significant.

11. Axes Rotation: Multiplying the original data matrix by a matrix of eigenvectors is

equivalent to rotating the data to a new coordinate system.

12. Standraised PC: The principal components calculated using correlation matrix.

13. Unstandardized PC: The principal components calculated using covariance matrix.

Outline of Principle Component Analysis:

1. Start with an image data set including the reflectance values from n bands with m pixels.

This non-square nxm matrix will be called D.

2. This data set may contain “redundancies” i.e. bands whose reflectance’s correlate

perfectly with another band. It may also contain noise.

Noise: Our definition of noise is signal that does not correlate at all between bands

3. Subtract the means from each band and compute the variances and co-variances between

each pair of bands. Place these values into an nxn square matrix (say A). It is symmetric.

Normalize the co-variances by the square-root of the variances to form the correlation matrix.

This is a useful matrix to study, and it forms the basis of PCA.

Note: At this point, one could just delete bands that correlate well with other bands. This

action would reduce the size of the data set. The PCA method below is more objective and

systematic.

Page 13: Pca 2012

JUNE 25, 2012

13

4. Find the eigenvalues and eigenvectors of the dataset by solving this

formula: ________________________ (1)

Where λ is an eigenvalue and V is an eigenvector.

Eigen: The word “Eigen” means that these quantities are characteristics of the correlation

matrix (say A).They reveal the hidden properties it.

Typically there will be n different solutions to (1), so there will be n paired eigenvalues and

eigenvectors (i.e. iλ .and IV). The eigenvalues will be real and positive (because A is

symmetric). We usually list these eigenvalues and eigenvectors in order of decreasing

eigenvalue. That is, the first eigenvector corresponds to the largest eigenvalue.

5. The eigenvectors have a dimension equal to n, i.e. the number of original bands. The first

eigenvector represents a synthetic spectrum containing the largest variance across the scene.

6. The eigenvectors are orthogonal to each other, for example 021 =VV.

They are normally scaled so that their length is unity, that is 1|| =iV.

With these two properties, the multiplying the original dataset by an eigenvector rotates the

reflectance vector for a pixel where PC1 is the first Principle Component. It is a vector with

m components representing a brightness value for each pixel, i.e. it is a new single band

image. Its pixel values are linear combinations of the original band values for that pixel. The

weights are given by the components of the first eigenvector. Because of our ordering of

eigenvalues, this image contains the most “information” of any single image.

The first eigenvalue is proportional to the brightness variance in the first PC.

To obtain the other Principle Components, we repeat (2) with the other eigenvectors so that

___________________ (2)

And the PC data layers can be stacked to form a new “data cube”. If desired, only the first

few PCs can be kept, reducing the size of the dataset. For example, if the original dataset had

200 bands (i.e. n = 200), you could keep only the ten PCs with the largest eigenvalues. This

dataset is only 1/20 of the original size. The number of pixels is unchanged.

Page 14: Pca 2012

JUNE 25, 2012

14

7. A remarkable property of the new data cube is that the band values are completely

uncorrelated. There is no more redundancy! (Actually, the uncorrelated data in the original

scene is pushed off into the high PC components.) Another property is that the bands are

ordered by their “information content” (i.e. variance).

Advantage:

The primary advantage of PCA is Principal Component’s analysis generates orthogonal

(uncorrelated) components that represent 100% of the variance present in the original dataset.

Disadvantage:

A disadvantage of the PC representation is that one can no longer identify spectral signatures

of objects. A “pixel profile” in the new data cube is not a spectral signature (i.e. reflectance

plotted against wavelength).

Summary of Principal Components Analysis (PCA):

PCA is a technique that transforms the original vector image data into smaller set of

uncorrelated variables.

The variables represent most of the image information and easier to interpret.

Principal components are derived such that the first PC accounts for much of the

variation of the original data. The second (vertical) accounts for most of the remaining

variation.

PCA is useful in reducing the dimensionality (number of bands) that used for analysis.

Minimum noise fraction (MNF) method can be used with hyper spectral data for noise

reduction.

Expert Opinion about PCA:

1. Mather, (2003) states that PCA is a standard method for deriving reduced data

or minimizing information redundancy in the original image.

2. Zumsprekel and Prinz, (2000) states that PCA reduces the dimensionality

Of the dataset while retaining as much information as possible.

3. A more detailed reading of PCA can be taken from Jensen (1996), Mather

(2003), and Gibson and Power (2000). Principal Component Analysis was run on bands

1, 2, 3, 4, 5, &7. The results were six PCA bands and were visually assessed using

RGB band combinations.

4. Zumsprekel and Prinz, (2000) and Rajesh (2008) state that the first PC (PC

-1) combines the total albedo difference of all original TM bands, the second PC (PC-2)

emphasize the spectral differences between the visible spectrum (VIS) and the infrared

spectrum (IR) and the third PC (PC-3) illustrates albedo variations within the IR

spectrum. They further state that the higher principal components (PC-4 to -6) may

contain important lithological information but are often increasingly loaded with noise

effect. The first 3 PC bands have been selected in this study, because they best highlight the

lithological features.

Page 15: Pca 2012

JUNE 25, 2012

15

References

PDF-FORMATS:

1. A HYBRID IMAGE CLASSIFICATION APPROACH FOR THE SYSTEMATIC

ANALYSIS OF LAND COVER (LC) CHANGES IN THE NIGER DELTA

REGION.

2. A TUTORIAL ON PRINCIPAL COMPONENT ANALYSIS.

3. EVALUATING PRINCIPAL COMPONENTS ANALYSIS FOR IDENTIFYING

OPTIMAL BANDS USING WETLAND HYPERSPECTRAL MEASUREMENTS

FROM THE GREAT LAKES, USA ,

4. IDENTIFYING HYDROCARBON LEAKAGE INDUCED ANOMALIES USING

LANDSAT-7 /ETM+ DATA PROCESSING TECHNIQUES IN THE WEST SLOPE

OF SONGLIAO BASIN, CHINA

5. LAND-COVER CLASSIFICATION USING ASTER MULTI-BAND

COMBINATIONS BASED ON WAVELET FUSION AND SOM NEURAL

NETWORK

BOOKS:

1. DIGITAL REMOTE SENSING BY PRITHVISH NAG, M. KUDRAT

2. INTRODUCTION TO REMOTE SENSING, BY JAMES CAMPBELL,

RANDOLPH WYNNE

3. JENSEN, J.R. (1996). INTRODUCTORY DIGITAL IMAGE PROCESSING: A

REMOTE SENSING PERSPECTIVE. SECOND EDITION. PRENTICE HALL.

4. LILLESAND, T. M. AND R. W. KIEFER (2002). REMOTE SENSING AND

IMAGE INTERPRETATION. NEW YORK, JOHN WILEY & SONS, FIFTH

EDITION.

5. TEXTBOOK OF REMOTE SENSING AND GEOGRAPHICAL

INFORMATION SYSTEM BY KALI CHARAN SAHU