64
Classification & Analysis of Digital MSS Data Lecture 8 Summer Session 09 August 2011

Summer Session 09 August 2011. Tips for the Final Exam Make sure your answers clear, without convoluted language. Read questions carefully – are you answering

Embed Size (px)

Citation preview

Lecture 9 - GEOG 372 Remote Sensing

Classification & Analysis of Digital MSS Data Lecture 8Summer Session 09 August 20111Tips for the Final ExamMake sure your answers clear, without convoluted language.Read questions carefully are you answering the entire question?Be thorough! Synthesize material you have learned in this class.Think about applications of remote sensing science tie in your lab experiences with what you have learned from the book/lecture.Likely format:14 multiple choice - 2 points each = 2810 short answer 4 points (9) or 6 points (1) each = 42You will have choices for most 1A or 1B, 2A or 2B, etc.2 or 3 essay questions total of 50 pointsYou will probably have choices here, too.Longer, more challenging, more comprehensive. Extra Credit Assignment Lab worth 3%, due 18 August before final exam

3Today, we are going to start discussing methods used to analyze and classify satellite imagery collected by spaceborne MSS images

While image interpretation is a very important component in this process, with multi-band MSS data it is somewhat limiting

For example, here we have an image that was created from Landsat ETM data

Recall Landsat ETM has 6 different bands or channels in the visible/RIR

It is impossible to display all these bands on a single color image, you can only display 3 channels at a time

So here we have an image from bands 1 3

Made from the blue, green, and red channels

4We have 6 bands, so why not use them??

Here we have an example of the digital numbers from the different land surfaces

Question how do you systematically process and analyze, and display the information that is available from these 4, 6, perhaps more different channels?

This leads us to todays topic: image processing and analysisThis is a topic that is of great interest in remote sensing, and if you are interested in learning more, I suggest that you consider taking GEOG 472, which covers this topic in great detail

Goals of image processing are three-fold

WOB Identify and quantifyGoals for image processing IdentifyFeatures, single or multiple characteristics, areas of change, etc.QuantifySpatial extent of featuresMagnitude of features (levels; e.g. fire severity, or extent of burn)AnalyzeDerive meaningful information 5Essentially there are three goals... The first is to IDENTIFY!Image Processing 3 Primary TasksIdentify and map a specific feature of interest on the imagerye.g. identify deforested areasCreate a map with multiple categories or levels e.g. create a land cover mapCreate maps that represent different levels of a surface/atmosphere characteristicestimate net primary production in oceanic regionsA map of different levels of the same characteristice.g. percent tree cover 0-100%

6Essentially there are three goals... The first is to IDENTIFY!

Single CharacteristicMultiple Categories

Different levels of a single characteristic7Here we have three examples of these primary goals in image processing specifically, these are image classification, which is todays focusImage ClassificationThe process of automatically dividing all pixels within a digital remote sensing image into: Land or surface-cover categoriesInformation themes or quantification of specific surface characteristics8Image Classification it is a very broad area of research in remote sensing

Perhaps one of the most intensive areas of research in the field over the past 30 years

In this lecture, I want to briefly review some of the basic approaches to image classification because in many cases, what you are going to see is an image product that is based on a classification approachFrom: http://www.fes.uwaterloo.ca/crs/geog376.f2001/ImageAnalysis/ImageAnalysis.html#ImageProcessingSteps

9Example

Here we have another land cover map that was based on analysis of Landsat imagery

In this case, you see we have a different set of categories

The selection of categories is dependent on the information requirements of the user for example, in a broader study, this entire area might be termed: urban complex (scale is important!)

1: Evergreen Needleleaf Forests; 2: Evergreen Broadleaf Forests; 3: Deciduous Needleleaf Forests; 4: Deciduous Broadleaf Forests; 5: Mixed Forests; 6: Woodlands; 7: Wooded Grasslands/Shrubs; 8: Closed Bushlands or Shrublands; 9: Open Shrublands; 10: Grasses; 11: Croplands; 12: Bare; 13: Mosses and Lichens

http://www.geog.umd.edu/landcover/8km-map.html10There have even been approaches to use satellite imagery to produce global land cover products

Here we have a 13 category landcover map that was produced from AVHRR imagery at an 8 km resolution by Ruth DeFries and John Townshend, and colleagues here at the University of Maryland

Radar image classification 11This is a radar image classification, differentiating varying kinds of vegetation (the dominant land cover type in this scene)

Cropland Probability 0-100%Pittman et al. (2010)The same characteristic measured at levels what is the probability that any given pixel is predominantly cropland? Because it is so difficult to pin down this class (and others!), probability is often a good way of expressing a landcover type the user can decide an acceptable threshold for their purpose. - determined from running dozens of classification algorithms, and calculating what % of the time a pixel was labeled crop! This is different from percent cropland which is talking about what percentage of a given pixel is cropland (where we want to be!)12

SeaWiFS (Sea-viewing Wide Field-of-view Sensor) image classification: chlorophyll concentration in the Gulf of Mexico13Here we have an example of chlorophyll concentration mapped in the Gulf of Mexico from SeaWifs another example of levels of the same characteristic

This information product was made by applying an algorithm to the different bands of Seawifs

Image Slicing and ThresholdingThresholding of digital valuesi.e. % reflectace or DN

Thresholding of transformed valuese.g. NDVI, NBR, etc. 14Lets start our foray into image classification by looking at the simplest method image thresholding and slicing.

Image slice like density slice

Image classification based on average data values in a single channel is a risky undertaking15Here we have the digital numbers from a variety of Land Cover types from our Alaskan ETM imagery that we have looked at.

Based on these data, it appears that one might be able to come up with a simple land cover classification based on the digital numbers in a single channel

Any thing less than 20 is waterAny thing >20 but < 25 is a burnSpruce > 35 but < 42Soil > 45, but < 52Gravel >55, and 40 = Land< 40 = water19When we examine the histogram in this channel, we can see that the distribution of points for land does not intersect that from water

Therefore, in this case, we can use a simple threshold to create a map that has two categories land vs. water

Note simple thresholding is rarely used with MSS data, unless you have a very distinct separation of features and a limited number of categories.

For example, here thresholding would work if one wants to simply separate water from landTwo-step level slicing or thresholdingStep 1 Estimate the range of values of a given surface characteristic on a single band e.g. vegetation on Landsat 7 ETM+ Band 4

Step 2 create discrete levels of the characteristicslice up the histogram20Draw a histogram during step 1

Slice it up on step 2Example of 2-step level sliceWith AVHRR data, greenness can be estimated from the Normalized Difference Vegetation Index (NDVI)

Greenness = (Near IR Red)(Near IR + Red)21

This greenness map was created by level slicing NDVI Values22This is an AVHRR Product produced for the state of Alaska Fire Service based on calculating and displaying the NDVI Index

In this case, NDVI was calculated from the red and NIR channel of AVHRR

Then the values of NDVI were divided into 11 different categories

This works, but whether or not NDVI is accurately providing information on the amount of green biomass remains to be seen i.e. NDVI is a secondary measure of the earths characteristics, not a direct representation of surface characteristics recall its a relative indictor, not an absolute measure. Using a density slice to cut up the exact histogram to get different levels of biomass would be more difficult!!

Image ClassificationBecause we have seen the limitations of density slicing, or single-band classifications

Lets look at how exactly multiple bands of information are combined to perform a multiband classification 23There are two general categories of image classification supervised and unsupervised define from next page

Challenge in remote sensing how does one capture the information content that is available in the different channels of the digital image?24Here we have three different color composites, each of which gives us unique information that the others fail to differentiate.

It would be optimal to use as many bands worth of information as we can so how do we capture the information content that is available in the different channels of the digital image?

25Obviously, when you have multiple bands of imagery, you are not limited to thresholding using a single channel of information

Here we have digital number values from different land surface covers from the Alaskan Scene from the previous landsat image

Obviously the different land surface types have different reflectances and therefore radiances, therefore, they have different DNs in the different channels

You can see here, that Band 4 = useful for differentiating Spruce from Aspen

26When one plots out the average digital numbers from two channels, you begin to see that features that have a similar DN in one channel may in fact have a dramatically different DN in another channel

Examples

Spruce and Soil

Aspen and gravel

All of these have DNs that are near one another in Band 4, but are dramatically different in Band 3

Lillesand and KieferFigure 7-39If you find this interesting, read up on the tasseled cap transformation

27In actually, when you plot out the digital numbers for two different channels for different land cover types, you dont get single points, but you get a number of points that represent the range of values in the two channels

Note in this example, band 4 is on the x-axis and band 3 is on the y-axis

We have 6 different land cover categories represented

In theory and practice, you can actually create an n-dimensional scatter plot of points

In this example, we can see that even though each land cover type has a spread of values in the two channels, there is still separation in the points from the different channels

Thus, it is possible to use the values from the two (or more) channels to discriminate the different land cover categories.

A number of image processing approaches have been developed to do just thisImage ClassificationThe process of automatically dividing all pixels within a digital remote sensing image into discrete categories

Supervised vs. unsupervised

28There are two general categories of image classification supervised and unsupervised

This can be done based on spectral characteristics, temporal characteristics, textural characteristics, or a combination of any of these.

Today, we are focusing on spectrally-based image classification Supervised vs. Unsupervised ClassificationSupervised classification a procedure where the analyst guides or supervises the classification process by specifying numerical descriptors of the land cover types of interest

Unsupervised classification the computer is allowed to aggregate groups of pixels into like clusters based upon different classification algorithms29Supervised classification typically uses something called training sets or areas

WRITE ON BOARD

Training areasSpecified by the analyst to represent the land cover categories of interest Used to compile a numerical interpretation key that describes the spectral attributes of the areas of interest Each pixel in the scene is compared to the training sets, and then assigned to one of the categories

Training Areas and Supervised ClassificationSpecified by the analyst to represent the land cover categories of interest Used to compile a numerical interpretation key that describes the spectral attributes of the areas of interest Each pixel in the scene is compared to the training areas, and then assigned to one of the categories

30Multiband Classification ApproachesMinimum distance classifiers*Parallelepiped classifiers*Maximum likelihood classifiers*Decision trees*Neural networks

*covered in class (know for exam)31Minimum Distance ClassifiersfffffStep 1 calculate the average value for each training area in each band+ccccc+32Lets say f = forest, and c = corn

The first step is to create average for the different observations (pixels) that represent each class, e.g., for each training area

Note, here we have two different bands, but in reality, you can have multiple bands, dependent on the sensor you are usingMinimum Distance ClassifiersfffffStep 2 for each unclassified pixel, calculate the distance to the average for each training areaThe unclassified pixel is place in the group to which it is closest+ccccc+** - Unclassified pixel33After estimating the distance to each training set category, you assign your pixel to the category to which it has the minimum distance to the average

Minimum Distance ClassifiersLillesand and KieferFigure 7-4034A minumum distance classifier operates by determining which training set center has the minimum distance to the unclassified pixel.

In this example, the minimum distance from pixel 1 is to type C, corn

Pixel 2 is closer to type s, the sand category

Advantages of MDC SimpleComputationally efficient

Disadvantage does not recognize different degrees of variance in spectral signatures

For example, the urban class has a high degree of variability Pixel 2 could very well be an urban pixel

Advantages/Disadvantages of Minimum Distance ClassifiersAdvantagesSimple and computationally efficientDisadvantages Does not factor in the fact that some categories have a large variancee.g. pixel #2 on last slide ended up on sand, but could have been urban!35Parallelepiped ClassifiersfffffStep 1 define the range of values in each training area and use these ranges to construct an n-dimensional box (a parallelepiped) around each classccccc36In parallelepiped classifiers, you construct an n-dimensional box around the pixels within each category of interest

You use the n-dimensional space defined by the parallelepiped to define the different categories

Lillesand and Kiefer Figure 7-41 a pixel falls into a category if it falls within the N-dimensional box, otherwise it is unclassified

a problem with the PP is that there can be overlap between categories37In a parallelepiped classifier, a pixel falls into a category if it falls within the N-dimensional box, otherwise it is unclassified

You can see that a problem with the PP is that there can be overlap between categories

To overcome this, they useStepped decision region boundaries

Lillesand and Kiefer Figure 7-41Fix the overlappinjg regions with a parallelepiped classifier with a stepped decision region boundary38Here we have an example of a parallelepiped classifier with a stepped decision region boundary this can improve your resultsMaximum likelihood classifiersBased on a probability function derived from a statistical distribution of reflectance values39

Lillesand and KieferFigure 7-46Plots of DN values fit create a histogram which usually fit a certain statistical distribution40When you make a plot of the number of pixels with a given DN value, you are creating a histogram of the distribution of these DN values

Most curves fit a certain statistical distribution

There are statistical functions or equations which describe the distribution of data41There are statistical functions or equations which describe the distribution of data

One of the most common, displayed in this figure, is the normal or Gaussian distribution

This is the so-called bell shaped curve, aka normal distribution

Studies have shown that DN values from most MSS systems have a normal distribution

Lillesand and Kiefer Figure 7-43a 3-dimensional normal curve fit to the data values from an example of Digital values from the two channels of the Landsat scene42For example, here we have a 3-dimensional normal curve fit to the data values from our example of Digital values from the two channels of the Landsat scene

Normal distributions are used in statistical algorithmsOne can use the normal distribution to define the probability of a pixel being within a certain class

The operator then defines the levels of probability it is acceptable for classification of a given pixelSteps for maximum likelihood classifierDetermine the n-dimensional curve for a particular featureFit it to normal distributionUse statistical algorithms to describe themDefine the levels of probability acceptable for classification of a given pixel43Fortunately, ENVI has all of these embedded just like for minimum distance and parallelipiped.

Lillesand and KieferFigure 7-44Maximum likelihood classifiersthe equal probability contours that were constructed around the different training areas are used to classify the images

Max likelihood classifier selects the category with highest probability for a pixel44For example, here we have the equilprobability contours that were constructed around the different training areas from our example.

These equiprobability contours are used to classify the image

You can see that in some cases, the equiprobability contours overlap between different categories based on the training sets.

What the maximum likelihood classifier does is select the category that has the highest probability for that pixel.

Unsupervised classificationLack of a priori information on what types of land or vegetation cover types exist within a regionBUT: it may be difficult to interpret the computer generated classes

45

Unsupervised ClassificationLillesand and KieferFigure 7-51Allow the computer to identify clusters based on different classification procedures46In many cases, you may not have a priori information on what types of land or vegetation cover types exist within a region, and you can let the computer identify unique spectral classes for you

For example, here we have three clusters that exist within a Landsat image of a forested region

The problem with unsupervised classification approaches is that you may not know what the classes by the computer represent

In many cases, you use a hybrid approach

WRITE ON BOARD

Perform an unsupervised classification to create a number of land cover categories within the area of interestCarry out field surveys to identify the land cover type represented by different unsupervised clustersUse a supervised approach to combine unsupervised clusters into similar land cover categories Hybrid Classification ApproachPerform an unsupervised classification to create a number of land cover categories within the area of interestCarry out field surveys to identify the land cover type represented by different unsupervised clustersUse a supervised approach to combine unsupervised clusters into similar land cover categories

47Sources of Uncertainty in Image ClassificationNon-representative training areasHigh variability in the spectral signatures for a land cover classMixed land cover within the pixel area48Mixed pixelsIn many cases, the IFOV of a sensor will include multiple land cover categories e.g., a mixed pixel

Mixed pixels contribute to classification errors49There are very few pure pixels out there, at least when we are talking about moderate to coarse resolution.

What happens in a mixed pixel is that the digital number for the mixed pixel will be a combination of the values of the varying reflectance characteristics represented at the sub-pixel level. Subpixel heterogeneity is a huge problem for me in my work because I am looking agriculture globally i.e. I need to use a coarse resolution sensor to get the temporal resolution I need, but lots of croplands are very small, and fragmented by forests and other changeable land cover types. fffffcccccdddddmmmmmQuestion How do different algorithms treat mixed pixels?

In some cases, mixed pixels are close enough to a specific category, which leads to misclassifications50The problem with mixed pixels is that sometimes the values fall between the categories that contain the classes that comprise the mixed pixel, and therefore are not classified as those categories

However, in many cases, they fall very close to the other categories themselves, and can represent an error in classficationDecision Tree ClassifierDecision tree classifiers use a simple set of rules to divide pixels into different land cover types51Most land cover products from AVHRR time series data are derived using decision tree classifiers

1: Evergreen Needleleaf Forests; 2: Evergreen Broadleaf Forests; 3: Deciduous Needleleaf Forests; 4: Deciduous Broadleaf Forests; 5: Mixed Forests; 6: Woodlands; 7: Wooded Grasslands/Shrubs; 8: Closed Bushlands or Shrublands; 9: Open Shrublands; 10: Grasses; 11: Croplands; 12: Bare; 13: Mosses and Lichens

http://www.geog.umd.edu/landcover/8km-map.html52Earlier I showed a global land cover map that had a number of different categories

This map was created from AVHRR data using a decision tree classifierClassification logic

a)a)b)b)c)c)53This is the logic that was used to create this land cover map.

- Within each of these steps, a variety of different characteristics (spatial, temporal, textural, spectral) can be picked out and grouped according to different algorithms

54The decision to separate vegetation from non-vegetated is based on Max NDVI

If the max NDVI is less that .155, it is sorted into non-veg, if > .155, into vegetation

55Additional steps are used to classify it into forest versus woodland

56Further steps are used to classify into tall versus short vegetation

In this approach, you can identify training sets in the same fashion as you do with multiple channel classifiers

The computer then can develop a decision tree to divide the data into different classes

57Now, here we have an example of a land cover product generated for North American boreal forest region

Note that there are more categories (17 total), but a very similar approach has been used , e.g., a decision tree classifier

58Here we have an additional AVHRR product generated for Alaska using a decision tree classifier and NDVI composite data

This map contains 23 land cover categories

Again, this map was generated through the analysis of time series Landsat imagery so there is a temporal component to classification as well!

www.eomf.ou.edu They used time-series VIs to look for # of cycles (croppings) per year to map agricultural intensificationThis is an example of a temporally based classifier for mapping purposes, where they looked at a years worth of VI data (every 8-days) to try to see how many different vegetation cycles (green up and brown down) were present. This is a characteristic which is unique to agriculture i.e. forests have at most one cycle, less if they are evergreen. Where they saw multiple cycles in a single year, they knew they were looking at agriculture, intense agriculture at that. 59Accuracy assessment & ValidationIt is necessary to provide information about the accuracy of a given mapping approachdifferent applications require different levels of accuracyFor land cover classifications, no global map has ever exceeded 70% accuracy And what is accuracy in the context of remote sensing, anyway? (Keep this question in mind) Relative to other maps?Relative to the ground?There are efforts to standardize validation approacheshttp://landval.gsfc.nasa.gov/pdf/GlobalLandCoverValidation.pdf