8/3/2019 CE 32110 Unsupervised Classification 2010
1/19
UNSUPERVISEDCLASSIFICATION
8/3/2019 CE 32110 Unsupervised Classification 2010
2/19
Unsupervised Classification
In contrast to supervised classification, unsupervisedclassification requires only a minimal amount of initial input fromthe analyst.
It is a process whereby numerical operations are performedthat search for natural groupings of the spectral properties ofpixels, as examined in multispectral feature space.
The user allows the computer to select the class means andcovariance matrices to be used in the classification.
Once the data are classified, the analyst attempts, to assignthese natural or spectral classes to the information classes ofinterest.
This may not be easy.
Some of the clusters may be meaningless as they representmixed classes of earth surface materials. This ambiguity is resolved by the analyst who understands the
spectral characteristics of the terrain in order classify clusters
into information classes.
8/3/2019 CE 32110 Unsupervised Classification 2010
3/19
UNSUPERVISED CLASSIFICATION
Clustering is one of the most important tasks in
data mining and knowledge discovery. It attempts to find subsets within a given data that
are similar enough to warrant further analysis.
It organizes a set of objects into groups (orclusters) such that objects in the same group aresimilar to each other and different from those inother groups.
These groups or clusters should have meaning inthe context of a particular problem.
8/3/2019 CE 32110 Unsupervised Classification 2010
4/19
In clustering one of important and fundamentaltask is the definition of
a) proximity (similarity function) between two dataobjects and
b) the overall optimization search strategy, i.e. howto find the best overall grouping according to an
optimization criteria.
Clustering, commonly known as unsupervisedclassification, does not need any training dataand is especially useful when the user haslimited knowledge about the data.
8/3/2019 CE 32110 Unsupervised Classification 2010
5/19
Clustering algorithms partition data into a certainnumber of clusters (groups, subsets, orcategories).
There is no universally agreed upon definition. Most researchers describe a cluster by
considering
a) the internal homogeneity andb) the external separation i.e. patterns in the same
cluster should be similar to each other, whilepatterns in different clusters should not.
Both the similarity and the dissimilarity should beexaminable in a clear and meaningful way.
8/3/2019 CE 32110 Unsupervised Classification 2010
6/19
STEPS IN CLUSTER ANALYSIS Data to cluster data acquisition, preparation and
cleaning.
Variables to use selection of relevant variable forperforming the clustering procedure. Irrelevant ormasking variable should be excluded as far aspossible.
A proximity measure designing a proper proximitymeasure.
The clustering procedure.
Number of clusters even no cluster is a possible
outcome. Replication,
Testing and interpretation.
8/3/2019 CE 32110 Unsupervised Classification 2010
7/19
Unsupervised Classification
Clustering algorithms used for the unsupervised classification ofremotely sensed data, generally, vary according to theefficiency with which the clustering takes place.
An example of a conceptually simple but not necessarilyefficient clustering algorithm has been used below todemonstrate the fundamental logic of unsupervisedclassification known as CLUSTER.
This algorithm operates in a two-pass mode.
In the first pass, the algorithm sequentially builds class clusters.
In the second pass, a minimum-distance classifier is applied tothe whole data set on a pixel-by-pixel basis, where each pixel isassigned to one of the mean vectors created in pass 1.
8/3/2019 CE 32110 Unsupervised Classification 2010
8/19
CLUSTER Algorithm
Pass 1: Cluster Building
During the first pass, the analyst may be required tosupply four types of information:
(i) R, radius of the cluster,
(ii) C, a distance parameter for merging clusters,(iii) N, the number of pixels to be evaluated between each
merging of the clusters,and
(iv) Cmax, the maximum number of clusters to be identifiedby the algorithm.
8/3/2019 CE 32110 Unsupervised Classification 2010
9/19
To start the process of building of cluster centres, thefirst pixel of the image is considered to be the cluster
centre of the first class. Then the second pixel is taken up and its membership
for the first cluster is found by computing the distancebetween this point and the cluster centre of class 1.
If the distance between the pixel and the cluster centreof class 1 is less than or equal to R, then this pixelbelong to class 1.
Now the class 1 has two points within its cluster andthe cluster centre of class 1 is modified by taking theaverage value of both the pixels.
CLUSTER Algorithm
8/3/2019 CE 32110 Unsupervised Classification 2010
10/19
Pixel 2 (20. 20)
Pixel 3 (30. 20)
Pixel 1 (10. 10)
0 10 20 30 40
30
20
10
0
Band 1Brightness Values
Ba
nd
2
BrightnessValues
8/3/2019 CE 32110 Unsupervised Classification 2010
11/19
Pixel 2 (20. 20)
Pixel 1 (10. 10)
0 10 20 30 40
30
20
10
0
Band 1Brightness Values
Ba
nd
2
Brightn
essValues
8/3/2019 CE 32110 Unsupervised Classification 2010
12/19
R=15
Cluster #1after 1st iteration(15,15)
0 10 20 30 40
30
20
10
0
Band 4Brightness Values
Band
5
Brightn
essValues
8/3/2019 CE 32110 Unsupervised Classification 2010
13/19
CLUSTER Algorithm
Now the third pixel is taken up for examination. If the distance between this pixel and the cluster centre
of class 1 less than or equal to R, then the pixel belongs
to class 1. Adjust the cluster centre of class 1 by taking the
average values of all the three pixels.
If the distance of the third pixel exceeds the distanceR
,then this pixel does not belong to the class 1, hence thispixel now becomes the cluster centre of a new class i.e.class 2.
8/3/2019 CE 32110 Unsupervised Classification 2010
14/19
8/3/2019 CE 32110 Unsupervised Classification 2010
15/19
This process of building cluster continues till N pixels havebeen examined for their membership to cluster of differentclasses.
At this point, the cluster building process stops temporarily andthe distance between class clusters are examined for theirseparability.
The class clusters that have now been identified have to bechecked such that the cluster centres of all classes areseparated by a minimum value C.
Those clusters, which are lying at a distance less than C, haveto be merged together as they belong to the same cluster.
The new cluster centres of the merged cluster are found bytaking weighted average value of the old cluster centres beingmerged.
CLUSTER Algorithm
8/3/2019 CE 32110 Unsupervised Classification 2010
16/19
Once the cluster centres have been checked for properseparability, the building of cluster starts from the pointwhere it had stopped.
It is found that the centres of the cluster, which havebeen identified, tends to move in its position in the initialphase, and as more points are examined, the positionsof the clusters start to stabilize before converging into a
fixed position. This process of cluster building continues till themaximum number of cluster centres (Cmax) have beenidentified or the end of image is encountered.
Finally, the separability of each cluster is checked beforeproceeding to Pass 2.
CLUSTER Algorithm
8/3/2019 CE 32110 Unsupervised Classification 2010
17/19
ending
Cluster # 1beginning
0 10 20 30 40
30
20
10
0
Band 4Brightness Values
Band
5
Brightn
essValues
Cluster # 2beginning
ending
8/3/2019 CE 32110 Unsupervised Classification 2010
18/19
Unsupervised Classification
Pass 2: Classification of Image
Having identified the cluster centres of all the classes, theclassification of the image starts.
Each point is assigned a class membership on the basis ofminimum distance to means classifier.
When the whole image has been classified, the analyst nowexamines the classified image.
Since the classes that have been identified are basicallyspectral class and not informational classes, hence the
analyst now has to undertake the process of converting thespectral classes into informational classes.
8/3/2019 CE 32110 Unsupervised Classification 2010
19/19
Unsupervised Classification
In this process of convergence it is found that two or
more spectral classes may combine together to yield asingle information class.
This process is rather a tedious, cumbersome, andcomplex, hence requires a great amount of expertise on
the part of the analyst in merging many spectral classesinto one informational class.