Upload
jonathan-strickland
View
294
Download
0
Tags:
Embed Size (px)
Citation preview
Object-Oriented classification used to determine landscape utilization
of woodland caribou (Rangifer tarandus caribou) in Gaff Topsails, NL.
Jonathan W. Strickland
Dept. of Environment and Conservation Newfoundland and Labrador,
Wildlife Division, Corner Brook, NL.
Abstract: In order to better understand our wildlife and the way they interact with the landscape, spatial ecologists often
perform land cover classifications. Here we use LANDSAT 7 ETM + data and several classification techniques including a non-
traditional Object-Oriented classification to best classify the landscape based on observations made in the field. Field data was
collected along 51 spiral transects based on a modified Fibonacci sequence. Field data was used to train an Object-Oriented
classification where the landscape is first segmented into ‘objects’ of similar properties and then classified by defining each
object according to supervised classification techniques. Overall accuracy of the Object-Oriented classification was 38.3% due to
a large confusion between bog, barren, and shrub. Unsupervised Isocluster classification showed an overall accuracy of 57.6%.
As a comparison, publically available Earth observation for sustainable development of forest (EOSD) data was re-classified to
mimic our classification and an accuracy assessment was performed. EOSD overall accuracy was 27.2% using our ground truth
data. Low accuracy was believed to be due to an overall low spatial accuracy in the dataset and a low Minimum Mapping Unit
(MMU): Grain size (i.e. Spatial resolution) ratio Fragmentation analysis showed no correlation (R² = 0.052) between occupancy
level and degree of fragmentation for woodland caribou in Gaff Topsails, NL.
Introduction:
To better understand woodland caribou of Newfoundland it is important to observe the behavior of
many individuals to detect patterns on a sub-population/population scale. One such behavior is habitat
selection and avoidance within the animal’s home range. To begin studying this interaction, GPS
telemetry data and LANDSAT 7 ETM+ satellite imagery was used. Several classification techniques were
used to describe land cover features in areas of high and low caribou occupancy defined by GPS
telemetry data. Data was collected by Paul W. Saunders, Dept. of Environment and Conservation, NL,
Wildlife Division, Corner Brook, NL. As part of an unpublished M.Sc. Project (Saunders, 2010).
Methods and Results:
Field data Collection:
Ground truth data was collected at 51 sample locations based on 8 possible High and Low occupancy
sites for each GPS collared individual in the study area. A 3.25 km modified Fibonacci spiral was created
at each site and used as a pre-described route for field data collection (Figure 1). Fibonacci spirals were
used as a means of avoiding error that may be introduced by directionality and trend (Fortin and Dale,
2005).
Figure 1: Modified Fibonacci spiral used as predefined route for ground truth data collection.
Each transect was hiked by a field biologist where all habitat transition sites (boundaries) were marked
with a handheld GPS unit and notes were recorded on habitat type starting at each boundary. Habitat
types included the 17 classes listed below according to (Saunders, 2010):
1 – Rock Barren
2 – Soil Barren
3 – Organic Bog
4 – Treed Bog
5 – Wet Bog
6 – Agriculture
7 – Residential
8 – Right-of-way
9 – Cleared Land
10 – Forested
11 – Cutover
12 – Shrub
13 – Grasses
14 – Stream
15 – River
16 – Pond
17 - Lake
Ground truth data was collected using hand-held GPS units and converted to ESRI shapefiles for use in
this study. All waypoints were snapped to its corresponding transect spiral. Data points were then
filtered by snapping distance to ensure that any points that were not recorded on or directly adjacent to
a transect spiral would not be included in the analysis. All transect spirals were then converted to
‘routes’ using ESRI ArcGIS 9.3.1. Routes were then updated with line events using point locations
collected in the field. Ground truth information was then entered manually for each corresponding
route segment. Finally all transect routes were merged into one file. 75% of the complete dataset was
then randomly selected (n=38) using excel and merged to be used at training areas for classification
analysis. The remaining 25% of the data was preserved for classification error checking.
Segmentation and Classification:
In order to begin studying woodland caribou behavior in relation to their physical environment, a
habitat classification was produced using Ortho-rectified LANDSAT 7 ETM data. Classification was
completed using an Object-Oriented approach. In this non-traditional method of classification, the
image is first segmented into objects where an object is a region of interest with associated spatial,
spectral (brightness and color), and texture characteristics that describe the region (ENVI Ex Tutorial).
Once created, objects are classified using supervised classification. Training data used by the supervised
classification process was developed using ground truth data collected from 38 transect routes.
Traditional remote sensing classification techniques are pixel-based, meaning that spectral information
in each pixel is used to classify imagery. In the search for more accurate classifiers there has been a
growing recognition that so called ‘per pixel’ classifiers have inherent limitations and a parcel-based
approach can often lead to more accurate classification (Devereux et al., 2004). Lewinski, 2006 found
the tools of Object-Oriented classification did not only enable the identification of twice as many classes
as those of the pixel-based approach, but also provided a high accuracy classification.
Ortho-rectified LANDSAT 7 ETM Satellite imagery was downloaded via Geogratis (geogratis.cgdi.gc.ca).
The image was then clipped to only include the study area defined by a minimum bounding rectangle
containing all transect spirals. Using the Optimum index factor, LANDSAT band numbers 1, 4 and 5 were
found to have the highest value meaning the combination contains a high amount of "information" (e.g.
high standard deviation) with little "duplication" (e.g. low correlation between the bands) (van der
Meer, 2006) causing it to be the optimum combination to use for image classification. Band numbers 1,
4, and 5 were combined in a composite image using ESRI ArcGIS 9.3.1 ‘composite bands’ tool. By pan-
sharpening the image, Fox et al. (2002) demonstrated the ability to map smaller landscape features
thereby avoiding some of the mixed pixel problem experienced with 30-meter imagery. Pan sharpening
was completed using the ‘Create Pan-sharpened raster dataset’ tool in ArcGIS 9.3.1, increasing the
spatial resolution from 30m to 15m.
Image segmentation was completed using the ‘feature extraction’ module of the ENVI EX remote
sensing package. Feature extraction is a module for extracting information on spatial, spectral, and
texture characteristics. The module segments an input image using two parameters, the ‘scale level’ and
‘merging level’, each being a value between 0 and 100. For this analysis a scale level of 1 was selected to
maximize the number of features detected. A merging level was set at 10 based on it being the lowest
level of merging that could be used while avoiding long computational times and computer crashes. Low
feature merging allows objects to be further grouped during the classification process.
The feature extraction module in ENVI has a built in classification tool. Once the image is segmented the
user has the ability to create a supervised classification using training data in ESRI shapefile format. This
tool however performed poorly and produced large error zones in the image where classes were not
classified or classified incorrectly. To avoid this issue, the segmented image was exported to an ESRI
polygon shapefile. Using ArcGIS all transect routes were overlaid on the segmentation image. Transect
routes were then queried to only display segments of a single site class (i.e. forest). All polygons that
intersected one of these line segments, but did not overlap a boundary were then selected and
exported to a separate file. This process was repeated for each land cover type used in the classification.
These separate files now represented training areas for the associated classes.
Training areas were imported into IDRISI Andes 15 for classification analysis. The ‘MAKESIG ‘ tool was
used to create a signature file for each polygon file. A maximum likelihood classification was then
completed using the ‘MAXLIKE’ tool (Figure 2). When training sites are known to be strong (i.e. well
defined with a large sample size), the ‘MAXLIKE’ procedure should be used (Eastman, 2006). Following
data investigation, a total of 5 classes were used for the supervised classification (table 1). Ground truth
classes not used in the classification analysis were rejected on a basis of low sample size or small spatial
extent.
Table 1: Class names and associated site class numbers used for LANDSAT 7 ETM supervised classification.
Class Name Site class number(s)
1 Forest 10
2 Shrub 12
3 Bog/Barren 1, 3, 4, 5
4 Water 14, 15, 16
5 Other N/A
Figure 2: Cloud cover (region of red circle) in LANDSAT 7 ETM + satellite imagery classification using 5 Land Cover
classes for the region of Gaff Topsails, NL.
Classification results were converted to a vector file using the ‘Raster to Feature’ tool found in ArcGIS
9.3.1 spatial analysis toolset. In an attempt to improve accuracy of the classified image, ancillary data
was added. Datasets included 1:50,000 map sheet roads, water bodies (to better define streams and
rivers), and cutovers recorded in the provincial forest inventory GIS layer (Figure 3). Ancillary data was
added to the classification using the ArcGIS 9.3.1 ‘Update’ tool. Road and Stream layers were buffered
to reflect average feature width in the study area.
Fig
ure
3:
Lan
d C
ove
r C
lass
ific
ati
on
usi
ng
LA
ND
SAT
7 E
TM
+ d
ata
an
d a
n O
bje
ct-O
rie
nte
d c
lass
ific
ati
on
pro
cess
. N
ote
clo
ud
co
ver
cla
ssif
ied
as
wa
ter
in b
ott
om
rig
ht
of
ima
ge
.
Classification accuracy was calculated using cross tabulation analysis. Data from 13 transect routes (25%
of the original dataset) was used to calculate accuracy. The ‘Union’ tool from ArcGIS 9.3.1 was used to
unite the classification layer with the 13 transect routes. The attribute table was then exported to excel
where conditional statements were used to determine the number of occurrences of each correct or
incorrect classification. Upon completion of cross tabulation analysis it was determined the overall
accuracy of the classification was 38.3%. Addition of ancillary data did not improve classification
accuracy. A further analysis revealed the extremely low accuracy was partially due to a confusion
between bog, barren, and shrub. When these classes were removed from the accuracy assessment the
value increased to 81.2%.
Once aware of extremely low classification accuracy using Object-Oriented supervised classification,
other classification methods were considered. Classification accuracy did not significantly increase using
other supervised methods; however isocluster unsupervised classification showed some improvement.
LANDSAT 7 ETM bands 1, 4 and 5 were clustered in 20 unknown classes. Ground truth data was then
used to determine what each class represented. The image was re-classed using IDRISI where 20 classes
were grouped to 5 land cover types (Table 2), (Figure 4).
Table 2: Class numbers and associated Land Cover types used as classification scheme for isocluster unsupervised
classification.
Class number Land Cover Type
1 Forest
2 Shrub
3 Bog
4 Water
5 Unclassified
A cross tabulation analysis using the same 13 transect routes previously mentioned, revealed the
classification accuracy for the isocluster unsupervised method to be 57.6%. To compare classification
accuracy to other freely available land cover products, Earth observation for sustainable development of
forest (EOSD) product was considered. EOSD Classification accuracy was measured by reclassing the
dataset to contain the same 5 classification classes listed in table 2. Accuracy was then measured using
the same methods and validation data previously described. Overall classification accuracy for the EOSD
dataset was equal to 27.2 %. Low classification accuracy in all datasets measured is believed to be
largely due to the fine scale of field data and low spatial accuracy of all data used (Table 3). Transect
route segments have a mean length = 119.3 m and standard deviation = 128.6 m, meaning transect
segments may be less than spatial error in several instances.
Fig
ure
4:
Lan
d C
ove
r cl
ass
ific
ati
on
usi
ng
LA
ND
SAT
7 E
TM
+ s
ate
llite
im
ag
ery
an
d a
n U
nsu
pe
rvis
ed
Iso
clu
ste
r cl
ass
ific
ati
on
pro
cess
.
Table 3: Spatial accuracy of data and methods used in determining overall classification accuracy.
Source of Spatial error Spatial accuracy (+/-)
LANDSAT 7 satellite imagery 20-30 m
Handheld GPS for field data 5 m
Point Snapping process (methods) 3 m
Classification accuracy may also be lowered in this analysis due to the improper MMU: Grain ratio. MMU
or minimum mapping unit refers to the smallest area in the extent that will be mapped as a discrete
unit. Grain refers to the smallest resolvable element in the extent (i.e. Spatial resolution) (Fassnacht et
al., 2005). In this analysis the MMU = 10m while the grain is initially 30m and decreased to 15 m through
the use of pan-sharpening. Smaller MMUs may result in large within-patch variability, making patches of
interest difficult to discern and reducing classification accuracies compared to maps where this
variability has been removed (Fassnacht et al., 2005).
Highest classification accuracy was calculated by comparing the unsupervised isocluster classification to
EOSD product by sampling 200 random points throughout the image, where overall accuracy was
measured to be 70.5%.
Fragmentation Analysis:
Upon completion of an appropriate land cover classification, a magnitude of questions may be asked.
One question selected in this analysis is if woodland caribou are selecting habitat based on a level of
fragmentation. To approach this question a comparison was made between the number of land cover
polygons found in High vs. Low occupancy sites. To calculate the number of land cover polygons, a circle
was created around each of the 38 transect routes used. To create uniform circles, a line was first
created from ‘Start’ to ‘End’ on each line. The midpoint of each line was then calculated. Since most
transects are a standard size (3.2km long) a standard sized circle (r= 1.145 km) was created with its
centroid at the midpoint of the created line (Figure 5). The vector classification was then clipped to the
extents of each individual circle and exported to individual files. The number of polygons in each circle
was then calculated and summary statistics were computed for high and low occupancy areas (Table 3).
Figure 5: Method used for creation of study circles sounding transect routes.
Table 3: Habitat fragmentation for high and low occupancy sites of woodland caribou, during calving in Gaff
Topsails, NL.
High Occupancy Low Occupancy
SUM of features 4232 4463
Avg. # of features 223 235
Standard deviation 53.79 76.25
Fragmentation data was tested for spatial autocorrelation using the Morans I tool found in ArcGIS 9.3.1.
Data proved to be auto correlated with a Moran’s Index = 0.91, Z score = 1.84 standard deviations, and a
significance level = 0.10. Geographically weighted regression was used to investigate the correlation
between occupancy level (i.e. High occupancy or Low occupancy) and degree of fragmentation. Using a
default bandwidth of 66676.9 m, results showed an R²=0.052 meaning there is no correlation between
occupancy level and degree of fragmentation in the study area at this scale.
Discussion:
In order to better understand our wildlife species and protect them for future generations, it is
important for us to understand how animals interact with their environment. One step towards this
understanding is to classify land cover features throughout an animal’s home range to determine the
type of habitat an organism requires through the use of regression analysis. This study however presents
some challenges that exist in producing an accurate classification, along with some solutions.
Field data collection:
As part of an unpublished graduate project by Paul W. Saunders, land cover information was collected
throughout the area of Gaff Topsails, NL. Data was collected through the use of modified Fibonacci spiral
to decrease bias introduced by the directionality of features throughout the study area, where land
cover features tend to run in a northeast to southwest direction. Land cover boundaries were recorded
throughout transect spirals using handheld GPS. During data processing all points were snapped to lines
to facilitate the creation of transect routes. This activity however introduced an unnecessary error to
classification analysis and should be avoided in future studies. It is recommended that transect lines be
snapped to points to better reflect data collection methods.
Segmentation and Classification:
Over the past two decades, there has been an explosion in the use of maps derived from remote sensing
(particularly those from Landsat) (Cohen and Goward, 2004). Here we consider a non-traditional method
of classification using an Object-Oriented approach. Classification was completed using a segmentation
process followed by supervised classification. Ancillary data was added to the classification results in an
attempt to increase classification accuracy. Caution should be taken however in using data with the
finest scale available in order to minimize classification error caused by map generalization of ancillary
data.
Classification accuracy was determined to be extremely low for the Object-Oriented approach leading
the use of other classification types including an unsupervised isocluster method. Classification accuracy
however remained low in all methods. Low classification accuracy may be caused by a number of
factors. The two factors most relevant to this study include a low spatial accuracy of the processed
dataset as well as a low MMU: Grain Size ratio. In order to improve classification accuracy,
recommendation would include collecting ground truth data with a larger MMU and/ or using a satellite
dataset with a finer grain size or spatial resolution. Existing field data may be reclassified to imitate a
larger MMU.
Classification accuracy was highest when compared to EOSD data with the use of random points,
supporting the fact that the classification is much ‘better’ at a coarse spatial scale then at the fine scale
of field data collection.
Fragmentation:
To begin understanding the landscape geometry of woodland caribou habitat, a comparison was made
between the degree of fragmentation in high and low occupancy sites. After calculating the number of
features at each site, a Moran’s I calculation revealed spatial autocorrelation in the data. Since classical
statistics no longer could be used, Geographically weighted regression was used to determine there is
no significant correlation between level of occupancy and degree of fragmentation. Woodland caribou
do not appear to select habitat based on the amount the area is fragmented in this study. Habitat
selection however may be based on what feature type is present rather than how many features are
present. Further investigation including regression analysis is required in order to better explore this
topic.
Conclusion:
When interested in animal’s behavior with relation to their environment, land cover classification is an
extremely useful tool. A number of considerations should be made however in order to minimize error
associated with data sources and calculation methods. When using field based observations as
classification training data, it is important to make observations in a systematic fashion that does not
introduce error associated with directionality of land cover features. When collecting field data it is also
important to record observations at the scale the satellite imagery will be studied. MMUs should be
designed to mimic the spatial resolution of the image to improve classification accuracy. During post-
classification analysis, it is important to remain mindful of spatial autocorrelation that may exist,
eliminating the usefulness of classical statistics. Finally it is important for us to overcome challenges and
limitations of the classification process, in order to better understand our wildlife and the landscape
they interact with.
References:
Cohen, W.B., Goward, S.N. 2004. Landsat’s role in ecological applications of remote sensing. Bioscience
Vol. 54, pp. 535-545.
Devereux B.J., Amable G.S., and Costa Posada C. 2004. An Efficient Image Segmentation Algorithm for
Landscape Analysis. International Journal of Applied Earth Observation and Geoinformation, Vol.
6, pp. 47-61.
Eastman, J.R. 2006. IDRISI Andes, Guide to GIS and Image Processing, chapter 16: Classification of
Remotely Sensed Imagery. Clark Labs, Clark University, MA, USA.
ENVI EX Tutorial: Feature Extraction with Supervised Classification. ITT Visual Information Solutions (ITT
VIS).
Fassnacht, K.S., Cohen, W.B., and Spies, T.A. 2006. Key issues in making and using satellite-based maps
in ecology: A primer. Forest Ecology and Management. Vol. .222, pp. 167-181.
Fortin, M. and Dale M.R.T. 2005. Spatial Analysis, A Guide for Ecologists. Cambridge University Press,
Cambridge, UK, 365 pages.
Fox L., Garrett M.L, Heasty R, and Torres E. 2002. Classifying Wildlife Habitat with Pan-Sharpened
Landsat 7 imagery. Preceedings: ISPRS Commission I Mid-Term Symposium in conjunction with
Pecora 15/ Land Satellite Information IV Conference 10-15 Nov. 2002, Denver, CO, USA.
Lewinski, S. 2006. Object-Oriented classification of Landsat ETM+ satellite image. Journal of Water and
Land Development. No. 10, pp. 91-106.
Saunders, P.W. 2010. Delineation of Landcover Features in Areas Utilized or Avoided by Female Caribou
during calving and Post-Calving Using Publically Available Spatial Datasets. Unpublished M.Sc.
Dissertation, The Manchester Metropolitan University.
Van der Meer, F. 2006. Remote Sensing and GIS techniques applied to geological survey. [Internet]
Faculty of Geo-Information Science and Earth Observation of the University of Twente. Available
at: < http://www.itc.nl/ilwis/Applications/application14.asp> [Accessed March 29, 2010].