Object-Oriented classification used to determine landscape utilization of woodland caribou (Rangifer tarandus caribou) in Gaff Topsails, NL

Object-Oriented classification used to determine landscape utilization

of woodland caribou (Rangifer tarandus caribou) in Gaff Topsails, NL.

Jonathan W. Strickland

Dept. of Environment and Conservation Newfoundland and Labrador,

Wildlife Division, Corner Brook, NL.

Abstract: In order to better understand our wildlife and the way they interact with the landscape, spatial ecologists often

perform land cover classifications. Here we use LANDSAT 7 ETM + data and several classification techniques including a non-

traditional Object-Oriented classification to best classify the landscape based on observations made in the field. Field data was

collected along 51 spiral transects based on a modified Fibonacci sequence. Field data was used to train an Object-Oriented

classification where the landscape is first segmented into ‘objects’ of similar properties and then classified by defining each

object according to supervised classification techniques. Overall accuracy of the Object-Oriented classification was 38.3% due to

a large confusion between bog, barren, and shrub. Unsupervised Isocluster classification showed an overall accuracy of 57.6%.

As a comparison, publically available Earth observation for sustainable development of forest (EOSD) data was re-classified to

mimic our classification and an accuracy assessment was performed. EOSD overall accuracy was 27.2% using our ground truth

data. Low accuracy was believed to be due to an overall low spatial accuracy in the dataset and a low Minimum Mapping Unit

(MMU): Grain size (i.e. Spatial resolution) ratio Fragmentation analysis showed no correlation (R² = 0.052) between occupancy

level and degree of fragmentation for woodland caribou in Gaff Topsails, NL.

Introduction:

To better understand woodland caribou of Newfoundland it is important to observe the behavior of

many individuals to detect patterns on a sub-population/population scale. One such behavior is habitat

selection and avoidance within the animal’s home range. To begin studying this interaction, GPS

telemetry data and LANDSAT 7 ETM+ satellite imagery was used. Several classification techniques were

used to describe land cover features in areas of high and low caribou occupancy defined by GPS

telemetry data. Data was collected by Paul W. Saunders, Dept. of Environment and Conservation, NL,

Wildlife Division, Corner Brook, NL. As part of an unpublished M.Sc. Project (Saunders, 2010).

Methods and Results:

Field data Collection:

Ground truth data was collected at 51 sample locations based on 8 possible High and Low occupancy

sites for each GPS collared individual in the study area. A 3.25 km modified Fibonacci spiral was created

at each site and used as a pre-described route for field data collection (Figure 1). Fibonacci spirals were

used as a means of avoiding error that may be introduced by directionality and trend (Fortin and Dale,

2005).

Figure 1: Modified Fibonacci spiral used as predefined route for ground truth data collection.

Each transect was hiked by a field biologist where all habitat transition sites (boundaries) were marked

with a handheld GPS unit and notes were recorded on habitat type starting at each boundary. Habitat

types included the 17 classes listed below according to (Saunders, 2010):

1 – Rock Barren

2 – Soil Barren

3 – Organic Bog

4 – Treed Bog

5 – Wet Bog

6 – Agriculture

7 – Residential

8 – Right-of-way

9 – Cleared Land

10 – Forested

11 – Cutover

12 – Shrub

13 – Grasses

14 – Stream

15 – River

16 – Pond

17 - Lake

Ground truth data was collected using hand-held GPS units and converted to ESRI shapefiles for use in

this study. All waypoints were snapped to its corresponding transect spiral. Data points were then

filtered by snapping distance to ensure that any points that were not recorded on or directly adjacent to

a transect spiral would not be included in the analysis. All transect spirals were then converted to

‘routes’ using ESRI ArcGIS 9.3.1. Routes were then updated with line events using point locations

collected in the field. Ground truth information was then entered manually for each corresponding

route segment. Finally all transect routes were merged into one file. 75% of the complete dataset was

then randomly selected (n=38) using excel and merged to be used at training areas for classification

analysis. The remaining 25% of the data was preserved for classification error checking.

Segmentation and Classification:

In order to begin studying woodland caribou behavior in relation to their physical environment, a

habitat classification was produced using Ortho-rectified LANDSAT 7 ETM data. Classification was

completed using an Object-Oriented approach. In this non-traditional method of classification, the

image is first segmented into objects where an object is a region of interest with associated spatial,

spectral (brightness and color), and texture characteristics that describe the region (ENVI Ex Tutorial).

Once created, objects are classified using supervised classification. Training data used by the supervised

classification process was developed using ground truth data collected from 38 transect routes.

Traditional remote sensing classification techniques are pixel-based, meaning that spectral information

in each pixel is used to classify imagery. In the search for more accurate classifiers there has been a

growing recognition that so called ‘per pixel’ classifiers have inherent limitations and a parcel-based

approach can often lead to more accurate classification (Devereux et al., 2004). Lewinski, 2006 found

the tools of Object-Oriented classification did not only enable the identification of twice as many classes

as those of the pixel-based approach, but also provided a high accuracy classification.

Ortho-rectified LANDSAT 7 ETM Satellite imagery was downloaded via Geogratis (geogratis.cgdi.gc.ca).

The image was then clipped to only include the study area defined by a minimum bounding rectangle

containing all transect spirals. Using the Optimum index factor, LANDSAT band numbers 1, 4 and 5 were

found to have the highest value meaning the combination contains a high amount of "information" (e.g.

high standard deviation) with little "duplication" (e.g. low correlation between the bands) (van der

Meer, 2006) causing it to be the optimum combination to use for image classification. Band numbers 1,

4, and 5 were combined in a composite image using ESRI ArcGIS 9.3.1 ‘composite bands’ tool. By pan-

sharpening the image, Fox et al. (2002) demonstrated the ability to map smaller landscape features

thereby avoiding some of the mixed pixel problem experienced with 30-meter imagery. Pan sharpening

was completed using the ‘Create Pan-sharpened raster dataset’ tool in ArcGIS 9.3.1, increasing the

spatial resolution from 30m to 15m.

Image segmentation was completed using the ‘feature extraction’ module of the ENVI EX remote

sensing package. Feature extraction is a module for extracting information on spatial, spectral, and

texture characteristics. The module segments an input image using two parameters, the ‘scale level’ and

‘merging level’, each being a value between 0 and 100. For this analysis a scale level of 1 was selected to

maximize the number of features detected. A merging level was set at 10 based on it being the lowest

level of merging that could be used while avoiding long computational times and computer crashes. Low

feature merging allows objects to be further grouped during the classification process.

The feature extraction module in ENVI has a built in classification tool. Once the image is segmented the

user has the ability to create a supervised classification using training data in ESRI shapefile format. This

tool however performed poorly and produced large error zones in the image where classes were not

classified or classified incorrectly. To avoid this issue, the segmented image was exported to an ESRI

polygon shapefile. Using ArcGIS all transect routes were overlaid on the segmentation image. Transect

routes were then queried to only display segments of a single site class (i.e. forest). All polygons that

intersected one of these line segments, but did not overlap a boundary were then selected and

exported to a separate file. This process was repeated for each land cover type used in the classification.

These separate files now represented training areas for the associated classes.

Training areas were imported into IDRISI Andes 15 for classification analysis. The ‘MAKESIG ‘ tool was

used to create a signature file for each polygon file. A maximum likelihood classification was then

completed using the ‘MAXLIKE’ tool (Figure 2). When training sites are known to be strong (i.e. well

defined with a large sample size), the ‘MAXLIKE’ procedure should be used (Eastman, 2006). Following

data investigation, a total of 5 classes were used for the supervised classification (table 1). Ground truth

classes not used in the classification analysis were rejected on a basis of low sample size or small spatial

extent.

Table 1: Class names and associated site class numbers used for LANDSAT 7 ETM supervised classification.

Class Name Site class number(s)

1 Forest 10

2 Shrub 12

3 Bog/Barren 1, 3, 4, 5

4 Water 14, 15, 16

5 Other N/A

Figure 2: Cloud cover (region of red circle) in LANDSAT 7 ETM + satellite imagery classification using 5 Land Cover

classes for the region of Gaff Topsails, NL.

Classification results were converted to a vector file using the ‘Raster to Feature’ tool found in ArcGIS

9.3.1 spatial analysis toolset. In an attempt to improve accuracy of the classified image, ancillary data

was added. Datasets included 1:50,000 map sheet roads, water bodies (to better define streams and

rivers), and cutovers recorded in the provincial forest inventory GIS layer (Figure 3). Ancillary data was

added to the classification using the ArcGIS 9.3.1 ‘Update’ tool. Road and Stream layers were buffered

to reflect average feature width in the study area.

Fig

ure

3:

Lan

d C

ove

r C

lass

ific

ati

on

usi

ng

LA

ND

SAT

7 E

TM

+ d

ata

an

d a

n O

bje

ct-O

rie

nte

d c

lass

ific

ati

on

pro

cess

. N

ote

clo

ud

co

ver

cla

ssif

ied

as

wa

ter

in b

ott

om

rig

ht

of

ima

ge

.

Classification accuracy was calculated using cross tabulation analysis. Data from 13 transect routes (25%

of the original dataset) was used to calculate accuracy. The ‘Union’ tool from ArcGIS 9.3.1 was used to

unite the classification layer with the 13 transect routes. The attribute table was then exported to excel

where conditional statements were used to determine the number of occurrences of each correct or

incorrect classification. Upon completion of cross tabulation analysis it was determined the overall

accuracy of the classification was 38.3%. Addition of ancillary data did not improve classification

accuracy. A further analysis revealed the extremely low accuracy was partially due to a confusion

between bog, barren, and shrub. When these classes were removed from the accuracy assessment the

value increased to 81.2%.

Once aware of extremely low classification accuracy using Object-Oriented supervised classification,

other classification methods were considered. Classification accuracy did not significantly increase using

other supervised methods; however isocluster unsupervised classification showed some improvement.

LANDSAT 7 ETM bands 1, 4 and 5 were clustered in 20 unknown classes. Ground truth data was then

used to determine what each class represented. The image was re-classed using IDRISI where 20 classes

were grouped to 5 land cover types (Table 2), (Figure 4).

Table 2: Class numbers and associated Land Cover types used as classification scheme for isocluster unsupervised

classification.

Class number Land Cover Type

1 Forest

2 Shrub

3 Bog

4 Water

5 Unclassified

A cross tabulation analysis using the same 13 transect routes previously mentioned, revealed the

classification accuracy for the isocluster unsupervised method to be 57.6%. To compare classification

accuracy to other freely available land cover products, Earth observation for sustainable development of

forest (EOSD) product was considered. EOSD Classification accuracy was measured by reclassing the

dataset to contain the same 5 classification classes listed in table 2. Accuracy was then measured using

the same methods and validation data previously described. Overall classification accuracy for the EOSD

dataset was equal to 27.2 %. Low classification accuracy in all datasets measured is believed to be

largely due to the fine scale of field data and low spatial accuracy of all data used (Table 3). Transect

route segments have a mean length = 119.3 m and standard deviation = 128.6 m, meaning transect

segments may be less than spatial error in several instances.

Fig

ure

4:

Lan

d C

ove

r cl

ass

ific

ati

on

usi

ng

LA

ND

SAT

7 E

TM

+ s

ate

llite

im

ag

ery

an

d a

n U

nsu

pe

rvis

ed

Iso

clu

ste

r cl

ass

ific

ati

on

pro

cess

.

Table 3: Spatial accuracy of data and methods used in determining overall classification accuracy.

Source of Spatial error Spatial accuracy (+/-)

LANDSAT 7 satellite imagery 20-30 m

Handheld GPS for field data 5 m

Point Snapping process (methods) 3 m

Classification accuracy may also be lowered in this analysis due to the improper MMU: Grain ratio. MMU

or minimum mapping unit refers to the smallest area in the extent that will be mapped as a discrete

unit. Grain refers to the smallest resolvable element in the extent (i.e. Spatial resolution) (Fassnacht et

al., 2005). In this analysis the MMU = 10m while the grain is initially 30m and decreased to 15 m through

the use of pan-sharpening. Smaller MMUs may result in large within-patch variability, making patches of

interest difficult to discern and reducing classification accuracies compared to maps where this

variability has been removed (Fassnacht et al., 2005).

Highest classification accuracy was calculated by comparing the unsupervised isocluster classification to

EOSD product by sampling 200 random points throughout the image, where overall accuracy was

measured to be 70.5%.

Fragmentation Analysis:

Upon completion of an appropriate land cover classification, a magnitude of questions may be asked.

One question selected in this analysis is if woodland caribou are selecting habitat based on a level of

fragmentation. To approach this question a comparison was made between the number of land cover

polygons found in High vs. Low occupancy sites. To calculate the number of land cover polygons, a circle

was created around each of the 38 transect routes used. To create uniform circles, a line was first

created from ‘Start’ to ‘End’ on each line. The midpoint of each line was then calculated. Since most

transects are a standard size (3.2km long) a standard sized circle (r= 1.145 km) was created with its

centroid at the midpoint of the created line (Figure 5). The vector classification was then clipped to the

extents of each individual circle and exported to individual files. The number of polygons in each circle

was then calculated and summary statistics were computed for high and low occupancy areas (Table 3).

Figure 5: Method used for creation of study circles sounding transect routes.

Table 3: Habitat fragmentation for high and low occupancy sites of woodland caribou, during calving in Gaff

Topsails, NL.

High Occupancy Low Occupancy

SUM of features 4232 4463

Avg. # of features 223 235

Standard deviation 53.79 76.25

Fragmentation data was tested for spatial autocorrelation using the Morans I tool found in ArcGIS 9.3.1.

Data proved to be auto correlated with a Moran’s Index = 0.91, Z score = 1.84 standard deviations, and a

significance level = 0.10. Geographically weighted regression was used to investigate the correlation

between occupancy level (i.e. High occupancy or Low occupancy) and degree of fragmentation. Using a

default bandwidth of 66676.9 m, results showed an R²=0.052 meaning there is no correlation between

occupancy level and degree of fragmentation in the study area at this scale.

Discussion:

In order to better understand our wildlife species and protect them for future generations, it is

important for us to understand how animals interact with their environment. One step towards this

understanding is to classify land cover features throughout an animal’s home range to determine the

type of habitat an organism requires through the use of regression analysis. This study however presents

some challenges that exist in producing an accurate classification, along with some solutions.

Field data collection:

As part of an unpublished graduate project by Paul W. Saunders, land cover information was collected

throughout the area of Gaff Topsails, NL. Data was collected through the use of modified Fibonacci spiral

to decrease bias introduced by the directionality of features throughout the study area, where land

cover features tend to run in a northeast to southwest direction. Land cover boundaries were recorded

throughout transect spirals using handheld GPS. During data processing all points were snapped to lines

to facilitate the creation of transect routes. This activity however introduced an unnecessary error to

classification analysis and should be avoided in future studies. It is recommended that transect lines be

snapped to points to better reflect data collection methods.

Segmentation and Classification:

Over the past two decades, there has been an explosion in the use of maps derived from remote sensing

(particularly those from Landsat) (Cohen and Goward, 2004). Here we consider a non-traditional method

of classification using an Object-Oriented approach. Classification was completed using a segmentation

process followed by supervised classification. Ancillary data was added to the classification results in an

attempt to increase classification accuracy. Caution should be taken however in using data with the

finest scale available in order to minimize classification error caused by map generalization of ancillary

data.

Classification accuracy was determined to be extremely low for the Object-Oriented approach leading

the use of other classification types including an unsupervised isocluster method. Classification accuracy

however remained low in all methods. Low classification accuracy may be caused by a number of

factors. The two factors most relevant to this study include a low spatial accuracy of the processed

dataset as well as a low MMU: Grain Size ratio. In order to improve classification accuracy,

recommendation would include collecting ground truth data with a larger MMU and/ or using a satellite

dataset with a finer grain size or spatial resolution. Existing field data may be reclassified to imitate a

larger MMU.

Classification accuracy was highest when compared to EOSD data with the use of random points,

supporting the fact that the classification is much ‘better’ at a coarse spatial scale then at the fine scale

of field data collection.

Fragmentation:

To begin understanding the landscape geometry of woodland caribou habitat, a comparison was made

between the degree of fragmentation in high and low occupancy sites. After calculating the number of

features at each site, a Moran’s I calculation revealed spatial autocorrelation in the data. Since classical

statistics no longer could be used, Geographically weighted regression was used to determine there is

no significant correlation between level of occupancy and degree of fragmentation. Woodland caribou

do not appear to select habitat based on the amount the area is fragmented in this study. Habitat

selection however may be based on what feature type is present rather than how many features are

present. Further investigation including regression analysis is required in order to better explore this

topic.

Conclusion:

When interested in animal’s behavior with relation to their environment, land cover classification is an

extremely useful tool. A number of considerations should be made however in order to minimize error

associated with data sources and calculation methods. When using field based observations as

classification training data, it is important to make observations in a systematic fashion that does not

introduce error associated with directionality of land cover features. When collecting field data it is also

important to record observations at the scale the satellite imagery will be studied. MMUs should be

designed to mimic the spatial resolution of the image to improve classification accuracy. During post-

classification analysis, it is important to remain mindful of spatial autocorrelation that may exist,

eliminating the usefulness of classical statistics. Finally it is important for us to overcome challenges and

limitations of the classification process, in order to better understand our wildlife and the landscape

they interact with.

References:

Cohen, W.B., Goward, S.N. 2004. Landsat’s role in ecological applications of remote sensing. Bioscience

Vol. 54, pp. 535-545.

Devereux B.J., Amable G.S., and Costa Posada C. 2004. An Efficient Image Segmentation Algorithm for

Landscape Analysis. International Journal of Applied Earth Observation and Geoinformation, Vol.

6, pp. 47-61.

Eastman, J.R. 2006. IDRISI Andes, Guide to GIS and Image Processing, chapter 16: Classification of

Remotely Sensed Imagery. Clark Labs, Clark University, MA, USA.

ENVI EX Tutorial: Feature Extraction with Supervised Classification. ITT Visual Information Solutions (ITT

VIS).

Fassnacht, K.S., Cohen, W.B., and Spies, T.A. 2006. Key issues in making and using satellite-based maps

in ecology: A primer. Forest Ecology and Management. Vol. .222, pp. 167-181.

Fortin, M. and Dale M.R.T. 2005. Spatial Analysis, A Guide for Ecologists. Cambridge University Press,

Cambridge, UK, 365 pages.

Fox L., Garrett M.L, Heasty R, and Torres E. 2002. Classifying Wildlife Habitat with Pan-Sharpened

Landsat 7 imagery. Preceedings: ISPRS Commission I Mid-Term Symposium in conjunction with

Pecora 15/ Land Satellite Information IV Conference 10-15 Nov. 2002, Denver, CO, USA.

Lewinski, S. 2006. Object-Oriented classification of Landsat ETM+ satellite image. Journal of Water and

Land Development. No. 10, pp. 91-106.

Saunders, P.W. 2010. Delineation of Landcover Features in Areas Utilized or Avoided by Female Caribou

during calving and Post-Calving Using Publically Available Spatial Datasets. Unpublished M.Sc.

Dissertation, The Manchester Metropolitan University.

Van der Meer, F. 2006. Remote Sensing and GIS techniques applied to geological survey. [Internet]

Faculty of Geo-Information Science and Earth Observation of the University of Twente. Available

at: < http://www.itc.nl/ilwis/Applications/application14.asp> [Accessed March 29, 2010].

Documents

Object-Oriented classification used to determine landscape utilization of woodland caribou (Rangifer tarandus caribou) in Gaff Topsails, NL