Hidden Markov Models for crop recognition in remote sensing image sequences

Pattern Recognition Letters 32 (2011) 19–26

Contents lists available at ScienceDirect

Pattern Recognition Letters

journal homepage: www.elsevier .com/locate /patrec

Hidden Markov Models for crop recognition in remote sensing image sequences

Paula Beatriz Cerqueira Leite a, Raul Queiroz Feitosa a, Antônio Roberto Formaggio b,Gilson Alexandre Ostwald Pedro da Costa a,*, Kian Pakzad c, Ieda Del’Arco Sanches b

a Pontifícia Universidade Católica do Rio de Janeiro (PUC-Rio), Brazilb Instituto Nacional de Pesquisas Espaciais (INPE), Divisão de Sensoriamento Remoto, Brazilc Leibnitz Universität Hannover, Institut für Photogrammetrie und Geoinfomation (IPI), Germany

a r t i c l e i n f o

Article history:Available online 20 February 2010

Keywords:Hidden Markov ModelsCrop recognitionRemote sensing

0167-8655/$ - see front matter � 2010 Elsevier B.V. Adoi:10.1016/j.patrec.2010.02.008

* Corresponding author. Address: Pontifícia UnivJaneiro, Departamento de Engenharia Elétrica, Rua MaCardel Leme, 4. andar, sala 401, 22453-900 Rio de Ja35271626; fax: +55 21 35271232.

E-mail address: [email protected] (G.A.O.P. da

a b s t r a c t

This work proposes a Hidden Markov Model (HMM) based technique to classify agricultural crops. Themethod uses HMM to relate the varying spectral response along the crop cycle with plant phenology,for different crop classes, and recognizes different agricultural crops by analyzing their spectral profilesover a sequence of images. The method assigns each image segment to the crop class whose correspond-ing HMM delivers the highest probability of emitting the observed sequence of spectral values. Experi-mental analysis was conducted upon a set of 12 co-registered and radiometrically corrected LANDSATimages of region in southeast Brazil, of approximately 124.100 ha, acquired between 2002 and 2004. Ref-erence data was provided by visual classification, validated through extensive field work. The HMM-based method achieved 93% average class accuracy in the identification of the correct crop, being, respec-tively, 10% and 26% superior to multi-date and single-date alternative approaches applied to the samedata set.

� 2010 Elsevier B.V. All rights reserved.

1. Introduction

Given the importance of agriculture worldwide, socially andeconomically, the availability of precise and efficient informationabout agricultural activities in an appropriate time interval ishighly relevant for a number of strategic decisions. With accurateinformation about the status of different crops it is possible to de-velop commercial plans, to regulate agricultural products’ internalstocks, to make decisions on subsidies and to draw strategies forthe negotiation of agricultural commodities in financial markets.

Remote sensing data have been increasingly applied along thelast three decades to assess agricultural yield, production and cropcondition (Wiegand et al., 1979; Ren et al., 2008).

With focus on crop classification, Jeon and Landgrebe (1992)proposed a multi-date decision fusion classifier called jointly likeli-hood decision fusion multitemporal classifier (TP-LIK). Their methodrelies on the outcome of single-date classifiers, selecting, for a gi-ven image object, the class with the highest likelihood of producingthe observed single-date classifications.

Considerable efforts have been devoted to monitoring crop phe-nology from remote-sensing imagery for assessing crop health and

ll rights reserved.

ersidade Católica do Rio derquês de São Vicente, 225, Ed.neiro, RJ, Brazil. Tel.: +55 21

Costa).

for crop yield forecasting (Zhang et al., 2003; Shanahan et al., 2001;Reed and Brown, 2005; Ahla et al., 2006; Törmä et al., 2007). How-ever, in spite of the numerous studies on automatic land coverclassification methods, there can be found in the literature only afew works that explore phenological models to aid the classifica-tion process. A rare example is presented in (Aurdal et al., 2005),where HMMs of phenological cycles are used in the classificationof different forest types in southern Norway. The main differencesbetween this work and the one presented in (Aurdal et al., 2005)lay first in the target applications (agriculture � forest) and secondin the modeling strategy, as it will be made clear later in this text.Moreover, the achieved performances in both works are remark-ably different.

This work proposes a Hidden Markov Model (HMM) based tech-nique for the classification of agricultural crops, exploring informa-tion of temporal image sequences. Instead of relying on single-dateimages, the method identifies different agricultural crops by ana-lyzing the crop specific temporal profiles of spectral features overa sequence of satellite images.

Specifically, a general procedure for the HMM-based classifierdesign is presented and applied to crop recognition of a sequenceof medium resolution satellite images of an area in the Southeastregion of Brazil where sugarcane, soybean and corn are the domi-nant crops.

Moreover, this work investigates the ability of HMM to identifyphenological stages on distinct agricultural crops.

http://dx.doi.org/10.1016/j.patrec.2010.02.008

mailto:[email protected]

http://www.sciencedirect.com/science/journal/01678655

http://www.elsevier.com/locate/patrec

https://www.researchgate.net/publication/223240833_Regional_yield_estimation_for_winter_wheat_with_MODIS-NDVI_data_in_Shandong_China?el=1_x_8&enrichId=rgreq-37af652c-7cda-4ae4-87e0-a38638ab470a&enrichSource=Y292ZXJQYWdlOzIyMDY0NDE5NDtBUzoxMDI3MTQzMzE2MzE2MTlAMTQwMTUwMDQxMzUwMg==

https://www.researchgate.net/publication/250101009_Leaf_Area_Index_Estimates_for_Wheat_from_LANDSAT_and_Their_Implications_for_Evapotranspiration_and_Crop_Modeling1?el=1_x_8&enrichId=rgreq-37af652c-7cda-4ae4-87e0-a38638ab470a&enrichSource=Y292ZXJQYWdlOzIyMDY0NDE5NDtBUzoxMDI3MTQzMzE2MzE2MTlAMTQwMTUwMDQxMzUwMg==

https://www.researchgate.net/publication/43268993_Use_of_Remote-Sensing_Imagery_to_Estimate_Corn_Grain_Yield?el=1_x_8&enrichId=rgreq-37af652c-7cda-4ae4-87e0-a38638ab470a&enrichSource=Y292ZXJQYWdlOzIyMDY0NDE5NDtBUzoxMDI3MTQzMzE2MzE2MTlAMTQwMTUwMDQxMzUwMg==

https://www.researchgate.net/publication/222838751_Monitoring_spring_canopy_phenology_of_a_deciduous_broadleaf_forest_using_MODIS?el=1_x_8&enrichId=rgreq-37af652c-7cda-4ae4-87e0-a38638ab470a&enrichSource=Y292ZXJQYWdlOzIyMDY0NDE5NDtBUzoxMDI3MTQzMzE2MzE2MTlAMTQwMTUwMDQxMzUwMg==

https://www.researchgate.net/publication/4156757_Use_of_hidden_Markov_models_and_phenology_for_multitemporal_satellite_image_classification_applications_to_mountain_vegetation_classification?el=1_x_8&enrichId=rgreq-37af652c-7cda-4ae4-87e0-a38638ab470a&enrichSource=Y292ZXJQYWdlOzIyMDY0NDE5NDtBUzoxMDI3MTQzMzE2MzE2MTlAMTQwMTUwMDQxMzUwMg==


https://www.researchgate.net/publication/4156764_Trend_analysis_of_time-series_phenology_derived_from_satellite_data?el=1_x_8&enrichId=rgreq-37af652c-7cda-4ae4-87e0-a38638ab470a&enrichSource=Y292ZXJQYWdlOzIyMDY0NDE5NDtBUzoxMDI3MTQzMzE2MzE2MTlAMTQwMTUwMDQxMzUwMg==

https://www.researchgate.net/publication/222552718_Monitoring_vegetation_phenology_using_MODIS_Remote_Sens_Environ?el=1_x_8&enrichId=rgreq-37af652c-7cda-4ae4-87e0-a38638ab470a&enrichSource=Y292ZXJQYWdlOzIyMDY0NDE5NDtBUzoxMDI3MTQzMzE2MzE2MTlAMTQwMTUwMDQxMzUwMg==

https://www.researchgate.net/publication/224301127_Using_phenological_information_derived_from_MODIS-data_to_aid_nutrient_modeling?el=1_x_8&enrichId=rgreq-37af652c-7cda-4ae4-87e0-a38638ab470a&enrichSource=Y292ZXJQYWdlOzIyMDY0NDE5NDtBUzoxMDI3MTQzMzE2MzE2MTlAMTQwMTUwMDQxMzUwMg==

https://www.researchgate.net/publication/3200998_Classification_with_Spatio-temporal_Interpixel_Class_Dependency_Contexts?el=1_x_8&enrichId=rgreq-37af652c-7cda-4ae4-87e0-a38638ab470a&enrichSource=Y292ZXJQYWdlOzIyMDY0NDE5NDtBUzoxMDI3MTQzMzE2MzE2MTlAMTQwMTUwMDQxMzUwMg==

20 P.B.C. Leite et al. / Pattern Recognition Letters 32 (2011) 19–26

The remainder of this work is organized as follows. Section 2characterizes the phenological cycles of the crops considered inthe experimental analysis. Section 3 presents a short descriptionof the Hidden Markov Model concept. The proposed methodologyis presented in Section 4 and a performance analysis is presentedin Section 5, followed by final comments and conclusions.

S1 S2 SN

1

v2 v3 vM

a12

a11 a21 a22 aNN

aN1

v1

b11 b12

b13

b1Mb21

b23

b2Mb22

bN1

bN2 bN3 bNM

Fig. 1. Example of a Hidden Markov Model (Si ? stages, vk ? observed spectralvectors, aij ? stage transition probability, bik ? spectral vector emissionprobability).

2. Crops and phenological cycles

The quantity of foliar area, phytomass volume and soil coveragetemporal variations in a given area are determined by the plantingand harvesting dates, and by the particular cycles of the differentcrops developed in it. Knowledge of these peculiarities providesthe basis for understanding the spectral behaviors presented bydifferent crop types in a certain period of the year.

Although the crop types described below are exemplified by theones present in a particular study area, located in São Paulo State,Brazil (Section 5.1.1), it should be noted that the general HMM-based model, introduced in Section 4, can be applied to differentcrops and phenological cycles.

2.1. Semi-perennial crops

Sugarcane (SC) is the most important long cycle crop in thestudy area, cultivated mainly for ethanol production. Cultivationfollows basically two cycles: one of 12 months (‘‘1-year” sugar-cane) and another of 18 months (‘‘one-year-and-half” sugarcane).The one-year-and-half sugarcane is planted between January andMarch and the 1-year sugarcane, between October and November.It is important to highlight that each sugarcane crop can be har-vested during five or six consecutive agricultural cycles. For thisreason the cycle is named ‘‘semi-perennial”.

For areas where this crop is recently planted, a green mass ofone-year-and-half sugarcane starts to completely cover the soilin October, when there is more heat and pluviometric precipita-tion. New areas of 1-year sugarcane should have full green cover-age in April and May and then the green phytomass tends toincrease its foliar area until the next harvesting period.

Each year the period of harvesting starts in April and ends inNovember, therefore, in a same satellite image it is possible to find:straw from harvested crop; recently planted sugarcane; as well assugarcane in the Growth phase and in the Adult phase. It is also pos-sible to find exposed soil where the agricultural area is preparedfor planting.

2.2. Short cycle crops

Soybean (SB) and corn (CO) are regarded as ‘‘annual crops” or‘‘short cycle crops” since they can complete their phenological cy-cles in 110–140 days. They are planted, in general, in the end ofOctober or in the beginning of November, germinating around10 days after being planted, and fully covering the soil surfacearound 60 days after germination. In sequence, these crops reachthe peak of green phytomass and then begin the grain filling pro-cess – when the quantity of green leaves starts to diminish, whilethe quantity of yellow leaves increases. They then dry out and fall,exposing again the soil background until the harvesting period.

2.3. Pasture

Pasture (PS) presents different phenological and spectraldynamics from the crops mentioned above. Its dynamics dependon the types of soil management used by cattlemen; generally,however, pastures are drier and scarcer between April and Septem-ber. Revigoration starts in the beginning of the rainy season, with a

rapid increase of the foliar area index, and the green vegetative vig-or is sustained from November to March.

2.4. Other classes

Although not actually considered a crop, riparian forest (RF) wasalso considered in this work. Other classes of land cover are pres-ent in the study area: urban areas, roads, forest and water bodies.They appear as few, large segments that practically do not changethroughout the image sequence; hence they were not treated inthis work.

3. Hidden Markov Models

In this section, we introduce the Hidden Markov Model (HMM)(Bunkle and Caelli, 2001) basic principles. The description that fol-lows differs somewhat of seminal texts on HMM, as we favor a ter-minology closely related to the target application. A HMMrepresents a doubly embedded stochastic process. In our HMMmodel the observable variables (vi) are regarded as vectors of spec-tral values emitted by non-observable phenological stages (Si), fol-lowing particular probabilistic functions whereby the stagesequence is a first order Markov Chain.

An HMM is illustrated in Fig. 1. N is the number of stages in themodel – individual stages are denoted as S = {S1, . . ., SN} and thestage at time t as qt – and M is the number of distinct values thatan observable variable may have per stage – individual valuesare denoted as V = {v1, . . ., vM}. A basic HMM consists of three setsof parameters:

(a) the spectral vector emission probabilities – bjk stands for theprobability that the value vk is observed in stage Sj, i.e.

bjk ¼ P½vk at tjqt ¼ Sj�; 1 � j � N and 1 � k � M

(b) the stage transition probabilities – aij is the probability of asystem to be in stage Sj in the subsequent time instant, giventhat its current stage is Si, i.e.

aij ¼ P½qtþ1 ¼ Sjjqt ¼ Si�; 1 � i; j � N

(c) the initial probability distribution – pi stands for the proba-bility of the system being in a given stage Si at the initialtime instant, i.e.

pi ¼ P½q1 ¼ Si�; 1 � i � N

ADPHPP GR AD

Fig. 2. HMM used in this work for (a) sugarcane, soybean and corn and (b) forpasture and riparian forest (PP = Prepared Soil, GR = Growth, AD = Adult phase andPH = Post-Harvesting).

P.B.C. Leite et al. / Pattern Recognition Letters 32 (2011) 19–26 21

If aij > 0, then stage Si can be followed by stage Sj. If aij = 0, thistransition will not occur.

Once reliable estimates of these three parameter sets are avail-able, two basic problems can be solved by using well known algo-rithms (see Rabiner (1989) for details):

� Problem 1: Given a crop model k and a sequence of observedspectral vectors V, compute the probability that the observedspectral vector sequence was produced by this model.

� Problem 2: Given a model and a sequence of observed spectralvectors, determine the most likely stage sequence.

The algorithms to solve these problems cannot be thoroughlyexplained in few words. Nevertheless, we describe shortly the ideaunderlying the solution for both problems. So, let’s consider a par-ticular state sequence Q = q1, q2, . . ., qT of length T. The probabilitythat the crop represented by a HMM model k followed the stage se-quence Q and produced the observed vector sequence V is givenby:

PðV ;Q jkÞ ¼ pq1bq1ðv1Þaq1q2

bq2ðv2Þ � � � bqT�1

ðvT�1ÞaqT�1qTbqTðvTÞ ð1Þ

This is simply the product of the individual stage transitionprobabilities and the emission probability of the observed spectralvectors weighted by the probability of being at stage q1 at the firstpoint in time. The solution of problem 1 is given by the summationof the expression above over all possible stage sequences, formally:

PðV jkÞ ¼XallQ

pq1bq1ðv1Þaq1q2


ðvT�1ÞaqT�1qTbqTðvTÞ ð2Þ

In the classification step the multitemporal segment is assignedto the crop model that delivers the highest probability of havingemitted the observed vector sequence.

Problem 2 searches for the most probable stage sequence Qopt,given a model k and a sequence V of observed spectral vectors.The solution corresponds to the sequence that maximizes Eq. (1).This means that:

Q opt ¼ arg maxQ

½pq1bq1ðv1Þaq1q2


ðvT�1ÞaqT�1qTbqTðvTÞ�

ð3Þ

The calculation of these expressions following directly the for-mulation of Eqs. (2) and (3) is computational intensive. Efficientalgorithms are presented in many text books on pattern recogni-tion and in particular in the seminal HMM paper (Rabiner, 1989).

4. Methodology

4.1. General model description

In this work, a specific HMM is defined for each crop class. Phe-nological stages correspond to stages and the observable variablesare the vectors comprising the digital numbers registered by theorbital sensor in each spectral band, plus the NDVI (computed fromthe spectral bands).

The basic HMM shown in Fig. 2a was chosen to model the tem-poral behavior of sugarcane, soybean and corn. The arrows illus-trate how the stages are temporally related. According to plantphenology, the symbolsPP, GR, AD and PH correspond, respectively,to stages Prepared Soil, Growth phase, Adult phase and Post-Harvesting.

For pasture and riparian forest there is no significant phenolog-ical change during the period covered by the images in the avail-able data set. This can be explained by the fact that thesecultures are neither sowed nor harvested and are indeed foundin the target areas for a long time. Thus a specific HMM is devised

for these classes having a single stage AD, which correspond to theAdult phase (Fig. 2b).

Even though pasture and riparian forest are actually not crops,the term ‘‘crop” will be used hereafter to designate all five classesto be recognized in our particular application, namely: sugarcane,soybean, corn, pasture and riparian forest.

4.2. Fitting the model to the application

The problem being considered in this work deviates in a num-ber of ways from the basic HMM description presented in the pre-ceding sections.

First, the spectral vector emission probabilities bjk depend onseasonal effects that cannot be fully compensated in the imagepre-processing phase. Second, the initial probability distribution(pi) is not constant along the year (see Section 2). This happens be-cause each crop class has preferential months for sowing, condi-tioned mostly by the climate and by characteristics of theparticular crop. In these months, crops are likely to be in the initialphenological stages. As time passes this probability decreases,while the probability of a crop to be in the growth or adult stagesincreases. So the initial probabilities will vary, according to thoseperiods. Third, the basic HMM model described in Section 3 as-sumes that the spectral vectors are emitted (actually available) ata constant time rate. In most real applications not all images in asequence acquired at a constant frequency are usable, mostlydue to clouds over the target geographical area. Moreover, the ba-sic model shown in Fig. 2 may also change for a larger interval be-tween two consecutive images in the data set. For instance, atransition from PP to AD may become possible in these cases.

Therefore, an HMM for our problem will have to consider dis-tinct spectral vector emission probabilities, initial stage probabili-ties, as well as stage transition probabilities for each pair ofconsecutive images in the available dataset (Leite et al., 2008).

These characteristics resulted in a model that is in many waysdifferent from the one presented (Aurdal et al., 2005), as antici-pated in the introduction of this paper.

Gaussian distributions are assumed for the spectral vectoremission probabilities. Hence, the emission probability density ofa vector vk consisting of the spectral bands and NDVI will be givenby:

bjk ¼1

ð2pÞd=2jRcsj1=2 exp �ðvk � ljÞ

TR�1j ðvk � ljÞ

2

" #ð4Þ

where lj and Rj denote, respectively, the mean vector, the covari-ance matrix for stage j, and d is the dimension of vk. Since differentcrop types have generally different spectral characteristics even ifthey are in the same phonological stage, it will be necessary to esti-mate both the mean vector and the covariance matrix for each stageof each crop class. Therefore, the specific means and covariancematrices of phenological stages are part each HMM crop model.

https://www.researchgate.net/publication/2984124_Rabiner_L_A_Tutorial_on_Hidden_Markov_Models_and_Selected_Applications_in_Speech_Recognition_Proc_IEEE_772_257-286?el=1_x_8&enrichId=rgreq-37af652c-7cda-4ae4-87e0-a38638ab470a&enrichSource=Y292ZXJQYWdlOzIyMDY0NDE5NDtBUzoxMDI3MTQzMzE2MzE2MTlAMTQwMTUwMDQxMzUwMg==


https://www.researchgate.net/publication/260733414_Crop_Type_Recognition_Based_on_Hidden_Markov_Models_of_Plant_Phenology?el=1_x_8&enrichId=rgreq-37af652c-7cda-4ae4-87e0-a38638ab470a&enrichSource=Y292ZXJQYWdlOzIyMDY0NDE5NDtBUzoxMDI3MTQzMzE2MzE2MTlAMTQwMTUwMDQxMzUwMg==


4.3. Estimating parameters values for each model

Parameter estimation is performed for each crop type sepa-rately. The leave-one-out (Brovelli et al., 2006) cross validationstrategy was used, in order to cope with possibly scarce availablesamples in view of the large number of parameters to estimate.

Stage transition possibilities aij are estimated for each pair ofconsecutive images in the data set. Only samples of the crop classbeing modeled are considered in the next steps. First, an accumu-lator matrix is created with rows and columns corresponding tostages, respectively, in the earlier and in the later date. Next, forall rows and columns, the element at row i and column j is incre-mented for each sample in stage Si and Sj – respectively, in the ear-lier and in the later epochs. Finally, the accumulator matrix isnormalized by dividing each of its elements by the sum acrossthe corresponding row. The result is the estimate of the transitionprobability matrix.

The procedure to estimate the initial probability distribution pi

is similar. Again, only samples of the crop class being modeled atthat date are considered. An accumulator vector is created havingone element per stage. The ith element is incremented for eachtraining sample at stage Si. Finally, the accumulator vector is di-vided by the sum of its elements. The result is taken as the estimateof the initial stage probability.

As mentioned before, the problem of estimating emission prob-ability turns into the one of estimating mean vectors and covari-ance matrices for each image. The computation of mean vectorsdoes not represent any trouble. However, the estimation of thecovariance matrix may be a problem if training patterns are scarce.The sampled covariance (Johnson and Wichern, 1998) is conven-tionally used as estimate of the covariance matrix. If the numberof training patterns does not exceed the dimension of the featurespace, the sampled covariance is non-invertible, and the value ofthe Gaussian pdf (see Eq. (4)) is undetermined. Real applicationsoften do not meet this requirement, if each training segment pro-vides just a single training pattern. This problem can be circum-vented in the following way. Instead of taking an image segmentas a single training pattern, we consider each individual pixel with-in the image segment (see Section 5.1.5) as a training pattern forthe computation of the sampled covariance. This procedure in-creases the amount of training patterns, assures a non-singularsampled covariance and provides a more accurate estimate. Theclassification itself is performed strictly segmentwise, wherebyeach segment being classified is described by a single vector givenby the average spectral values of the pixels it encloses. The param-eters associated with the distribution of the segments in each classcan be estimated by assuming each segment as a random sample ofsize k, where k is the number of pixels in the segment, drawn froma multivariate normal distribution of pixels with mean vector land covariance matrix R. It follows from a well known propertyof Gaussian distributions (see e.g., Johnson and Wichern, 1998)that the corresponding average spectral vector of segments belong-ing to the same crop and stage is also a Gaussian random variablewith mean value equal to l and covariance matrix equal to (1/k)R.Thus, we can estimate the parameters of the distribution of thesegments through the pixelwise mean estimate and the pixelwisecovariance matrix estimate divided by the number of pixels en-closed by the segments.

4.4. Classification

Segment based classification has been increasingly advocated inthe last decade as advantageous over traditional pixel based classi-fication (Blaschke and Strobl, 2001). In a multitemporal frameworkthis issue becomes even more important. In fact, the proposedHMM-based method can be applied to pixel-wise as well as

segment-wise classification, as long as the objects being classifiedhave been accurately located in all images of the multitemporal se-quence. Proper image registration is therefore a crucial require-ment in this multitemporal framework. An error of about 1 pixelmay be enough to cause a complete object misalignment in themultitemporal image sequence if pixels are the objects being clas-sified. Thus, the spectral values at the same image coordinate atdifferent dates will actually refer to distinct objects. Segmentsare usually characterized by the mean spectral value of the pixelsit encloses. In consequence their spectral description is less sensi-tive to inaccuracies in the image registration phase. For these rea-sons the experimental analysis reported in Section 5 is conductedsegment wise.

Once the HMMs have been established and their parametersestimated, the classification of an image segment is done in the fol-lowing way. The segment is represented at each date by a spectralvector comprising its average spectral values and NDVI observed atthat date. The classifier computes for each model the probabilitythat the corresponding crop class emits the observed sequence ofspectral vectors. The segment is assigned to the class whose modeldelivers the highest emission probability. A detailed descriptionabout how emission probabilities are computed for an HMM (prob-lem 1) can be found in (Rabiner, 1989). Another algorithm de-scribed in that work is used to infer for each model the mostprobable stage, at each point in time (problem 2).

5. Performance analysis

5.1. Data set

5.1.1. Study areaThe study area corresponds to three municipalities in the State

of São Paulo, Brazil: Ipuã, Guará e São Joaquim da Barra (inside ageographical region defined by the following coordinates:20�1603000S to 20�4000000S; 47�3703600W to 48�1305000W), coveringan area of 124.100 ha (Fig. 3). Agriculture is the main activity inthis area. The most common crops found in the region are: sugar-cane, soybeans and corn. Furthermore, the region has a plane toslightly undulated relief, a tropical climate with dry winter, annualmean temperature of 22.9 �C and annual mean precipitation of1480 mm.

5.1.2. Image sequenceThe dataset was composed of a total of 12 Landsat scenes, from

orbit/point WRS 220/74, from either TM/Landsat-5 or ETM+/Land-sat-7 sensor systems (Sanches, 2004). The images were acquired atdifferent dates from 2002 to 2004 (Fig. 4). Bands 1–5 and 7 fromthose images were used in this work.

5.1.3. Image pre-processingGeometric correction of the Landsat data was performed using

13 ground control points. The nearest neighbor resampling methodwas chosen, because it better preserves the original images’ radi-ometry (Mather, 1993).

Atmospheric correction was performed through the dark-objectsubtraction technique proposed by Chavez (1988).

To account for differences in the solar angles and spread-spec-trum effect, the multi-date group of images was radiometricallynormalized. In this work, this process was done according to themethodology proposed by Gürtler (2003).

In order to correctly represent the different objects to be classi-fied and their conditions at the images’ acquisition moments, gray-scale values were converted to reflectance values – which have awell defined physical meaning. This conversion was based on themethodology proposed in (Luiz et al., 2003).

https://www.researchgate.net/publication/2984124_Rabiner_L_A_Tutorial_on_Hidden_Markov_Models_and_Selected_Applications_in_Speech_Recognition_Proc_IEEE_772_257-286?el=1_x_8&enrichId=rgreq-37af652c-7cda-4ae4-87e0-a38638ab470a&enrichSource=Y292ZXJQYWdlOzIyMDY0NDE5NDtBUzoxMDI3MTQzMzE2MzE2MTlAMTQwMTUwMDQxMzUwMg==

https://www.researchgate.net/publication/200459227_Computer_Processing_of_Remotely-Sensed_Images_An_Introduction?el=1_x_8&enrichId=rgreq-37af652c-7cda-4ae4-87e0-a38638ab470a&enrichSource=Y292ZXJQYWdlOzIyMDY0NDE5NDtBUzoxMDI3MTQzMzE2MzE2MTlAMTQwMTUwMDQxMzUwMg==

https://www.researchgate.net/publication/216266284_What's_wrong_with_pixels_Some_recent_developments_interfacing_remote_sensing_and_GIS?el=1_x_8&enrichId=rgreq-37af652c-7cda-4ae4-87e0-a38638ab470a&enrichSource=Y292ZXJQYWdlOzIyMDY0NDE5NDtBUzoxMDI3MTQzMzE2MzE2MTlAMTQwMTUwMDQxMzUwMg==

https://www.researchgate.net/publication/216722131_Applied_Multivariate_Statistical_Analysis_Third_Ed?el=1_x_8&enrichId=rgreq-37af652c-7cda-4ae4-87e0-a38638ab470a&enrichSource=Y292ZXJQYWdlOzIyMDY0NDE5NDtBUzoxMDI3MTQzMzE2MzE2MTlAMTQwMTUwMDQxMzUwMg==

Fig. 3. Study area in State of São Paulo, Brazil.

Fig. 4. Images available in the data set with respective acquisition dates (month/year).

1 The term image object denotes a portion of a reference segment, as mentioned inSection 5.1.5.


5.1.4. Image segmentationThe segmentation procedure consisted of five main steps: (a)

first, the bands from all images in the temporal sequence (sixbands for each image) were stacked, forming a single multi-dateimage with 72 bands; (b) a spatial Gaussian low pass filter was ap-plied to each band; (c) the gradient of each band was then com-puted using the Sobel operator; (d) next, the maximum value ofthe gradient magnitude across all bands was computed, resultingin a single, two-dimensional matrix; (e) all local minima in the gra-dient matrix whose depth was lower than an empirically definedthreshold d were suppressed by applying the h-minima transfor-mation (Soille, 2003); (f) finally, the Watershed Algorithm (Vincentand Soille, 1991) was applied to the result of the previous step.

It is worth mentioning that by segmenting the image set as asingle image the resulting segment borders are consistent acrossall images. Assuming that the images were properly registeredthe step (a) of the segmentation procedure described above guar-antees that each segment corresponds to the same geographical re-gion in all dates. This is crucial for the multitemporal classificationthat follows. As far as the proposed multitemporal analysis is con-cerned, steps (b)–(f) could be replaced by other segmentationstrategies as long as the outcome is reasonably consistent withthe visual perception.

5.1.5. Reference dataA total of 316 reference image locations were selected in the

study area and two experts classified them visually in each image,determining the crop class and corresponding phenological stage.First, the two human experts worked individually, and each oneof them interpreted all 316 reference segments (centered on thebright dots in Fig. 3). Then, both experts worked together in orderto reach a consensus classification result: the individual resultswere compared, and when they differed, the experts investigatedagain the multi-date image sequence and decided on a final classi-fication. Information gathered in two field works, that took place inMarch 2003 and August 2003, were used by the experts in the clas-sification process.

The segments enclosing each of the 316 image locations in eachdate were used to build the training and testing sets. The segmentsof each date were grouped according to culture and stage. Approx-imately half of segments in each group were randomly selected fortraining and the remaining segments were used for test. The proce-dure was repeated 100 times, each time with a distinct randomselection of training and test segments. The results reported hence-forth are averages computed across all experiment repetitions.

5.2. Experiment results

A software prototype implementing the proposed method wasbuilt for validation. Experiments have been conducted to assessthe model’s ability to recognize crop classes on sequences of imageobjects1 collected from our data set. Each sequence consisted of a setof corresponding image objects, in a set of temporally adjacentimages, with all objects in the set belonging to the same crop class.A sequence starts with the earliest and ends with the latest occur-

Table 1Optimum set of n spectral features and corresponding classification performance.

Number of spectralfeatures

Optimum set of spectralfeatures

Average classaccuracy (%)

1 NDVI 792 4 NDVI 803 2 7 NDVI 814 2 3 7 NDVI 835 1 2 4 5 7 846 1 2 3 4 5 7 857 All 83


rence of a specific crop in the particular region covered by the corre-sponding image object. Unless stated otherwise, the results reportedhereafter refer to experiments performed on a total of 386 sequencesof diverse lengths collected according to the aforementioned criteria.

Table 2Crop classification accuracy.

Crops Rates (%)

HMM TP-LIK Single-date

Soybeans (SB) 94 90 71Corn (CO) 73 34 61Sugarcane (SC) 94 88 64Pasture (PS) 92 80 66Riparian forest (RF) 72 66 61Overall accuracy 90 82 66Average class accuracy (%) 85 72 66

Table 3Stage classification accuracy.

Phenological stages Rates (%)

Post-Harvesting (PH) 94Prepared Soil (PP) 81Growth phase (GR) 55Adult phase (AD) 95Overall accuracy 87Average class accuracy 81

Table 4Stage classification confusion matrix.

Phenological stages PH DD GR PP

PH 107 2 1 3

5.2.1. Optimum feature setThe first experiment aimed at determining the set of n out of the

seven available features (six image bands plus the NDVI) which re-sulted in the best classification, for n varying from 1 to 7.

The optimum set of n spectral features was determined byexhaustive search, for n varying between 1 and 7. Table 1 summa-rizes the results. The best individual feature was the NDVI. This issignificant considering the approximation implied by the normal-ity assumption. Indeed, if bands 3 and 4 may plausibly be modeledby a Gaussian, the NDVI will presumably deviate from thatassumption. So, the results in Table 1 suggest that a refinementinvolving a more accurate model for the NDVI distribution mightbring an important performance improvement. However, theinvestigation of this possibility will be left for a future work.

With the NDVI alone the HMM method achieves approximately79% for the average class accuracy. By adding the remaining sixspectral features to the NDVI the performance increases about6%. It is worth noting that the highest performance reported in Ta-ble 1 was achieved with six features not including the NDVI. Thesefeatures (bands 1, 2, 3, 4, 5 and 7) were the ones used in all the fol-lowing experiments.

5.2.2. Crop recognitionThe second experiment assessed the performance of the HMM

model for crop recognition. Similar experiments were carried outusing two other classification approaches, in order to provide themeans for performance comparison. The first alternative approachwas the jointly likelihood decision fusion multitemporal classifier (TP-LIK) proposed in (Jeon and Landgrebe, 1992). The TP-LIK methoddistinguishes between information classes and local classes. Inour analysis, information classes correspond to the five crop typesof our dataset, while the local classes are jointly defined by thephenological stages and the crop types, yielding a total of 14 localclasses2. At each date a maximum likelihood (ML) classifier deliversthe local class label for each image object based only on its spectralappearance at that date. Gaussian distributions identical to the vec-tor emission probability densities (Eq. (1)) were assumed. Given thelocal labels assigned to the image object of the sequence being clas-sified, the TP-LIK method assigns it to the information class (crop)having the highest probability of generating that label sequence. Inour experiment with TP-LIK it was assumed that the crop classeswere equiprobable.

A second alternative to the HMM-based method was also con-sidered, with the purpose of assessing the performance gainbrought by the use of multi-date in relation to single-date images.

2 SB, CO and SC have four stages while PS and RF have just one stage, yielding a totalof 14 local classes.

The single-date classifier was simulated by applying the sameHMM algorithm described so far to all unitary length sequencesof our dataset. The results obtained for these three classificationapproaches are shown in Table 2.

Both multi-date methods performed notably better than thesingle-date counterpart in terms of average per class and globalaccuracies. These results indicate clearly the advantage of usingmultiple images for crop recognition.

Regarding the multi-date approaches, the HMM-based methodoutperformed TP-LIK in 13% and 8%, respectively, for the averageclass and for the overall accuracy. Both methods performed nicelyfor all classes with corn and riparian forest presenting the lowestperformance. In effect, these are the crop types with the least num-ber of samples in our database – approximately 4.5% of all samplesin both cases. Consequently, the model parameters related to cornand riparian forest have been poorly estimated in comparison toother crop types.

5.2.3. Phenological stage recognitionAlthough it was not the main interest in this work, a further

experiment was conducted, with the aim of assessing the ability ofthe proposed method to identify the phenological stage of each im-age object at each date of the image sequences. Only correctly recog-nized sequences of corn, soybean and sugarcane were used in thisexperiment. Once the crop class is known, the phenological stageat each date is given by the stage of the corresponding model withthe highest emission probability of the observed spectral features.

Table 3 shows the recognition rates. All phenological stageswere reasonably well identified, with the exception of the Growthphase. This can be explained by the temporal evolution of the cropsthroughout the phenological cycle. During the Prepared Soil, Adultand Post-Harvesting phases, there are small changes in the crop’s

AD 16 216 18 17GH 3 27 104 58PP 2 9 32 877

Table 5Accuracy for sequences of varying lengths.

Rates (%) Global accuracy (%) Average class accuracy (%)

Sequence length Crops

SB CO SC PS PF

1 68 42 66 58 50 63 572 77 43 82 67 66 76 673 79 42 90 69 73 83 714 82 44 95 72 72 87 735 83 38 97 76 72 89 736 84 39 98 81 72 91 757 87 13 98 79 74 91 708 92 – 99 83 76 93 699 92 – 99 84 76 93 8810 91 – 99 84 76 93 8711 93 – 99 83 82 93 8912 92 – 99 80 81 93 88


spectral response. However, the spectral response of the Growthphase is continuously changing from Prepared to Adult phase,therefore its spectral response during this phase may be close tothe response of any of these two stages, or something in between,which can lead to misclassification.

The confusion matrix in Table 4 confirms this interpretation. Itshows that the Growth phase was often misclassified as Adult phaseand Prepared Soil. The varying spectral appearance during theGrowth phase could not be properly modeled by a single Gaussiandistribution. This reasoning suggests that the Growth phase shouldbe more accurately modeled upon the temporal derivative insteadof the absolute values of the spectral response.

5.2.4. The influence of sequence lengthThis final experiment had the objective of assessing the influ-

ence of the sequence length on the accuracy of the proposedHMM-based classification method. Sequences of lengths varyingbetween 1 and 12 were considered. Each sequence encompasseda single crop type in all dates.

The results shown in Table 5 put in evidence the benefits ofusing multi-date images for crop recognition. By just taking twoimages instead of a single image the average class accuracy im-proves in 10%, the global accuracy in 13%. From there on, the inclu-sion of one additional image almost always brings a performancegain. The reader must bear in mind that in a real application eachnew image added to the sequence implies in collecting more train-ing samples. According to Table 5, this may worth the additionaleffort.

6. Conclusion

This work proposed and evaluated the use of Hidden MarkovModels for crop classification. The experimental evaluation, basedon sequence of 12 Landsat images for five crop types, indicated aremarkable superiority of the HMM-based method over a single-date as well as over an alternative multi-date classification ap-proach. Beyond delivering a higher performance among all alterna-tives considered in this analysis, the HMM method was also morestable, in the sense that the per class recognition rates were similarfor all crop classes.

The HMM approach also performed well to recognize phenolog-ical stages. The exception was the Growth phase, which was fre-quently confused with Prepared Soil and Adult phase. Thisobservation suggests that vectors used to characterize the Growthphase should take into account not only the absolute spectral val-ues but also their variation through time.

For this work, only sequences of data associated to one croptype were considered. An extension of this method to handle se-

quences containing samples of more than one crop type is a furtherissue, worth investigating as a continuation of this work.

Our experiments also encourage a future investigation on moreaccurate feature probability distribution models, especially for theNDVI.

References

Ahla, D.E., Gowera, S.T., Burrowsa, S.N., Shabanovb, N.V., Mynenib, R.B., Knyazikhin,Y., 2006. Monitoring spring canopy phenology of a deciduous broadleaf forestusing MODIS. Remote Sens. Environ. 104 (1), 88–95.

Aurdal, L., Huseby, R.B., Vikhamar, D., Eikvil, L., Solberg, A., Solberg, R., 2005. Use ofhidden Markov models and phenology for multitemporal satellite imageclassification: Applications to mountain vegetation classification. In: ThirdInternational Workshop on the Analysis of Multi-Temporal Remote SensingImages: Multi Temp 2005, pp. 220–224.

Blaschke, T., Strobl, J., 2001. What’s wrong with pixels? Some recent developmentsinterfacing remote sensing and GIS. GIS – Zeitschrift fürGeoinformationssysteme 14 (6), 12–17.

Brovelli, M.A., Crespi, M., Fratarcangeli, F., Giannone, F., Realini, E., 2006.Accuracy assessment of high resolution satellite imagery by leave-one-outmethod. In: Proceedings of the 7th International Symposium on SpatialAccuracy Assessment in Natural Resources and Environmental Sciences, pp.533–542.

Bunkle, H., Caelli, T., 2001. Hidden Markov Models – Applications in ComputerVision. World Scientific.

Chavez Jr., P.S., 1988. An improved dark-object subtraction technique foratmospheric scattering correction of multispectral data. Remote Sens.Environ. 24 (9), 459–479.

Gürtler, S., 2003. Estimativa da área agrícola a partir de sensoriamento remoto ebanco de pixels amostrais. Master Dissertation, INPE, São José dos Campos,Brazil.

Jeon, B., Landgrebe, D.A., 1992. Classification with spatio-temporal interpixel classdependency contexts. IEEE Trans. Geosci. Remote Sens. 37 (3), 1227–1233.

Johnson, R.A., Wichern, D.W., 1998. Applied Multivariate Statistical Analysis.Prentice Hall, New Jersey.

Leite, P.B.C., Feitosa, R.Q., Formaggio, A.R., Costa, G.A.O.P., Pakzad, K., Sanches, I.D.,2008. Crop type recognition based on hidden Markov models of plantphenology. In: SIBGRAPI 2008 XXI Brazilian Symposium on ComputerGraphics and Image Processing, pp. 12–15.

Luiz, A.J.B., Gürtler, S., Gleriani, J.M., Epiphanio, J.C.N., Campos, R.C., 2003.Reflectância a partir do número digital de imagens ETM, XI SimpósioBrasileiro de Sensoriamento Remoto, pp. 2071–2078.

Mather, P.M., 1993. Computer processing of remotely-sensed images: Anintroduction. John Wiley and Sons, Chichester.

Rabiner, L.R., 1989. A tutorial on hidden Markov models and selected applications inspeech recognition. Proc. IEEE 77 (2), 257–286.

Reed, B.C., Brown, J.F., 2005. Trend analysis of time-series phenology derived fromsatellite data. In: Third International Workshop on the Analysis of Multi-Temporal Remote Sensing Images: Multi Temp 2005, pp. 166–168.

Ren, J., Chen, Z., Zhou, Q., Tang, H., 2008. Regional yield estimation for winter wheatwith MODIS-NDVI data in Shandong, China. Internat. J. Appl. Earth Obs.Geoinform. 10 (4), 403–413.

Sanches, I.D., 2004. Sensoriamento remoto para o levantamento espectro-temporale estimativa de áreas de culturas agrícolas. Master Dissertation, INPE, São Josédos Campos, Brazil.

Shanahan, J.F., Schepers, J.S., Francis, D.D., Varvel, G.E., Wilhelm, W.W., Tringe, J.M.,Schlemmer, M.R., Major, D.J., 2001. Use of remote-sensing imagery to estimatecorn grain yield. Agron. J. 93 (3), 583–589.

Soille, P., 2003. Morphological image analysis principles and application, second ed.Springer Verlag, Berlin.


Törmä, M., Rankinen, K., Härmä, P., 2007. Using phenological information derivedfrom MODIS-data to aid nutrient modeling. In: IEEE 2007 InternationalGeoscience and Remote Sensing Symposium, pp. 2298–2301.

Vincent, L., Soille, P., 1991. Watersheds in digital spaces: An efficient algorithmbased on immersion simulations. IEEE Trans. Pattern Anal. Machine Intell. 13(6), 583–598.

Wiegand, C.L., Richardson, A.J., Kanemasu, E.T., 1979. Leaf area index estimates forwheat from Landsat and their implications for evapotranspiration and cropmodeling. Agron. J. 71, 336–342.

Zhang, X., Friedl, M.A., Schaaf, C.B., Strahler, A.H., Hodges, J.C.F., Gao, F., Reed, B.C.,Huete, A., 2003. Monitoring vegetation phenology using MODIS. Remote Sens.Environ. 84, 471–475.

Documents

Hidden Markov Models for crop recognition in remote sensing image sequences