11
Research Article Discriminant WSRC for Large-Scale Plant Species Recognition Shanwen Zhang, 1,2 Chuanlei Zhang, 1 Yihai Zhu, 2 and Zhuhong You 1 1 Department of Information Engineering, Xijing University, Xi’an 710123, China 2 Tableau Soſtware, Seattle, WA 98103, USA Correspondence should be addressed to Chuanlei Zhang; [email protected] Received 3 June 2017; Revised 3 November 2017; Accepted 15 November 2017; Published 25 December 2017 Academic Editor: Carlos M. Travieso-Gonz´ alez Copyright © 2017 Shanwen Zhang et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. In sparse representation based classification (SRC) and weighted SRC (WSRC), it is time-consuming to solve the global sparse representation problem. A discriminant WSRC (DWSRC) is proposed for large-scale plant species recognition, including two stages. Firstly, several subdictionaries are constructed by dividing the dataset into several similar classes, and a subdictionary is chosen by the maximum similarity between the test sample and the typical sample of each similar class. Secondly, the weighted sparse representation of the test image is calculated with respect to the chosen subdictionary, and then the leaf category is assigned through the minimum reconstruction error. Different from the traditional SRC and its improved approaches, we sparsely represent the test sample on a subdictionary whose base elements are the training samples of the selected similar class, instead of using the generic overcomplete dictionary on the entire training samples. us, the complexity to solving the sparse representation problem is reduced. Moreover, DWSRC is adapted to newly added leaf species without rebuilding the dictionary. Experimental results on the ICL plant leaf database show that the method has low computational complexity and high recognition rate and can be clearly interpreted. 1. Introduction Plant species automatic identification via computer vision and image processing techniques is very useful and important for biodiversity conservation. e plant species can be recog- nized by their flowers, leaves, fruits, barks, roots, and stems [1–3]. Leaf is oſten used for plant species identification. A leaf image can be characterized by its color, texture, and shape. In particular, leaf shape contains a lot of classifying features, such as leaf tip, blade, leaf base, teeth, veins, apex, margin, leafstalk, insertion point, and contour. e availability and universality of relevant technologies, such as digital cameras, smart phone, Internet of things (IOT), remote access to databases, and many effective algorithms and techniques in image processing and machine learning make leaf based automatic plant species identification become reality. Mata- Montero and Carranza-Rojas [4] summed up the recent efforts in the plant species recognition using computer vision and machine learning techniques, and the main technical issues for efficiently identifying biodiversity. W¨ aldchen and ader [5] introduced 120 peer-reviewed studies in the last 10 years from 2005 to 2015 and compared the existing methods of plant species identification based on the classification accuracy achieved on the public available datasets. Munisami et al. [6] proposed a plant recognition method by extract- ing a lot of features such as the leaf length, width, area, perimeter, hull perimeter, hull area, color histogram, and centroid-based radial distance. Chaki et al. [7] proposed a method of characterizing and recognizing plant leaves using a combination of texture and shape features. e extracted features are invariant to translation, rotation, and scaling factors. Different from the existing studies which target at simple leaves, Zhao et al. [8] proposed a plant identification method to accurately recognize both simple and compound leaves. de Souza et al. [9] presented a plant identification method by combining shape descriptor, feature selector, and classification model. e extracted shape descriptor is invari- ant to rotation, translation, scale, and start point of the edge centroid distance sequences. Liu and Kan [10] extracted a lot of texture and shape features for leaf classification, including Gabor filters, local binary patterns, Hu moment invariants, gray level cooccurrence matrices, and Fourier descriptors Hindawi Computational Intelligence and Neuroscience Volume 2017, Article ID 9581292, 10 pages https://doi.org/10.1155/2017/9581292

ResearchArticle Discriminant WSRC for Large-Scale Plant ...downloads.hindawi.com/journals/cin/2017/9581292.pdf · ResearchArticle Discriminant WSRC for Large-Scale Plant Species Recognition

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ResearchArticle Discriminant WSRC for Large-Scale Plant ...downloads.hindawi.com/journals/cin/2017/9581292.pdf · ResearchArticle Discriminant WSRC for Large-Scale Plant Species Recognition

Research ArticleDiscriminant WSRC for Large-Scale Plant Species Recognition

Shanwen Zhang12 Chuanlei Zhang1 Yihai Zhu2 and Zhuhong You1

1Department of Information Engineering Xijing University Xirsquoan 710123 China2Tableau Software Seattle WA 98103 USA

Correspondence should be addressed to Chuanlei Zhang a17647gmailcom

Received 3 June 2017 Revised 3 November 2017 Accepted 15 November 2017 Published 25 December 2017

Academic Editor Carlos M Travieso-Gonzalez

Copyright copy 2017 Shanwen Zhang et alThis is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

In sparse representation based classification (SRC) and weighted SRC (WSRC) it is time-consuming to solve the global sparserepresentation problemAdiscriminantWSRC (DWSRC) is proposed for large-scale plant species recognition including two stagesFirstly several subdictionaries are constructed by dividing the dataset into several similar classes and a subdictionary is chosenby the maximum similarity between the test sample and the typical sample of each similar class Secondly the weighted sparserepresentation of the test image is calculated with respect to the chosen subdictionary and then the leaf category is assigned throughthe minimum reconstruction error Different from the traditional SRC and its improved approaches we sparsely represent thetest sample on a subdictionary whose base elements are the training samples of the selected similar class instead of using thegeneric overcomplete dictionary on the entire training samples Thus the complexity to solving the sparse representation problemis reduced Moreover DWSRC is adapted to newly added leaf species without rebuilding the dictionary Experimental results onthe ICL plant leaf database show that the method has low computational complexity and high recognition rate and can be clearlyinterpreted

1 Introduction

Plant species automatic identification via computer visionand image processing techniques is very useful and importantfor biodiversity conservationThe plant species can be recog-nized by their flowers leaves fruits barks roots and stems[1ndash3] Leaf is often used for plant species identification A leafimage can be characterized by its color texture and shapeIn particular leaf shape contains a lot of classifying featuressuch as leaf tip blade leaf base teeth veins apex marginleafstalk insertion point and contour The availability anduniversality of relevant technologies such as digital camerassmart phone Internet of things (IOT) remote access todatabases and many effective algorithms and techniques inimage processing and machine learning make leaf basedautomatic plant species identification become reality Mata-Montero and Carranza-Rojas [4] summed up the recentefforts in the plant species recognition using computer visionand machine learning techniques and the main technicalissues for efficiently identifying biodiversity Waldchen andMader [5] introduced 120 peer-reviewed studies in the last 10

years from 2005 to 2015 and compared the existing methodsof plant species identification based on the classificationaccuracy achieved on the public available datasets Munisamiet al [6] proposed a plant recognition method by extract-ing a lot of features such as the leaf length width areaperimeter hull perimeter hull area color histogram andcentroid-based radial distance Chaki et al [7] proposed amethod of characterizing and recognizing plant leaves usinga combination of texture and shape features The extractedfeatures are invariant to translation rotation and scalingfactors Different from the existing studies which target atsimple leaves Zhao et al [8] proposed a plant identificationmethod to accurately recognize both simple and compoundleaves de Souza et al [9] presented a plant identificationmethod by combining shape descriptor feature selector andclassification modelThe extracted shape descriptor is invari-ant to rotation translation scale and start point of the edgecentroid distance sequences Liu and Kan [10] extracted a lotof texture and shape features for leaf classification includingGabor filters local binary patterns Hu moment invariantsgray level cooccurrence matrices and Fourier descriptors

HindawiComputational Intelligence and NeuroscienceVolume 2017 Article ID 9581292 10 pageshttpsdoiorg10115520179581292

2 Computational Intelligence and Neuroscience

and applied deep belief networks with dropout to classifyplant species Vijayalakshmi and Mohan [11] extracted graylevel cooccurrence matrix (GLCM) and local binary pattern(LBP) features for leaf classification Pradeep Kumar et al[12] presented a leaf recognition system using orthogonalmoments as shape descriptors and histogram of orientedgradients (HOG) and Gabor features as texture descriptors

Although many leaf based plant species recognitionmethods have been proposed this research still remainschallenging The reasons are that (1) the plant leaves are veryvarious complex and changeable with seasons and a leafimage can be taken under arbitrary position and the relativeposes of its petiole and blade are often various in the differentleaf images (2) in nature the leaf shapes have large intraclassdifference (as shown in Figures 1(a) and 1(b)) and highinterclass similarity (as shown in Figure 1(c))

Recently sparse representation based classification (SRC)has received significant attention and has been proven to beadvantageous for various applications in signal processingcomputer vision and pattern recognition [13 14] SRC isrobust to occlusion illumination and noise and has beenapplied to plant species identification and achieved betterrecognition results Hsiao et al [15] proposed two SRC basedleaf image recognition frameworks for plant species identi-fication and experimentally compared them In the frame-works an overcomplete dictionary is learned for sparselyrepresenting the training images of each species Jin et al [16]proposed a plant species identification method using sparserepresentation (SR) of leaf tooth features In the methodfour leaf tooth features are extracted and plant species isrecognized by SRC In practice however the existing SRCand WSRC would fail in leaf based large-scale plant speciesrecognition because it is time-consuming to solve the 1198971-norm minimization problem in a dictionary composed of alltraining data across all classes Several modified local SRCalgorithms have been proposed to reduce the computationalcost of SRC [17ndash19] In these methods the SR problem isperformed for each test sample in its local neighborhood setHowever the constructed dictionary is exceptionally depen-dent on the test sample Elhamifar and Vidal [20] proposed astructured SR for face recognition They tried to look for theSR of a test sample using the minimum number of groupsfrom the dictionary The task of WSRC is to measure thesignificance of each training sample in representing the testsamples It can preserve the locality and similarity betweenthe test sample and its neighbor training samples while look-ing for the sparse linear representation [21] In classical SRCWSRC and their modified approaches to achieve goodsparse performance all training samples are employed tobuild the overcomplete dictionary [13 14] However it istime-consuming to solve the SR problem through the over-complete dictionary especially in large-scale image recogni-tion task To improve the performance of SRC a lot of small-size dictionary schemes have been developed Wright et al[22] manually selected a few training samples to constructthe dictionary Mairal et al [23] learned a separate dictionaryfor each class Zhang et al [24] presented a dictionarylearning model to improve the SR performance The testsample is recognized by only the reconstruction error of

each class-specific subdictionary instead of the features fromall classes in the common subdictionary Sun et al [25]presented a dictionary learningmodel for image classificationby learning a class-specific subdictionary for each class wherethe subdictionary is shared by all classes

Inspired by the recent progress of plant species iden-tification and WSRC we propose a discriminant weightedSRC (DWSRC) for large-scale plant species identification InDWSRC we calculate the sparse representation of the testexample that involves the minimum number of trainingsamples instead of employing all the training data

The main contribution of this paper is listed as follows(1) In DWSRC instead of constructing the 1198971-normmin-imization problem by employing all of the training sampleswe solve a similar problem that involves the minimumnumber of training samples Thus the computational cost todeal with the large-scale plant species recognition task will besignificantly reduced(2)The side-effect of the noisy data and outliers on theclassification decision can be eliminated through selecting thesimilar class to sparsely represent the test sample(3) DWSRC integrates the sparsity and both local andglobal structure information of the training data into aunified framework and considersmore attention to the train-ing data that are similar to the test sample in representing thetest sample(4) DWSRC integrates data structure information anddiscriminative information

The rest of this paper is organized as follows In Section 2we briefly review SRC weighted SRC (WSRC) and overcom-plete dictionary Section 3 presents a discriminant weightedSRC (DWSRC) approach for large-scale plant species iden-tification Section 4 reports the experiment results on thepopular leaf image database Finally Section 5 concludes thispaper

2 Related Work

21 SRC and WSRC The goal of SRC and WSRC is tosparsely represent the test sample by the smallest number ofnonzero coefficients in terms of the overcomplete dictionaryand assign the test sample to the class with the minimumreconstitution error Suppose there are 119899 training samples119883 = [1199091 1199092 119909119899] isin 119877119899times119898 belonging to 119862 classes and119910 isin 119877119898 is a test sample The task of SRC is to solve thefollowing 1198971-norm minimization problem

1198861015840 = argmin119886

1003817100381710038171003817119910 minus 119860119886100381710038171003817100381722 + 120583 1198861 (1)

where119860 is an overcomplete dictionary built by all of the train-ing samples 119886 = [1198861 1198862 119886119899]119879 is a SR coefficient vector119886119894 (119894 = 1 2 119899) is the 119894th SR coefficient of 119909119894 and 120583 gt 0 is ascalar regularization parameter

The weights of the training samples are often calculatedby the distance or similarity between the test sample and eachtraining sample Similar to SRC WSRC solves a weighted 1198971-norm minimization problem

119886 = argmin119886

1003817100381710038171003817119910 minus 119860119886100381710038171003817100381722 + 120583 1198821198861 (2)

where119882 is a block-diagonal weighted matrix

Computational Intelligence and Neuroscience 3

1jpg

11jpg 12jpg 13jpg 14jpg 15jpg 16jpg 17jpg 18jpg 19jpg 20jpg

2jpg 3jpg 4jpg 5jpg 6jpg 7jpg 8jpg 9jpg 10jpg

(a) 20 leaves of Aesculus

IMG_1837 IMG_1838 IMG_1839 IMG_1840 IMG_1841 IMG_1842 IMG_1843 IMG_1844 IMG_1845 IMG_1846

IMG_1847 IMG_1848 IMG_1849 IMG_1850 IMG_1851 IMG_1852 IMG_1853 IMG_1854 IMG_1855 IMG_1856

(b) 20 leaves of ACrape myrtle

(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)

(c) 10 different species leaves

Figure 1 Plant leaf examples

The minimal problems of (1) and (2) can be solved bythe sparse subspace clustering algorithm [26] Ideally theobtained SR coefficients are sparse and thus the test samplecan easily be assigned to that class [27] Then 119910 is assigned tothe class with the minimum reconstruction error among allof the classes which is defined as follows

identity (119910) = arg min119888=12119862

1003817100381710038171003817119910 minus 11986011988811988611988810038171003817100381710038172 (3)

where 119886119888 is the coefficient subvector corresponding to the cthtraining class in 11988622 Dictionary of SRC SRC and WSRC aim to find a sparserepresentation of the input data in the form of a linear com-bination of basic elements These elements are called atomsand they compose a dictionary Atoms in the dictionaryare not required to be orthogonal and they might be anovercomplete spanning set Although SRC WSRC and theirimprovedmethods have beenwidely used in various practicalapplications how to construct the dictionary is still a chal-lenging issue in image processing Many schemes have beenproposed to efficiently learn the overcomplete dictionary byenforcing some discriminant criteria and achieve impressiveperformances on image classification butmost of the existingmodified SRCmethods are complex and difficult to be imple-mented [23 24] Moreover most of these methods assumethat the training samples of a particular class approximately

form a linear basis for any test sample belonging to thatclass [18 22ndash25] But in nature there are a lot of interclasssimilar leaves such as in Figure 1(c) In particular it is time-consuming to implement these methods on the large-scaledatabase

3 Discriminant WSRC for Large-Scale PlantSpecies Recognition

Based on the global and local SR several WSRC and groupSRC algorithms are proposed to suppress the small nonzerocoefficients Similar to SRC these approaches assume that thesamples from a single class lie on a linear subspace but we donot know the class label of a test sample Anyway it is impor-tant to choose the ldquooptimalrdquo training samples to sparselyrepresent the test sample To improve the performance ofSRC a discriminant WSRC (DWSRC) method is proposedfor large-scale plant species recognition given 119899 training leafimages 1199091 1199092 119909119899 and119898 representative leaf images withlarge different shapes such as Linear Oblong Elliptic andOvate where 119905119895 (119895 = 1 2 119898) is the jth representativeimage WSRC is introduced in this section

31 Subdictionary Construction In nature there are morethan 400000 kinds of plant species Although there arevarious leaves in size shape color and texture we can divide

4 Computational Intelligence and Neuroscience

(a) Simple and complex (b) Single and compound

Linear Oblong Elliptic Ovate Obovate

Orbicular Cordate Obcordate Reniform Deltoid(c) 10 kinds of leaf shapes

Figure 2 Plant leaf examples

all leaves into several similar classes by different leaf shaperepresentation criteria such as (1) Simple and Complex (2)Single and Compound (3) Linear Oblong Elliptic OvateObovate Orbicular Cordate Obcordate Reniform Deltoidand others as shown in Figure 2

The Gaussian kernel distance can capture the nonlinearinformation within the dataset to measure the distance orsimilarity between the samples Suppose there are two leafimages 119909 119910 and their similarity is defined by the Gaussiankernel distance

119904 (119909 119910) = exp(minus1003817100381710038171003817119909 minus 1199101003817100381710038171003817221205732 ) (4)

where 120573 is the Gaussian kernel width to control the excessivedistance between119909 and119910 and 119909minus119910 is the Euclidean distancebetween 119909 119910

In this section any original leaf image is transformed intogray image or binary image In fact any gray or binary imageis a 2DmatrixThus 119909minus119910 is the Euclidean distance betweentwo matrices 119909 119910

Given 119899 training leaf images 1199091 1199092 119909119899 in orderto divide these images into 119898 similar classes we find 119898representative leaf images with large different shapes such asLinear Oblong Elliptic andOvate where 119905119895 (119895 = 1 2 119898)is the jth representative image Then we set up a threshold119879 If 119904(119909119894 119905119895) ge 119879 119909119894 is assigned to the jth similar class if119904(119909119894 119905119895) lt 119879 for any 119905119895 (119895 = 1 2 119898) 119909119894 does not belong to119898 given similar classes It is assigned to the (119898 + 1)th similarclassThen 119899 leaf images are roughly divided into119898+1 similarclasses

We concatenate each leaf image of119898 similar class as a one-dimensionality vector and construct119898+1 subdictionaries forthe following WSRC algorithm by all these vectors namely119860119895 isin 119877119899119895times119899 (119895 = 1 2 119898+1) where 119899 is the dimensionalityof the leaf image vector and 119899119895 is the leaf number of the jthsimilar class and each columnof119860119895 is a vector correspondingto a leaf image

32 Discriminant WSRC SRC can ensure sparsity whileit loses the local information of the training data WSRCcan preserve the sparsity and the similarity between thetest sample and its neighbor training data while it ignoresthe prior global structure information of the training dataWe proposed a discriminant WSRC (DWSRC) scheme toimprove the classifying performance of WSRC by integratingthe sparsity and both locality and global structure informa-tion of the training data into a unified framework

Given a test leaf image 119910 and a threshold 1198791 by (4) wecalculate the similarity 119904(119910 119905119895) (119895 = 1 2 119898) between 119910and the jth representative leaf image 119905119895 (119895 = 1 2 119898) If119904(119910 119905119903) gt 1198791 choose the 119903th similar class as the candidatesimilar class with the maximum similarity for classifying119910 and select the corresponding 119903th subdictionary 119860119903 asthe candidate subdictionary Similar to the general WSRCDWSRC solves the followingweighted 1198971-normminimizationproblem

1198861015840 = argmin119886

1003817100381710038171003817119910 minus 119860119903119886100381710038171003817100381722 + 120583 100381710038171003817100381711988211990311988610038171003817100381710038171 (5)

Computational Intelligence and Neuroscience 5

where 119882119903 isin 119877119899119903times119899119903 is a weighted diagonal matrix 119899119903 is theleaf image number of the 119903th similar class and the diagonalelements 1199081199031 1199081199032 119908119903119899119903 of 119882119903 are the Gaussian kerneldistances between 119910 and the candidate samples which aredefined as follows

119908119903119894 = exp(1003817100381710038171003817119910 minus 11990911990311989410038171003817100381710038172212057321 ) (6)

where 119909119903119894 is the 119894th vector of the candidate similar class and1205731is the Gaussian kernel width to adjust the weight decay speed

In (5)119882119903 can penalize the distance between 119910 and eachtraining sample of the candidate similar class Its objectiveis to avoid selecting the training samples that are far fromthe test sample to represent the test sample Because thesample numbers of the different similar class are differentafter obtaining the SR coefficient vector 119886 we compute theaverage reconstruction error of each class in the candidatesimilar class as follows

119898119888 (119910) = 11198991198881003817100381710038171003817119910 minus 11986011990311988611988810038171003817100381710038172 (7)

where 119886119888 is the coefficient subvector associated with the cthtraining class in 119886 and 119899119888 is the vector number of the cth classin the candidate similar class

Then the test image is labeled to the class that owns theminimum normalized reconstruction error119898119888(119910)

In classical WSRC [21] if the diagonal elements 1199081 1199082 119908119899 of weighted diagonal matrix119882 is defined as follows

119908119894 = 119904 (119909119894 119910) if 119904 (119909119894 119910) gt 1198790 otherwise (8)

then WSRC is DWSRCDWSRC can generate more discriminative sparse codes

which can be used to represent the test sample more robustlyby combining both linearity and locality information for im-proving recognition performance DWSRC is a constrainedLASSO problem [21] Its flowchart is shown in Figure 3

321 Feasibility Analysis In classical SRC the computationalcomplexity to obtain the SR coefficients is about 119874(1198962119899) pertest sample [14 15 17] where 119896 is the number of nonzeroentries in reconstruction coefficients Ideally all of the nonz-ero SR coefficients will be associated with the columns of theovercomplete dictionary from a single object class and thetest sample can be easily assigned to that class However infact in the training leaf image set noise outliers modelingerror and a lot of intraclass dissimilar leaves may lead tomany small nonzero SR coefficients associated with multiplespecies classesThe computational complexity to obtain the SRcoefficients is nearly 119874(1198993) per test sample [17] In DWSRCall of 119899 training samples are split into 119898 similar classesthat are used to build 119898 subdictionaries and a candidatesubdictionary is chosen for WSRC Although the number ofthe training samples from the candidate similar class is lessthan the number of all the training samples the dictionary

completeness may not be destroyed because the candidatetraining samples that are similar to the test sample areretained while the noise outliers and irregular samples thatare dissimilar to the test sample are excluded The SRcoefficients obtained byDWSRCmight not be as sparse as theones achieved by the classical SRC and WSRC but the SRcoefficients of the relevant training samples still own greatermagnitudes because all training samples of a candidatesimilar class generally play an important role in sparselyrepresenting the test sample

If we consider all SR coefficients of all similar classesexcept the candidate similar class to be zeros we also thinkDWSRC is conducted on whole the database Thus DWSRCis much sparser than SRC and WSRC On the whole inDWSRCGaussian kernel distance information is imposed onthe WSRC to suppress the small nonzero coefficients so thenumber of small nonzero SR coefficients of DWSRC shouldbe significantly reducedThe sample number in the candidatesimilar class is about 119899119898 Compared with SRC and WSRCthe computational complexity of DWSRC to obtain the SRcoefficients is about 119874(11989621119899119898) per test sample where 1198961 isthe number of nonzero entries in reconstruction coefficientsby DWSRC In general 1198961 is much less than 119896 and 119899 so thecomputational complexity of DWSRC is quite less than thatof SRC and WSRC

From the above analysis it is indicated thatDWSR ismoresuitable for the large-scale database classification problem

4 Experiments and Results

In this section we investigate the performance of the pro-posed DWSRC method for plant species recognition andcompare it with four state-of-the-art approaches includingplant species recognition using WSRC [21] Leaf MarginSequences (LMS) [28]ManifoldndashManifold Distance (MMD)[29] and Centroid Contour Distance (CCD) [30] In WSRCbased experiments the parameter selection data processingand recognition process are similar to that in DWSRC whilein LMS MMD and CCD based experiments the parameterselection data processing and recognition process are fromthe three references [28ndash30] respectively

The important parameters in DWSRC include the scalarregularization parameter 120583 two thresholds119879 and1198791 and twoGaussian kernel widths 120573 and 1205731 in (4) and (6) Among themas 120583 120573 and 1205731 are not sensitive to recognition results [13 2131]120583 in (5) is set to 0001 empirically and twoGaussian kernelwidths are set as

120573 = (119889max (119909119894 119905119895) + 119889min (119909119894 119905119895))2 1205731 = (119889max (119910 119909119903119894) + 119889min (119910 119909119903119894))2

(9)

where 119889max(119909119894 119905119895) and 119889min(119909119894 119905119895) are maximum and mini-mum Euclidean distance between 119909119894 and 119905119895 and 119889max(119910 119909119903119894)and 119889min(119910 119909119903119894) are maximum and minimum Euclideandistance between 119910 and 119909119903119894

All experiments are performed on the platform with30GHz CPU and 40GB RAM by MATLAB 2011b The

6 Computational Intelligence and Neuroscience

Original database

Buildingsubdictionary

Similar class 1 Similar class 2 Similar class m + 1

Dividing database

Subdictionary 1 Subdictionary 2 Subdictionary m + 1

Build subdictionary

Test leaf image

middot middot middot

middot middot middot

middot middot middot

Choose the similar class and subdictionary with the maximum similarity

DWSRC

Constructing DWSRC

Construct and solve the weighted l1-norm minimization constraint problemon the chosen similar class and subdictionary

Classifying

Classify the test sample y into the class with the minimum reconstruction error rc(y)

(y tj)maxj s

Figure 3 The flowchart of the proposed method

MATLAB SR toolbox 19 (httpssitesgooglecomsitespar-sereptool) is used to solve theweighted 1198971-normminimizationproblem

41 Data Preparation All plant species recognition experi-ments by DWSRC and four competitive methods are con-ducted on the ICL Leaf database (httpspanbaiducomsharelinkshareid=2093145911ampuk=1395063007) which wascollected by intelligent computing laboratory (ICL) of Chi-neseAcademy of SciencesThe database contains 16846 plantleaf images from 220 kinds of plant species with differentnumbers of leaf images per species and different sizes frommin-size of 29 times 21 to max-size of 124 times 107 Some examplesare shown in Figure 4

FromFigure 4we find that each leaf image in ICL is singlewith simple background and without obscureness twist andnoise so we ignore a lot of complex preprocessing such asfiltering and enhancement But from Figure 4(a) it can beseen that the original leaf images in ICL database are orientedat a random angle and have various shapes colors andsizes Some leaf images contain footstalks and some kinds of

leaves are similar to each other Because the sizes of the orig-inal color leaf images are different from each other we firstlytransform each original image to grayscale image and geo-metrically scale it to the same size by bilinear interpolationalgorithm From Figure 4(b) it is found that 30 cottonplant leaves have large intraclass difference with low-intensitywhite backgrounds To improve recognition rate several pre-processing steps prior to species recognition are employed asfollows(1) Convert each color leaf image into grayscale image(2) Separate the background from the leaf image by asmall threshold (we set as 30) that is the pixel value less than30 is set as 0 under gray level of 255(3) Reduce noise by a median filter of radius 10(4) Cut footstalks off through contour information todetect narrow parallel running chains and remove the longestchain [6 30](5) In order to align all leaf images extract the angle fromthe image by which the major axis of the leaf is oriented withrespect to the vertical line and then rotate the image by theangle so that the major axis is aligned with the vertical line

Computational Intelligence and Neuroscience 7

(a) Leaf image samples of different species

157 (1) 157 (2) 157 (3) 157 (4) 157 (5) 157 (6) 157 (7) 157 (8) 157 (9) 157 (10)

157 (11) 157 (12) 157 (13) 157 (14) 157 (15) 157 (16) 157 (17) 157 (18) 157 (19) 157 (20)

157 (21) 157 (22) 157 (23) 157 (24) 157 (25) 157 (26) 157 (27) 157 (28) 157 (29) 157 (30)

(b) 30-leaf image of cotton plant

Figure 4 Leaf image samples of ICL database

finally remove the white bounding rectangle to superimposethe leaf over a homogeneous background(6) Extract leaf contour by Canny Edge algorithm [30](7) Crop each contour into the size to contain only theextents of the leaf and then resize them to a square size of 32 times32 by bilinear interpolation algorithm

Figure 5 shows a leaf preprocessing example by taking theabove steps

42 Subdictionary Construction From Figure 4(a) we seethere are innumerable various plant leaves but these leavescan be clustered into several similar classes by leaf shapedescription or similarity measure such as Acerose LinearGladiate Ensiform Oblanceolate Ovate Oval Obovate andElliptic The number of the classes can be defined by K-mean clustering and K-fusion clustering algorithms [32] Wereferred to twowebsites httptheseedsitecoukleafshapeshtml and httpsenwikipediaorgwikiGlossary of leafmorphology and four references [33ndash36] and defined the

class number as 21 In fact in nature we can choose 20 kinds oftypical leaves as shown in Figure 6 and find the similar leavesof each typical leaf from the ICL database Then we obtain 21similar classes

Suppose 119905119894 (119894 = 1 2 20) is the 119894th representative leafimage 119909119895 (119895 = 1 2 16846) is the jth leaf image of theICL database and the similarity 119904(119905119894 119909119895) between 119905119894 and 119909119895is calculated by (4) Figure 7 is the similarities between agiven dentate shape leaf image and each leaf image of the ICLdatabase In order to clearly show the visual graph of selectingthe candidate training leaf images Figure 7 only shows partialresults that is about 6300 similarities

From Figure 7 we can observe that a few leaf imagesare similar to the given test leaf image We set the threshold119879 = 05 that is if 119904(119905119894 119909119895) ge 05 classify 119909119895 into the 119894th similarclass Aftermost of leaf images of the ICL database are dividedinto 20 similar classes the remaining images are clusteredinto the 21st similar class Finally we build 21 subdictionariesby stacking all preprocessed contours of each similar class ascolumns of a matrix respectively

8 Computational Intelligence and Neuroscience

Original image Gray image Cut footstalks Aligned image Binary image Contour

Figure 5 The leaf preprocessing example

Line Round Dentate Polygon Crenate Lobed Palmatifid Lanceolate Spatulate Deltoid

Ovate Obcordate Cordate Reniform Hastate Subulate Peltate Flabellate Obovate Pinnatifid(a) 20 different kinds of typical leaves

(b) Others

Figure 6 21 kinds of typical leaves

00102030405060708091

Sim

ilarit

y

0 1000 2000 3000 4000 5000 6000

Training sample number

Figure 7 Partial similarities between the dentate leaf image andeach image of the ICL database

43 Classifying Plant Species by DWSRC Given a new testleaf image 119910 we calculate the similarity between 119910 and eachtypical leaf 119905119894 (119894 = 1 2 20) We set the second threshold1198791 = 01 that is if 119904(119910 119905119895) lt 01 for any 119895 = 1 2 20and select the 21st similar class as the candidate similar classand 11986021 as candidate subdictionary Otherwise we select thecandidate similar class and candidate subdictionary with themaximum similarity 119904(119910 119905119895) (119895 = 1 2 20)

0 2 4 6 8 10 12 14 16 18 200

00501

01502

02503

03504

04505

Typical leaf image number

Sim

ilarit

y

Figure 8 The similarity between 119910 and each typical leaf 119905119894 (119894 =1 2 20) where 119910 is cotton leaf

Figure 8 is a similarity example where the test image is acotton plant leaf image From Figure 8 the maximum simi-larity is 119904(1199054 119910) = 04269 Then we choose the fourth similarclass and the fourth subdictionary 1198604 to construct DWSRCfor classifying the test image We calculate the SR coefficients

Computational Intelligence and Neuroscience 9

Table 1 Average recognition rates standard deviation and running time of WSRC LMS MMD CCD and DWSRC

Method WSRC LMS MMD CCD DWSRCRecognition results 8827 plusmn 189 8273 plusmn 173 8651 plusmn 146 8162 plusmn 253 9112 plusmn 125Running time (s) 1218 842 936 723 437

Table 2 Average recognition rates standard deviation and running time of WSRC LMS MMD CCD and DWSRC

Method WSRC LMS MMD CCD DWSRCRecognition results 8656 plusmn 212 8250 plusmn 206 8543 plusmn 152 8117 plusmn 256 9064 plusmn 142Running time (s) 1432 953 1016 647 516

and recognize the test leaf by the minimum reconstructionerror

Figures 7 and 8 only list the visual process on how todivide the whole dataset into several similar classes andconstruct the subdictionary for the SRC algorithm where thetraining set is all images of ICL database and the test set is anyleaf image

44 Experimental Results To illustrate how DWSRC and 4competitive methods work two kinds of experiments areconducted(1) From ICL database we randomly select 1100 imagesfrom 220 species with 5 images per plant species as the test setand the rest as the training setThe training set is divided into21 similar classes to build the subdictionaries and each testleaf is recognized by DWSRC where the typical samples areshown in Figure 6(a) The training and test random partitionis repeated 50 times The average recognition results of 50experiments of five methods are shown in Table 1(2) From ICL database we randomly select 2000 imagesfrom 16846 plant leaf images as the test set and the rest asthe training set and repeat the training and testing procedure50 times In each time the training set is also divided into 21similar classes to build the subdictionaries and each test leafis recognized by DWSRC where the typical samples are alsoshown in Figure 6(a) The average recognition results of 50experiments of five methods are shown in Table 2

In Tables 1 and 2 the bold fonts highlight the best recogni-tion performance From two tables we can see that the recog-nition performance of DWSRC is the best The main reasonmight be that DWSRC employs only the candidate similarclass to solve the SR problem and recognize the test samplesand it makes use of the similarity relationship between thetest sample and each training sample So the computationalcost is significantly reduced and the side-effect of outliersnoise and modeling error to the recognition performance isgreatly suppressed The recognition rate of WSRC is higherthan LMS MMD and CCD but its computational time isthe longest Most of the time is taken to solve the 1198971-normminimization problem Since LMS MMD and CCD need toextract the classifying features from each leaf image whichis time-consuming they have lower recognition rates thanWSRC and DWSRC Because the real-world leaves of thesame species are different from each other it is difficult toextract the ldquooptimalrdquo features from each leaf image

5 Conclusions

As for the large-scale plant species recognition issue weproposed a new SRC approach that is discriminant weightedSRC (DWSRC) which aims to exploit the locality andsimilarity of the original dataset and the test samples insparsely representing the test samples It makes use of theprior structure information of the dataset and the distancebetween the test sample and training samples to construct theSR problem To represent a test sample we not only considerthe original data structure of training samples but alsotake into account the locality and the similarity between thetest sample and each training sample The SR coefficients ofthe test sample maintain a sparse structure and more dis-criminative information DWSRC is essentially the general-ization of WSRC Rather than WSRC we construct severalsubdictionaries prior to SRC by dividing all training samplesinto several similar classes and then choose a candidatesubdictionary for given test sample to implement WSRCThus the computational cost to solve the 1198971-norm minimiza-tion problem is greatly reduced In this sense DWSRC isreasonable for large-scale plant species recognition whichcan improve the plant species recognition rate and reduce thecomputation cost Recently deep learning has led to a seriesof breakthroughs in many pattern recognition research areasWe will study deep learning for leaf based species recognitionin the future work

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

Thiswork is supported by the ChinaNational Natural ScienceFoundation under Grant no 61473237 This work is partiallysupported by Tianjin Research ProgramofApplication Foun-dation and Advanced Technology no 14JCYBJC42500 andTianjin Science and Technology Correspondent Project no16JCTPJC47300The authorswould like to thank the ICLLeafdatabase for their constructive advice

References

[1] M-E Nilsback and A Zisserman ldquoAutomated flower classi-fication over a large number of classesrdquo in Proceedings of the6th Indian Conference on Computer Vision Graphics and ImageProcessing (ICVGIP rsquo08) pp 722ndash729 December 2008

10 Computational Intelligence and Neuroscience

[2] M Seeland M Rzanny N Alaqraa J Waldchen P Mader andY Zhang ldquoPlant species classification using flower imagesmdashAcomparative study of local feature representationsrdquo PLoS ONEvol 12 no 2 p e0170629 2017

[3] N Jamil N A C Hussin S Nordin and K Awang ldquoAutomaticPlant Identification Is Shape the Key Featurerdquo Procedia Com-puter Science vol 76 pp 436ndash442 2015

[4] E Mata-Montero and J Carranza-Rojas ldquoAutomated plantspecies identification Challenges and opportunitiesrdquo IFIPAdvances in Information and Communication Technology vol481 pp 26ndash36 2016

[5] J Waldchen and P Mader ldquoPlant Species Identification UsingComputer Vision Techniques A Systematic Literature ReviewrdquoArchives of Computational Methods in Engineering State-of-the-Art Reviews pp 1ndash37 2017

[6] T Munisami M Ramsurn S Kishnah and S Pudaruth ldquolantLeaf Recognition Using Shape Features and Colour Histogramwith K-nearest Neighbour Classifiersrdquo Procedia Computer Sci-ence vol 58 pp 740ndash747 2015

[7] J Chaki R Parekh and S Bhattacharya ldquoPlant leaf recognitionusing texture and shape features with neural classifiersrdquo PatternRecognition Letters vol 58 article no 6175 pp 61ndash68 2015

[8] C Zhao S S F Chan W-K Cham and L M Chu ldquoPlantidentification using leaf shapes - A pattern counting approachrdquoPattern Recognition vol 48 no 10 pp 3203ndash3215 2015

[9] M M S de Souza F N S Medeiros G L B Ramalho I Cde Paula Jr and I N S Oliveira ldquoEvolutionary optimization ofa multiscale descriptor for leaf shape analysisrdquo Expert Systemswith Applications vol 63 pp 375ndash385 2016

[10] N Liu and J-M Kan ldquoImproved deep belief networks andmulti-feature fusion for leaf identificationrdquo Neurocomputingvol 216 pp 460ndash467 2016

[11] B VijayaLakshmi and V Mohan ldquoKernel-based PSO andFRVM An automatic plant leaf type detection using textureshape and color featuresrdquo Computers and Electronics in Agri-culture vol 125 pp 99ndash112 2016

[12] T P Kumar M V P Reddy and P K Bora ldquoLeaf identificationusing shape and texture featuresrdquoAdvances in Intelligent Systemsand Computing vol 460 pp 531ndash541 2017

[13] Z Zhang Y Xu J Yang X Li andD Zhang ldquoA survey of sparserepresentation algorithms and applicationsrdquo IEEEAccess vol 3pp 490ndash530 2015

[14] H Cheng ldquoSparse Representation Modeling and Learningin Visual Recognition Theory Algorithms and ApplicationsrdquoAdvances in Computer Vision amp Pattern Recognition vol 6 pp1408ndash1425 93

[15] J-K Hsiao L-W Kang C-L Chang and C-Y Lin ldquoLearning-based leaf image recognition frameworksrdquo Studies in Computa-tional Intelligence vol 591 pp 77ndash91 2015

[16] T Jin X Hou P Li and F Zhou ldquoA novel method of automaticplant species identification using sparse representation of leaftooth featuresrdquo PLoS ONE vol 10 no 10 Article ID e01394822015

[17] C-G Li J Guo and H-G Zhang ldquoLocal sparse representationbased classificationrdquo in Proceedings of the 20th InternationalConference on Pattern Recognition (ICPR rsquo10) pp 649ndash652 2010

[18] J-X Mi and J-X Liu ldquoFace Recognition Using Sparse Repre-sentation-Based Classification on K-Nearest Subspacerdquo PLoSONE vol 8 no 3 Article ID e59430 2013

[19] Q Liu ldquoKernel Local Sparse Representation Based ClassifierrdquoNeural Processing Letters vol 43 no 1 pp 85ndash95 2016

[20] E Elhamifar and R Vidal ldquoRobust classification using struc-tured sparse representationrdquo in Proceedings of the IEEE Confer-ence on Computer Vision and Pattern Recognition (CVPR rsquo11)pp 1873ndash1879 June 2011

[21] Z Fan M Ni Q Zhu and E Liu ldquoWeighted sparse represen-tation for face recognitionrdquo Neurocomputing vol 151 no 1 pp304ndash309 2015

[22] JWright A Y Yang A Ganesh S S Sastry and YMa ldquoRobustface recognition via sparse representationrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 31 no 2 pp210ndash227 2009

[23] J Mairal F Bach J Ponce G Sapiro and A ZissermanldquoDiscriminative learned dictionaries for local image analysisrdquo inProceedings of the 26th IEEEConference onComputer Vision andPattern Recognition (CVPR rsquo08) pp 1ndash8 Anchorage AlaskaUSA June 2008

[24] J Zhang D Zhao F Jiang and W Gao ldquoStructural groupsparse representation for image Compressive Sensing recoveryrdquoin Proceedings of the 2013 Data Compression Conference DCC2013 pp 331ndash340 USA March 2013

[25] Y Sun Q Liu J Tang and D Tao ldquoLearning discriminativedictionary for group sparse representationrdquo IEEE Transactionson Image Processing vol 23 no 9 pp 3816ndash3828 2014

[26] S S Chen D L Donoho and M A Saunders ldquoAtomic decom-position by basis pursuitrdquo SIAM Review vol 43 no 1 pp 129ndash159 2001

[27] X TangG Feng and J Cai ldquoWeighted group sparse representa-tion for undersampled face recognitionrdquo Neurocomputing vol145 pp 402ndash415 2014

[28] G Cerutti L Tougne D Coquin and A Vacavant ldquoLeaf mar-gins as sequences A structural approach to leaf identificationrdquoPattern Recognition Letters vol 49 pp 177ndash184 2014

[29] J-X Du M-W Shao C-M Zhai J Wang Y Tang and CL P Chen ldquoRecognition of leaf image set based on manifold-manifold distancerdquo Neurocomputing vol 188 pp 131ndash138 2016

[30] A Hasim Y Herdiyeni and S Douady ldquoLeaf Shape Recogni-tion using Centroid Contour Distancerdquo in Proceedings of theInternational Seminar on Science of Complex Natural Systems2015 ISS-CNS 2015 Indonesia October 2015

[31] B Li CWang and D-S Huang ldquoSupervised feature extractionbased on orthogonal discriminant projectionrdquoNeurocomputingvol 73 no 1-3 pp 191ndash196 2009

[32] C C Aggarwal and C K Reddy Data Clustering Algorithmsand Applications Chapman amp HallCRC 2014

[33] httptheseedsitecoukleafshapeshtml[34] httpsenwikipediaorgwikiGlossary of leaf morphology[35] H Tsukaya ldquoMechanism of leaf-shape determinationrdquo Annual

Review of Plant Biology vol 57 pp 477ndash496 2006[36] J Dkhar and A Pareek ldquoWhat determines a leaf rsquos shaperdquo

EvoDevo vol 5 no 1 article no 47 2014

Submit your manuscripts athttpswwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 2: ResearchArticle Discriminant WSRC for Large-Scale Plant ...downloads.hindawi.com/journals/cin/2017/9581292.pdf · ResearchArticle Discriminant WSRC for Large-Scale Plant Species Recognition

2 Computational Intelligence and Neuroscience

and applied deep belief networks with dropout to classifyplant species Vijayalakshmi and Mohan [11] extracted graylevel cooccurrence matrix (GLCM) and local binary pattern(LBP) features for leaf classification Pradeep Kumar et al[12] presented a leaf recognition system using orthogonalmoments as shape descriptors and histogram of orientedgradients (HOG) and Gabor features as texture descriptors

Although many leaf based plant species recognitionmethods have been proposed this research still remainschallenging The reasons are that (1) the plant leaves are veryvarious complex and changeable with seasons and a leafimage can be taken under arbitrary position and the relativeposes of its petiole and blade are often various in the differentleaf images (2) in nature the leaf shapes have large intraclassdifference (as shown in Figures 1(a) and 1(b)) and highinterclass similarity (as shown in Figure 1(c))

Recently sparse representation based classification (SRC)has received significant attention and has been proven to beadvantageous for various applications in signal processingcomputer vision and pattern recognition [13 14] SRC isrobust to occlusion illumination and noise and has beenapplied to plant species identification and achieved betterrecognition results Hsiao et al [15] proposed two SRC basedleaf image recognition frameworks for plant species identi-fication and experimentally compared them In the frame-works an overcomplete dictionary is learned for sparselyrepresenting the training images of each species Jin et al [16]proposed a plant species identification method using sparserepresentation (SR) of leaf tooth features In the methodfour leaf tooth features are extracted and plant species isrecognized by SRC In practice however the existing SRCand WSRC would fail in leaf based large-scale plant speciesrecognition because it is time-consuming to solve the 1198971-norm minimization problem in a dictionary composed of alltraining data across all classes Several modified local SRCalgorithms have been proposed to reduce the computationalcost of SRC [17ndash19] In these methods the SR problem isperformed for each test sample in its local neighborhood setHowever the constructed dictionary is exceptionally depen-dent on the test sample Elhamifar and Vidal [20] proposed astructured SR for face recognition They tried to look for theSR of a test sample using the minimum number of groupsfrom the dictionary The task of WSRC is to measure thesignificance of each training sample in representing the testsamples It can preserve the locality and similarity betweenthe test sample and its neighbor training samples while look-ing for the sparse linear representation [21] In classical SRCWSRC and their modified approaches to achieve goodsparse performance all training samples are employed tobuild the overcomplete dictionary [13 14] However it istime-consuming to solve the SR problem through the over-complete dictionary especially in large-scale image recogni-tion task To improve the performance of SRC a lot of small-size dictionary schemes have been developed Wright et al[22] manually selected a few training samples to constructthe dictionary Mairal et al [23] learned a separate dictionaryfor each class Zhang et al [24] presented a dictionarylearning model to improve the SR performance The testsample is recognized by only the reconstruction error of

each class-specific subdictionary instead of the features fromall classes in the common subdictionary Sun et al [25]presented a dictionary learningmodel for image classificationby learning a class-specific subdictionary for each class wherethe subdictionary is shared by all classes

Inspired by the recent progress of plant species iden-tification and WSRC we propose a discriminant weightedSRC (DWSRC) for large-scale plant species identification InDWSRC we calculate the sparse representation of the testexample that involves the minimum number of trainingsamples instead of employing all the training data

The main contribution of this paper is listed as follows(1) In DWSRC instead of constructing the 1198971-normmin-imization problem by employing all of the training sampleswe solve a similar problem that involves the minimumnumber of training samples Thus the computational cost todeal with the large-scale plant species recognition task will besignificantly reduced(2)The side-effect of the noisy data and outliers on theclassification decision can be eliminated through selecting thesimilar class to sparsely represent the test sample(3) DWSRC integrates the sparsity and both local andglobal structure information of the training data into aunified framework and considersmore attention to the train-ing data that are similar to the test sample in representing thetest sample(4) DWSRC integrates data structure information anddiscriminative information

The rest of this paper is organized as follows In Section 2we briefly review SRC weighted SRC (WSRC) and overcom-plete dictionary Section 3 presents a discriminant weightedSRC (DWSRC) approach for large-scale plant species iden-tification Section 4 reports the experiment results on thepopular leaf image database Finally Section 5 concludes thispaper

2 Related Work

21 SRC and WSRC The goal of SRC and WSRC is tosparsely represent the test sample by the smallest number ofnonzero coefficients in terms of the overcomplete dictionaryand assign the test sample to the class with the minimumreconstitution error Suppose there are 119899 training samples119883 = [1199091 1199092 119909119899] isin 119877119899times119898 belonging to 119862 classes and119910 isin 119877119898 is a test sample The task of SRC is to solve thefollowing 1198971-norm minimization problem

1198861015840 = argmin119886

1003817100381710038171003817119910 minus 119860119886100381710038171003817100381722 + 120583 1198861 (1)

where119860 is an overcomplete dictionary built by all of the train-ing samples 119886 = [1198861 1198862 119886119899]119879 is a SR coefficient vector119886119894 (119894 = 1 2 119899) is the 119894th SR coefficient of 119909119894 and 120583 gt 0 is ascalar regularization parameter

The weights of the training samples are often calculatedby the distance or similarity between the test sample and eachtraining sample Similar to SRC WSRC solves a weighted 1198971-norm minimization problem

119886 = argmin119886

1003817100381710038171003817119910 minus 119860119886100381710038171003817100381722 + 120583 1198821198861 (2)

where119882 is a block-diagonal weighted matrix

Computational Intelligence and Neuroscience 3

1jpg

11jpg 12jpg 13jpg 14jpg 15jpg 16jpg 17jpg 18jpg 19jpg 20jpg

2jpg 3jpg 4jpg 5jpg 6jpg 7jpg 8jpg 9jpg 10jpg

(a) 20 leaves of Aesculus

IMG_1837 IMG_1838 IMG_1839 IMG_1840 IMG_1841 IMG_1842 IMG_1843 IMG_1844 IMG_1845 IMG_1846

IMG_1847 IMG_1848 IMG_1849 IMG_1850 IMG_1851 IMG_1852 IMG_1853 IMG_1854 IMG_1855 IMG_1856

(b) 20 leaves of ACrape myrtle

(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)

(c) 10 different species leaves

Figure 1 Plant leaf examples

The minimal problems of (1) and (2) can be solved bythe sparse subspace clustering algorithm [26] Ideally theobtained SR coefficients are sparse and thus the test samplecan easily be assigned to that class [27] Then 119910 is assigned tothe class with the minimum reconstruction error among allof the classes which is defined as follows

identity (119910) = arg min119888=12119862

1003817100381710038171003817119910 minus 11986011988811988611988810038171003817100381710038172 (3)

where 119886119888 is the coefficient subvector corresponding to the cthtraining class in 11988622 Dictionary of SRC SRC and WSRC aim to find a sparserepresentation of the input data in the form of a linear com-bination of basic elements These elements are called atomsand they compose a dictionary Atoms in the dictionaryare not required to be orthogonal and they might be anovercomplete spanning set Although SRC WSRC and theirimprovedmethods have beenwidely used in various practicalapplications how to construct the dictionary is still a chal-lenging issue in image processing Many schemes have beenproposed to efficiently learn the overcomplete dictionary byenforcing some discriminant criteria and achieve impressiveperformances on image classification butmost of the existingmodified SRCmethods are complex and difficult to be imple-mented [23 24] Moreover most of these methods assumethat the training samples of a particular class approximately

form a linear basis for any test sample belonging to thatclass [18 22ndash25] But in nature there are a lot of interclasssimilar leaves such as in Figure 1(c) In particular it is time-consuming to implement these methods on the large-scaledatabase

3 Discriminant WSRC for Large-Scale PlantSpecies Recognition

Based on the global and local SR several WSRC and groupSRC algorithms are proposed to suppress the small nonzerocoefficients Similar to SRC these approaches assume that thesamples from a single class lie on a linear subspace but we donot know the class label of a test sample Anyway it is impor-tant to choose the ldquooptimalrdquo training samples to sparselyrepresent the test sample To improve the performance ofSRC a discriminant WSRC (DWSRC) method is proposedfor large-scale plant species recognition given 119899 training leafimages 1199091 1199092 119909119899 and119898 representative leaf images withlarge different shapes such as Linear Oblong Elliptic andOvate where 119905119895 (119895 = 1 2 119898) is the jth representativeimage WSRC is introduced in this section

31 Subdictionary Construction In nature there are morethan 400000 kinds of plant species Although there arevarious leaves in size shape color and texture we can divide

4 Computational Intelligence and Neuroscience

(a) Simple and complex (b) Single and compound

Linear Oblong Elliptic Ovate Obovate

Orbicular Cordate Obcordate Reniform Deltoid(c) 10 kinds of leaf shapes

Figure 2 Plant leaf examples

all leaves into several similar classes by different leaf shaperepresentation criteria such as (1) Simple and Complex (2)Single and Compound (3) Linear Oblong Elliptic OvateObovate Orbicular Cordate Obcordate Reniform Deltoidand others as shown in Figure 2

The Gaussian kernel distance can capture the nonlinearinformation within the dataset to measure the distance orsimilarity between the samples Suppose there are two leafimages 119909 119910 and their similarity is defined by the Gaussiankernel distance

119904 (119909 119910) = exp(minus1003817100381710038171003817119909 minus 1199101003817100381710038171003817221205732 ) (4)

where 120573 is the Gaussian kernel width to control the excessivedistance between119909 and119910 and 119909minus119910 is the Euclidean distancebetween 119909 119910

In this section any original leaf image is transformed intogray image or binary image In fact any gray or binary imageis a 2DmatrixThus 119909minus119910 is the Euclidean distance betweentwo matrices 119909 119910

Given 119899 training leaf images 1199091 1199092 119909119899 in orderto divide these images into 119898 similar classes we find 119898representative leaf images with large different shapes such asLinear Oblong Elliptic andOvate where 119905119895 (119895 = 1 2 119898)is the jth representative image Then we set up a threshold119879 If 119904(119909119894 119905119895) ge 119879 119909119894 is assigned to the jth similar class if119904(119909119894 119905119895) lt 119879 for any 119905119895 (119895 = 1 2 119898) 119909119894 does not belong to119898 given similar classes It is assigned to the (119898 + 1)th similarclassThen 119899 leaf images are roughly divided into119898+1 similarclasses

We concatenate each leaf image of119898 similar class as a one-dimensionality vector and construct119898+1 subdictionaries forthe following WSRC algorithm by all these vectors namely119860119895 isin 119877119899119895times119899 (119895 = 1 2 119898+1) where 119899 is the dimensionalityof the leaf image vector and 119899119895 is the leaf number of the jthsimilar class and each columnof119860119895 is a vector correspondingto a leaf image

32 Discriminant WSRC SRC can ensure sparsity whileit loses the local information of the training data WSRCcan preserve the sparsity and the similarity between thetest sample and its neighbor training data while it ignoresthe prior global structure information of the training dataWe proposed a discriminant WSRC (DWSRC) scheme toimprove the classifying performance of WSRC by integratingthe sparsity and both locality and global structure informa-tion of the training data into a unified framework

Given a test leaf image 119910 and a threshold 1198791 by (4) wecalculate the similarity 119904(119910 119905119895) (119895 = 1 2 119898) between 119910and the jth representative leaf image 119905119895 (119895 = 1 2 119898) If119904(119910 119905119903) gt 1198791 choose the 119903th similar class as the candidatesimilar class with the maximum similarity for classifying119910 and select the corresponding 119903th subdictionary 119860119903 asthe candidate subdictionary Similar to the general WSRCDWSRC solves the followingweighted 1198971-normminimizationproblem

1198861015840 = argmin119886

1003817100381710038171003817119910 minus 119860119903119886100381710038171003817100381722 + 120583 100381710038171003817100381711988211990311988610038171003817100381710038171 (5)

Computational Intelligence and Neuroscience 5

where 119882119903 isin 119877119899119903times119899119903 is a weighted diagonal matrix 119899119903 is theleaf image number of the 119903th similar class and the diagonalelements 1199081199031 1199081199032 119908119903119899119903 of 119882119903 are the Gaussian kerneldistances between 119910 and the candidate samples which aredefined as follows

119908119903119894 = exp(1003817100381710038171003817119910 minus 11990911990311989410038171003817100381710038172212057321 ) (6)

where 119909119903119894 is the 119894th vector of the candidate similar class and1205731is the Gaussian kernel width to adjust the weight decay speed

In (5)119882119903 can penalize the distance between 119910 and eachtraining sample of the candidate similar class Its objectiveis to avoid selecting the training samples that are far fromthe test sample to represent the test sample Because thesample numbers of the different similar class are differentafter obtaining the SR coefficient vector 119886 we compute theaverage reconstruction error of each class in the candidatesimilar class as follows

119898119888 (119910) = 11198991198881003817100381710038171003817119910 minus 11986011990311988611988810038171003817100381710038172 (7)

where 119886119888 is the coefficient subvector associated with the cthtraining class in 119886 and 119899119888 is the vector number of the cth classin the candidate similar class

Then the test image is labeled to the class that owns theminimum normalized reconstruction error119898119888(119910)

In classical WSRC [21] if the diagonal elements 1199081 1199082 119908119899 of weighted diagonal matrix119882 is defined as follows

119908119894 = 119904 (119909119894 119910) if 119904 (119909119894 119910) gt 1198790 otherwise (8)

then WSRC is DWSRCDWSRC can generate more discriminative sparse codes

which can be used to represent the test sample more robustlyby combining both linearity and locality information for im-proving recognition performance DWSRC is a constrainedLASSO problem [21] Its flowchart is shown in Figure 3

321 Feasibility Analysis In classical SRC the computationalcomplexity to obtain the SR coefficients is about 119874(1198962119899) pertest sample [14 15 17] where 119896 is the number of nonzeroentries in reconstruction coefficients Ideally all of the nonz-ero SR coefficients will be associated with the columns of theovercomplete dictionary from a single object class and thetest sample can be easily assigned to that class However infact in the training leaf image set noise outliers modelingerror and a lot of intraclass dissimilar leaves may lead tomany small nonzero SR coefficients associated with multiplespecies classesThe computational complexity to obtain the SRcoefficients is nearly 119874(1198993) per test sample [17] In DWSRCall of 119899 training samples are split into 119898 similar classesthat are used to build 119898 subdictionaries and a candidatesubdictionary is chosen for WSRC Although the number ofthe training samples from the candidate similar class is lessthan the number of all the training samples the dictionary

completeness may not be destroyed because the candidatetraining samples that are similar to the test sample areretained while the noise outliers and irregular samples thatare dissimilar to the test sample are excluded The SRcoefficients obtained byDWSRCmight not be as sparse as theones achieved by the classical SRC and WSRC but the SRcoefficients of the relevant training samples still own greatermagnitudes because all training samples of a candidatesimilar class generally play an important role in sparselyrepresenting the test sample

If we consider all SR coefficients of all similar classesexcept the candidate similar class to be zeros we also thinkDWSRC is conducted on whole the database Thus DWSRCis much sparser than SRC and WSRC On the whole inDWSRCGaussian kernel distance information is imposed onthe WSRC to suppress the small nonzero coefficients so thenumber of small nonzero SR coefficients of DWSRC shouldbe significantly reducedThe sample number in the candidatesimilar class is about 119899119898 Compared with SRC and WSRCthe computational complexity of DWSRC to obtain the SRcoefficients is about 119874(11989621119899119898) per test sample where 1198961 isthe number of nonzero entries in reconstruction coefficientsby DWSRC In general 1198961 is much less than 119896 and 119899 so thecomputational complexity of DWSRC is quite less than thatof SRC and WSRC

From the above analysis it is indicated thatDWSR ismoresuitable for the large-scale database classification problem

4 Experiments and Results

In this section we investigate the performance of the pro-posed DWSRC method for plant species recognition andcompare it with four state-of-the-art approaches includingplant species recognition using WSRC [21] Leaf MarginSequences (LMS) [28]ManifoldndashManifold Distance (MMD)[29] and Centroid Contour Distance (CCD) [30] In WSRCbased experiments the parameter selection data processingand recognition process are similar to that in DWSRC whilein LMS MMD and CCD based experiments the parameterselection data processing and recognition process are fromthe three references [28ndash30] respectively

The important parameters in DWSRC include the scalarregularization parameter 120583 two thresholds119879 and1198791 and twoGaussian kernel widths 120573 and 1205731 in (4) and (6) Among themas 120583 120573 and 1205731 are not sensitive to recognition results [13 2131]120583 in (5) is set to 0001 empirically and twoGaussian kernelwidths are set as

120573 = (119889max (119909119894 119905119895) + 119889min (119909119894 119905119895))2 1205731 = (119889max (119910 119909119903119894) + 119889min (119910 119909119903119894))2

(9)

where 119889max(119909119894 119905119895) and 119889min(119909119894 119905119895) are maximum and mini-mum Euclidean distance between 119909119894 and 119905119895 and 119889max(119910 119909119903119894)and 119889min(119910 119909119903119894) are maximum and minimum Euclideandistance between 119910 and 119909119903119894

All experiments are performed on the platform with30GHz CPU and 40GB RAM by MATLAB 2011b The

6 Computational Intelligence and Neuroscience

Original database

Buildingsubdictionary

Similar class 1 Similar class 2 Similar class m + 1

Dividing database

Subdictionary 1 Subdictionary 2 Subdictionary m + 1

Build subdictionary

Test leaf image

middot middot middot

middot middot middot

middot middot middot

Choose the similar class and subdictionary with the maximum similarity

DWSRC

Constructing DWSRC

Construct and solve the weighted l1-norm minimization constraint problemon the chosen similar class and subdictionary

Classifying

Classify the test sample y into the class with the minimum reconstruction error rc(y)

(y tj)maxj s

Figure 3 The flowchart of the proposed method

MATLAB SR toolbox 19 (httpssitesgooglecomsitespar-sereptool) is used to solve theweighted 1198971-normminimizationproblem

41 Data Preparation All plant species recognition experi-ments by DWSRC and four competitive methods are con-ducted on the ICL Leaf database (httpspanbaiducomsharelinkshareid=2093145911ampuk=1395063007) which wascollected by intelligent computing laboratory (ICL) of Chi-neseAcademy of SciencesThe database contains 16846 plantleaf images from 220 kinds of plant species with differentnumbers of leaf images per species and different sizes frommin-size of 29 times 21 to max-size of 124 times 107 Some examplesare shown in Figure 4

FromFigure 4we find that each leaf image in ICL is singlewith simple background and without obscureness twist andnoise so we ignore a lot of complex preprocessing such asfiltering and enhancement But from Figure 4(a) it can beseen that the original leaf images in ICL database are orientedat a random angle and have various shapes colors andsizes Some leaf images contain footstalks and some kinds of

leaves are similar to each other Because the sizes of the orig-inal color leaf images are different from each other we firstlytransform each original image to grayscale image and geo-metrically scale it to the same size by bilinear interpolationalgorithm From Figure 4(b) it is found that 30 cottonplant leaves have large intraclass difference with low-intensitywhite backgrounds To improve recognition rate several pre-processing steps prior to species recognition are employed asfollows(1) Convert each color leaf image into grayscale image(2) Separate the background from the leaf image by asmall threshold (we set as 30) that is the pixel value less than30 is set as 0 under gray level of 255(3) Reduce noise by a median filter of radius 10(4) Cut footstalks off through contour information todetect narrow parallel running chains and remove the longestchain [6 30](5) In order to align all leaf images extract the angle fromthe image by which the major axis of the leaf is oriented withrespect to the vertical line and then rotate the image by theangle so that the major axis is aligned with the vertical line

Computational Intelligence and Neuroscience 7

(a) Leaf image samples of different species

157 (1) 157 (2) 157 (3) 157 (4) 157 (5) 157 (6) 157 (7) 157 (8) 157 (9) 157 (10)

157 (11) 157 (12) 157 (13) 157 (14) 157 (15) 157 (16) 157 (17) 157 (18) 157 (19) 157 (20)

157 (21) 157 (22) 157 (23) 157 (24) 157 (25) 157 (26) 157 (27) 157 (28) 157 (29) 157 (30)

(b) 30-leaf image of cotton plant

Figure 4 Leaf image samples of ICL database

finally remove the white bounding rectangle to superimposethe leaf over a homogeneous background(6) Extract leaf contour by Canny Edge algorithm [30](7) Crop each contour into the size to contain only theextents of the leaf and then resize them to a square size of 32 times32 by bilinear interpolation algorithm

Figure 5 shows a leaf preprocessing example by taking theabove steps

42 Subdictionary Construction From Figure 4(a) we seethere are innumerable various plant leaves but these leavescan be clustered into several similar classes by leaf shapedescription or similarity measure such as Acerose LinearGladiate Ensiform Oblanceolate Ovate Oval Obovate andElliptic The number of the classes can be defined by K-mean clustering and K-fusion clustering algorithms [32] Wereferred to twowebsites httptheseedsitecoukleafshapeshtml and httpsenwikipediaorgwikiGlossary of leafmorphology and four references [33ndash36] and defined the

class number as 21 In fact in nature we can choose 20 kinds oftypical leaves as shown in Figure 6 and find the similar leavesof each typical leaf from the ICL database Then we obtain 21similar classes

Suppose 119905119894 (119894 = 1 2 20) is the 119894th representative leafimage 119909119895 (119895 = 1 2 16846) is the jth leaf image of theICL database and the similarity 119904(119905119894 119909119895) between 119905119894 and 119909119895is calculated by (4) Figure 7 is the similarities between agiven dentate shape leaf image and each leaf image of the ICLdatabase In order to clearly show the visual graph of selectingthe candidate training leaf images Figure 7 only shows partialresults that is about 6300 similarities

From Figure 7 we can observe that a few leaf imagesare similar to the given test leaf image We set the threshold119879 = 05 that is if 119904(119905119894 119909119895) ge 05 classify 119909119895 into the 119894th similarclass Aftermost of leaf images of the ICL database are dividedinto 20 similar classes the remaining images are clusteredinto the 21st similar class Finally we build 21 subdictionariesby stacking all preprocessed contours of each similar class ascolumns of a matrix respectively

8 Computational Intelligence and Neuroscience

Original image Gray image Cut footstalks Aligned image Binary image Contour

Figure 5 The leaf preprocessing example

Line Round Dentate Polygon Crenate Lobed Palmatifid Lanceolate Spatulate Deltoid

Ovate Obcordate Cordate Reniform Hastate Subulate Peltate Flabellate Obovate Pinnatifid(a) 20 different kinds of typical leaves

(b) Others

Figure 6 21 kinds of typical leaves

00102030405060708091

Sim

ilarit

y

0 1000 2000 3000 4000 5000 6000

Training sample number

Figure 7 Partial similarities between the dentate leaf image andeach image of the ICL database

43 Classifying Plant Species by DWSRC Given a new testleaf image 119910 we calculate the similarity between 119910 and eachtypical leaf 119905119894 (119894 = 1 2 20) We set the second threshold1198791 = 01 that is if 119904(119910 119905119895) lt 01 for any 119895 = 1 2 20and select the 21st similar class as the candidate similar classand 11986021 as candidate subdictionary Otherwise we select thecandidate similar class and candidate subdictionary with themaximum similarity 119904(119910 119905119895) (119895 = 1 2 20)

0 2 4 6 8 10 12 14 16 18 200

00501

01502

02503

03504

04505

Typical leaf image number

Sim

ilarit

y

Figure 8 The similarity between 119910 and each typical leaf 119905119894 (119894 =1 2 20) where 119910 is cotton leaf

Figure 8 is a similarity example where the test image is acotton plant leaf image From Figure 8 the maximum simi-larity is 119904(1199054 119910) = 04269 Then we choose the fourth similarclass and the fourth subdictionary 1198604 to construct DWSRCfor classifying the test image We calculate the SR coefficients

Computational Intelligence and Neuroscience 9

Table 1 Average recognition rates standard deviation and running time of WSRC LMS MMD CCD and DWSRC

Method WSRC LMS MMD CCD DWSRCRecognition results 8827 plusmn 189 8273 plusmn 173 8651 plusmn 146 8162 plusmn 253 9112 plusmn 125Running time (s) 1218 842 936 723 437

Table 2 Average recognition rates standard deviation and running time of WSRC LMS MMD CCD and DWSRC

Method WSRC LMS MMD CCD DWSRCRecognition results 8656 plusmn 212 8250 plusmn 206 8543 plusmn 152 8117 plusmn 256 9064 plusmn 142Running time (s) 1432 953 1016 647 516

and recognize the test leaf by the minimum reconstructionerror

Figures 7 and 8 only list the visual process on how todivide the whole dataset into several similar classes andconstruct the subdictionary for the SRC algorithm where thetraining set is all images of ICL database and the test set is anyleaf image

44 Experimental Results To illustrate how DWSRC and 4competitive methods work two kinds of experiments areconducted(1) From ICL database we randomly select 1100 imagesfrom 220 species with 5 images per plant species as the test setand the rest as the training setThe training set is divided into21 similar classes to build the subdictionaries and each testleaf is recognized by DWSRC where the typical samples areshown in Figure 6(a) The training and test random partitionis repeated 50 times The average recognition results of 50experiments of five methods are shown in Table 1(2) From ICL database we randomly select 2000 imagesfrom 16846 plant leaf images as the test set and the rest asthe training set and repeat the training and testing procedure50 times In each time the training set is also divided into 21similar classes to build the subdictionaries and each test leafis recognized by DWSRC where the typical samples are alsoshown in Figure 6(a) The average recognition results of 50experiments of five methods are shown in Table 2

In Tables 1 and 2 the bold fonts highlight the best recogni-tion performance From two tables we can see that the recog-nition performance of DWSRC is the best The main reasonmight be that DWSRC employs only the candidate similarclass to solve the SR problem and recognize the test samplesand it makes use of the similarity relationship between thetest sample and each training sample So the computationalcost is significantly reduced and the side-effect of outliersnoise and modeling error to the recognition performance isgreatly suppressed The recognition rate of WSRC is higherthan LMS MMD and CCD but its computational time isthe longest Most of the time is taken to solve the 1198971-normminimization problem Since LMS MMD and CCD need toextract the classifying features from each leaf image whichis time-consuming they have lower recognition rates thanWSRC and DWSRC Because the real-world leaves of thesame species are different from each other it is difficult toextract the ldquooptimalrdquo features from each leaf image

5 Conclusions

As for the large-scale plant species recognition issue weproposed a new SRC approach that is discriminant weightedSRC (DWSRC) which aims to exploit the locality andsimilarity of the original dataset and the test samples insparsely representing the test samples It makes use of theprior structure information of the dataset and the distancebetween the test sample and training samples to construct theSR problem To represent a test sample we not only considerthe original data structure of training samples but alsotake into account the locality and the similarity between thetest sample and each training sample The SR coefficients ofthe test sample maintain a sparse structure and more dis-criminative information DWSRC is essentially the general-ization of WSRC Rather than WSRC we construct severalsubdictionaries prior to SRC by dividing all training samplesinto several similar classes and then choose a candidatesubdictionary for given test sample to implement WSRCThus the computational cost to solve the 1198971-norm minimiza-tion problem is greatly reduced In this sense DWSRC isreasonable for large-scale plant species recognition whichcan improve the plant species recognition rate and reduce thecomputation cost Recently deep learning has led to a seriesof breakthroughs in many pattern recognition research areasWe will study deep learning for leaf based species recognitionin the future work

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

Thiswork is supported by the ChinaNational Natural ScienceFoundation under Grant no 61473237 This work is partiallysupported by Tianjin Research ProgramofApplication Foun-dation and Advanced Technology no 14JCYBJC42500 andTianjin Science and Technology Correspondent Project no16JCTPJC47300The authorswould like to thank the ICLLeafdatabase for their constructive advice

References

[1] M-E Nilsback and A Zisserman ldquoAutomated flower classi-fication over a large number of classesrdquo in Proceedings of the6th Indian Conference on Computer Vision Graphics and ImageProcessing (ICVGIP rsquo08) pp 722ndash729 December 2008

10 Computational Intelligence and Neuroscience

[2] M Seeland M Rzanny N Alaqraa J Waldchen P Mader andY Zhang ldquoPlant species classification using flower imagesmdashAcomparative study of local feature representationsrdquo PLoS ONEvol 12 no 2 p e0170629 2017

[3] N Jamil N A C Hussin S Nordin and K Awang ldquoAutomaticPlant Identification Is Shape the Key Featurerdquo Procedia Com-puter Science vol 76 pp 436ndash442 2015

[4] E Mata-Montero and J Carranza-Rojas ldquoAutomated plantspecies identification Challenges and opportunitiesrdquo IFIPAdvances in Information and Communication Technology vol481 pp 26ndash36 2016

[5] J Waldchen and P Mader ldquoPlant Species Identification UsingComputer Vision Techniques A Systematic Literature ReviewrdquoArchives of Computational Methods in Engineering State-of-the-Art Reviews pp 1ndash37 2017

[6] T Munisami M Ramsurn S Kishnah and S Pudaruth ldquolantLeaf Recognition Using Shape Features and Colour Histogramwith K-nearest Neighbour Classifiersrdquo Procedia Computer Sci-ence vol 58 pp 740ndash747 2015

[7] J Chaki R Parekh and S Bhattacharya ldquoPlant leaf recognitionusing texture and shape features with neural classifiersrdquo PatternRecognition Letters vol 58 article no 6175 pp 61ndash68 2015

[8] C Zhao S S F Chan W-K Cham and L M Chu ldquoPlantidentification using leaf shapes - A pattern counting approachrdquoPattern Recognition vol 48 no 10 pp 3203ndash3215 2015

[9] M M S de Souza F N S Medeiros G L B Ramalho I Cde Paula Jr and I N S Oliveira ldquoEvolutionary optimization ofa multiscale descriptor for leaf shape analysisrdquo Expert Systemswith Applications vol 63 pp 375ndash385 2016

[10] N Liu and J-M Kan ldquoImproved deep belief networks andmulti-feature fusion for leaf identificationrdquo Neurocomputingvol 216 pp 460ndash467 2016

[11] B VijayaLakshmi and V Mohan ldquoKernel-based PSO andFRVM An automatic plant leaf type detection using textureshape and color featuresrdquo Computers and Electronics in Agri-culture vol 125 pp 99ndash112 2016

[12] T P Kumar M V P Reddy and P K Bora ldquoLeaf identificationusing shape and texture featuresrdquoAdvances in Intelligent Systemsand Computing vol 460 pp 531ndash541 2017

[13] Z Zhang Y Xu J Yang X Li andD Zhang ldquoA survey of sparserepresentation algorithms and applicationsrdquo IEEEAccess vol 3pp 490ndash530 2015

[14] H Cheng ldquoSparse Representation Modeling and Learningin Visual Recognition Theory Algorithms and ApplicationsrdquoAdvances in Computer Vision amp Pattern Recognition vol 6 pp1408ndash1425 93

[15] J-K Hsiao L-W Kang C-L Chang and C-Y Lin ldquoLearning-based leaf image recognition frameworksrdquo Studies in Computa-tional Intelligence vol 591 pp 77ndash91 2015

[16] T Jin X Hou P Li and F Zhou ldquoA novel method of automaticplant species identification using sparse representation of leaftooth featuresrdquo PLoS ONE vol 10 no 10 Article ID e01394822015

[17] C-G Li J Guo and H-G Zhang ldquoLocal sparse representationbased classificationrdquo in Proceedings of the 20th InternationalConference on Pattern Recognition (ICPR rsquo10) pp 649ndash652 2010

[18] J-X Mi and J-X Liu ldquoFace Recognition Using Sparse Repre-sentation-Based Classification on K-Nearest Subspacerdquo PLoSONE vol 8 no 3 Article ID e59430 2013

[19] Q Liu ldquoKernel Local Sparse Representation Based ClassifierrdquoNeural Processing Letters vol 43 no 1 pp 85ndash95 2016

[20] E Elhamifar and R Vidal ldquoRobust classification using struc-tured sparse representationrdquo in Proceedings of the IEEE Confer-ence on Computer Vision and Pattern Recognition (CVPR rsquo11)pp 1873ndash1879 June 2011

[21] Z Fan M Ni Q Zhu and E Liu ldquoWeighted sparse represen-tation for face recognitionrdquo Neurocomputing vol 151 no 1 pp304ndash309 2015

[22] JWright A Y Yang A Ganesh S S Sastry and YMa ldquoRobustface recognition via sparse representationrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 31 no 2 pp210ndash227 2009

[23] J Mairal F Bach J Ponce G Sapiro and A ZissermanldquoDiscriminative learned dictionaries for local image analysisrdquo inProceedings of the 26th IEEEConference onComputer Vision andPattern Recognition (CVPR rsquo08) pp 1ndash8 Anchorage AlaskaUSA June 2008

[24] J Zhang D Zhao F Jiang and W Gao ldquoStructural groupsparse representation for image Compressive Sensing recoveryrdquoin Proceedings of the 2013 Data Compression Conference DCC2013 pp 331ndash340 USA March 2013

[25] Y Sun Q Liu J Tang and D Tao ldquoLearning discriminativedictionary for group sparse representationrdquo IEEE Transactionson Image Processing vol 23 no 9 pp 3816ndash3828 2014

[26] S S Chen D L Donoho and M A Saunders ldquoAtomic decom-position by basis pursuitrdquo SIAM Review vol 43 no 1 pp 129ndash159 2001

[27] X TangG Feng and J Cai ldquoWeighted group sparse representa-tion for undersampled face recognitionrdquo Neurocomputing vol145 pp 402ndash415 2014

[28] G Cerutti L Tougne D Coquin and A Vacavant ldquoLeaf mar-gins as sequences A structural approach to leaf identificationrdquoPattern Recognition Letters vol 49 pp 177ndash184 2014

[29] J-X Du M-W Shao C-M Zhai J Wang Y Tang and CL P Chen ldquoRecognition of leaf image set based on manifold-manifold distancerdquo Neurocomputing vol 188 pp 131ndash138 2016

[30] A Hasim Y Herdiyeni and S Douady ldquoLeaf Shape Recogni-tion using Centroid Contour Distancerdquo in Proceedings of theInternational Seminar on Science of Complex Natural Systems2015 ISS-CNS 2015 Indonesia October 2015

[31] B Li CWang and D-S Huang ldquoSupervised feature extractionbased on orthogonal discriminant projectionrdquoNeurocomputingvol 73 no 1-3 pp 191ndash196 2009

[32] C C Aggarwal and C K Reddy Data Clustering Algorithmsand Applications Chapman amp HallCRC 2014

[33] httptheseedsitecoukleafshapeshtml[34] httpsenwikipediaorgwikiGlossary of leaf morphology[35] H Tsukaya ldquoMechanism of leaf-shape determinationrdquo Annual

Review of Plant Biology vol 57 pp 477ndash496 2006[36] J Dkhar and A Pareek ldquoWhat determines a leaf rsquos shaperdquo

EvoDevo vol 5 no 1 article no 47 2014

Submit your manuscripts athttpswwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 3: ResearchArticle Discriminant WSRC for Large-Scale Plant ...downloads.hindawi.com/journals/cin/2017/9581292.pdf · ResearchArticle Discriminant WSRC for Large-Scale Plant Species Recognition

Computational Intelligence and Neuroscience 3

1jpg

11jpg 12jpg 13jpg 14jpg 15jpg 16jpg 17jpg 18jpg 19jpg 20jpg

2jpg 3jpg 4jpg 5jpg 6jpg 7jpg 8jpg 9jpg 10jpg

(a) 20 leaves of Aesculus

IMG_1837 IMG_1838 IMG_1839 IMG_1840 IMG_1841 IMG_1842 IMG_1843 IMG_1844 IMG_1845 IMG_1846

IMG_1847 IMG_1848 IMG_1849 IMG_1850 IMG_1851 IMG_1852 IMG_1853 IMG_1854 IMG_1855 IMG_1856

(b) 20 leaves of ACrape myrtle

(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)

(c) 10 different species leaves

Figure 1 Plant leaf examples

The minimal problems of (1) and (2) can be solved bythe sparse subspace clustering algorithm [26] Ideally theobtained SR coefficients are sparse and thus the test samplecan easily be assigned to that class [27] Then 119910 is assigned tothe class with the minimum reconstruction error among allof the classes which is defined as follows

identity (119910) = arg min119888=12119862

1003817100381710038171003817119910 minus 11986011988811988611988810038171003817100381710038172 (3)

where 119886119888 is the coefficient subvector corresponding to the cthtraining class in 11988622 Dictionary of SRC SRC and WSRC aim to find a sparserepresentation of the input data in the form of a linear com-bination of basic elements These elements are called atomsand they compose a dictionary Atoms in the dictionaryare not required to be orthogonal and they might be anovercomplete spanning set Although SRC WSRC and theirimprovedmethods have beenwidely used in various practicalapplications how to construct the dictionary is still a chal-lenging issue in image processing Many schemes have beenproposed to efficiently learn the overcomplete dictionary byenforcing some discriminant criteria and achieve impressiveperformances on image classification butmost of the existingmodified SRCmethods are complex and difficult to be imple-mented [23 24] Moreover most of these methods assumethat the training samples of a particular class approximately

form a linear basis for any test sample belonging to thatclass [18 22ndash25] But in nature there are a lot of interclasssimilar leaves such as in Figure 1(c) In particular it is time-consuming to implement these methods on the large-scaledatabase

3 Discriminant WSRC for Large-Scale PlantSpecies Recognition

Based on the global and local SR several WSRC and groupSRC algorithms are proposed to suppress the small nonzerocoefficients Similar to SRC these approaches assume that thesamples from a single class lie on a linear subspace but we donot know the class label of a test sample Anyway it is impor-tant to choose the ldquooptimalrdquo training samples to sparselyrepresent the test sample To improve the performance ofSRC a discriminant WSRC (DWSRC) method is proposedfor large-scale plant species recognition given 119899 training leafimages 1199091 1199092 119909119899 and119898 representative leaf images withlarge different shapes such as Linear Oblong Elliptic andOvate where 119905119895 (119895 = 1 2 119898) is the jth representativeimage WSRC is introduced in this section

31 Subdictionary Construction In nature there are morethan 400000 kinds of plant species Although there arevarious leaves in size shape color and texture we can divide

4 Computational Intelligence and Neuroscience

(a) Simple and complex (b) Single and compound

Linear Oblong Elliptic Ovate Obovate

Orbicular Cordate Obcordate Reniform Deltoid(c) 10 kinds of leaf shapes

Figure 2 Plant leaf examples

all leaves into several similar classes by different leaf shaperepresentation criteria such as (1) Simple and Complex (2)Single and Compound (3) Linear Oblong Elliptic OvateObovate Orbicular Cordate Obcordate Reniform Deltoidand others as shown in Figure 2

The Gaussian kernel distance can capture the nonlinearinformation within the dataset to measure the distance orsimilarity between the samples Suppose there are two leafimages 119909 119910 and their similarity is defined by the Gaussiankernel distance

119904 (119909 119910) = exp(minus1003817100381710038171003817119909 minus 1199101003817100381710038171003817221205732 ) (4)

where 120573 is the Gaussian kernel width to control the excessivedistance between119909 and119910 and 119909minus119910 is the Euclidean distancebetween 119909 119910

In this section any original leaf image is transformed intogray image or binary image In fact any gray or binary imageis a 2DmatrixThus 119909minus119910 is the Euclidean distance betweentwo matrices 119909 119910

Given 119899 training leaf images 1199091 1199092 119909119899 in orderto divide these images into 119898 similar classes we find 119898representative leaf images with large different shapes such asLinear Oblong Elliptic andOvate where 119905119895 (119895 = 1 2 119898)is the jth representative image Then we set up a threshold119879 If 119904(119909119894 119905119895) ge 119879 119909119894 is assigned to the jth similar class if119904(119909119894 119905119895) lt 119879 for any 119905119895 (119895 = 1 2 119898) 119909119894 does not belong to119898 given similar classes It is assigned to the (119898 + 1)th similarclassThen 119899 leaf images are roughly divided into119898+1 similarclasses

We concatenate each leaf image of119898 similar class as a one-dimensionality vector and construct119898+1 subdictionaries forthe following WSRC algorithm by all these vectors namely119860119895 isin 119877119899119895times119899 (119895 = 1 2 119898+1) where 119899 is the dimensionalityof the leaf image vector and 119899119895 is the leaf number of the jthsimilar class and each columnof119860119895 is a vector correspondingto a leaf image

32 Discriminant WSRC SRC can ensure sparsity whileit loses the local information of the training data WSRCcan preserve the sparsity and the similarity between thetest sample and its neighbor training data while it ignoresthe prior global structure information of the training dataWe proposed a discriminant WSRC (DWSRC) scheme toimprove the classifying performance of WSRC by integratingthe sparsity and both locality and global structure informa-tion of the training data into a unified framework

Given a test leaf image 119910 and a threshold 1198791 by (4) wecalculate the similarity 119904(119910 119905119895) (119895 = 1 2 119898) between 119910and the jth representative leaf image 119905119895 (119895 = 1 2 119898) If119904(119910 119905119903) gt 1198791 choose the 119903th similar class as the candidatesimilar class with the maximum similarity for classifying119910 and select the corresponding 119903th subdictionary 119860119903 asthe candidate subdictionary Similar to the general WSRCDWSRC solves the followingweighted 1198971-normminimizationproblem

1198861015840 = argmin119886

1003817100381710038171003817119910 minus 119860119903119886100381710038171003817100381722 + 120583 100381710038171003817100381711988211990311988610038171003817100381710038171 (5)

Computational Intelligence and Neuroscience 5

where 119882119903 isin 119877119899119903times119899119903 is a weighted diagonal matrix 119899119903 is theleaf image number of the 119903th similar class and the diagonalelements 1199081199031 1199081199032 119908119903119899119903 of 119882119903 are the Gaussian kerneldistances between 119910 and the candidate samples which aredefined as follows

119908119903119894 = exp(1003817100381710038171003817119910 minus 11990911990311989410038171003817100381710038172212057321 ) (6)

where 119909119903119894 is the 119894th vector of the candidate similar class and1205731is the Gaussian kernel width to adjust the weight decay speed

In (5)119882119903 can penalize the distance between 119910 and eachtraining sample of the candidate similar class Its objectiveis to avoid selecting the training samples that are far fromthe test sample to represent the test sample Because thesample numbers of the different similar class are differentafter obtaining the SR coefficient vector 119886 we compute theaverage reconstruction error of each class in the candidatesimilar class as follows

119898119888 (119910) = 11198991198881003817100381710038171003817119910 minus 11986011990311988611988810038171003817100381710038172 (7)

where 119886119888 is the coefficient subvector associated with the cthtraining class in 119886 and 119899119888 is the vector number of the cth classin the candidate similar class

Then the test image is labeled to the class that owns theminimum normalized reconstruction error119898119888(119910)

In classical WSRC [21] if the diagonal elements 1199081 1199082 119908119899 of weighted diagonal matrix119882 is defined as follows

119908119894 = 119904 (119909119894 119910) if 119904 (119909119894 119910) gt 1198790 otherwise (8)

then WSRC is DWSRCDWSRC can generate more discriminative sparse codes

which can be used to represent the test sample more robustlyby combining both linearity and locality information for im-proving recognition performance DWSRC is a constrainedLASSO problem [21] Its flowchart is shown in Figure 3

321 Feasibility Analysis In classical SRC the computationalcomplexity to obtain the SR coefficients is about 119874(1198962119899) pertest sample [14 15 17] where 119896 is the number of nonzeroentries in reconstruction coefficients Ideally all of the nonz-ero SR coefficients will be associated with the columns of theovercomplete dictionary from a single object class and thetest sample can be easily assigned to that class However infact in the training leaf image set noise outliers modelingerror and a lot of intraclass dissimilar leaves may lead tomany small nonzero SR coefficients associated with multiplespecies classesThe computational complexity to obtain the SRcoefficients is nearly 119874(1198993) per test sample [17] In DWSRCall of 119899 training samples are split into 119898 similar classesthat are used to build 119898 subdictionaries and a candidatesubdictionary is chosen for WSRC Although the number ofthe training samples from the candidate similar class is lessthan the number of all the training samples the dictionary

completeness may not be destroyed because the candidatetraining samples that are similar to the test sample areretained while the noise outliers and irregular samples thatare dissimilar to the test sample are excluded The SRcoefficients obtained byDWSRCmight not be as sparse as theones achieved by the classical SRC and WSRC but the SRcoefficients of the relevant training samples still own greatermagnitudes because all training samples of a candidatesimilar class generally play an important role in sparselyrepresenting the test sample

If we consider all SR coefficients of all similar classesexcept the candidate similar class to be zeros we also thinkDWSRC is conducted on whole the database Thus DWSRCis much sparser than SRC and WSRC On the whole inDWSRCGaussian kernel distance information is imposed onthe WSRC to suppress the small nonzero coefficients so thenumber of small nonzero SR coefficients of DWSRC shouldbe significantly reducedThe sample number in the candidatesimilar class is about 119899119898 Compared with SRC and WSRCthe computational complexity of DWSRC to obtain the SRcoefficients is about 119874(11989621119899119898) per test sample where 1198961 isthe number of nonzero entries in reconstruction coefficientsby DWSRC In general 1198961 is much less than 119896 and 119899 so thecomputational complexity of DWSRC is quite less than thatof SRC and WSRC

From the above analysis it is indicated thatDWSR ismoresuitable for the large-scale database classification problem

4 Experiments and Results

In this section we investigate the performance of the pro-posed DWSRC method for plant species recognition andcompare it with four state-of-the-art approaches includingplant species recognition using WSRC [21] Leaf MarginSequences (LMS) [28]ManifoldndashManifold Distance (MMD)[29] and Centroid Contour Distance (CCD) [30] In WSRCbased experiments the parameter selection data processingand recognition process are similar to that in DWSRC whilein LMS MMD and CCD based experiments the parameterselection data processing and recognition process are fromthe three references [28ndash30] respectively

The important parameters in DWSRC include the scalarregularization parameter 120583 two thresholds119879 and1198791 and twoGaussian kernel widths 120573 and 1205731 in (4) and (6) Among themas 120583 120573 and 1205731 are not sensitive to recognition results [13 2131]120583 in (5) is set to 0001 empirically and twoGaussian kernelwidths are set as

120573 = (119889max (119909119894 119905119895) + 119889min (119909119894 119905119895))2 1205731 = (119889max (119910 119909119903119894) + 119889min (119910 119909119903119894))2

(9)

where 119889max(119909119894 119905119895) and 119889min(119909119894 119905119895) are maximum and mini-mum Euclidean distance between 119909119894 and 119905119895 and 119889max(119910 119909119903119894)and 119889min(119910 119909119903119894) are maximum and minimum Euclideandistance between 119910 and 119909119903119894

All experiments are performed on the platform with30GHz CPU and 40GB RAM by MATLAB 2011b The

6 Computational Intelligence and Neuroscience

Original database

Buildingsubdictionary

Similar class 1 Similar class 2 Similar class m + 1

Dividing database

Subdictionary 1 Subdictionary 2 Subdictionary m + 1

Build subdictionary

Test leaf image

middot middot middot

middot middot middot

middot middot middot

Choose the similar class and subdictionary with the maximum similarity

DWSRC

Constructing DWSRC

Construct and solve the weighted l1-norm minimization constraint problemon the chosen similar class and subdictionary

Classifying

Classify the test sample y into the class with the minimum reconstruction error rc(y)

(y tj)maxj s

Figure 3 The flowchart of the proposed method

MATLAB SR toolbox 19 (httpssitesgooglecomsitespar-sereptool) is used to solve theweighted 1198971-normminimizationproblem

41 Data Preparation All plant species recognition experi-ments by DWSRC and four competitive methods are con-ducted on the ICL Leaf database (httpspanbaiducomsharelinkshareid=2093145911ampuk=1395063007) which wascollected by intelligent computing laboratory (ICL) of Chi-neseAcademy of SciencesThe database contains 16846 plantleaf images from 220 kinds of plant species with differentnumbers of leaf images per species and different sizes frommin-size of 29 times 21 to max-size of 124 times 107 Some examplesare shown in Figure 4

FromFigure 4we find that each leaf image in ICL is singlewith simple background and without obscureness twist andnoise so we ignore a lot of complex preprocessing such asfiltering and enhancement But from Figure 4(a) it can beseen that the original leaf images in ICL database are orientedat a random angle and have various shapes colors andsizes Some leaf images contain footstalks and some kinds of

leaves are similar to each other Because the sizes of the orig-inal color leaf images are different from each other we firstlytransform each original image to grayscale image and geo-metrically scale it to the same size by bilinear interpolationalgorithm From Figure 4(b) it is found that 30 cottonplant leaves have large intraclass difference with low-intensitywhite backgrounds To improve recognition rate several pre-processing steps prior to species recognition are employed asfollows(1) Convert each color leaf image into grayscale image(2) Separate the background from the leaf image by asmall threshold (we set as 30) that is the pixel value less than30 is set as 0 under gray level of 255(3) Reduce noise by a median filter of radius 10(4) Cut footstalks off through contour information todetect narrow parallel running chains and remove the longestchain [6 30](5) In order to align all leaf images extract the angle fromthe image by which the major axis of the leaf is oriented withrespect to the vertical line and then rotate the image by theangle so that the major axis is aligned with the vertical line

Computational Intelligence and Neuroscience 7

(a) Leaf image samples of different species

157 (1) 157 (2) 157 (3) 157 (4) 157 (5) 157 (6) 157 (7) 157 (8) 157 (9) 157 (10)

157 (11) 157 (12) 157 (13) 157 (14) 157 (15) 157 (16) 157 (17) 157 (18) 157 (19) 157 (20)

157 (21) 157 (22) 157 (23) 157 (24) 157 (25) 157 (26) 157 (27) 157 (28) 157 (29) 157 (30)

(b) 30-leaf image of cotton plant

Figure 4 Leaf image samples of ICL database

finally remove the white bounding rectangle to superimposethe leaf over a homogeneous background(6) Extract leaf contour by Canny Edge algorithm [30](7) Crop each contour into the size to contain only theextents of the leaf and then resize them to a square size of 32 times32 by bilinear interpolation algorithm

Figure 5 shows a leaf preprocessing example by taking theabove steps

42 Subdictionary Construction From Figure 4(a) we seethere are innumerable various plant leaves but these leavescan be clustered into several similar classes by leaf shapedescription or similarity measure such as Acerose LinearGladiate Ensiform Oblanceolate Ovate Oval Obovate andElliptic The number of the classes can be defined by K-mean clustering and K-fusion clustering algorithms [32] Wereferred to twowebsites httptheseedsitecoukleafshapeshtml and httpsenwikipediaorgwikiGlossary of leafmorphology and four references [33ndash36] and defined the

class number as 21 In fact in nature we can choose 20 kinds oftypical leaves as shown in Figure 6 and find the similar leavesof each typical leaf from the ICL database Then we obtain 21similar classes

Suppose 119905119894 (119894 = 1 2 20) is the 119894th representative leafimage 119909119895 (119895 = 1 2 16846) is the jth leaf image of theICL database and the similarity 119904(119905119894 119909119895) between 119905119894 and 119909119895is calculated by (4) Figure 7 is the similarities between agiven dentate shape leaf image and each leaf image of the ICLdatabase In order to clearly show the visual graph of selectingthe candidate training leaf images Figure 7 only shows partialresults that is about 6300 similarities

From Figure 7 we can observe that a few leaf imagesare similar to the given test leaf image We set the threshold119879 = 05 that is if 119904(119905119894 119909119895) ge 05 classify 119909119895 into the 119894th similarclass Aftermost of leaf images of the ICL database are dividedinto 20 similar classes the remaining images are clusteredinto the 21st similar class Finally we build 21 subdictionariesby stacking all preprocessed contours of each similar class ascolumns of a matrix respectively

8 Computational Intelligence and Neuroscience

Original image Gray image Cut footstalks Aligned image Binary image Contour

Figure 5 The leaf preprocessing example

Line Round Dentate Polygon Crenate Lobed Palmatifid Lanceolate Spatulate Deltoid

Ovate Obcordate Cordate Reniform Hastate Subulate Peltate Flabellate Obovate Pinnatifid(a) 20 different kinds of typical leaves

(b) Others

Figure 6 21 kinds of typical leaves

00102030405060708091

Sim

ilarit

y

0 1000 2000 3000 4000 5000 6000

Training sample number

Figure 7 Partial similarities between the dentate leaf image andeach image of the ICL database

43 Classifying Plant Species by DWSRC Given a new testleaf image 119910 we calculate the similarity between 119910 and eachtypical leaf 119905119894 (119894 = 1 2 20) We set the second threshold1198791 = 01 that is if 119904(119910 119905119895) lt 01 for any 119895 = 1 2 20and select the 21st similar class as the candidate similar classand 11986021 as candidate subdictionary Otherwise we select thecandidate similar class and candidate subdictionary with themaximum similarity 119904(119910 119905119895) (119895 = 1 2 20)

0 2 4 6 8 10 12 14 16 18 200

00501

01502

02503

03504

04505

Typical leaf image number

Sim

ilarit

y

Figure 8 The similarity between 119910 and each typical leaf 119905119894 (119894 =1 2 20) where 119910 is cotton leaf

Figure 8 is a similarity example where the test image is acotton plant leaf image From Figure 8 the maximum simi-larity is 119904(1199054 119910) = 04269 Then we choose the fourth similarclass and the fourth subdictionary 1198604 to construct DWSRCfor classifying the test image We calculate the SR coefficients

Computational Intelligence and Neuroscience 9

Table 1 Average recognition rates standard deviation and running time of WSRC LMS MMD CCD and DWSRC

Method WSRC LMS MMD CCD DWSRCRecognition results 8827 plusmn 189 8273 plusmn 173 8651 plusmn 146 8162 plusmn 253 9112 plusmn 125Running time (s) 1218 842 936 723 437

Table 2 Average recognition rates standard deviation and running time of WSRC LMS MMD CCD and DWSRC

Method WSRC LMS MMD CCD DWSRCRecognition results 8656 plusmn 212 8250 plusmn 206 8543 plusmn 152 8117 plusmn 256 9064 plusmn 142Running time (s) 1432 953 1016 647 516

and recognize the test leaf by the minimum reconstructionerror

Figures 7 and 8 only list the visual process on how todivide the whole dataset into several similar classes andconstruct the subdictionary for the SRC algorithm where thetraining set is all images of ICL database and the test set is anyleaf image

44 Experimental Results To illustrate how DWSRC and 4competitive methods work two kinds of experiments areconducted(1) From ICL database we randomly select 1100 imagesfrom 220 species with 5 images per plant species as the test setand the rest as the training setThe training set is divided into21 similar classes to build the subdictionaries and each testleaf is recognized by DWSRC where the typical samples areshown in Figure 6(a) The training and test random partitionis repeated 50 times The average recognition results of 50experiments of five methods are shown in Table 1(2) From ICL database we randomly select 2000 imagesfrom 16846 plant leaf images as the test set and the rest asthe training set and repeat the training and testing procedure50 times In each time the training set is also divided into 21similar classes to build the subdictionaries and each test leafis recognized by DWSRC where the typical samples are alsoshown in Figure 6(a) The average recognition results of 50experiments of five methods are shown in Table 2

In Tables 1 and 2 the bold fonts highlight the best recogni-tion performance From two tables we can see that the recog-nition performance of DWSRC is the best The main reasonmight be that DWSRC employs only the candidate similarclass to solve the SR problem and recognize the test samplesand it makes use of the similarity relationship between thetest sample and each training sample So the computationalcost is significantly reduced and the side-effect of outliersnoise and modeling error to the recognition performance isgreatly suppressed The recognition rate of WSRC is higherthan LMS MMD and CCD but its computational time isthe longest Most of the time is taken to solve the 1198971-normminimization problem Since LMS MMD and CCD need toextract the classifying features from each leaf image whichis time-consuming they have lower recognition rates thanWSRC and DWSRC Because the real-world leaves of thesame species are different from each other it is difficult toextract the ldquooptimalrdquo features from each leaf image

5 Conclusions

As for the large-scale plant species recognition issue weproposed a new SRC approach that is discriminant weightedSRC (DWSRC) which aims to exploit the locality andsimilarity of the original dataset and the test samples insparsely representing the test samples It makes use of theprior structure information of the dataset and the distancebetween the test sample and training samples to construct theSR problem To represent a test sample we not only considerthe original data structure of training samples but alsotake into account the locality and the similarity between thetest sample and each training sample The SR coefficients ofthe test sample maintain a sparse structure and more dis-criminative information DWSRC is essentially the general-ization of WSRC Rather than WSRC we construct severalsubdictionaries prior to SRC by dividing all training samplesinto several similar classes and then choose a candidatesubdictionary for given test sample to implement WSRCThus the computational cost to solve the 1198971-norm minimiza-tion problem is greatly reduced In this sense DWSRC isreasonable for large-scale plant species recognition whichcan improve the plant species recognition rate and reduce thecomputation cost Recently deep learning has led to a seriesof breakthroughs in many pattern recognition research areasWe will study deep learning for leaf based species recognitionin the future work

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

Thiswork is supported by the ChinaNational Natural ScienceFoundation under Grant no 61473237 This work is partiallysupported by Tianjin Research ProgramofApplication Foun-dation and Advanced Technology no 14JCYBJC42500 andTianjin Science and Technology Correspondent Project no16JCTPJC47300The authorswould like to thank the ICLLeafdatabase for their constructive advice

References

[1] M-E Nilsback and A Zisserman ldquoAutomated flower classi-fication over a large number of classesrdquo in Proceedings of the6th Indian Conference on Computer Vision Graphics and ImageProcessing (ICVGIP rsquo08) pp 722ndash729 December 2008

10 Computational Intelligence and Neuroscience

[2] M Seeland M Rzanny N Alaqraa J Waldchen P Mader andY Zhang ldquoPlant species classification using flower imagesmdashAcomparative study of local feature representationsrdquo PLoS ONEvol 12 no 2 p e0170629 2017

[3] N Jamil N A C Hussin S Nordin and K Awang ldquoAutomaticPlant Identification Is Shape the Key Featurerdquo Procedia Com-puter Science vol 76 pp 436ndash442 2015

[4] E Mata-Montero and J Carranza-Rojas ldquoAutomated plantspecies identification Challenges and opportunitiesrdquo IFIPAdvances in Information and Communication Technology vol481 pp 26ndash36 2016

[5] J Waldchen and P Mader ldquoPlant Species Identification UsingComputer Vision Techniques A Systematic Literature ReviewrdquoArchives of Computational Methods in Engineering State-of-the-Art Reviews pp 1ndash37 2017

[6] T Munisami M Ramsurn S Kishnah and S Pudaruth ldquolantLeaf Recognition Using Shape Features and Colour Histogramwith K-nearest Neighbour Classifiersrdquo Procedia Computer Sci-ence vol 58 pp 740ndash747 2015

[7] J Chaki R Parekh and S Bhattacharya ldquoPlant leaf recognitionusing texture and shape features with neural classifiersrdquo PatternRecognition Letters vol 58 article no 6175 pp 61ndash68 2015

[8] C Zhao S S F Chan W-K Cham and L M Chu ldquoPlantidentification using leaf shapes - A pattern counting approachrdquoPattern Recognition vol 48 no 10 pp 3203ndash3215 2015

[9] M M S de Souza F N S Medeiros G L B Ramalho I Cde Paula Jr and I N S Oliveira ldquoEvolutionary optimization ofa multiscale descriptor for leaf shape analysisrdquo Expert Systemswith Applications vol 63 pp 375ndash385 2016

[10] N Liu and J-M Kan ldquoImproved deep belief networks andmulti-feature fusion for leaf identificationrdquo Neurocomputingvol 216 pp 460ndash467 2016

[11] B VijayaLakshmi and V Mohan ldquoKernel-based PSO andFRVM An automatic plant leaf type detection using textureshape and color featuresrdquo Computers and Electronics in Agri-culture vol 125 pp 99ndash112 2016

[12] T P Kumar M V P Reddy and P K Bora ldquoLeaf identificationusing shape and texture featuresrdquoAdvances in Intelligent Systemsand Computing vol 460 pp 531ndash541 2017

[13] Z Zhang Y Xu J Yang X Li andD Zhang ldquoA survey of sparserepresentation algorithms and applicationsrdquo IEEEAccess vol 3pp 490ndash530 2015

[14] H Cheng ldquoSparse Representation Modeling and Learningin Visual Recognition Theory Algorithms and ApplicationsrdquoAdvances in Computer Vision amp Pattern Recognition vol 6 pp1408ndash1425 93

[15] J-K Hsiao L-W Kang C-L Chang and C-Y Lin ldquoLearning-based leaf image recognition frameworksrdquo Studies in Computa-tional Intelligence vol 591 pp 77ndash91 2015

[16] T Jin X Hou P Li and F Zhou ldquoA novel method of automaticplant species identification using sparse representation of leaftooth featuresrdquo PLoS ONE vol 10 no 10 Article ID e01394822015

[17] C-G Li J Guo and H-G Zhang ldquoLocal sparse representationbased classificationrdquo in Proceedings of the 20th InternationalConference on Pattern Recognition (ICPR rsquo10) pp 649ndash652 2010

[18] J-X Mi and J-X Liu ldquoFace Recognition Using Sparse Repre-sentation-Based Classification on K-Nearest Subspacerdquo PLoSONE vol 8 no 3 Article ID e59430 2013

[19] Q Liu ldquoKernel Local Sparse Representation Based ClassifierrdquoNeural Processing Letters vol 43 no 1 pp 85ndash95 2016

[20] E Elhamifar and R Vidal ldquoRobust classification using struc-tured sparse representationrdquo in Proceedings of the IEEE Confer-ence on Computer Vision and Pattern Recognition (CVPR rsquo11)pp 1873ndash1879 June 2011

[21] Z Fan M Ni Q Zhu and E Liu ldquoWeighted sparse represen-tation for face recognitionrdquo Neurocomputing vol 151 no 1 pp304ndash309 2015

[22] JWright A Y Yang A Ganesh S S Sastry and YMa ldquoRobustface recognition via sparse representationrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 31 no 2 pp210ndash227 2009

[23] J Mairal F Bach J Ponce G Sapiro and A ZissermanldquoDiscriminative learned dictionaries for local image analysisrdquo inProceedings of the 26th IEEEConference onComputer Vision andPattern Recognition (CVPR rsquo08) pp 1ndash8 Anchorage AlaskaUSA June 2008

[24] J Zhang D Zhao F Jiang and W Gao ldquoStructural groupsparse representation for image Compressive Sensing recoveryrdquoin Proceedings of the 2013 Data Compression Conference DCC2013 pp 331ndash340 USA March 2013

[25] Y Sun Q Liu J Tang and D Tao ldquoLearning discriminativedictionary for group sparse representationrdquo IEEE Transactionson Image Processing vol 23 no 9 pp 3816ndash3828 2014

[26] S S Chen D L Donoho and M A Saunders ldquoAtomic decom-position by basis pursuitrdquo SIAM Review vol 43 no 1 pp 129ndash159 2001

[27] X TangG Feng and J Cai ldquoWeighted group sparse representa-tion for undersampled face recognitionrdquo Neurocomputing vol145 pp 402ndash415 2014

[28] G Cerutti L Tougne D Coquin and A Vacavant ldquoLeaf mar-gins as sequences A structural approach to leaf identificationrdquoPattern Recognition Letters vol 49 pp 177ndash184 2014

[29] J-X Du M-W Shao C-M Zhai J Wang Y Tang and CL P Chen ldquoRecognition of leaf image set based on manifold-manifold distancerdquo Neurocomputing vol 188 pp 131ndash138 2016

[30] A Hasim Y Herdiyeni and S Douady ldquoLeaf Shape Recogni-tion using Centroid Contour Distancerdquo in Proceedings of theInternational Seminar on Science of Complex Natural Systems2015 ISS-CNS 2015 Indonesia October 2015

[31] B Li CWang and D-S Huang ldquoSupervised feature extractionbased on orthogonal discriminant projectionrdquoNeurocomputingvol 73 no 1-3 pp 191ndash196 2009

[32] C C Aggarwal and C K Reddy Data Clustering Algorithmsand Applications Chapman amp HallCRC 2014

[33] httptheseedsitecoukleafshapeshtml[34] httpsenwikipediaorgwikiGlossary of leaf morphology[35] H Tsukaya ldquoMechanism of leaf-shape determinationrdquo Annual

Review of Plant Biology vol 57 pp 477ndash496 2006[36] J Dkhar and A Pareek ldquoWhat determines a leaf rsquos shaperdquo

EvoDevo vol 5 no 1 article no 47 2014

Submit your manuscripts athttpswwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 4: ResearchArticle Discriminant WSRC for Large-Scale Plant ...downloads.hindawi.com/journals/cin/2017/9581292.pdf · ResearchArticle Discriminant WSRC for Large-Scale Plant Species Recognition

4 Computational Intelligence and Neuroscience

(a) Simple and complex (b) Single and compound

Linear Oblong Elliptic Ovate Obovate

Orbicular Cordate Obcordate Reniform Deltoid(c) 10 kinds of leaf shapes

Figure 2 Plant leaf examples

all leaves into several similar classes by different leaf shaperepresentation criteria such as (1) Simple and Complex (2)Single and Compound (3) Linear Oblong Elliptic OvateObovate Orbicular Cordate Obcordate Reniform Deltoidand others as shown in Figure 2

The Gaussian kernel distance can capture the nonlinearinformation within the dataset to measure the distance orsimilarity between the samples Suppose there are two leafimages 119909 119910 and their similarity is defined by the Gaussiankernel distance

119904 (119909 119910) = exp(minus1003817100381710038171003817119909 minus 1199101003817100381710038171003817221205732 ) (4)

where 120573 is the Gaussian kernel width to control the excessivedistance between119909 and119910 and 119909minus119910 is the Euclidean distancebetween 119909 119910

In this section any original leaf image is transformed intogray image or binary image In fact any gray or binary imageis a 2DmatrixThus 119909minus119910 is the Euclidean distance betweentwo matrices 119909 119910

Given 119899 training leaf images 1199091 1199092 119909119899 in orderto divide these images into 119898 similar classes we find 119898representative leaf images with large different shapes such asLinear Oblong Elliptic andOvate where 119905119895 (119895 = 1 2 119898)is the jth representative image Then we set up a threshold119879 If 119904(119909119894 119905119895) ge 119879 119909119894 is assigned to the jth similar class if119904(119909119894 119905119895) lt 119879 for any 119905119895 (119895 = 1 2 119898) 119909119894 does not belong to119898 given similar classes It is assigned to the (119898 + 1)th similarclassThen 119899 leaf images are roughly divided into119898+1 similarclasses

We concatenate each leaf image of119898 similar class as a one-dimensionality vector and construct119898+1 subdictionaries forthe following WSRC algorithm by all these vectors namely119860119895 isin 119877119899119895times119899 (119895 = 1 2 119898+1) where 119899 is the dimensionalityof the leaf image vector and 119899119895 is the leaf number of the jthsimilar class and each columnof119860119895 is a vector correspondingto a leaf image

32 Discriminant WSRC SRC can ensure sparsity whileit loses the local information of the training data WSRCcan preserve the sparsity and the similarity between thetest sample and its neighbor training data while it ignoresthe prior global structure information of the training dataWe proposed a discriminant WSRC (DWSRC) scheme toimprove the classifying performance of WSRC by integratingthe sparsity and both locality and global structure informa-tion of the training data into a unified framework

Given a test leaf image 119910 and a threshold 1198791 by (4) wecalculate the similarity 119904(119910 119905119895) (119895 = 1 2 119898) between 119910and the jth representative leaf image 119905119895 (119895 = 1 2 119898) If119904(119910 119905119903) gt 1198791 choose the 119903th similar class as the candidatesimilar class with the maximum similarity for classifying119910 and select the corresponding 119903th subdictionary 119860119903 asthe candidate subdictionary Similar to the general WSRCDWSRC solves the followingweighted 1198971-normminimizationproblem

1198861015840 = argmin119886

1003817100381710038171003817119910 minus 119860119903119886100381710038171003817100381722 + 120583 100381710038171003817100381711988211990311988610038171003817100381710038171 (5)

Computational Intelligence and Neuroscience 5

where 119882119903 isin 119877119899119903times119899119903 is a weighted diagonal matrix 119899119903 is theleaf image number of the 119903th similar class and the diagonalelements 1199081199031 1199081199032 119908119903119899119903 of 119882119903 are the Gaussian kerneldistances between 119910 and the candidate samples which aredefined as follows

119908119903119894 = exp(1003817100381710038171003817119910 minus 11990911990311989410038171003817100381710038172212057321 ) (6)

where 119909119903119894 is the 119894th vector of the candidate similar class and1205731is the Gaussian kernel width to adjust the weight decay speed

In (5)119882119903 can penalize the distance between 119910 and eachtraining sample of the candidate similar class Its objectiveis to avoid selecting the training samples that are far fromthe test sample to represent the test sample Because thesample numbers of the different similar class are differentafter obtaining the SR coefficient vector 119886 we compute theaverage reconstruction error of each class in the candidatesimilar class as follows

119898119888 (119910) = 11198991198881003817100381710038171003817119910 minus 11986011990311988611988810038171003817100381710038172 (7)

where 119886119888 is the coefficient subvector associated with the cthtraining class in 119886 and 119899119888 is the vector number of the cth classin the candidate similar class

Then the test image is labeled to the class that owns theminimum normalized reconstruction error119898119888(119910)

In classical WSRC [21] if the diagonal elements 1199081 1199082 119908119899 of weighted diagonal matrix119882 is defined as follows

119908119894 = 119904 (119909119894 119910) if 119904 (119909119894 119910) gt 1198790 otherwise (8)

then WSRC is DWSRCDWSRC can generate more discriminative sparse codes

which can be used to represent the test sample more robustlyby combining both linearity and locality information for im-proving recognition performance DWSRC is a constrainedLASSO problem [21] Its flowchart is shown in Figure 3

321 Feasibility Analysis In classical SRC the computationalcomplexity to obtain the SR coefficients is about 119874(1198962119899) pertest sample [14 15 17] where 119896 is the number of nonzeroentries in reconstruction coefficients Ideally all of the nonz-ero SR coefficients will be associated with the columns of theovercomplete dictionary from a single object class and thetest sample can be easily assigned to that class However infact in the training leaf image set noise outliers modelingerror and a lot of intraclass dissimilar leaves may lead tomany small nonzero SR coefficients associated with multiplespecies classesThe computational complexity to obtain the SRcoefficients is nearly 119874(1198993) per test sample [17] In DWSRCall of 119899 training samples are split into 119898 similar classesthat are used to build 119898 subdictionaries and a candidatesubdictionary is chosen for WSRC Although the number ofthe training samples from the candidate similar class is lessthan the number of all the training samples the dictionary

completeness may not be destroyed because the candidatetraining samples that are similar to the test sample areretained while the noise outliers and irregular samples thatare dissimilar to the test sample are excluded The SRcoefficients obtained byDWSRCmight not be as sparse as theones achieved by the classical SRC and WSRC but the SRcoefficients of the relevant training samples still own greatermagnitudes because all training samples of a candidatesimilar class generally play an important role in sparselyrepresenting the test sample

If we consider all SR coefficients of all similar classesexcept the candidate similar class to be zeros we also thinkDWSRC is conducted on whole the database Thus DWSRCis much sparser than SRC and WSRC On the whole inDWSRCGaussian kernel distance information is imposed onthe WSRC to suppress the small nonzero coefficients so thenumber of small nonzero SR coefficients of DWSRC shouldbe significantly reducedThe sample number in the candidatesimilar class is about 119899119898 Compared with SRC and WSRCthe computational complexity of DWSRC to obtain the SRcoefficients is about 119874(11989621119899119898) per test sample where 1198961 isthe number of nonzero entries in reconstruction coefficientsby DWSRC In general 1198961 is much less than 119896 and 119899 so thecomputational complexity of DWSRC is quite less than thatof SRC and WSRC

From the above analysis it is indicated thatDWSR ismoresuitable for the large-scale database classification problem

4 Experiments and Results

In this section we investigate the performance of the pro-posed DWSRC method for plant species recognition andcompare it with four state-of-the-art approaches includingplant species recognition using WSRC [21] Leaf MarginSequences (LMS) [28]ManifoldndashManifold Distance (MMD)[29] and Centroid Contour Distance (CCD) [30] In WSRCbased experiments the parameter selection data processingand recognition process are similar to that in DWSRC whilein LMS MMD and CCD based experiments the parameterselection data processing and recognition process are fromthe three references [28ndash30] respectively

The important parameters in DWSRC include the scalarregularization parameter 120583 two thresholds119879 and1198791 and twoGaussian kernel widths 120573 and 1205731 in (4) and (6) Among themas 120583 120573 and 1205731 are not sensitive to recognition results [13 2131]120583 in (5) is set to 0001 empirically and twoGaussian kernelwidths are set as

120573 = (119889max (119909119894 119905119895) + 119889min (119909119894 119905119895))2 1205731 = (119889max (119910 119909119903119894) + 119889min (119910 119909119903119894))2

(9)

where 119889max(119909119894 119905119895) and 119889min(119909119894 119905119895) are maximum and mini-mum Euclidean distance between 119909119894 and 119905119895 and 119889max(119910 119909119903119894)and 119889min(119910 119909119903119894) are maximum and minimum Euclideandistance between 119910 and 119909119903119894

All experiments are performed on the platform with30GHz CPU and 40GB RAM by MATLAB 2011b The

6 Computational Intelligence and Neuroscience

Original database

Buildingsubdictionary

Similar class 1 Similar class 2 Similar class m + 1

Dividing database

Subdictionary 1 Subdictionary 2 Subdictionary m + 1

Build subdictionary

Test leaf image

middot middot middot

middot middot middot

middot middot middot

Choose the similar class and subdictionary with the maximum similarity

DWSRC

Constructing DWSRC

Construct and solve the weighted l1-norm minimization constraint problemon the chosen similar class and subdictionary

Classifying

Classify the test sample y into the class with the minimum reconstruction error rc(y)

(y tj)maxj s

Figure 3 The flowchart of the proposed method

MATLAB SR toolbox 19 (httpssitesgooglecomsitespar-sereptool) is used to solve theweighted 1198971-normminimizationproblem

41 Data Preparation All plant species recognition experi-ments by DWSRC and four competitive methods are con-ducted on the ICL Leaf database (httpspanbaiducomsharelinkshareid=2093145911ampuk=1395063007) which wascollected by intelligent computing laboratory (ICL) of Chi-neseAcademy of SciencesThe database contains 16846 plantleaf images from 220 kinds of plant species with differentnumbers of leaf images per species and different sizes frommin-size of 29 times 21 to max-size of 124 times 107 Some examplesare shown in Figure 4

FromFigure 4we find that each leaf image in ICL is singlewith simple background and without obscureness twist andnoise so we ignore a lot of complex preprocessing such asfiltering and enhancement But from Figure 4(a) it can beseen that the original leaf images in ICL database are orientedat a random angle and have various shapes colors andsizes Some leaf images contain footstalks and some kinds of

leaves are similar to each other Because the sizes of the orig-inal color leaf images are different from each other we firstlytransform each original image to grayscale image and geo-metrically scale it to the same size by bilinear interpolationalgorithm From Figure 4(b) it is found that 30 cottonplant leaves have large intraclass difference with low-intensitywhite backgrounds To improve recognition rate several pre-processing steps prior to species recognition are employed asfollows(1) Convert each color leaf image into grayscale image(2) Separate the background from the leaf image by asmall threshold (we set as 30) that is the pixel value less than30 is set as 0 under gray level of 255(3) Reduce noise by a median filter of radius 10(4) Cut footstalks off through contour information todetect narrow parallel running chains and remove the longestchain [6 30](5) In order to align all leaf images extract the angle fromthe image by which the major axis of the leaf is oriented withrespect to the vertical line and then rotate the image by theangle so that the major axis is aligned with the vertical line

Computational Intelligence and Neuroscience 7

(a) Leaf image samples of different species

157 (1) 157 (2) 157 (3) 157 (4) 157 (5) 157 (6) 157 (7) 157 (8) 157 (9) 157 (10)

157 (11) 157 (12) 157 (13) 157 (14) 157 (15) 157 (16) 157 (17) 157 (18) 157 (19) 157 (20)

157 (21) 157 (22) 157 (23) 157 (24) 157 (25) 157 (26) 157 (27) 157 (28) 157 (29) 157 (30)

(b) 30-leaf image of cotton plant

Figure 4 Leaf image samples of ICL database

finally remove the white bounding rectangle to superimposethe leaf over a homogeneous background(6) Extract leaf contour by Canny Edge algorithm [30](7) Crop each contour into the size to contain only theextents of the leaf and then resize them to a square size of 32 times32 by bilinear interpolation algorithm

Figure 5 shows a leaf preprocessing example by taking theabove steps

42 Subdictionary Construction From Figure 4(a) we seethere are innumerable various plant leaves but these leavescan be clustered into several similar classes by leaf shapedescription or similarity measure such as Acerose LinearGladiate Ensiform Oblanceolate Ovate Oval Obovate andElliptic The number of the classes can be defined by K-mean clustering and K-fusion clustering algorithms [32] Wereferred to twowebsites httptheseedsitecoukleafshapeshtml and httpsenwikipediaorgwikiGlossary of leafmorphology and four references [33ndash36] and defined the

class number as 21 In fact in nature we can choose 20 kinds oftypical leaves as shown in Figure 6 and find the similar leavesof each typical leaf from the ICL database Then we obtain 21similar classes

Suppose 119905119894 (119894 = 1 2 20) is the 119894th representative leafimage 119909119895 (119895 = 1 2 16846) is the jth leaf image of theICL database and the similarity 119904(119905119894 119909119895) between 119905119894 and 119909119895is calculated by (4) Figure 7 is the similarities between agiven dentate shape leaf image and each leaf image of the ICLdatabase In order to clearly show the visual graph of selectingthe candidate training leaf images Figure 7 only shows partialresults that is about 6300 similarities

From Figure 7 we can observe that a few leaf imagesare similar to the given test leaf image We set the threshold119879 = 05 that is if 119904(119905119894 119909119895) ge 05 classify 119909119895 into the 119894th similarclass Aftermost of leaf images of the ICL database are dividedinto 20 similar classes the remaining images are clusteredinto the 21st similar class Finally we build 21 subdictionariesby stacking all preprocessed contours of each similar class ascolumns of a matrix respectively

8 Computational Intelligence and Neuroscience

Original image Gray image Cut footstalks Aligned image Binary image Contour

Figure 5 The leaf preprocessing example

Line Round Dentate Polygon Crenate Lobed Palmatifid Lanceolate Spatulate Deltoid

Ovate Obcordate Cordate Reniform Hastate Subulate Peltate Flabellate Obovate Pinnatifid(a) 20 different kinds of typical leaves

(b) Others

Figure 6 21 kinds of typical leaves

00102030405060708091

Sim

ilarit

y

0 1000 2000 3000 4000 5000 6000

Training sample number

Figure 7 Partial similarities between the dentate leaf image andeach image of the ICL database

43 Classifying Plant Species by DWSRC Given a new testleaf image 119910 we calculate the similarity between 119910 and eachtypical leaf 119905119894 (119894 = 1 2 20) We set the second threshold1198791 = 01 that is if 119904(119910 119905119895) lt 01 for any 119895 = 1 2 20and select the 21st similar class as the candidate similar classand 11986021 as candidate subdictionary Otherwise we select thecandidate similar class and candidate subdictionary with themaximum similarity 119904(119910 119905119895) (119895 = 1 2 20)

0 2 4 6 8 10 12 14 16 18 200

00501

01502

02503

03504

04505

Typical leaf image number

Sim

ilarit

y

Figure 8 The similarity between 119910 and each typical leaf 119905119894 (119894 =1 2 20) where 119910 is cotton leaf

Figure 8 is a similarity example where the test image is acotton plant leaf image From Figure 8 the maximum simi-larity is 119904(1199054 119910) = 04269 Then we choose the fourth similarclass and the fourth subdictionary 1198604 to construct DWSRCfor classifying the test image We calculate the SR coefficients

Computational Intelligence and Neuroscience 9

Table 1 Average recognition rates standard deviation and running time of WSRC LMS MMD CCD and DWSRC

Method WSRC LMS MMD CCD DWSRCRecognition results 8827 plusmn 189 8273 plusmn 173 8651 plusmn 146 8162 plusmn 253 9112 plusmn 125Running time (s) 1218 842 936 723 437

Table 2 Average recognition rates standard deviation and running time of WSRC LMS MMD CCD and DWSRC

Method WSRC LMS MMD CCD DWSRCRecognition results 8656 plusmn 212 8250 plusmn 206 8543 plusmn 152 8117 plusmn 256 9064 plusmn 142Running time (s) 1432 953 1016 647 516

and recognize the test leaf by the minimum reconstructionerror

Figures 7 and 8 only list the visual process on how todivide the whole dataset into several similar classes andconstruct the subdictionary for the SRC algorithm where thetraining set is all images of ICL database and the test set is anyleaf image

44 Experimental Results To illustrate how DWSRC and 4competitive methods work two kinds of experiments areconducted(1) From ICL database we randomly select 1100 imagesfrom 220 species with 5 images per plant species as the test setand the rest as the training setThe training set is divided into21 similar classes to build the subdictionaries and each testleaf is recognized by DWSRC where the typical samples areshown in Figure 6(a) The training and test random partitionis repeated 50 times The average recognition results of 50experiments of five methods are shown in Table 1(2) From ICL database we randomly select 2000 imagesfrom 16846 plant leaf images as the test set and the rest asthe training set and repeat the training and testing procedure50 times In each time the training set is also divided into 21similar classes to build the subdictionaries and each test leafis recognized by DWSRC where the typical samples are alsoshown in Figure 6(a) The average recognition results of 50experiments of five methods are shown in Table 2

In Tables 1 and 2 the bold fonts highlight the best recogni-tion performance From two tables we can see that the recog-nition performance of DWSRC is the best The main reasonmight be that DWSRC employs only the candidate similarclass to solve the SR problem and recognize the test samplesand it makes use of the similarity relationship between thetest sample and each training sample So the computationalcost is significantly reduced and the side-effect of outliersnoise and modeling error to the recognition performance isgreatly suppressed The recognition rate of WSRC is higherthan LMS MMD and CCD but its computational time isthe longest Most of the time is taken to solve the 1198971-normminimization problem Since LMS MMD and CCD need toextract the classifying features from each leaf image whichis time-consuming they have lower recognition rates thanWSRC and DWSRC Because the real-world leaves of thesame species are different from each other it is difficult toextract the ldquooptimalrdquo features from each leaf image

5 Conclusions

As for the large-scale plant species recognition issue weproposed a new SRC approach that is discriminant weightedSRC (DWSRC) which aims to exploit the locality andsimilarity of the original dataset and the test samples insparsely representing the test samples It makes use of theprior structure information of the dataset and the distancebetween the test sample and training samples to construct theSR problem To represent a test sample we not only considerthe original data structure of training samples but alsotake into account the locality and the similarity between thetest sample and each training sample The SR coefficients ofthe test sample maintain a sparse structure and more dis-criminative information DWSRC is essentially the general-ization of WSRC Rather than WSRC we construct severalsubdictionaries prior to SRC by dividing all training samplesinto several similar classes and then choose a candidatesubdictionary for given test sample to implement WSRCThus the computational cost to solve the 1198971-norm minimiza-tion problem is greatly reduced In this sense DWSRC isreasonable for large-scale plant species recognition whichcan improve the plant species recognition rate and reduce thecomputation cost Recently deep learning has led to a seriesof breakthroughs in many pattern recognition research areasWe will study deep learning for leaf based species recognitionin the future work

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

Thiswork is supported by the ChinaNational Natural ScienceFoundation under Grant no 61473237 This work is partiallysupported by Tianjin Research ProgramofApplication Foun-dation and Advanced Technology no 14JCYBJC42500 andTianjin Science and Technology Correspondent Project no16JCTPJC47300The authorswould like to thank the ICLLeafdatabase for their constructive advice

References

[1] M-E Nilsback and A Zisserman ldquoAutomated flower classi-fication over a large number of classesrdquo in Proceedings of the6th Indian Conference on Computer Vision Graphics and ImageProcessing (ICVGIP rsquo08) pp 722ndash729 December 2008

10 Computational Intelligence and Neuroscience

[2] M Seeland M Rzanny N Alaqraa J Waldchen P Mader andY Zhang ldquoPlant species classification using flower imagesmdashAcomparative study of local feature representationsrdquo PLoS ONEvol 12 no 2 p e0170629 2017

[3] N Jamil N A C Hussin S Nordin and K Awang ldquoAutomaticPlant Identification Is Shape the Key Featurerdquo Procedia Com-puter Science vol 76 pp 436ndash442 2015

[4] E Mata-Montero and J Carranza-Rojas ldquoAutomated plantspecies identification Challenges and opportunitiesrdquo IFIPAdvances in Information and Communication Technology vol481 pp 26ndash36 2016

[5] J Waldchen and P Mader ldquoPlant Species Identification UsingComputer Vision Techniques A Systematic Literature ReviewrdquoArchives of Computational Methods in Engineering State-of-the-Art Reviews pp 1ndash37 2017

[6] T Munisami M Ramsurn S Kishnah and S Pudaruth ldquolantLeaf Recognition Using Shape Features and Colour Histogramwith K-nearest Neighbour Classifiersrdquo Procedia Computer Sci-ence vol 58 pp 740ndash747 2015

[7] J Chaki R Parekh and S Bhattacharya ldquoPlant leaf recognitionusing texture and shape features with neural classifiersrdquo PatternRecognition Letters vol 58 article no 6175 pp 61ndash68 2015

[8] C Zhao S S F Chan W-K Cham and L M Chu ldquoPlantidentification using leaf shapes - A pattern counting approachrdquoPattern Recognition vol 48 no 10 pp 3203ndash3215 2015

[9] M M S de Souza F N S Medeiros G L B Ramalho I Cde Paula Jr and I N S Oliveira ldquoEvolutionary optimization ofa multiscale descriptor for leaf shape analysisrdquo Expert Systemswith Applications vol 63 pp 375ndash385 2016

[10] N Liu and J-M Kan ldquoImproved deep belief networks andmulti-feature fusion for leaf identificationrdquo Neurocomputingvol 216 pp 460ndash467 2016

[11] B VijayaLakshmi and V Mohan ldquoKernel-based PSO andFRVM An automatic plant leaf type detection using textureshape and color featuresrdquo Computers and Electronics in Agri-culture vol 125 pp 99ndash112 2016

[12] T P Kumar M V P Reddy and P K Bora ldquoLeaf identificationusing shape and texture featuresrdquoAdvances in Intelligent Systemsand Computing vol 460 pp 531ndash541 2017

[13] Z Zhang Y Xu J Yang X Li andD Zhang ldquoA survey of sparserepresentation algorithms and applicationsrdquo IEEEAccess vol 3pp 490ndash530 2015

[14] H Cheng ldquoSparse Representation Modeling and Learningin Visual Recognition Theory Algorithms and ApplicationsrdquoAdvances in Computer Vision amp Pattern Recognition vol 6 pp1408ndash1425 93

[15] J-K Hsiao L-W Kang C-L Chang and C-Y Lin ldquoLearning-based leaf image recognition frameworksrdquo Studies in Computa-tional Intelligence vol 591 pp 77ndash91 2015

[16] T Jin X Hou P Li and F Zhou ldquoA novel method of automaticplant species identification using sparse representation of leaftooth featuresrdquo PLoS ONE vol 10 no 10 Article ID e01394822015

[17] C-G Li J Guo and H-G Zhang ldquoLocal sparse representationbased classificationrdquo in Proceedings of the 20th InternationalConference on Pattern Recognition (ICPR rsquo10) pp 649ndash652 2010

[18] J-X Mi and J-X Liu ldquoFace Recognition Using Sparse Repre-sentation-Based Classification on K-Nearest Subspacerdquo PLoSONE vol 8 no 3 Article ID e59430 2013

[19] Q Liu ldquoKernel Local Sparse Representation Based ClassifierrdquoNeural Processing Letters vol 43 no 1 pp 85ndash95 2016

[20] E Elhamifar and R Vidal ldquoRobust classification using struc-tured sparse representationrdquo in Proceedings of the IEEE Confer-ence on Computer Vision and Pattern Recognition (CVPR rsquo11)pp 1873ndash1879 June 2011

[21] Z Fan M Ni Q Zhu and E Liu ldquoWeighted sparse represen-tation for face recognitionrdquo Neurocomputing vol 151 no 1 pp304ndash309 2015

[22] JWright A Y Yang A Ganesh S S Sastry and YMa ldquoRobustface recognition via sparse representationrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 31 no 2 pp210ndash227 2009

[23] J Mairal F Bach J Ponce G Sapiro and A ZissermanldquoDiscriminative learned dictionaries for local image analysisrdquo inProceedings of the 26th IEEEConference onComputer Vision andPattern Recognition (CVPR rsquo08) pp 1ndash8 Anchorage AlaskaUSA June 2008

[24] J Zhang D Zhao F Jiang and W Gao ldquoStructural groupsparse representation for image Compressive Sensing recoveryrdquoin Proceedings of the 2013 Data Compression Conference DCC2013 pp 331ndash340 USA March 2013

[25] Y Sun Q Liu J Tang and D Tao ldquoLearning discriminativedictionary for group sparse representationrdquo IEEE Transactionson Image Processing vol 23 no 9 pp 3816ndash3828 2014

[26] S S Chen D L Donoho and M A Saunders ldquoAtomic decom-position by basis pursuitrdquo SIAM Review vol 43 no 1 pp 129ndash159 2001

[27] X TangG Feng and J Cai ldquoWeighted group sparse representa-tion for undersampled face recognitionrdquo Neurocomputing vol145 pp 402ndash415 2014

[28] G Cerutti L Tougne D Coquin and A Vacavant ldquoLeaf mar-gins as sequences A structural approach to leaf identificationrdquoPattern Recognition Letters vol 49 pp 177ndash184 2014

[29] J-X Du M-W Shao C-M Zhai J Wang Y Tang and CL P Chen ldquoRecognition of leaf image set based on manifold-manifold distancerdquo Neurocomputing vol 188 pp 131ndash138 2016

[30] A Hasim Y Herdiyeni and S Douady ldquoLeaf Shape Recogni-tion using Centroid Contour Distancerdquo in Proceedings of theInternational Seminar on Science of Complex Natural Systems2015 ISS-CNS 2015 Indonesia October 2015

[31] B Li CWang and D-S Huang ldquoSupervised feature extractionbased on orthogonal discriminant projectionrdquoNeurocomputingvol 73 no 1-3 pp 191ndash196 2009

[32] C C Aggarwal and C K Reddy Data Clustering Algorithmsand Applications Chapman amp HallCRC 2014

[33] httptheseedsitecoukleafshapeshtml[34] httpsenwikipediaorgwikiGlossary of leaf morphology[35] H Tsukaya ldquoMechanism of leaf-shape determinationrdquo Annual

Review of Plant Biology vol 57 pp 477ndash496 2006[36] J Dkhar and A Pareek ldquoWhat determines a leaf rsquos shaperdquo

EvoDevo vol 5 no 1 article no 47 2014

Submit your manuscripts athttpswwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 5: ResearchArticle Discriminant WSRC for Large-Scale Plant ...downloads.hindawi.com/journals/cin/2017/9581292.pdf · ResearchArticle Discriminant WSRC for Large-Scale Plant Species Recognition

Computational Intelligence and Neuroscience 5

where 119882119903 isin 119877119899119903times119899119903 is a weighted diagonal matrix 119899119903 is theleaf image number of the 119903th similar class and the diagonalelements 1199081199031 1199081199032 119908119903119899119903 of 119882119903 are the Gaussian kerneldistances between 119910 and the candidate samples which aredefined as follows

119908119903119894 = exp(1003817100381710038171003817119910 minus 11990911990311989410038171003817100381710038172212057321 ) (6)

where 119909119903119894 is the 119894th vector of the candidate similar class and1205731is the Gaussian kernel width to adjust the weight decay speed

In (5)119882119903 can penalize the distance between 119910 and eachtraining sample of the candidate similar class Its objectiveis to avoid selecting the training samples that are far fromthe test sample to represent the test sample Because thesample numbers of the different similar class are differentafter obtaining the SR coefficient vector 119886 we compute theaverage reconstruction error of each class in the candidatesimilar class as follows

119898119888 (119910) = 11198991198881003817100381710038171003817119910 minus 11986011990311988611988810038171003817100381710038172 (7)

where 119886119888 is the coefficient subvector associated with the cthtraining class in 119886 and 119899119888 is the vector number of the cth classin the candidate similar class

Then the test image is labeled to the class that owns theminimum normalized reconstruction error119898119888(119910)

In classical WSRC [21] if the diagonal elements 1199081 1199082 119908119899 of weighted diagonal matrix119882 is defined as follows

119908119894 = 119904 (119909119894 119910) if 119904 (119909119894 119910) gt 1198790 otherwise (8)

then WSRC is DWSRCDWSRC can generate more discriminative sparse codes

which can be used to represent the test sample more robustlyby combining both linearity and locality information for im-proving recognition performance DWSRC is a constrainedLASSO problem [21] Its flowchart is shown in Figure 3

321 Feasibility Analysis In classical SRC the computationalcomplexity to obtain the SR coefficients is about 119874(1198962119899) pertest sample [14 15 17] where 119896 is the number of nonzeroentries in reconstruction coefficients Ideally all of the nonz-ero SR coefficients will be associated with the columns of theovercomplete dictionary from a single object class and thetest sample can be easily assigned to that class However infact in the training leaf image set noise outliers modelingerror and a lot of intraclass dissimilar leaves may lead tomany small nonzero SR coefficients associated with multiplespecies classesThe computational complexity to obtain the SRcoefficients is nearly 119874(1198993) per test sample [17] In DWSRCall of 119899 training samples are split into 119898 similar classesthat are used to build 119898 subdictionaries and a candidatesubdictionary is chosen for WSRC Although the number ofthe training samples from the candidate similar class is lessthan the number of all the training samples the dictionary

completeness may not be destroyed because the candidatetraining samples that are similar to the test sample areretained while the noise outliers and irregular samples thatare dissimilar to the test sample are excluded The SRcoefficients obtained byDWSRCmight not be as sparse as theones achieved by the classical SRC and WSRC but the SRcoefficients of the relevant training samples still own greatermagnitudes because all training samples of a candidatesimilar class generally play an important role in sparselyrepresenting the test sample

If we consider all SR coefficients of all similar classesexcept the candidate similar class to be zeros we also thinkDWSRC is conducted on whole the database Thus DWSRCis much sparser than SRC and WSRC On the whole inDWSRCGaussian kernel distance information is imposed onthe WSRC to suppress the small nonzero coefficients so thenumber of small nonzero SR coefficients of DWSRC shouldbe significantly reducedThe sample number in the candidatesimilar class is about 119899119898 Compared with SRC and WSRCthe computational complexity of DWSRC to obtain the SRcoefficients is about 119874(11989621119899119898) per test sample where 1198961 isthe number of nonzero entries in reconstruction coefficientsby DWSRC In general 1198961 is much less than 119896 and 119899 so thecomputational complexity of DWSRC is quite less than thatof SRC and WSRC

From the above analysis it is indicated thatDWSR ismoresuitable for the large-scale database classification problem

4 Experiments and Results

In this section we investigate the performance of the pro-posed DWSRC method for plant species recognition andcompare it with four state-of-the-art approaches includingplant species recognition using WSRC [21] Leaf MarginSequences (LMS) [28]ManifoldndashManifold Distance (MMD)[29] and Centroid Contour Distance (CCD) [30] In WSRCbased experiments the parameter selection data processingand recognition process are similar to that in DWSRC whilein LMS MMD and CCD based experiments the parameterselection data processing and recognition process are fromthe three references [28ndash30] respectively

The important parameters in DWSRC include the scalarregularization parameter 120583 two thresholds119879 and1198791 and twoGaussian kernel widths 120573 and 1205731 in (4) and (6) Among themas 120583 120573 and 1205731 are not sensitive to recognition results [13 2131]120583 in (5) is set to 0001 empirically and twoGaussian kernelwidths are set as

120573 = (119889max (119909119894 119905119895) + 119889min (119909119894 119905119895))2 1205731 = (119889max (119910 119909119903119894) + 119889min (119910 119909119903119894))2

(9)

where 119889max(119909119894 119905119895) and 119889min(119909119894 119905119895) are maximum and mini-mum Euclidean distance between 119909119894 and 119905119895 and 119889max(119910 119909119903119894)and 119889min(119910 119909119903119894) are maximum and minimum Euclideandistance between 119910 and 119909119903119894

All experiments are performed on the platform with30GHz CPU and 40GB RAM by MATLAB 2011b The

6 Computational Intelligence and Neuroscience

Original database

Buildingsubdictionary

Similar class 1 Similar class 2 Similar class m + 1

Dividing database

Subdictionary 1 Subdictionary 2 Subdictionary m + 1

Build subdictionary

Test leaf image

middot middot middot

middot middot middot

middot middot middot

Choose the similar class and subdictionary with the maximum similarity

DWSRC

Constructing DWSRC

Construct and solve the weighted l1-norm minimization constraint problemon the chosen similar class and subdictionary

Classifying

Classify the test sample y into the class with the minimum reconstruction error rc(y)

(y tj)maxj s

Figure 3 The flowchart of the proposed method

MATLAB SR toolbox 19 (httpssitesgooglecomsitespar-sereptool) is used to solve theweighted 1198971-normminimizationproblem

41 Data Preparation All plant species recognition experi-ments by DWSRC and four competitive methods are con-ducted on the ICL Leaf database (httpspanbaiducomsharelinkshareid=2093145911ampuk=1395063007) which wascollected by intelligent computing laboratory (ICL) of Chi-neseAcademy of SciencesThe database contains 16846 plantleaf images from 220 kinds of plant species with differentnumbers of leaf images per species and different sizes frommin-size of 29 times 21 to max-size of 124 times 107 Some examplesare shown in Figure 4

FromFigure 4we find that each leaf image in ICL is singlewith simple background and without obscureness twist andnoise so we ignore a lot of complex preprocessing such asfiltering and enhancement But from Figure 4(a) it can beseen that the original leaf images in ICL database are orientedat a random angle and have various shapes colors andsizes Some leaf images contain footstalks and some kinds of

leaves are similar to each other Because the sizes of the orig-inal color leaf images are different from each other we firstlytransform each original image to grayscale image and geo-metrically scale it to the same size by bilinear interpolationalgorithm From Figure 4(b) it is found that 30 cottonplant leaves have large intraclass difference with low-intensitywhite backgrounds To improve recognition rate several pre-processing steps prior to species recognition are employed asfollows(1) Convert each color leaf image into grayscale image(2) Separate the background from the leaf image by asmall threshold (we set as 30) that is the pixel value less than30 is set as 0 under gray level of 255(3) Reduce noise by a median filter of radius 10(4) Cut footstalks off through contour information todetect narrow parallel running chains and remove the longestchain [6 30](5) In order to align all leaf images extract the angle fromthe image by which the major axis of the leaf is oriented withrespect to the vertical line and then rotate the image by theangle so that the major axis is aligned with the vertical line

Computational Intelligence and Neuroscience 7

(a) Leaf image samples of different species

157 (1) 157 (2) 157 (3) 157 (4) 157 (5) 157 (6) 157 (7) 157 (8) 157 (9) 157 (10)

157 (11) 157 (12) 157 (13) 157 (14) 157 (15) 157 (16) 157 (17) 157 (18) 157 (19) 157 (20)

157 (21) 157 (22) 157 (23) 157 (24) 157 (25) 157 (26) 157 (27) 157 (28) 157 (29) 157 (30)

(b) 30-leaf image of cotton plant

Figure 4 Leaf image samples of ICL database

finally remove the white bounding rectangle to superimposethe leaf over a homogeneous background(6) Extract leaf contour by Canny Edge algorithm [30](7) Crop each contour into the size to contain only theextents of the leaf and then resize them to a square size of 32 times32 by bilinear interpolation algorithm

Figure 5 shows a leaf preprocessing example by taking theabove steps

42 Subdictionary Construction From Figure 4(a) we seethere are innumerable various plant leaves but these leavescan be clustered into several similar classes by leaf shapedescription or similarity measure such as Acerose LinearGladiate Ensiform Oblanceolate Ovate Oval Obovate andElliptic The number of the classes can be defined by K-mean clustering and K-fusion clustering algorithms [32] Wereferred to twowebsites httptheseedsitecoukleafshapeshtml and httpsenwikipediaorgwikiGlossary of leafmorphology and four references [33ndash36] and defined the

class number as 21 In fact in nature we can choose 20 kinds oftypical leaves as shown in Figure 6 and find the similar leavesof each typical leaf from the ICL database Then we obtain 21similar classes

Suppose 119905119894 (119894 = 1 2 20) is the 119894th representative leafimage 119909119895 (119895 = 1 2 16846) is the jth leaf image of theICL database and the similarity 119904(119905119894 119909119895) between 119905119894 and 119909119895is calculated by (4) Figure 7 is the similarities between agiven dentate shape leaf image and each leaf image of the ICLdatabase In order to clearly show the visual graph of selectingthe candidate training leaf images Figure 7 only shows partialresults that is about 6300 similarities

From Figure 7 we can observe that a few leaf imagesare similar to the given test leaf image We set the threshold119879 = 05 that is if 119904(119905119894 119909119895) ge 05 classify 119909119895 into the 119894th similarclass Aftermost of leaf images of the ICL database are dividedinto 20 similar classes the remaining images are clusteredinto the 21st similar class Finally we build 21 subdictionariesby stacking all preprocessed contours of each similar class ascolumns of a matrix respectively

8 Computational Intelligence and Neuroscience

Original image Gray image Cut footstalks Aligned image Binary image Contour

Figure 5 The leaf preprocessing example

Line Round Dentate Polygon Crenate Lobed Palmatifid Lanceolate Spatulate Deltoid

Ovate Obcordate Cordate Reniform Hastate Subulate Peltate Flabellate Obovate Pinnatifid(a) 20 different kinds of typical leaves

(b) Others

Figure 6 21 kinds of typical leaves

00102030405060708091

Sim

ilarit

y

0 1000 2000 3000 4000 5000 6000

Training sample number

Figure 7 Partial similarities between the dentate leaf image andeach image of the ICL database

43 Classifying Plant Species by DWSRC Given a new testleaf image 119910 we calculate the similarity between 119910 and eachtypical leaf 119905119894 (119894 = 1 2 20) We set the second threshold1198791 = 01 that is if 119904(119910 119905119895) lt 01 for any 119895 = 1 2 20and select the 21st similar class as the candidate similar classand 11986021 as candidate subdictionary Otherwise we select thecandidate similar class and candidate subdictionary with themaximum similarity 119904(119910 119905119895) (119895 = 1 2 20)

0 2 4 6 8 10 12 14 16 18 200

00501

01502

02503

03504

04505

Typical leaf image number

Sim

ilarit

y

Figure 8 The similarity between 119910 and each typical leaf 119905119894 (119894 =1 2 20) where 119910 is cotton leaf

Figure 8 is a similarity example where the test image is acotton plant leaf image From Figure 8 the maximum simi-larity is 119904(1199054 119910) = 04269 Then we choose the fourth similarclass and the fourth subdictionary 1198604 to construct DWSRCfor classifying the test image We calculate the SR coefficients

Computational Intelligence and Neuroscience 9

Table 1 Average recognition rates standard deviation and running time of WSRC LMS MMD CCD and DWSRC

Method WSRC LMS MMD CCD DWSRCRecognition results 8827 plusmn 189 8273 plusmn 173 8651 plusmn 146 8162 plusmn 253 9112 plusmn 125Running time (s) 1218 842 936 723 437

Table 2 Average recognition rates standard deviation and running time of WSRC LMS MMD CCD and DWSRC

Method WSRC LMS MMD CCD DWSRCRecognition results 8656 plusmn 212 8250 plusmn 206 8543 plusmn 152 8117 plusmn 256 9064 plusmn 142Running time (s) 1432 953 1016 647 516

and recognize the test leaf by the minimum reconstructionerror

Figures 7 and 8 only list the visual process on how todivide the whole dataset into several similar classes andconstruct the subdictionary for the SRC algorithm where thetraining set is all images of ICL database and the test set is anyleaf image

44 Experimental Results To illustrate how DWSRC and 4competitive methods work two kinds of experiments areconducted(1) From ICL database we randomly select 1100 imagesfrom 220 species with 5 images per plant species as the test setand the rest as the training setThe training set is divided into21 similar classes to build the subdictionaries and each testleaf is recognized by DWSRC where the typical samples areshown in Figure 6(a) The training and test random partitionis repeated 50 times The average recognition results of 50experiments of five methods are shown in Table 1(2) From ICL database we randomly select 2000 imagesfrom 16846 plant leaf images as the test set and the rest asthe training set and repeat the training and testing procedure50 times In each time the training set is also divided into 21similar classes to build the subdictionaries and each test leafis recognized by DWSRC where the typical samples are alsoshown in Figure 6(a) The average recognition results of 50experiments of five methods are shown in Table 2

In Tables 1 and 2 the bold fonts highlight the best recogni-tion performance From two tables we can see that the recog-nition performance of DWSRC is the best The main reasonmight be that DWSRC employs only the candidate similarclass to solve the SR problem and recognize the test samplesand it makes use of the similarity relationship between thetest sample and each training sample So the computationalcost is significantly reduced and the side-effect of outliersnoise and modeling error to the recognition performance isgreatly suppressed The recognition rate of WSRC is higherthan LMS MMD and CCD but its computational time isthe longest Most of the time is taken to solve the 1198971-normminimization problem Since LMS MMD and CCD need toextract the classifying features from each leaf image whichis time-consuming they have lower recognition rates thanWSRC and DWSRC Because the real-world leaves of thesame species are different from each other it is difficult toextract the ldquooptimalrdquo features from each leaf image

5 Conclusions

As for the large-scale plant species recognition issue weproposed a new SRC approach that is discriminant weightedSRC (DWSRC) which aims to exploit the locality andsimilarity of the original dataset and the test samples insparsely representing the test samples It makes use of theprior structure information of the dataset and the distancebetween the test sample and training samples to construct theSR problem To represent a test sample we not only considerthe original data structure of training samples but alsotake into account the locality and the similarity between thetest sample and each training sample The SR coefficients ofthe test sample maintain a sparse structure and more dis-criminative information DWSRC is essentially the general-ization of WSRC Rather than WSRC we construct severalsubdictionaries prior to SRC by dividing all training samplesinto several similar classes and then choose a candidatesubdictionary for given test sample to implement WSRCThus the computational cost to solve the 1198971-norm minimiza-tion problem is greatly reduced In this sense DWSRC isreasonable for large-scale plant species recognition whichcan improve the plant species recognition rate and reduce thecomputation cost Recently deep learning has led to a seriesof breakthroughs in many pattern recognition research areasWe will study deep learning for leaf based species recognitionin the future work

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

Thiswork is supported by the ChinaNational Natural ScienceFoundation under Grant no 61473237 This work is partiallysupported by Tianjin Research ProgramofApplication Foun-dation and Advanced Technology no 14JCYBJC42500 andTianjin Science and Technology Correspondent Project no16JCTPJC47300The authorswould like to thank the ICLLeafdatabase for their constructive advice

References

[1] M-E Nilsback and A Zisserman ldquoAutomated flower classi-fication over a large number of classesrdquo in Proceedings of the6th Indian Conference on Computer Vision Graphics and ImageProcessing (ICVGIP rsquo08) pp 722ndash729 December 2008

10 Computational Intelligence and Neuroscience

[2] M Seeland M Rzanny N Alaqraa J Waldchen P Mader andY Zhang ldquoPlant species classification using flower imagesmdashAcomparative study of local feature representationsrdquo PLoS ONEvol 12 no 2 p e0170629 2017

[3] N Jamil N A C Hussin S Nordin and K Awang ldquoAutomaticPlant Identification Is Shape the Key Featurerdquo Procedia Com-puter Science vol 76 pp 436ndash442 2015

[4] E Mata-Montero and J Carranza-Rojas ldquoAutomated plantspecies identification Challenges and opportunitiesrdquo IFIPAdvances in Information and Communication Technology vol481 pp 26ndash36 2016

[5] J Waldchen and P Mader ldquoPlant Species Identification UsingComputer Vision Techniques A Systematic Literature ReviewrdquoArchives of Computational Methods in Engineering State-of-the-Art Reviews pp 1ndash37 2017

[6] T Munisami M Ramsurn S Kishnah and S Pudaruth ldquolantLeaf Recognition Using Shape Features and Colour Histogramwith K-nearest Neighbour Classifiersrdquo Procedia Computer Sci-ence vol 58 pp 740ndash747 2015

[7] J Chaki R Parekh and S Bhattacharya ldquoPlant leaf recognitionusing texture and shape features with neural classifiersrdquo PatternRecognition Letters vol 58 article no 6175 pp 61ndash68 2015

[8] C Zhao S S F Chan W-K Cham and L M Chu ldquoPlantidentification using leaf shapes - A pattern counting approachrdquoPattern Recognition vol 48 no 10 pp 3203ndash3215 2015

[9] M M S de Souza F N S Medeiros G L B Ramalho I Cde Paula Jr and I N S Oliveira ldquoEvolutionary optimization ofa multiscale descriptor for leaf shape analysisrdquo Expert Systemswith Applications vol 63 pp 375ndash385 2016

[10] N Liu and J-M Kan ldquoImproved deep belief networks andmulti-feature fusion for leaf identificationrdquo Neurocomputingvol 216 pp 460ndash467 2016

[11] B VijayaLakshmi and V Mohan ldquoKernel-based PSO andFRVM An automatic plant leaf type detection using textureshape and color featuresrdquo Computers and Electronics in Agri-culture vol 125 pp 99ndash112 2016

[12] T P Kumar M V P Reddy and P K Bora ldquoLeaf identificationusing shape and texture featuresrdquoAdvances in Intelligent Systemsand Computing vol 460 pp 531ndash541 2017

[13] Z Zhang Y Xu J Yang X Li andD Zhang ldquoA survey of sparserepresentation algorithms and applicationsrdquo IEEEAccess vol 3pp 490ndash530 2015

[14] H Cheng ldquoSparse Representation Modeling and Learningin Visual Recognition Theory Algorithms and ApplicationsrdquoAdvances in Computer Vision amp Pattern Recognition vol 6 pp1408ndash1425 93

[15] J-K Hsiao L-W Kang C-L Chang and C-Y Lin ldquoLearning-based leaf image recognition frameworksrdquo Studies in Computa-tional Intelligence vol 591 pp 77ndash91 2015

[16] T Jin X Hou P Li and F Zhou ldquoA novel method of automaticplant species identification using sparse representation of leaftooth featuresrdquo PLoS ONE vol 10 no 10 Article ID e01394822015

[17] C-G Li J Guo and H-G Zhang ldquoLocal sparse representationbased classificationrdquo in Proceedings of the 20th InternationalConference on Pattern Recognition (ICPR rsquo10) pp 649ndash652 2010

[18] J-X Mi and J-X Liu ldquoFace Recognition Using Sparse Repre-sentation-Based Classification on K-Nearest Subspacerdquo PLoSONE vol 8 no 3 Article ID e59430 2013

[19] Q Liu ldquoKernel Local Sparse Representation Based ClassifierrdquoNeural Processing Letters vol 43 no 1 pp 85ndash95 2016

[20] E Elhamifar and R Vidal ldquoRobust classification using struc-tured sparse representationrdquo in Proceedings of the IEEE Confer-ence on Computer Vision and Pattern Recognition (CVPR rsquo11)pp 1873ndash1879 June 2011

[21] Z Fan M Ni Q Zhu and E Liu ldquoWeighted sparse represen-tation for face recognitionrdquo Neurocomputing vol 151 no 1 pp304ndash309 2015

[22] JWright A Y Yang A Ganesh S S Sastry and YMa ldquoRobustface recognition via sparse representationrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 31 no 2 pp210ndash227 2009

[23] J Mairal F Bach J Ponce G Sapiro and A ZissermanldquoDiscriminative learned dictionaries for local image analysisrdquo inProceedings of the 26th IEEEConference onComputer Vision andPattern Recognition (CVPR rsquo08) pp 1ndash8 Anchorage AlaskaUSA June 2008

[24] J Zhang D Zhao F Jiang and W Gao ldquoStructural groupsparse representation for image Compressive Sensing recoveryrdquoin Proceedings of the 2013 Data Compression Conference DCC2013 pp 331ndash340 USA March 2013

[25] Y Sun Q Liu J Tang and D Tao ldquoLearning discriminativedictionary for group sparse representationrdquo IEEE Transactionson Image Processing vol 23 no 9 pp 3816ndash3828 2014

[26] S S Chen D L Donoho and M A Saunders ldquoAtomic decom-position by basis pursuitrdquo SIAM Review vol 43 no 1 pp 129ndash159 2001

[27] X TangG Feng and J Cai ldquoWeighted group sparse representa-tion for undersampled face recognitionrdquo Neurocomputing vol145 pp 402ndash415 2014

[28] G Cerutti L Tougne D Coquin and A Vacavant ldquoLeaf mar-gins as sequences A structural approach to leaf identificationrdquoPattern Recognition Letters vol 49 pp 177ndash184 2014

[29] J-X Du M-W Shao C-M Zhai J Wang Y Tang and CL P Chen ldquoRecognition of leaf image set based on manifold-manifold distancerdquo Neurocomputing vol 188 pp 131ndash138 2016

[30] A Hasim Y Herdiyeni and S Douady ldquoLeaf Shape Recogni-tion using Centroid Contour Distancerdquo in Proceedings of theInternational Seminar on Science of Complex Natural Systems2015 ISS-CNS 2015 Indonesia October 2015

[31] B Li CWang and D-S Huang ldquoSupervised feature extractionbased on orthogonal discriminant projectionrdquoNeurocomputingvol 73 no 1-3 pp 191ndash196 2009

[32] C C Aggarwal and C K Reddy Data Clustering Algorithmsand Applications Chapman amp HallCRC 2014

[33] httptheseedsitecoukleafshapeshtml[34] httpsenwikipediaorgwikiGlossary of leaf morphology[35] H Tsukaya ldquoMechanism of leaf-shape determinationrdquo Annual

Review of Plant Biology vol 57 pp 477ndash496 2006[36] J Dkhar and A Pareek ldquoWhat determines a leaf rsquos shaperdquo

EvoDevo vol 5 no 1 article no 47 2014

Submit your manuscripts athttpswwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 6: ResearchArticle Discriminant WSRC for Large-Scale Plant ...downloads.hindawi.com/journals/cin/2017/9581292.pdf · ResearchArticle Discriminant WSRC for Large-Scale Plant Species Recognition

6 Computational Intelligence and Neuroscience

Original database

Buildingsubdictionary

Similar class 1 Similar class 2 Similar class m + 1

Dividing database

Subdictionary 1 Subdictionary 2 Subdictionary m + 1

Build subdictionary

Test leaf image

middot middot middot

middot middot middot

middot middot middot

Choose the similar class and subdictionary with the maximum similarity

DWSRC

Constructing DWSRC

Construct and solve the weighted l1-norm minimization constraint problemon the chosen similar class and subdictionary

Classifying

Classify the test sample y into the class with the minimum reconstruction error rc(y)

(y tj)maxj s

Figure 3 The flowchart of the proposed method

MATLAB SR toolbox 19 (httpssitesgooglecomsitespar-sereptool) is used to solve theweighted 1198971-normminimizationproblem

41 Data Preparation All plant species recognition experi-ments by DWSRC and four competitive methods are con-ducted on the ICL Leaf database (httpspanbaiducomsharelinkshareid=2093145911ampuk=1395063007) which wascollected by intelligent computing laboratory (ICL) of Chi-neseAcademy of SciencesThe database contains 16846 plantleaf images from 220 kinds of plant species with differentnumbers of leaf images per species and different sizes frommin-size of 29 times 21 to max-size of 124 times 107 Some examplesare shown in Figure 4

FromFigure 4we find that each leaf image in ICL is singlewith simple background and without obscureness twist andnoise so we ignore a lot of complex preprocessing such asfiltering and enhancement But from Figure 4(a) it can beseen that the original leaf images in ICL database are orientedat a random angle and have various shapes colors andsizes Some leaf images contain footstalks and some kinds of

leaves are similar to each other Because the sizes of the orig-inal color leaf images are different from each other we firstlytransform each original image to grayscale image and geo-metrically scale it to the same size by bilinear interpolationalgorithm From Figure 4(b) it is found that 30 cottonplant leaves have large intraclass difference with low-intensitywhite backgrounds To improve recognition rate several pre-processing steps prior to species recognition are employed asfollows(1) Convert each color leaf image into grayscale image(2) Separate the background from the leaf image by asmall threshold (we set as 30) that is the pixel value less than30 is set as 0 under gray level of 255(3) Reduce noise by a median filter of radius 10(4) Cut footstalks off through contour information todetect narrow parallel running chains and remove the longestchain [6 30](5) In order to align all leaf images extract the angle fromthe image by which the major axis of the leaf is oriented withrespect to the vertical line and then rotate the image by theangle so that the major axis is aligned with the vertical line

Computational Intelligence and Neuroscience 7

(a) Leaf image samples of different species

157 (1) 157 (2) 157 (3) 157 (4) 157 (5) 157 (6) 157 (7) 157 (8) 157 (9) 157 (10)

157 (11) 157 (12) 157 (13) 157 (14) 157 (15) 157 (16) 157 (17) 157 (18) 157 (19) 157 (20)

157 (21) 157 (22) 157 (23) 157 (24) 157 (25) 157 (26) 157 (27) 157 (28) 157 (29) 157 (30)

(b) 30-leaf image of cotton plant

Figure 4 Leaf image samples of ICL database

finally remove the white bounding rectangle to superimposethe leaf over a homogeneous background(6) Extract leaf contour by Canny Edge algorithm [30](7) Crop each contour into the size to contain only theextents of the leaf and then resize them to a square size of 32 times32 by bilinear interpolation algorithm

Figure 5 shows a leaf preprocessing example by taking theabove steps

42 Subdictionary Construction From Figure 4(a) we seethere are innumerable various plant leaves but these leavescan be clustered into several similar classes by leaf shapedescription or similarity measure such as Acerose LinearGladiate Ensiform Oblanceolate Ovate Oval Obovate andElliptic The number of the classes can be defined by K-mean clustering and K-fusion clustering algorithms [32] Wereferred to twowebsites httptheseedsitecoukleafshapeshtml and httpsenwikipediaorgwikiGlossary of leafmorphology and four references [33ndash36] and defined the

class number as 21 In fact in nature we can choose 20 kinds oftypical leaves as shown in Figure 6 and find the similar leavesof each typical leaf from the ICL database Then we obtain 21similar classes

Suppose 119905119894 (119894 = 1 2 20) is the 119894th representative leafimage 119909119895 (119895 = 1 2 16846) is the jth leaf image of theICL database and the similarity 119904(119905119894 119909119895) between 119905119894 and 119909119895is calculated by (4) Figure 7 is the similarities between agiven dentate shape leaf image and each leaf image of the ICLdatabase In order to clearly show the visual graph of selectingthe candidate training leaf images Figure 7 only shows partialresults that is about 6300 similarities

From Figure 7 we can observe that a few leaf imagesare similar to the given test leaf image We set the threshold119879 = 05 that is if 119904(119905119894 119909119895) ge 05 classify 119909119895 into the 119894th similarclass Aftermost of leaf images of the ICL database are dividedinto 20 similar classes the remaining images are clusteredinto the 21st similar class Finally we build 21 subdictionariesby stacking all preprocessed contours of each similar class ascolumns of a matrix respectively

8 Computational Intelligence and Neuroscience

Original image Gray image Cut footstalks Aligned image Binary image Contour

Figure 5 The leaf preprocessing example

Line Round Dentate Polygon Crenate Lobed Palmatifid Lanceolate Spatulate Deltoid

Ovate Obcordate Cordate Reniform Hastate Subulate Peltate Flabellate Obovate Pinnatifid(a) 20 different kinds of typical leaves

(b) Others

Figure 6 21 kinds of typical leaves

00102030405060708091

Sim

ilarit

y

0 1000 2000 3000 4000 5000 6000

Training sample number

Figure 7 Partial similarities between the dentate leaf image andeach image of the ICL database

43 Classifying Plant Species by DWSRC Given a new testleaf image 119910 we calculate the similarity between 119910 and eachtypical leaf 119905119894 (119894 = 1 2 20) We set the second threshold1198791 = 01 that is if 119904(119910 119905119895) lt 01 for any 119895 = 1 2 20and select the 21st similar class as the candidate similar classand 11986021 as candidate subdictionary Otherwise we select thecandidate similar class and candidate subdictionary with themaximum similarity 119904(119910 119905119895) (119895 = 1 2 20)

0 2 4 6 8 10 12 14 16 18 200

00501

01502

02503

03504

04505

Typical leaf image number

Sim

ilarit

y

Figure 8 The similarity between 119910 and each typical leaf 119905119894 (119894 =1 2 20) where 119910 is cotton leaf

Figure 8 is a similarity example where the test image is acotton plant leaf image From Figure 8 the maximum simi-larity is 119904(1199054 119910) = 04269 Then we choose the fourth similarclass and the fourth subdictionary 1198604 to construct DWSRCfor classifying the test image We calculate the SR coefficients

Computational Intelligence and Neuroscience 9

Table 1 Average recognition rates standard deviation and running time of WSRC LMS MMD CCD and DWSRC

Method WSRC LMS MMD CCD DWSRCRecognition results 8827 plusmn 189 8273 plusmn 173 8651 plusmn 146 8162 plusmn 253 9112 plusmn 125Running time (s) 1218 842 936 723 437

Table 2 Average recognition rates standard deviation and running time of WSRC LMS MMD CCD and DWSRC

Method WSRC LMS MMD CCD DWSRCRecognition results 8656 plusmn 212 8250 plusmn 206 8543 plusmn 152 8117 plusmn 256 9064 plusmn 142Running time (s) 1432 953 1016 647 516

and recognize the test leaf by the minimum reconstructionerror

Figures 7 and 8 only list the visual process on how todivide the whole dataset into several similar classes andconstruct the subdictionary for the SRC algorithm where thetraining set is all images of ICL database and the test set is anyleaf image

44 Experimental Results To illustrate how DWSRC and 4competitive methods work two kinds of experiments areconducted(1) From ICL database we randomly select 1100 imagesfrom 220 species with 5 images per plant species as the test setand the rest as the training setThe training set is divided into21 similar classes to build the subdictionaries and each testleaf is recognized by DWSRC where the typical samples areshown in Figure 6(a) The training and test random partitionis repeated 50 times The average recognition results of 50experiments of five methods are shown in Table 1(2) From ICL database we randomly select 2000 imagesfrom 16846 plant leaf images as the test set and the rest asthe training set and repeat the training and testing procedure50 times In each time the training set is also divided into 21similar classes to build the subdictionaries and each test leafis recognized by DWSRC where the typical samples are alsoshown in Figure 6(a) The average recognition results of 50experiments of five methods are shown in Table 2

In Tables 1 and 2 the bold fonts highlight the best recogni-tion performance From two tables we can see that the recog-nition performance of DWSRC is the best The main reasonmight be that DWSRC employs only the candidate similarclass to solve the SR problem and recognize the test samplesand it makes use of the similarity relationship between thetest sample and each training sample So the computationalcost is significantly reduced and the side-effect of outliersnoise and modeling error to the recognition performance isgreatly suppressed The recognition rate of WSRC is higherthan LMS MMD and CCD but its computational time isthe longest Most of the time is taken to solve the 1198971-normminimization problem Since LMS MMD and CCD need toextract the classifying features from each leaf image whichis time-consuming they have lower recognition rates thanWSRC and DWSRC Because the real-world leaves of thesame species are different from each other it is difficult toextract the ldquooptimalrdquo features from each leaf image

5 Conclusions

As for the large-scale plant species recognition issue weproposed a new SRC approach that is discriminant weightedSRC (DWSRC) which aims to exploit the locality andsimilarity of the original dataset and the test samples insparsely representing the test samples It makes use of theprior structure information of the dataset and the distancebetween the test sample and training samples to construct theSR problem To represent a test sample we not only considerthe original data structure of training samples but alsotake into account the locality and the similarity between thetest sample and each training sample The SR coefficients ofthe test sample maintain a sparse structure and more dis-criminative information DWSRC is essentially the general-ization of WSRC Rather than WSRC we construct severalsubdictionaries prior to SRC by dividing all training samplesinto several similar classes and then choose a candidatesubdictionary for given test sample to implement WSRCThus the computational cost to solve the 1198971-norm minimiza-tion problem is greatly reduced In this sense DWSRC isreasonable for large-scale plant species recognition whichcan improve the plant species recognition rate and reduce thecomputation cost Recently deep learning has led to a seriesof breakthroughs in many pattern recognition research areasWe will study deep learning for leaf based species recognitionin the future work

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

Thiswork is supported by the ChinaNational Natural ScienceFoundation under Grant no 61473237 This work is partiallysupported by Tianjin Research ProgramofApplication Foun-dation and Advanced Technology no 14JCYBJC42500 andTianjin Science and Technology Correspondent Project no16JCTPJC47300The authorswould like to thank the ICLLeafdatabase for their constructive advice

References

[1] M-E Nilsback and A Zisserman ldquoAutomated flower classi-fication over a large number of classesrdquo in Proceedings of the6th Indian Conference on Computer Vision Graphics and ImageProcessing (ICVGIP rsquo08) pp 722ndash729 December 2008

10 Computational Intelligence and Neuroscience

[2] M Seeland M Rzanny N Alaqraa J Waldchen P Mader andY Zhang ldquoPlant species classification using flower imagesmdashAcomparative study of local feature representationsrdquo PLoS ONEvol 12 no 2 p e0170629 2017

[3] N Jamil N A C Hussin S Nordin and K Awang ldquoAutomaticPlant Identification Is Shape the Key Featurerdquo Procedia Com-puter Science vol 76 pp 436ndash442 2015

[4] E Mata-Montero and J Carranza-Rojas ldquoAutomated plantspecies identification Challenges and opportunitiesrdquo IFIPAdvances in Information and Communication Technology vol481 pp 26ndash36 2016

[5] J Waldchen and P Mader ldquoPlant Species Identification UsingComputer Vision Techniques A Systematic Literature ReviewrdquoArchives of Computational Methods in Engineering State-of-the-Art Reviews pp 1ndash37 2017

[6] T Munisami M Ramsurn S Kishnah and S Pudaruth ldquolantLeaf Recognition Using Shape Features and Colour Histogramwith K-nearest Neighbour Classifiersrdquo Procedia Computer Sci-ence vol 58 pp 740ndash747 2015

[7] J Chaki R Parekh and S Bhattacharya ldquoPlant leaf recognitionusing texture and shape features with neural classifiersrdquo PatternRecognition Letters vol 58 article no 6175 pp 61ndash68 2015

[8] C Zhao S S F Chan W-K Cham and L M Chu ldquoPlantidentification using leaf shapes - A pattern counting approachrdquoPattern Recognition vol 48 no 10 pp 3203ndash3215 2015

[9] M M S de Souza F N S Medeiros G L B Ramalho I Cde Paula Jr and I N S Oliveira ldquoEvolutionary optimization ofa multiscale descriptor for leaf shape analysisrdquo Expert Systemswith Applications vol 63 pp 375ndash385 2016

[10] N Liu and J-M Kan ldquoImproved deep belief networks andmulti-feature fusion for leaf identificationrdquo Neurocomputingvol 216 pp 460ndash467 2016

[11] B VijayaLakshmi and V Mohan ldquoKernel-based PSO andFRVM An automatic plant leaf type detection using textureshape and color featuresrdquo Computers and Electronics in Agri-culture vol 125 pp 99ndash112 2016

[12] T P Kumar M V P Reddy and P K Bora ldquoLeaf identificationusing shape and texture featuresrdquoAdvances in Intelligent Systemsand Computing vol 460 pp 531ndash541 2017

[13] Z Zhang Y Xu J Yang X Li andD Zhang ldquoA survey of sparserepresentation algorithms and applicationsrdquo IEEEAccess vol 3pp 490ndash530 2015

[14] H Cheng ldquoSparse Representation Modeling and Learningin Visual Recognition Theory Algorithms and ApplicationsrdquoAdvances in Computer Vision amp Pattern Recognition vol 6 pp1408ndash1425 93

[15] J-K Hsiao L-W Kang C-L Chang and C-Y Lin ldquoLearning-based leaf image recognition frameworksrdquo Studies in Computa-tional Intelligence vol 591 pp 77ndash91 2015

[16] T Jin X Hou P Li and F Zhou ldquoA novel method of automaticplant species identification using sparse representation of leaftooth featuresrdquo PLoS ONE vol 10 no 10 Article ID e01394822015

[17] C-G Li J Guo and H-G Zhang ldquoLocal sparse representationbased classificationrdquo in Proceedings of the 20th InternationalConference on Pattern Recognition (ICPR rsquo10) pp 649ndash652 2010

[18] J-X Mi and J-X Liu ldquoFace Recognition Using Sparse Repre-sentation-Based Classification on K-Nearest Subspacerdquo PLoSONE vol 8 no 3 Article ID e59430 2013

[19] Q Liu ldquoKernel Local Sparse Representation Based ClassifierrdquoNeural Processing Letters vol 43 no 1 pp 85ndash95 2016

[20] E Elhamifar and R Vidal ldquoRobust classification using struc-tured sparse representationrdquo in Proceedings of the IEEE Confer-ence on Computer Vision and Pattern Recognition (CVPR rsquo11)pp 1873ndash1879 June 2011

[21] Z Fan M Ni Q Zhu and E Liu ldquoWeighted sparse represen-tation for face recognitionrdquo Neurocomputing vol 151 no 1 pp304ndash309 2015

[22] JWright A Y Yang A Ganesh S S Sastry and YMa ldquoRobustface recognition via sparse representationrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 31 no 2 pp210ndash227 2009

[23] J Mairal F Bach J Ponce G Sapiro and A ZissermanldquoDiscriminative learned dictionaries for local image analysisrdquo inProceedings of the 26th IEEEConference onComputer Vision andPattern Recognition (CVPR rsquo08) pp 1ndash8 Anchorage AlaskaUSA June 2008

[24] J Zhang D Zhao F Jiang and W Gao ldquoStructural groupsparse representation for image Compressive Sensing recoveryrdquoin Proceedings of the 2013 Data Compression Conference DCC2013 pp 331ndash340 USA March 2013

[25] Y Sun Q Liu J Tang and D Tao ldquoLearning discriminativedictionary for group sparse representationrdquo IEEE Transactionson Image Processing vol 23 no 9 pp 3816ndash3828 2014

[26] S S Chen D L Donoho and M A Saunders ldquoAtomic decom-position by basis pursuitrdquo SIAM Review vol 43 no 1 pp 129ndash159 2001

[27] X TangG Feng and J Cai ldquoWeighted group sparse representa-tion for undersampled face recognitionrdquo Neurocomputing vol145 pp 402ndash415 2014

[28] G Cerutti L Tougne D Coquin and A Vacavant ldquoLeaf mar-gins as sequences A structural approach to leaf identificationrdquoPattern Recognition Letters vol 49 pp 177ndash184 2014

[29] J-X Du M-W Shao C-M Zhai J Wang Y Tang and CL P Chen ldquoRecognition of leaf image set based on manifold-manifold distancerdquo Neurocomputing vol 188 pp 131ndash138 2016

[30] A Hasim Y Herdiyeni and S Douady ldquoLeaf Shape Recogni-tion using Centroid Contour Distancerdquo in Proceedings of theInternational Seminar on Science of Complex Natural Systems2015 ISS-CNS 2015 Indonesia October 2015

[31] B Li CWang and D-S Huang ldquoSupervised feature extractionbased on orthogonal discriminant projectionrdquoNeurocomputingvol 73 no 1-3 pp 191ndash196 2009

[32] C C Aggarwal and C K Reddy Data Clustering Algorithmsand Applications Chapman amp HallCRC 2014

[33] httptheseedsitecoukleafshapeshtml[34] httpsenwikipediaorgwikiGlossary of leaf morphology[35] H Tsukaya ldquoMechanism of leaf-shape determinationrdquo Annual

Review of Plant Biology vol 57 pp 477ndash496 2006[36] J Dkhar and A Pareek ldquoWhat determines a leaf rsquos shaperdquo

EvoDevo vol 5 no 1 article no 47 2014

Submit your manuscripts athttpswwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 7: ResearchArticle Discriminant WSRC for Large-Scale Plant ...downloads.hindawi.com/journals/cin/2017/9581292.pdf · ResearchArticle Discriminant WSRC for Large-Scale Plant Species Recognition

Computational Intelligence and Neuroscience 7

(a) Leaf image samples of different species

157 (1) 157 (2) 157 (3) 157 (4) 157 (5) 157 (6) 157 (7) 157 (8) 157 (9) 157 (10)

157 (11) 157 (12) 157 (13) 157 (14) 157 (15) 157 (16) 157 (17) 157 (18) 157 (19) 157 (20)

157 (21) 157 (22) 157 (23) 157 (24) 157 (25) 157 (26) 157 (27) 157 (28) 157 (29) 157 (30)

(b) 30-leaf image of cotton plant

Figure 4 Leaf image samples of ICL database

finally remove the white bounding rectangle to superimposethe leaf over a homogeneous background(6) Extract leaf contour by Canny Edge algorithm [30](7) Crop each contour into the size to contain only theextents of the leaf and then resize them to a square size of 32 times32 by bilinear interpolation algorithm

Figure 5 shows a leaf preprocessing example by taking theabove steps

42 Subdictionary Construction From Figure 4(a) we seethere are innumerable various plant leaves but these leavescan be clustered into several similar classes by leaf shapedescription or similarity measure such as Acerose LinearGladiate Ensiform Oblanceolate Ovate Oval Obovate andElliptic The number of the classes can be defined by K-mean clustering and K-fusion clustering algorithms [32] Wereferred to twowebsites httptheseedsitecoukleafshapeshtml and httpsenwikipediaorgwikiGlossary of leafmorphology and four references [33ndash36] and defined the

class number as 21 In fact in nature we can choose 20 kinds oftypical leaves as shown in Figure 6 and find the similar leavesof each typical leaf from the ICL database Then we obtain 21similar classes

Suppose 119905119894 (119894 = 1 2 20) is the 119894th representative leafimage 119909119895 (119895 = 1 2 16846) is the jth leaf image of theICL database and the similarity 119904(119905119894 119909119895) between 119905119894 and 119909119895is calculated by (4) Figure 7 is the similarities between agiven dentate shape leaf image and each leaf image of the ICLdatabase In order to clearly show the visual graph of selectingthe candidate training leaf images Figure 7 only shows partialresults that is about 6300 similarities

From Figure 7 we can observe that a few leaf imagesare similar to the given test leaf image We set the threshold119879 = 05 that is if 119904(119905119894 119909119895) ge 05 classify 119909119895 into the 119894th similarclass Aftermost of leaf images of the ICL database are dividedinto 20 similar classes the remaining images are clusteredinto the 21st similar class Finally we build 21 subdictionariesby stacking all preprocessed contours of each similar class ascolumns of a matrix respectively

8 Computational Intelligence and Neuroscience

Original image Gray image Cut footstalks Aligned image Binary image Contour

Figure 5 The leaf preprocessing example

Line Round Dentate Polygon Crenate Lobed Palmatifid Lanceolate Spatulate Deltoid

Ovate Obcordate Cordate Reniform Hastate Subulate Peltate Flabellate Obovate Pinnatifid(a) 20 different kinds of typical leaves

(b) Others

Figure 6 21 kinds of typical leaves

00102030405060708091

Sim

ilarit

y

0 1000 2000 3000 4000 5000 6000

Training sample number

Figure 7 Partial similarities between the dentate leaf image andeach image of the ICL database

43 Classifying Plant Species by DWSRC Given a new testleaf image 119910 we calculate the similarity between 119910 and eachtypical leaf 119905119894 (119894 = 1 2 20) We set the second threshold1198791 = 01 that is if 119904(119910 119905119895) lt 01 for any 119895 = 1 2 20and select the 21st similar class as the candidate similar classand 11986021 as candidate subdictionary Otherwise we select thecandidate similar class and candidate subdictionary with themaximum similarity 119904(119910 119905119895) (119895 = 1 2 20)

0 2 4 6 8 10 12 14 16 18 200

00501

01502

02503

03504

04505

Typical leaf image number

Sim

ilarit

y

Figure 8 The similarity between 119910 and each typical leaf 119905119894 (119894 =1 2 20) where 119910 is cotton leaf

Figure 8 is a similarity example where the test image is acotton plant leaf image From Figure 8 the maximum simi-larity is 119904(1199054 119910) = 04269 Then we choose the fourth similarclass and the fourth subdictionary 1198604 to construct DWSRCfor classifying the test image We calculate the SR coefficients

Computational Intelligence and Neuroscience 9

Table 1 Average recognition rates standard deviation and running time of WSRC LMS MMD CCD and DWSRC

Method WSRC LMS MMD CCD DWSRCRecognition results 8827 plusmn 189 8273 plusmn 173 8651 plusmn 146 8162 plusmn 253 9112 plusmn 125Running time (s) 1218 842 936 723 437

Table 2 Average recognition rates standard deviation and running time of WSRC LMS MMD CCD and DWSRC

Method WSRC LMS MMD CCD DWSRCRecognition results 8656 plusmn 212 8250 plusmn 206 8543 plusmn 152 8117 plusmn 256 9064 plusmn 142Running time (s) 1432 953 1016 647 516

and recognize the test leaf by the minimum reconstructionerror

Figures 7 and 8 only list the visual process on how todivide the whole dataset into several similar classes andconstruct the subdictionary for the SRC algorithm where thetraining set is all images of ICL database and the test set is anyleaf image

44 Experimental Results To illustrate how DWSRC and 4competitive methods work two kinds of experiments areconducted(1) From ICL database we randomly select 1100 imagesfrom 220 species with 5 images per plant species as the test setand the rest as the training setThe training set is divided into21 similar classes to build the subdictionaries and each testleaf is recognized by DWSRC where the typical samples areshown in Figure 6(a) The training and test random partitionis repeated 50 times The average recognition results of 50experiments of five methods are shown in Table 1(2) From ICL database we randomly select 2000 imagesfrom 16846 plant leaf images as the test set and the rest asthe training set and repeat the training and testing procedure50 times In each time the training set is also divided into 21similar classes to build the subdictionaries and each test leafis recognized by DWSRC where the typical samples are alsoshown in Figure 6(a) The average recognition results of 50experiments of five methods are shown in Table 2

In Tables 1 and 2 the bold fonts highlight the best recogni-tion performance From two tables we can see that the recog-nition performance of DWSRC is the best The main reasonmight be that DWSRC employs only the candidate similarclass to solve the SR problem and recognize the test samplesand it makes use of the similarity relationship between thetest sample and each training sample So the computationalcost is significantly reduced and the side-effect of outliersnoise and modeling error to the recognition performance isgreatly suppressed The recognition rate of WSRC is higherthan LMS MMD and CCD but its computational time isthe longest Most of the time is taken to solve the 1198971-normminimization problem Since LMS MMD and CCD need toextract the classifying features from each leaf image whichis time-consuming they have lower recognition rates thanWSRC and DWSRC Because the real-world leaves of thesame species are different from each other it is difficult toextract the ldquooptimalrdquo features from each leaf image

5 Conclusions

As for the large-scale plant species recognition issue weproposed a new SRC approach that is discriminant weightedSRC (DWSRC) which aims to exploit the locality andsimilarity of the original dataset and the test samples insparsely representing the test samples It makes use of theprior structure information of the dataset and the distancebetween the test sample and training samples to construct theSR problem To represent a test sample we not only considerthe original data structure of training samples but alsotake into account the locality and the similarity between thetest sample and each training sample The SR coefficients ofthe test sample maintain a sparse structure and more dis-criminative information DWSRC is essentially the general-ization of WSRC Rather than WSRC we construct severalsubdictionaries prior to SRC by dividing all training samplesinto several similar classes and then choose a candidatesubdictionary for given test sample to implement WSRCThus the computational cost to solve the 1198971-norm minimiza-tion problem is greatly reduced In this sense DWSRC isreasonable for large-scale plant species recognition whichcan improve the plant species recognition rate and reduce thecomputation cost Recently deep learning has led to a seriesof breakthroughs in many pattern recognition research areasWe will study deep learning for leaf based species recognitionin the future work

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

Thiswork is supported by the ChinaNational Natural ScienceFoundation under Grant no 61473237 This work is partiallysupported by Tianjin Research ProgramofApplication Foun-dation and Advanced Technology no 14JCYBJC42500 andTianjin Science and Technology Correspondent Project no16JCTPJC47300The authorswould like to thank the ICLLeafdatabase for their constructive advice

References

[1] M-E Nilsback and A Zisserman ldquoAutomated flower classi-fication over a large number of classesrdquo in Proceedings of the6th Indian Conference on Computer Vision Graphics and ImageProcessing (ICVGIP rsquo08) pp 722ndash729 December 2008

10 Computational Intelligence and Neuroscience

[2] M Seeland M Rzanny N Alaqraa J Waldchen P Mader andY Zhang ldquoPlant species classification using flower imagesmdashAcomparative study of local feature representationsrdquo PLoS ONEvol 12 no 2 p e0170629 2017

[3] N Jamil N A C Hussin S Nordin and K Awang ldquoAutomaticPlant Identification Is Shape the Key Featurerdquo Procedia Com-puter Science vol 76 pp 436ndash442 2015

[4] E Mata-Montero and J Carranza-Rojas ldquoAutomated plantspecies identification Challenges and opportunitiesrdquo IFIPAdvances in Information and Communication Technology vol481 pp 26ndash36 2016

[5] J Waldchen and P Mader ldquoPlant Species Identification UsingComputer Vision Techniques A Systematic Literature ReviewrdquoArchives of Computational Methods in Engineering State-of-the-Art Reviews pp 1ndash37 2017

[6] T Munisami M Ramsurn S Kishnah and S Pudaruth ldquolantLeaf Recognition Using Shape Features and Colour Histogramwith K-nearest Neighbour Classifiersrdquo Procedia Computer Sci-ence vol 58 pp 740ndash747 2015

[7] J Chaki R Parekh and S Bhattacharya ldquoPlant leaf recognitionusing texture and shape features with neural classifiersrdquo PatternRecognition Letters vol 58 article no 6175 pp 61ndash68 2015

[8] C Zhao S S F Chan W-K Cham and L M Chu ldquoPlantidentification using leaf shapes - A pattern counting approachrdquoPattern Recognition vol 48 no 10 pp 3203ndash3215 2015

[9] M M S de Souza F N S Medeiros G L B Ramalho I Cde Paula Jr and I N S Oliveira ldquoEvolutionary optimization ofa multiscale descriptor for leaf shape analysisrdquo Expert Systemswith Applications vol 63 pp 375ndash385 2016

[10] N Liu and J-M Kan ldquoImproved deep belief networks andmulti-feature fusion for leaf identificationrdquo Neurocomputingvol 216 pp 460ndash467 2016

[11] B VijayaLakshmi and V Mohan ldquoKernel-based PSO andFRVM An automatic plant leaf type detection using textureshape and color featuresrdquo Computers and Electronics in Agri-culture vol 125 pp 99ndash112 2016

[12] T P Kumar M V P Reddy and P K Bora ldquoLeaf identificationusing shape and texture featuresrdquoAdvances in Intelligent Systemsand Computing vol 460 pp 531ndash541 2017

[13] Z Zhang Y Xu J Yang X Li andD Zhang ldquoA survey of sparserepresentation algorithms and applicationsrdquo IEEEAccess vol 3pp 490ndash530 2015

[14] H Cheng ldquoSparse Representation Modeling and Learningin Visual Recognition Theory Algorithms and ApplicationsrdquoAdvances in Computer Vision amp Pattern Recognition vol 6 pp1408ndash1425 93

[15] J-K Hsiao L-W Kang C-L Chang and C-Y Lin ldquoLearning-based leaf image recognition frameworksrdquo Studies in Computa-tional Intelligence vol 591 pp 77ndash91 2015

[16] T Jin X Hou P Li and F Zhou ldquoA novel method of automaticplant species identification using sparse representation of leaftooth featuresrdquo PLoS ONE vol 10 no 10 Article ID e01394822015

[17] C-G Li J Guo and H-G Zhang ldquoLocal sparse representationbased classificationrdquo in Proceedings of the 20th InternationalConference on Pattern Recognition (ICPR rsquo10) pp 649ndash652 2010

[18] J-X Mi and J-X Liu ldquoFace Recognition Using Sparse Repre-sentation-Based Classification on K-Nearest Subspacerdquo PLoSONE vol 8 no 3 Article ID e59430 2013

[19] Q Liu ldquoKernel Local Sparse Representation Based ClassifierrdquoNeural Processing Letters vol 43 no 1 pp 85ndash95 2016

[20] E Elhamifar and R Vidal ldquoRobust classification using struc-tured sparse representationrdquo in Proceedings of the IEEE Confer-ence on Computer Vision and Pattern Recognition (CVPR rsquo11)pp 1873ndash1879 June 2011

[21] Z Fan M Ni Q Zhu and E Liu ldquoWeighted sparse represen-tation for face recognitionrdquo Neurocomputing vol 151 no 1 pp304ndash309 2015

[22] JWright A Y Yang A Ganesh S S Sastry and YMa ldquoRobustface recognition via sparse representationrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 31 no 2 pp210ndash227 2009

[23] J Mairal F Bach J Ponce G Sapiro and A ZissermanldquoDiscriminative learned dictionaries for local image analysisrdquo inProceedings of the 26th IEEEConference onComputer Vision andPattern Recognition (CVPR rsquo08) pp 1ndash8 Anchorage AlaskaUSA June 2008

[24] J Zhang D Zhao F Jiang and W Gao ldquoStructural groupsparse representation for image Compressive Sensing recoveryrdquoin Proceedings of the 2013 Data Compression Conference DCC2013 pp 331ndash340 USA March 2013

[25] Y Sun Q Liu J Tang and D Tao ldquoLearning discriminativedictionary for group sparse representationrdquo IEEE Transactionson Image Processing vol 23 no 9 pp 3816ndash3828 2014

[26] S S Chen D L Donoho and M A Saunders ldquoAtomic decom-position by basis pursuitrdquo SIAM Review vol 43 no 1 pp 129ndash159 2001

[27] X TangG Feng and J Cai ldquoWeighted group sparse representa-tion for undersampled face recognitionrdquo Neurocomputing vol145 pp 402ndash415 2014

[28] G Cerutti L Tougne D Coquin and A Vacavant ldquoLeaf mar-gins as sequences A structural approach to leaf identificationrdquoPattern Recognition Letters vol 49 pp 177ndash184 2014

[29] J-X Du M-W Shao C-M Zhai J Wang Y Tang and CL P Chen ldquoRecognition of leaf image set based on manifold-manifold distancerdquo Neurocomputing vol 188 pp 131ndash138 2016

[30] A Hasim Y Herdiyeni and S Douady ldquoLeaf Shape Recogni-tion using Centroid Contour Distancerdquo in Proceedings of theInternational Seminar on Science of Complex Natural Systems2015 ISS-CNS 2015 Indonesia October 2015

[31] B Li CWang and D-S Huang ldquoSupervised feature extractionbased on orthogonal discriminant projectionrdquoNeurocomputingvol 73 no 1-3 pp 191ndash196 2009

[32] C C Aggarwal and C K Reddy Data Clustering Algorithmsand Applications Chapman amp HallCRC 2014

[33] httptheseedsitecoukleafshapeshtml[34] httpsenwikipediaorgwikiGlossary of leaf morphology[35] H Tsukaya ldquoMechanism of leaf-shape determinationrdquo Annual

Review of Plant Biology vol 57 pp 477ndash496 2006[36] J Dkhar and A Pareek ldquoWhat determines a leaf rsquos shaperdquo

EvoDevo vol 5 no 1 article no 47 2014

Submit your manuscripts athttpswwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 8: ResearchArticle Discriminant WSRC for Large-Scale Plant ...downloads.hindawi.com/journals/cin/2017/9581292.pdf · ResearchArticle Discriminant WSRC for Large-Scale Plant Species Recognition

8 Computational Intelligence and Neuroscience

Original image Gray image Cut footstalks Aligned image Binary image Contour

Figure 5 The leaf preprocessing example

Line Round Dentate Polygon Crenate Lobed Palmatifid Lanceolate Spatulate Deltoid

Ovate Obcordate Cordate Reniform Hastate Subulate Peltate Flabellate Obovate Pinnatifid(a) 20 different kinds of typical leaves

(b) Others

Figure 6 21 kinds of typical leaves

00102030405060708091

Sim

ilarit

y

0 1000 2000 3000 4000 5000 6000

Training sample number

Figure 7 Partial similarities between the dentate leaf image andeach image of the ICL database

43 Classifying Plant Species by DWSRC Given a new testleaf image 119910 we calculate the similarity between 119910 and eachtypical leaf 119905119894 (119894 = 1 2 20) We set the second threshold1198791 = 01 that is if 119904(119910 119905119895) lt 01 for any 119895 = 1 2 20and select the 21st similar class as the candidate similar classand 11986021 as candidate subdictionary Otherwise we select thecandidate similar class and candidate subdictionary with themaximum similarity 119904(119910 119905119895) (119895 = 1 2 20)

0 2 4 6 8 10 12 14 16 18 200

00501

01502

02503

03504

04505

Typical leaf image number

Sim

ilarit

y

Figure 8 The similarity between 119910 and each typical leaf 119905119894 (119894 =1 2 20) where 119910 is cotton leaf

Figure 8 is a similarity example where the test image is acotton plant leaf image From Figure 8 the maximum simi-larity is 119904(1199054 119910) = 04269 Then we choose the fourth similarclass and the fourth subdictionary 1198604 to construct DWSRCfor classifying the test image We calculate the SR coefficients

Computational Intelligence and Neuroscience 9

Table 1 Average recognition rates standard deviation and running time of WSRC LMS MMD CCD and DWSRC

Method WSRC LMS MMD CCD DWSRCRecognition results 8827 plusmn 189 8273 plusmn 173 8651 plusmn 146 8162 plusmn 253 9112 plusmn 125Running time (s) 1218 842 936 723 437

Table 2 Average recognition rates standard deviation and running time of WSRC LMS MMD CCD and DWSRC

Method WSRC LMS MMD CCD DWSRCRecognition results 8656 plusmn 212 8250 plusmn 206 8543 plusmn 152 8117 plusmn 256 9064 plusmn 142Running time (s) 1432 953 1016 647 516

and recognize the test leaf by the minimum reconstructionerror

Figures 7 and 8 only list the visual process on how todivide the whole dataset into several similar classes andconstruct the subdictionary for the SRC algorithm where thetraining set is all images of ICL database and the test set is anyleaf image

44 Experimental Results To illustrate how DWSRC and 4competitive methods work two kinds of experiments areconducted(1) From ICL database we randomly select 1100 imagesfrom 220 species with 5 images per plant species as the test setand the rest as the training setThe training set is divided into21 similar classes to build the subdictionaries and each testleaf is recognized by DWSRC where the typical samples areshown in Figure 6(a) The training and test random partitionis repeated 50 times The average recognition results of 50experiments of five methods are shown in Table 1(2) From ICL database we randomly select 2000 imagesfrom 16846 plant leaf images as the test set and the rest asthe training set and repeat the training and testing procedure50 times In each time the training set is also divided into 21similar classes to build the subdictionaries and each test leafis recognized by DWSRC where the typical samples are alsoshown in Figure 6(a) The average recognition results of 50experiments of five methods are shown in Table 2

In Tables 1 and 2 the bold fonts highlight the best recogni-tion performance From two tables we can see that the recog-nition performance of DWSRC is the best The main reasonmight be that DWSRC employs only the candidate similarclass to solve the SR problem and recognize the test samplesand it makes use of the similarity relationship between thetest sample and each training sample So the computationalcost is significantly reduced and the side-effect of outliersnoise and modeling error to the recognition performance isgreatly suppressed The recognition rate of WSRC is higherthan LMS MMD and CCD but its computational time isthe longest Most of the time is taken to solve the 1198971-normminimization problem Since LMS MMD and CCD need toextract the classifying features from each leaf image whichis time-consuming they have lower recognition rates thanWSRC and DWSRC Because the real-world leaves of thesame species are different from each other it is difficult toextract the ldquooptimalrdquo features from each leaf image

5 Conclusions

As for the large-scale plant species recognition issue weproposed a new SRC approach that is discriminant weightedSRC (DWSRC) which aims to exploit the locality andsimilarity of the original dataset and the test samples insparsely representing the test samples It makes use of theprior structure information of the dataset and the distancebetween the test sample and training samples to construct theSR problem To represent a test sample we not only considerthe original data structure of training samples but alsotake into account the locality and the similarity between thetest sample and each training sample The SR coefficients ofthe test sample maintain a sparse structure and more dis-criminative information DWSRC is essentially the general-ization of WSRC Rather than WSRC we construct severalsubdictionaries prior to SRC by dividing all training samplesinto several similar classes and then choose a candidatesubdictionary for given test sample to implement WSRCThus the computational cost to solve the 1198971-norm minimiza-tion problem is greatly reduced In this sense DWSRC isreasonable for large-scale plant species recognition whichcan improve the plant species recognition rate and reduce thecomputation cost Recently deep learning has led to a seriesof breakthroughs in many pattern recognition research areasWe will study deep learning for leaf based species recognitionin the future work

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

Thiswork is supported by the ChinaNational Natural ScienceFoundation under Grant no 61473237 This work is partiallysupported by Tianjin Research ProgramofApplication Foun-dation and Advanced Technology no 14JCYBJC42500 andTianjin Science and Technology Correspondent Project no16JCTPJC47300The authorswould like to thank the ICLLeafdatabase for their constructive advice

References

[1] M-E Nilsback and A Zisserman ldquoAutomated flower classi-fication over a large number of classesrdquo in Proceedings of the6th Indian Conference on Computer Vision Graphics and ImageProcessing (ICVGIP rsquo08) pp 722ndash729 December 2008

10 Computational Intelligence and Neuroscience

[2] M Seeland M Rzanny N Alaqraa J Waldchen P Mader andY Zhang ldquoPlant species classification using flower imagesmdashAcomparative study of local feature representationsrdquo PLoS ONEvol 12 no 2 p e0170629 2017

[3] N Jamil N A C Hussin S Nordin and K Awang ldquoAutomaticPlant Identification Is Shape the Key Featurerdquo Procedia Com-puter Science vol 76 pp 436ndash442 2015

[4] E Mata-Montero and J Carranza-Rojas ldquoAutomated plantspecies identification Challenges and opportunitiesrdquo IFIPAdvances in Information and Communication Technology vol481 pp 26ndash36 2016

[5] J Waldchen and P Mader ldquoPlant Species Identification UsingComputer Vision Techniques A Systematic Literature ReviewrdquoArchives of Computational Methods in Engineering State-of-the-Art Reviews pp 1ndash37 2017

[6] T Munisami M Ramsurn S Kishnah and S Pudaruth ldquolantLeaf Recognition Using Shape Features and Colour Histogramwith K-nearest Neighbour Classifiersrdquo Procedia Computer Sci-ence vol 58 pp 740ndash747 2015

[7] J Chaki R Parekh and S Bhattacharya ldquoPlant leaf recognitionusing texture and shape features with neural classifiersrdquo PatternRecognition Letters vol 58 article no 6175 pp 61ndash68 2015

[8] C Zhao S S F Chan W-K Cham and L M Chu ldquoPlantidentification using leaf shapes - A pattern counting approachrdquoPattern Recognition vol 48 no 10 pp 3203ndash3215 2015

[9] M M S de Souza F N S Medeiros G L B Ramalho I Cde Paula Jr and I N S Oliveira ldquoEvolutionary optimization ofa multiscale descriptor for leaf shape analysisrdquo Expert Systemswith Applications vol 63 pp 375ndash385 2016

[10] N Liu and J-M Kan ldquoImproved deep belief networks andmulti-feature fusion for leaf identificationrdquo Neurocomputingvol 216 pp 460ndash467 2016

[11] B VijayaLakshmi and V Mohan ldquoKernel-based PSO andFRVM An automatic plant leaf type detection using textureshape and color featuresrdquo Computers and Electronics in Agri-culture vol 125 pp 99ndash112 2016

[12] T P Kumar M V P Reddy and P K Bora ldquoLeaf identificationusing shape and texture featuresrdquoAdvances in Intelligent Systemsand Computing vol 460 pp 531ndash541 2017

[13] Z Zhang Y Xu J Yang X Li andD Zhang ldquoA survey of sparserepresentation algorithms and applicationsrdquo IEEEAccess vol 3pp 490ndash530 2015

[14] H Cheng ldquoSparse Representation Modeling and Learningin Visual Recognition Theory Algorithms and ApplicationsrdquoAdvances in Computer Vision amp Pattern Recognition vol 6 pp1408ndash1425 93

[15] J-K Hsiao L-W Kang C-L Chang and C-Y Lin ldquoLearning-based leaf image recognition frameworksrdquo Studies in Computa-tional Intelligence vol 591 pp 77ndash91 2015

[16] T Jin X Hou P Li and F Zhou ldquoA novel method of automaticplant species identification using sparse representation of leaftooth featuresrdquo PLoS ONE vol 10 no 10 Article ID e01394822015

[17] C-G Li J Guo and H-G Zhang ldquoLocal sparse representationbased classificationrdquo in Proceedings of the 20th InternationalConference on Pattern Recognition (ICPR rsquo10) pp 649ndash652 2010

[18] J-X Mi and J-X Liu ldquoFace Recognition Using Sparse Repre-sentation-Based Classification on K-Nearest Subspacerdquo PLoSONE vol 8 no 3 Article ID e59430 2013

[19] Q Liu ldquoKernel Local Sparse Representation Based ClassifierrdquoNeural Processing Letters vol 43 no 1 pp 85ndash95 2016

[20] E Elhamifar and R Vidal ldquoRobust classification using struc-tured sparse representationrdquo in Proceedings of the IEEE Confer-ence on Computer Vision and Pattern Recognition (CVPR rsquo11)pp 1873ndash1879 June 2011

[21] Z Fan M Ni Q Zhu and E Liu ldquoWeighted sparse represen-tation for face recognitionrdquo Neurocomputing vol 151 no 1 pp304ndash309 2015

[22] JWright A Y Yang A Ganesh S S Sastry and YMa ldquoRobustface recognition via sparse representationrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 31 no 2 pp210ndash227 2009

[23] J Mairal F Bach J Ponce G Sapiro and A ZissermanldquoDiscriminative learned dictionaries for local image analysisrdquo inProceedings of the 26th IEEEConference onComputer Vision andPattern Recognition (CVPR rsquo08) pp 1ndash8 Anchorage AlaskaUSA June 2008

[24] J Zhang D Zhao F Jiang and W Gao ldquoStructural groupsparse representation for image Compressive Sensing recoveryrdquoin Proceedings of the 2013 Data Compression Conference DCC2013 pp 331ndash340 USA March 2013

[25] Y Sun Q Liu J Tang and D Tao ldquoLearning discriminativedictionary for group sparse representationrdquo IEEE Transactionson Image Processing vol 23 no 9 pp 3816ndash3828 2014

[26] S S Chen D L Donoho and M A Saunders ldquoAtomic decom-position by basis pursuitrdquo SIAM Review vol 43 no 1 pp 129ndash159 2001

[27] X TangG Feng and J Cai ldquoWeighted group sparse representa-tion for undersampled face recognitionrdquo Neurocomputing vol145 pp 402ndash415 2014

[28] G Cerutti L Tougne D Coquin and A Vacavant ldquoLeaf mar-gins as sequences A structural approach to leaf identificationrdquoPattern Recognition Letters vol 49 pp 177ndash184 2014

[29] J-X Du M-W Shao C-M Zhai J Wang Y Tang and CL P Chen ldquoRecognition of leaf image set based on manifold-manifold distancerdquo Neurocomputing vol 188 pp 131ndash138 2016

[30] A Hasim Y Herdiyeni and S Douady ldquoLeaf Shape Recogni-tion using Centroid Contour Distancerdquo in Proceedings of theInternational Seminar on Science of Complex Natural Systems2015 ISS-CNS 2015 Indonesia October 2015

[31] B Li CWang and D-S Huang ldquoSupervised feature extractionbased on orthogonal discriminant projectionrdquoNeurocomputingvol 73 no 1-3 pp 191ndash196 2009

[32] C C Aggarwal and C K Reddy Data Clustering Algorithmsand Applications Chapman amp HallCRC 2014

[33] httptheseedsitecoukleafshapeshtml[34] httpsenwikipediaorgwikiGlossary of leaf morphology[35] H Tsukaya ldquoMechanism of leaf-shape determinationrdquo Annual

Review of Plant Biology vol 57 pp 477ndash496 2006[36] J Dkhar and A Pareek ldquoWhat determines a leaf rsquos shaperdquo

EvoDevo vol 5 no 1 article no 47 2014

Submit your manuscripts athttpswwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 9: ResearchArticle Discriminant WSRC for Large-Scale Plant ...downloads.hindawi.com/journals/cin/2017/9581292.pdf · ResearchArticle Discriminant WSRC for Large-Scale Plant Species Recognition

Computational Intelligence and Neuroscience 9

Table 1 Average recognition rates standard deviation and running time of WSRC LMS MMD CCD and DWSRC

Method WSRC LMS MMD CCD DWSRCRecognition results 8827 plusmn 189 8273 plusmn 173 8651 plusmn 146 8162 plusmn 253 9112 plusmn 125Running time (s) 1218 842 936 723 437

Table 2 Average recognition rates standard deviation and running time of WSRC LMS MMD CCD and DWSRC

Method WSRC LMS MMD CCD DWSRCRecognition results 8656 plusmn 212 8250 plusmn 206 8543 plusmn 152 8117 plusmn 256 9064 plusmn 142Running time (s) 1432 953 1016 647 516

and recognize the test leaf by the minimum reconstructionerror

Figures 7 and 8 only list the visual process on how todivide the whole dataset into several similar classes andconstruct the subdictionary for the SRC algorithm where thetraining set is all images of ICL database and the test set is anyleaf image

44 Experimental Results To illustrate how DWSRC and 4competitive methods work two kinds of experiments areconducted(1) From ICL database we randomly select 1100 imagesfrom 220 species with 5 images per plant species as the test setand the rest as the training setThe training set is divided into21 similar classes to build the subdictionaries and each testleaf is recognized by DWSRC where the typical samples areshown in Figure 6(a) The training and test random partitionis repeated 50 times The average recognition results of 50experiments of five methods are shown in Table 1(2) From ICL database we randomly select 2000 imagesfrom 16846 plant leaf images as the test set and the rest asthe training set and repeat the training and testing procedure50 times In each time the training set is also divided into 21similar classes to build the subdictionaries and each test leafis recognized by DWSRC where the typical samples are alsoshown in Figure 6(a) The average recognition results of 50experiments of five methods are shown in Table 2

In Tables 1 and 2 the bold fonts highlight the best recogni-tion performance From two tables we can see that the recog-nition performance of DWSRC is the best The main reasonmight be that DWSRC employs only the candidate similarclass to solve the SR problem and recognize the test samplesand it makes use of the similarity relationship between thetest sample and each training sample So the computationalcost is significantly reduced and the side-effect of outliersnoise and modeling error to the recognition performance isgreatly suppressed The recognition rate of WSRC is higherthan LMS MMD and CCD but its computational time isthe longest Most of the time is taken to solve the 1198971-normminimization problem Since LMS MMD and CCD need toextract the classifying features from each leaf image whichis time-consuming they have lower recognition rates thanWSRC and DWSRC Because the real-world leaves of thesame species are different from each other it is difficult toextract the ldquooptimalrdquo features from each leaf image

5 Conclusions

As for the large-scale plant species recognition issue weproposed a new SRC approach that is discriminant weightedSRC (DWSRC) which aims to exploit the locality andsimilarity of the original dataset and the test samples insparsely representing the test samples It makes use of theprior structure information of the dataset and the distancebetween the test sample and training samples to construct theSR problem To represent a test sample we not only considerthe original data structure of training samples but alsotake into account the locality and the similarity between thetest sample and each training sample The SR coefficients ofthe test sample maintain a sparse structure and more dis-criminative information DWSRC is essentially the general-ization of WSRC Rather than WSRC we construct severalsubdictionaries prior to SRC by dividing all training samplesinto several similar classes and then choose a candidatesubdictionary for given test sample to implement WSRCThus the computational cost to solve the 1198971-norm minimiza-tion problem is greatly reduced In this sense DWSRC isreasonable for large-scale plant species recognition whichcan improve the plant species recognition rate and reduce thecomputation cost Recently deep learning has led to a seriesof breakthroughs in many pattern recognition research areasWe will study deep learning for leaf based species recognitionin the future work

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

Thiswork is supported by the ChinaNational Natural ScienceFoundation under Grant no 61473237 This work is partiallysupported by Tianjin Research ProgramofApplication Foun-dation and Advanced Technology no 14JCYBJC42500 andTianjin Science and Technology Correspondent Project no16JCTPJC47300The authorswould like to thank the ICLLeafdatabase for their constructive advice

References

[1] M-E Nilsback and A Zisserman ldquoAutomated flower classi-fication over a large number of classesrdquo in Proceedings of the6th Indian Conference on Computer Vision Graphics and ImageProcessing (ICVGIP rsquo08) pp 722ndash729 December 2008

10 Computational Intelligence and Neuroscience

[2] M Seeland M Rzanny N Alaqraa J Waldchen P Mader andY Zhang ldquoPlant species classification using flower imagesmdashAcomparative study of local feature representationsrdquo PLoS ONEvol 12 no 2 p e0170629 2017

[3] N Jamil N A C Hussin S Nordin and K Awang ldquoAutomaticPlant Identification Is Shape the Key Featurerdquo Procedia Com-puter Science vol 76 pp 436ndash442 2015

[4] E Mata-Montero and J Carranza-Rojas ldquoAutomated plantspecies identification Challenges and opportunitiesrdquo IFIPAdvances in Information and Communication Technology vol481 pp 26ndash36 2016

[5] J Waldchen and P Mader ldquoPlant Species Identification UsingComputer Vision Techniques A Systematic Literature ReviewrdquoArchives of Computational Methods in Engineering State-of-the-Art Reviews pp 1ndash37 2017

[6] T Munisami M Ramsurn S Kishnah and S Pudaruth ldquolantLeaf Recognition Using Shape Features and Colour Histogramwith K-nearest Neighbour Classifiersrdquo Procedia Computer Sci-ence vol 58 pp 740ndash747 2015

[7] J Chaki R Parekh and S Bhattacharya ldquoPlant leaf recognitionusing texture and shape features with neural classifiersrdquo PatternRecognition Letters vol 58 article no 6175 pp 61ndash68 2015

[8] C Zhao S S F Chan W-K Cham and L M Chu ldquoPlantidentification using leaf shapes - A pattern counting approachrdquoPattern Recognition vol 48 no 10 pp 3203ndash3215 2015

[9] M M S de Souza F N S Medeiros G L B Ramalho I Cde Paula Jr and I N S Oliveira ldquoEvolutionary optimization ofa multiscale descriptor for leaf shape analysisrdquo Expert Systemswith Applications vol 63 pp 375ndash385 2016

[10] N Liu and J-M Kan ldquoImproved deep belief networks andmulti-feature fusion for leaf identificationrdquo Neurocomputingvol 216 pp 460ndash467 2016

[11] B VijayaLakshmi and V Mohan ldquoKernel-based PSO andFRVM An automatic plant leaf type detection using textureshape and color featuresrdquo Computers and Electronics in Agri-culture vol 125 pp 99ndash112 2016

[12] T P Kumar M V P Reddy and P K Bora ldquoLeaf identificationusing shape and texture featuresrdquoAdvances in Intelligent Systemsand Computing vol 460 pp 531ndash541 2017

[13] Z Zhang Y Xu J Yang X Li andD Zhang ldquoA survey of sparserepresentation algorithms and applicationsrdquo IEEEAccess vol 3pp 490ndash530 2015

[14] H Cheng ldquoSparse Representation Modeling and Learningin Visual Recognition Theory Algorithms and ApplicationsrdquoAdvances in Computer Vision amp Pattern Recognition vol 6 pp1408ndash1425 93

[15] J-K Hsiao L-W Kang C-L Chang and C-Y Lin ldquoLearning-based leaf image recognition frameworksrdquo Studies in Computa-tional Intelligence vol 591 pp 77ndash91 2015

[16] T Jin X Hou P Li and F Zhou ldquoA novel method of automaticplant species identification using sparse representation of leaftooth featuresrdquo PLoS ONE vol 10 no 10 Article ID e01394822015

[17] C-G Li J Guo and H-G Zhang ldquoLocal sparse representationbased classificationrdquo in Proceedings of the 20th InternationalConference on Pattern Recognition (ICPR rsquo10) pp 649ndash652 2010

[18] J-X Mi and J-X Liu ldquoFace Recognition Using Sparse Repre-sentation-Based Classification on K-Nearest Subspacerdquo PLoSONE vol 8 no 3 Article ID e59430 2013

[19] Q Liu ldquoKernel Local Sparse Representation Based ClassifierrdquoNeural Processing Letters vol 43 no 1 pp 85ndash95 2016

[20] E Elhamifar and R Vidal ldquoRobust classification using struc-tured sparse representationrdquo in Proceedings of the IEEE Confer-ence on Computer Vision and Pattern Recognition (CVPR rsquo11)pp 1873ndash1879 June 2011

[21] Z Fan M Ni Q Zhu and E Liu ldquoWeighted sparse represen-tation for face recognitionrdquo Neurocomputing vol 151 no 1 pp304ndash309 2015

[22] JWright A Y Yang A Ganesh S S Sastry and YMa ldquoRobustface recognition via sparse representationrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 31 no 2 pp210ndash227 2009

[23] J Mairal F Bach J Ponce G Sapiro and A ZissermanldquoDiscriminative learned dictionaries for local image analysisrdquo inProceedings of the 26th IEEEConference onComputer Vision andPattern Recognition (CVPR rsquo08) pp 1ndash8 Anchorage AlaskaUSA June 2008

[24] J Zhang D Zhao F Jiang and W Gao ldquoStructural groupsparse representation for image Compressive Sensing recoveryrdquoin Proceedings of the 2013 Data Compression Conference DCC2013 pp 331ndash340 USA March 2013

[25] Y Sun Q Liu J Tang and D Tao ldquoLearning discriminativedictionary for group sparse representationrdquo IEEE Transactionson Image Processing vol 23 no 9 pp 3816ndash3828 2014

[26] S S Chen D L Donoho and M A Saunders ldquoAtomic decom-position by basis pursuitrdquo SIAM Review vol 43 no 1 pp 129ndash159 2001

[27] X TangG Feng and J Cai ldquoWeighted group sparse representa-tion for undersampled face recognitionrdquo Neurocomputing vol145 pp 402ndash415 2014

[28] G Cerutti L Tougne D Coquin and A Vacavant ldquoLeaf mar-gins as sequences A structural approach to leaf identificationrdquoPattern Recognition Letters vol 49 pp 177ndash184 2014

[29] J-X Du M-W Shao C-M Zhai J Wang Y Tang and CL P Chen ldquoRecognition of leaf image set based on manifold-manifold distancerdquo Neurocomputing vol 188 pp 131ndash138 2016

[30] A Hasim Y Herdiyeni and S Douady ldquoLeaf Shape Recogni-tion using Centroid Contour Distancerdquo in Proceedings of theInternational Seminar on Science of Complex Natural Systems2015 ISS-CNS 2015 Indonesia October 2015

[31] B Li CWang and D-S Huang ldquoSupervised feature extractionbased on orthogonal discriminant projectionrdquoNeurocomputingvol 73 no 1-3 pp 191ndash196 2009

[32] C C Aggarwal and C K Reddy Data Clustering Algorithmsand Applications Chapman amp HallCRC 2014

[33] httptheseedsitecoukleafshapeshtml[34] httpsenwikipediaorgwikiGlossary of leaf morphology[35] H Tsukaya ldquoMechanism of leaf-shape determinationrdquo Annual

Review of Plant Biology vol 57 pp 477ndash496 2006[36] J Dkhar and A Pareek ldquoWhat determines a leaf rsquos shaperdquo

EvoDevo vol 5 no 1 article no 47 2014

Submit your manuscripts athttpswwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 10: ResearchArticle Discriminant WSRC for Large-Scale Plant ...downloads.hindawi.com/journals/cin/2017/9581292.pdf · ResearchArticle Discriminant WSRC for Large-Scale Plant Species Recognition

10 Computational Intelligence and Neuroscience

[2] M Seeland M Rzanny N Alaqraa J Waldchen P Mader andY Zhang ldquoPlant species classification using flower imagesmdashAcomparative study of local feature representationsrdquo PLoS ONEvol 12 no 2 p e0170629 2017

[3] N Jamil N A C Hussin S Nordin and K Awang ldquoAutomaticPlant Identification Is Shape the Key Featurerdquo Procedia Com-puter Science vol 76 pp 436ndash442 2015

[4] E Mata-Montero and J Carranza-Rojas ldquoAutomated plantspecies identification Challenges and opportunitiesrdquo IFIPAdvances in Information and Communication Technology vol481 pp 26ndash36 2016

[5] J Waldchen and P Mader ldquoPlant Species Identification UsingComputer Vision Techniques A Systematic Literature ReviewrdquoArchives of Computational Methods in Engineering State-of-the-Art Reviews pp 1ndash37 2017

[6] T Munisami M Ramsurn S Kishnah and S Pudaruth ldquolantLeaf Recognition Using Shape Features and Colour Histogramwith K-nearest Neighbour Classifiersrdquo Procedia Computer Sci-ence vol 58 pp 740ndash747 2015

[7] J Chaki R Parekh and S Bhattacharya ldquoPlant leaf recognitionusing texture and shape features with neural classifiersrdquo PatternRecognition Letters vol 58 article no 6175 pp 61ndash68 2015

[8] C Zhao S S F Chan W-K Cham and L M Chu ldquoPlantidentification using leaf shapes - A pattern counting approachrdquoPattern Recognition vol 48 no 10 pp 3203ndash3215 2015

[9] M M S de Souza F N S Medeiros G L B Ramalho I Cde Paula Jr and I N S Oliveira ldquoEvolutionary optimization ofa multiscale descriptor for leaf shape analysisrdquo Expert Systemswith Applications vol 63 pp 375ndash385 2016

[10] N Liu and J-M Kan ldquoImproved deep belief networks andmulti-feature fusion for leaf identificationrdquo Neurocomputingvol 216 pp 460ndash467 2016

[11] B VijayaLakshmi and V Mohan ldquoKernel-based PSO andFRVM An automatic plant leaf type detection using textureshape and color featuresrdquo Computers and Electronics in Agri-culture vol 125 pp 99ndash112 2016

[12] T P Kumar M V P Reddy and P K Bora ldquoLeaf identificationusing shape and texture featuresrdquoAdvances in Intelligent Systemsand Computing vol 460 pp 531ndash541 2017

[13] Z Zhang Y Xu J Yang X Li andD Zhang ldquoA survey of sparserepresentation algorithms and applicationsrdquo IEEEAccess vol 3pp 490ndash530 2015

[14] H Cheng ldquoSparse Representation Modeling and Learningin Visual Recognition Theory Algorithms and ApplicationsrdquoAdvances in Computer Vision amp Pattern Recognition vol 6 pp1408ndash1425 93

[15] J-K Hsiao L-W Kang C-L Chang and C-Y Lin ldquoLearning-based leaf image recognition frameworksrdquo Studies in Computa-tional Intelligence vol 591 pp 77ndash91 2015

[16] T Jin X Hou P Li and F Zhou ldquoA novel method of automaticplant species identification using sparse representation of leaftooth featuresrdquo PLoS ONE vol 10 no 10 Article ID e01394822015

[17] C-G Li J Guo and H-G Zhang ldquoLocal sparse representationbased classificationrdquo in Proceedings of the 20th InternationalConference on Pattern Recognition (ICPR rsquo10) pp 649ndash652 2010

[18] J-X Mi and J-X Liu ldquoFace Recognition Using Sparse Repre-sentation-Based Classification on K-Nearest Subspacerdquo PLoSONE vol 8 no 3 Article ID e59430 2013

[19] Q Liu ldquoKernel Local Sparse Representation Based ClassifierrdquoNeural Processing Letters vol 43 no 1 pp 85ndash95 2016

[20] E Elhamifar and R Vidal ldquoRobust classification using struc-tured sparse representationrdquo in Proceedings of the IEEE Confer-ence on Computer Vision and Pattern Recognition (CVPR rsquo11)pp 1873ndash1879 June 2011

[21] Z Fan M Ni Q Zhu and E Liu ldquoWeighted sparse represen-tation for face recognitionrdquo Neurocomputing vol 151 no 1 pp304ndash309 2015

[22] JWright A Y Yang A Ganesh S S Sastry and YMa ldquoRobustface recognition via sparse representationrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 31 no 2 pp210ndash227 2009

[23] J Mairal F Bach J Ponce G Sapiro and A ZissermanldquoDiscriminative learned dictionaries for local image analysisrdquo inProceedings of the 26th IEEEConference onComputer Vision andPattern Recognition (CVPR rsquo08) pp 1ndash8 Anchorage AlaskaUSA June 2008

[24] J Zhang D Zhao F Jiang and W Gao ldquoStructural groupsparse representation for image Compressive Sensing recoveryrdquoin Proceedings of the 2013 Data Compression Conference DCC2013 pp 331ndash340 USA March 2013

[25] Y Sun Q Liu J Tang and D Tao ldquoLearning discriminativedictionary for group sparse representationrdquo IEEE Transactionson Image Processing vol 23 no 9 pp 3816ndash3828 2014

[26] S S Chen D L Donoho and M A Saunders ldquoAtomic decom-position by basis pursuitrdquo SIAM Review vol 43 no 1 pp 129ndash159 2001

[27] X TangG Feng and J Cai ldquoWeighted group sparse representa-tion for undersampled face recognitionrdquo Neurocomputing vol145 pp 402ndash415 2014

[28] G Cerutti L Tougne D Coquin and A Vacavant ldquoLeaf mar-gins as sequences A structural approach to leaf identificationrdquoPattern Recognition Letters vol 49 pp 177ndash184 2014

[29] J-X Du M-W Shao C-M Zhai J Wang Y Tang and CL P Chen ldquoRecognition of leaf image set based on manifold-manifold distancerdquo Neurocomputing vol 188 pp 131ndash138 2016

[30] A Hasim Y Herdiyeni and S Douady ldquoLeaf Shape Recogni-tion using Centroid Contour Distancerdquo in Proceedings of theInternational Seminar on Science of Complex Natural Systems2015 ISS-CNS 2015 Indonesia October 2015

[31] B Li CWang and D-S Huang ldquoSupervised feature extractionbased on orthogonal discriminant projectionrdquoNeurocomputingvol 73 no 1-3 pp 191ndash196 2009

[32] C C Aggarwal and C K Reddy Data Clustering Algorithmsand Applications Chapman amp HallCRC 2014

[33] httptheseedsitecoukleafshapeshtml[34] httpsenwikipediaorgwikiGlossary of leaf morphology[35] H Tsukaya ldquoMechanism of leaf-shape determinationrdquo Annual

Review of Plant Biology vol 57 pp 477ndash496 2006[36] J Dkhar and A Pareek ldquoWhat determines a leaf rsquos shaperdquo

EvoDevo vol 5 no 1 article no 47 2014

Submit your manuscripts athttpswwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 11: ResearchArticle Discriminant WSRC for Large-Scale Plant ...downloads.hindawi.com/journals/cin/2017/9581292.pdf · ResearchArticle Discriminant WSRC for Large-Scale Plant Species Recognition

Submit your manuscripts athttpswwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014