13
Research Article Active Discriminative Dictionary Learning for Weather Recognition Caixia Zheng, 1,2 Fan Zhang, 1 Huirong Hou, 1 Chao Bi, 1,2 Ming Zhang, 1,3 and Baoxue Zhang 4 1 School of Computer Science and Information Technology, Northeast Normal University, Changchun 130117, China 2 School of Mathematics and Statistics, Northeast Normal University, Changchun 130024, China 3 Key Laboratory of Intelligent Information Processing of Jilin Universities, Northeast Normal University, Changchun 130117, China 4 College of Statistics, Capital University of Economics and Business, Beijing 100070, China Correspondence should be addressed to Ming Zhang; [email protected] and Baoxue Zhang; [email protected] Received 12 December 2015; Accepted 22 March 2016 Academic Editor: Xiao-Qiao He Copyright © 2016 Caixia Zheng et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Weather recognition based on outdoor images is a brand-new and challenging subject, which is widely required in many fields. is paper presents a novel framework for recognizing different weather conditions. Compared with other algorithms, the proposed method possesses the following advantages. Firstly, our method extracts both visual appearance features of the sky region and physical characteristics features of the nonsky region in images. us, the extracted features are more comprehensive than some of the existing methods in which only the features of sky region are considered. Secondly, unlike other methods which used the traditional classifiers (e.g., SVM and K-NN), we use discriminative dictionary learning as the classification model for weather, which could address the limitations of previous works. Moreover, the active learning procedure is introduced into dictionary learning to avoid requiring a large number of labeled samples to train the classification model for achieving good performance of weather recognition. Experiments and comparisons are performed on two datasets to verify the effectiveness of the proposed method. 1. Introduction Traditional weather detection depends on expensive sensors and is restricted to the number of weather stations. If we can use existing surveillance cameras capturing images from the local environment to detect weather conditions, it would be possible to turn weather observation and recognition into a low-cost and powerful computer vision application. In addition, most of current computer vision systems are designed to execute in clear weather [1]. However, in many outdoor applications (e.g., driver assistance systems [2], video surveillance [3], and robot navigation [4]), there is no escape of “bad” weather. Hence, research of weather recognition based on images is in urgent demand, which can be used to recognize the weather conditions for many vision systems to adaptively turn their models or adjust parameters under the different weathers. To date, despite its remarkable value, there are only a few of works that have been proposed to address the weather recognition problem. In [5, 6], some researchers proposed weather recognition models (sunny or rainy) from images captured by in-vehicle cameras. However, these models heav- ily relied on prior information of vehicles, which may weaken their performances. Song et al. [7] proposed a method to classify traffic images into sunny, fog, snowy, and rainy. eir method extracted several features such as the image inflection point, power spectral slope, and image noise and used K-nearest neighbor (K-NN) as classification model. is method applied only to classify the weather conditions of the traffic scene. Lu et al. [8] applied collaborative learn- ing approach to label the outdoor image as either sunny or cloudy. In their method, an existence vector is firstly introduced to indicate the confidence in the corresponding weather feature being present in the given image. en, the existence vector and weather features are combined for weather recognition. However, this method involved many complicated technologies (shadow detection, image matting, etc.); thus its performance largely depended on the accuracies of these technologies. Chen et al. [9] employed support vector machine (SVM) with the help of active learning to classify the Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2016, Article ID 8272859, 12 pages http://dx.doi.org/10.1155/2016/8272859

Research Article Active Discriminative Dictionary Learning

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Research Article Active Discriminative Dictionary Learning

Research ArticleActive Discriminative Dictionary Learning forWeather Recognition

Caixia Zheng12 Fan Zhang1 Huirong Hou1 Chao Bi12 Ming Zhang13 and Baoxue Zhang4

1School of Computer Science and Information Technology Northeast Normal University Changchun 130117 China2School of Mathematics and Statistics Northeast Normal University Changchun 130024 China3Key Laboratory of Intelligent Information Processing of Jilin Universities Northeast Normal University Changchun 130117 China4College of Statistics Capital University of Economics and Business Beijing 100070 China

Correspondence should be addressed to Ming Zhang zhangm545nenueducn and Baoxue Zhang bxzhangnenueducn

Received 12 December 2015 Accepted 22 March 2016

Academic Editor Xiao-Qiao He

Copyright copy 2016 Caixia Zheng et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Weather recognition based on outdoor images is a brand-new and challenging subject which is widely required inmany fieldsThispaper presents a novel framework for recognizing different weather conditions Compared with other algorithms the proposedmethod possesses the following advantages Firstly our method extracts both visual appearance features of the sky region andphysical characteristics features of the nonsky region in images Thus the extracted features are more comprehensive than someof the existing methods in which only the features of sky region are considered Secondly unlike other methods which used thetraditional classifiers (eg SVMandK-NN)we use discriminative dictionary learning as the classificationmodel forweather whichcould address the limitations of previous works Moreover the active learning procedure is introduced into dictionary learning toavoid requiring a large number of labeled samples to train the classification model for achieving good performance of weatherrecognition Experiments and comparisons are performed on two datasets to verify the effectiveness of the proposed method

1 Introduction

Traditional weather detection depends on expensive sensorsand is restricted to the number of weather stations If wecan use existing surveillance cameras capturing images fromthe local environment to detect weather conditions it wouldbe possible to turn weather observation and recognitioninto a low-cost and powerful computer vision applicationIn addition most of current computer vision systems aredesigned to execute in clear weather [1] However in manyoutdoor applications (eg driver assistance systems [2] videosurveillance [3] and robot navigation [4]) there is no escapeof ldquobadrdquo weather Hence research of weather recognitionbased on images is in urgent demand which can be used torecognize the weather conditions for many vision systems toadaptively turn their models or adjust parameters under thedifferent weathers

To date despite its remarkable value there are only a fewof works that have been proposed to address the weatherrecognition problem In [5 6] some researchers proposed

weather recognition models (sunny or rainy) from imagescaptured by in-vehicle cameras However thesemodels heav-ily relied on prior information of vehicles whichmay weakentheir performances Song et al [7] proposed a method toclassify traffic images into sunny fog snowy and rainyTheir method extracted several features such as the imageinflection point power spectral slope and image noise andusedK-nearest neighbor (K-NN) as classificationmodelThismethod applied only to classify the weather conditions ofthe traffic scene Lu et al [8] applied collaborative learn-ing approach to label the outdoor image as either sunnyor cloudy In their method an existence vector is firstlyintroduced to indicate the confidence in the correspondingweather feature being present in the given image Thenthe existence vector and weather features are combined forweather recognition However this method involved manycomplicated technologies (shadow detection image mattingetc) thus its performance largely depended on the accuraciesof these technologies Chen et al [9] employed support vectormachine (SVM)with the help of active learning to classify the

Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2016 Article ID 8272859 12 pageshttpdxdoiorg10115520168272859

2 Mathematical Problems in Engineering

weather conditions of images into sunny cloudy or overcastNevertheless they only extracted the features from sky partof the images and the useful information in nonsky part ofimages is neglected

In general the above methods have been successfullyapplied to the applications of weather recognition Howeverthey still suffer from the following limitations Firstly sincethese methods just extracted the features (eg SIFT LBP andHSV color) from whole images or only sky part of imagesthey neglected the different useful information in the sky partand nonsky part of images In the sky part of images thedistribution of cloud and the color of sky are key factors forweather recognition so the visual appearance feature suchas the texture color and shape should be extracted In thenonsky part of images the features based on physical prop-erties which can characterize the changes of images causedby the varying weather conditions and thus image contrast[5] and dark channel [10] should be considered Secondlythese methods directly used K-NN or SVM based algorithmsto classify the different weather conditions Although K-NNis a simple classifier and it is easy to implement it is notrobust enough for the complicated real-world images in thepractical application while SVM is a complex classifier andit is difficult to select the appropriate kernel function Lastlythese methods required a large amount of labeled data fortraining which is often expensive and seldom available

To address the above problemswe propose a novel frame-work to recognize different weather conditions (sunnycloudy and overcast) from outdoor images acquired by astatic camera looking at the same scene over a period of timeThe proposed method extracts not only the features from thesky parts of images which are relevant to the visualmanifesta-tions of differentweather conditions but also the features basedon physical characteristics in the nonsky parts Thus theextracted features aremore comprehensive for distinguishingthe images captured under various weather situations Unlikeother methods which used the traditional classifier (egSVM K-NN) the discriminative dictionary learning is usedas the classification model Moreover in order to achievegood performance of weather recognition with a few labeledsamples active discriminative dictionary learning algorithm(ADDL) is proposed In ADDL the active learning proce-dure is introduced to the dictionary learning which selectsthe informative and representative samples for learning animpact and discriminated dictionary to classify weatherconditions As far as we know the proposed framework is thefirst approachwhich combines the active learning technologyinto the dictionary learning for weather recognition

The rest of this paper is organized as follows Section 2briefly reviews some related work Section 3 presents thedetails of the feature extraction and the proposed ADDLalgorithm Extensive experiments and comparisons are con-ducted in Section 4 and Section 5 is the conclusion of thepaper

2 Related Work

21 Dictionary Learning Dictionary learning has emerged inrecent years It is one of the most popular tools for learning

the intrinsic structure of images and has achieved state-of-the-art performances in many computer vision and patternrecognition tasks [11ndash13]The unsupervised dictionary learn-ing algorithms such asK-SVD [14] have achieved satisfactoryresults in image restoration but they are not suitable forclassification tasks because it only requires that the learneddictionary could faithfully represent the training samplesBy exploring the label information of training dataset somesupervised dictionary learning approaches have been pro-posed to learn a discriminative dictionary for the classifi-cation task Among these methods one may directly usetraining data of all classes as the dictionary and the test imagecan be classified by finding which class leads to the minimalreconstruction error Such a naive supervised dictionarylearning method is called sparse representation based clas-sification (SRC) algorithm [15] which has shown good per-formance in face recognition However it was less effective inclassification when the raw training images include the noiseand trivial information and cannot sufficiently exploit the dis-criminative information in the training data Fortunately thisproblem can be addressed by properly learning a dictionaryfrom the original training data After the test samples areencoded over the learned dictionary both the coding residualand the coding coefficients can be employed for identifyingthe different classes of samples [16] Jiang et al [17] proposed amethod named label consistent K-SVD (LC-K-SVD) whichencouraged samples from the same class to have similarsparse codes by applying a binary class label sparse codematrix Fisher discrimination dictionary learning (FDDL)method [18] was proposed based on the Fisher criterionwhich learned a structured dictionary to distinguish samplesfrom different classes In most existing supervised dictionarylearningmethods 119897

0-norm or 119897

1-norm sparsity regularization

was adopted As a result they often suffer heavy computa-tional costs in both training and testing phases In order toaddress this limitation Gu et al [19] developed a projectivedictionary pair learning (DPL) algorithm which learned ananalysis dictionary together with a synthesis dictionary toattain the goal of signal representation and discriminationDPLmethod can successfully avoid solving the 119897

0-norm or 119897

1-

norm to accelerate the training and test process so we adoptthe DPL algorithm as the classification model in our work

22 Active Learning Active learning which aims to con-struct an efficient training dataset for improving the classifi-cation performance through iterative sampling has been wellstudied in the computer vision fields According to [20] theexisting active learning algorithms can be generally dividedinto three categories stream-based [21 22] membershipquery synthesis [23 24] and pool-based [25ndash27] Amongthem pool-based active learning is most widely used for real-world learning problems because it assumes that there is asmall set of labeled data and a large pool of unlabeled dataavailable This assumption is consistent with the actual situa-tion In this paper we adopt the pool-based active learning

The crucial point of pool-based active learning is how todefine a strategy to rank the unlabeled sample in the poolThere are two criteria informativeness and representative-ness which are widely considered for evaluating unlabeled

Mathematical Problems in Engineering 3

Features extraction(1) Visual appearance based features(2) Physical characteristics based features

Testing datasetLabeled dataset

Unlabeled dataset Output finaldictionary

Measures(1) Informativeness(2) Representativeness

Dictionarylearning

Selected samples

Classification results

Yes No

Training dataset

Expa

ndin

g

Images

Iter le N

Figure 1 The flow chart of the proposed method

samples Informativeness measures the capacity of the sam-ples in reducing the uncertainty of the classification modelwhile representativeness measures whether the samples wellrepresent the overall input patterns of unlabeled data or not[20] Most of active learning methods only take one of thetwo criteria into account when selecting unlabeled sampleswhich restricts the performance of the active learning [28]Although several approaches [29ndash31] have considered bothinformativeness and representativeness of unlabeled data toour knowledge almost no researchers have introduced thesetwo criteria of active learning into the dictionary learningalgorithms

3 Proposed Method

In this section we present an efficient scheme including theeffective feature extraction and ADDL classification modelto classify the weather conditions of images into sunnycloudy and overcast Figure 1 shows a flow chart of theoverall procedure of our method First the visual appearancebased features are extracted from the sky region and thephysical characteristics based features are extracted fromthe nonsky region of images Secondly the labeled trainingdataset is used to learn an initial dictionary and then thesamples are iteratively selected from unlabeled dataset basedon two measures informativeness and representativenessThese selected samples are used to expand the labeled datasetfor learning a more discriminative dictionary Finally thetesting dataset is classified by the learned dictionary

31 Features Extraction Feature extraction is an essentialpreprocessing step in pattern recognition problems In order

to express the difference among the images of the samescene taken under various weather situations we analyze thedifferent visual manifestations of images caused by differentweather conditions and extract the features which describeboth visual appearance properties and physical characteris-tics of images

From the viewpoint of visual appearance features ofimages the sky is the most important part in an outdoorimage for identifying theweather In the case of sunny the skyappears blue because the light is refracted as it passes throughthe atmosphere scattering blue light while under the over-cast condition the sky is white or grey due to the thick cloudcover The cloudy day is a situation between sunny and over-cast On a cloudy day the clouds are floating in the sky whichexhibit a wide range of shapes and textures Hence we extractthe SIFT [32] HSV color LBP [33] and the gradient magni-tude of the sky parts of images as the visual appearance featureof images These extracted features are used to describe thetexture color and shape of images for weather classification

Unlike Chen et al [9] who directly eliminated the nonskyregions of images we extract two features in the nonsky partsof images based on the physical characteristics which alsocan be used as powerful features to distinguish the differentweather conditions The interaction of light with the atmo-sphere has been studied as atmospheric optics In the sunny(clear) day the light rays reflected by scene objects reachto the observer without alteration or attenuation Howeverunder bad weather conditions atmospheric effects cannot beneglected anymore [34ndash36] The bad weather (eg overcast)causes decay in the image contrast which is exponential inthe depths of scene points [35] So images of the same scenetaken in different weather conditions should have different

4 Mathematical Problems in Engineering

contrasts The contrast CON of an image can be computedby

CON =119864max minus 119864min119864max + 119864min

(1)

where 119864max and 119864min are the maximum and minimumintensities of the image respectively

Overcast weather may come with haze [8] The darkchannel prior presented in [10] is effective to detect theimage haze It is based on the observation that most haze-free patches in the nonsky regions of images should have avery low intensity at least one RGB color channel The darkchannel 119869dark was defined as

119869dark

(119909) = min119888isin119903119892119887

( min1199091015840isinΩ(119909)

(119869119888(1199091015840))) (2)

where 119869119888 is one color channel of the image J and Ω(119909) is a

local patch centered at 119909In summary both the visual appearance features (SIFT

HSV color LBP and the gradient magnitude) of the skyregion and the physical characteristics based features (thecontrast and the dark channel) of the nonsky region areextracted to distinguish the images under different weathersituations And then the ldquobag-of-wordsrdquo model [37] is usedto code each feature for forming the feature vectors

32 The Proposed ADDL Classification Model Inspired bythe technological advancements of discriminative dictionarylearning and active learning we design an active discrimi-native dictionary learning (ADDL) algorithm to improve thediscriminative power of the learned dictionary In ADDLalgorithm DPL [19] is applied to learn a discriminativedictionary for recognizing different weather conditions andthe strategy of active samples selection is developed toiteratively select the unlabeled sample from a given pool toenlarge the training dataset for improving the DPL classifi-cation performance The criterion of active samples selectionincludes the informativenessmeasure and the representative-ness measure

321 DPL Algorithm Suppose there is a set of 119898-dimen-sionality training samples from 119862 classes denoted by 119883

= [1198831 119883

119888 119883

119862] with the label set 119884 = [119884

1 119884

119888

119884119862] where 119883

119888isin 119877119898times119899 denotes the sample set of 119888th class

and 119884119888denotes the corresponding label set DPL [19] algo-

rithm jointly learned an analysis dictionary 119875 = [1198751 119875

119888

119875119862] isin 119877119896119862times119898 and a synthesis dictionary 119863 = [119863

1 119863

119888

119863119862] isin 119877119898times119896119862 to avoid resolving the costly 119897

0-norm or 119897

1-

norm sparse coding process 119875 and 119863 were used for linearencoding representation coefficients and class-specific dis-criminative reconstruction respectively The object functionof DPL model is

119875lowast 119863lowast = argmin

119875119863

119862

sum

119888=1

1003817100381710038171003817119883119888 minus 119863119888119875119888119883119888

10038171003817100381710038172

119865+ 120582

10038171003817100381710038171003817119875119888119883119888

10038171003817100381710038171003817

2

119865

st 100381710038171003817100381711988911989410038171003817100381710038172

2le 1

(3)

where 119863119888

isin 119877119898times119896 and 119875

119888isin 119877119896times119898 represent subdictionary

pairs corresponding to class 119888119883119888represents the complemen-

tary matrix of 119883119888 120582 ≻ 0 is a scalar constant to control the

discriminative property of 119875 and 119889119894denotes the 119894th element

of dictionary 119863The objective function in (3) is generally nonconvex But

it can be relaxed to the following form by introducing avariable matrix 119860

119875lowast 119860lowast 119863lowast = argmin119875119860119863

119862

sum

119888=1

1003817100381710038171003817119883119888 minus 119863119888119860119888

10038171003817100381710038172

119865

+ 1205911003817100381710038171003817119875119888119883119888 minus 119860

119888

10038171003817100381710038172

119865

+ 12058210038171003817100381710038171003817119875119888119883119888

10038171003817100381710038171003817

2

119865

st 100381710038171003817100381711988911989410038171003817100381710038172

2le 1

(4)

where 120591 is an algorithm parameter According to [19] theobjective function in (4) can be solved by an alternativelyupdated manner

When 119863 and 119875 have been learned given a test sample119909119905 the class-specific reconstruction residual is used to assign

the class label So the classification model associated with theDPL is defined as

label (119909119905) = argmin

119888

1003817100381710038171003817119909119905 minus 119863119888119875119888119909119905

10038171003817100381710038172 (5)

When dealing with the classification task DPL requiressufficient labeled training samples to learn discriminativedictionary pair for obtaining good results In fact it is difficultand expensive to obtain the vast quantity of labeled dataIf we can exploit the information provided by the massiveinexpensive unlabeled samples and choose small amounts ofthe ldquoprofitablerdquo samples (the ldquoprofitablerdquo unlabeled samplesare the ones that are most beneficial for the improvement ofthe DPL classification performance) from unlabeled datasetto be labeled manually we would learn a more discriminativedictionary than the one learned only using a limited numberof labeled training data To achieve this we introduce theactive learning technique to DPL in the next section

322 Introducing Active Learning to DPL When evaluatingone sample is ldquoprofitablerdquo or not two measures are consid-ered informativeness and representativeness The proposedADDL iteratively evaluates both informativeness and repre-sentativeness of unlabeled samples in a given pool for seekingthe ones that are most beneficial for the improvement oftheDPL classification performance Specifically the informa-tiveness measure is constructed based on the reconstructionerror and the entropy on the probability distribution over theclass-specific reconstruction error and the representativenessis obtained from the distribution of the unlabeled dataset inthis study

Assume that we are given an initial training dataset D119897=

1199091 1199092 119909

119899119897 with its label set Y

119897= 1199101 1199102 119910

119899119897 and

an unlabeled dataset D119906

= 119909119899119897+1

119909119899119897+2

119909119873 where 119909

119894isin

119877119898times1 is an 119898 dimensional feature vector and 119910

119894isin 1 2 3

is the corresponding class label (sunny cloudy or overcast)

Mathematical Problems in Engineering 5

The target of ADDL is to iteratively select119873119904most ldquoprofitablerdquo

samples denoted by D119904 from D

119906to query their labels Y

119904and

then add them to D119897for improving the performance of the

dictionary learning classification model

Informativeness Measure Informativeness measure is aneffective criterion to select informative samples for reducingthe uncertainty of the classification model which capturesthe relationship of the candidate samples with the currentclassification model Probabilistic classification models selectthe one that has the largest entropy on the conditionaldistribution over its labels [38 39] Query-by-committeealgorithms choose the samples which have themost disagree-ment among a committee [22 26] In SVM methods themost informative sample is regarded as the one that is closestto the separating hyperplane [40 41] Since DPL is usedas the classification model in our framework we design aninformativeness measure based on the reconstruction errorand the entropy on the probability distribution over the class-specific reconstruction error of the sample

For dictionary learning the samples which are well-represented through the current learned dictionary are lesslikely to provide more information in further refining thedictionary Instead the samples have large reconstructionerror and large uncertainty should be mainly cared aboutbecause they have some additional information that is notcaptured by the current dictionary As a consequence theinformativeness measure is defined as follows

119864 (119909119895) = Error 119877

119895+ Entropy

119895 (6)

where Error 119877119895and Entropy

119895denote the reconstruction error

of the sample 119909119895

isin D119906with respect to the current learned

dictionary and the entropy of probability distribution overclass-specific reconstruction error of the sample 119909

119895isin D119906

respectively Error 119877119895is defined as

Error 119877119895= min119888

10038171003817100381710038171003817119909119895minus 119863119888119875119888119909119895

10038171003817100381710038171003817

2

(7)

where119863119888and119875119888represent subdictionary pairs corresponding

to the class 119888 which is learned by DPL algorithm as shownin (4) The larger Error 119877

119895indicates that the current learned

dictionary does not represent the sample 119909119895well

Since the class-specific reconstruction error is used toidentify the class label of sample 119909

119895(as shown in (5)) the

probability distribution of class-specific reconstruction errorof 119909119895can be acquired The class-specific reconstruction error

probability of 119909119895in class c is defined as

119901119888(119909119895) =

10038171003817100381710038171003817119909119895minus 119863119888119875119888119909119895

10038171003817100381710038171003817

2

sum119862

119888=1

10038171003817100381710038171003817119909119895minus 119863119888119875119888119909119895

10038171003817100381710038171003817

2 (8)

The class probability distribution 119901(119909119895) for sample 119909

119895is

computed as 119901(119909119895) = [119901

1 1199012 119901

119862] which demonstrates

how well the dictionary distinguishes the input sample Thatis if an input sample can be expressed well by the currentdictionary we will get a small value of 119909

119895minus 1198631198881198751198881199091198952

to one of class-specific subdictionaries and thus the classdistribution should reach the valley at the most likely classEntropy is a measure of uncertainty Hence in order toestimate the uncertainty of an input sample label the entropyof probability distribution over class-specific reconstructionerror is calculated as follows

Entropy119895= minus

119862

sum

119888=1

119901119888(119909119895) log119901

119888(119909119895) (9)

The high Entropy119895value demonstrates that 119909

119895is difficult to be

classified by the current learned dictionary thus it should beselected to be labeled and added to the labeled set for furthertraining dictionary learning

Representativeness Measure Since the informativeness mea-sure only considers how the candidate samples relate to thecurrent classificationmodel it ignores the potential structureinformation of the whole input unlabeled dataset Thereforethe representativeness measure is employed as an additionalcriterion to choose the useful unlabeled samples Represen-tativeness measure is to evaluate whether the samples wellrepresent the overall input patterns of unlabeled data whichexploits the relation between the candidate sample and therest of unlabeled samples

The distribution of unlabeled data is very useful fortraining a good classifier In previous active learning workthe marginal density and cosine distance are used as therepresentativeness measure to gain the information of datadistribution [39 42] Li and Guo [43] defined a morestraightforward representativeness measure called mutualinformation Its intention is to select the samples located inthe density region of the unlabeled data distribution which ismore representative regarding the remaining unlabeled datathan the ones located in the sparse region We introduce theframework of representativenessmeasure proposed by Li andGuo [43] into the dictionary learning

For an unlabeled sample 119909119895 the mutual information with

respect to other unlabeled samples is defined as follows

119872(119909119895) = 119867(119909

119895) minus 119867(119909

119895| 119883119880119895

) =1

2ln(

1205902

119895

1205902119895|119880119895

) (10)

where 119867(119909119895) and 119867(119909

119895| 119883119880119895

) denote the entropy and theconditional entropy of sample 119909

119895 respectively 119880

119895represents

the index set of unlabeled samples where 119895 has been removedfrom 119880 119880

119895= 119880 minus 119895 and 119883

119880119895

represents the set of samplesindexed by119880

119895 1205902119895and 120590

2

119895|119880119895

can be calculated by the followingformulas

1205902

119895= K (119909

119895 119909119895)

1205902

119895|119880119895

= 1205902

119895minus sum

119895119880119895

minus1

sum

119880119895119880119895

sum

119880119895119895

(11)

6 Mathematical Problems in Engineering

Inputs Labeled set D119897and its label set Y

119897 Unlabeled set D

119906 the number of iteration 119868

119905

and the number of unlabeled samples 119873119904to be selected in each iteration

(1) Initialization Learn an initial dictionary pair 119863lowast and 119875lowast by DPL algorithm from the D

119897

(2) For 119894 = 1 to 119868119905 do

(3) Compute 119864(119909119895) and 119872(119909

119895) by (6) and (10) for each sample 119909

119895in the unlabeled dataset D

119906

(4) Select 119873119904samples (denoted by D

119904) from the D

119906by (13) and add them into D

119897with their

class labels which manually assigned by user Then updates D119906= D119906minus D119904and D

119897= D119897cup D119904

(5) Learn the refined dictionaries 119863lowastnew and 119875lowast

new over the expanded dataset D119897

(6) End forOutput Final learned dictionary pair 119863lowastnew and 119875

lowast

new

Algorithm 1 Active discriminative dictionary learning (ADDL)

Assume that the index set 119880119895= (1 2 3 119905) and sum

119880119895119880119895

is a kernel matrix defined over all the unlabeled samplesindexed by 119880

119895 it is computed by the following form

sum

119880119895119880119895

= (

K (1199091 1199091) K (119909

1 1199092) sdot sdot sdot K (119909

1 119909119905)

K (1199092 1199091) K (119909

2 1199092) sdot sdot sdot K (119909

2 119909119905)

K (119909119905 1199091) K (119909

119905 1199092) sdot sdot sdot K (119909

119905 119909119905)

)

(12)

K(sdot) is a symmetric positive definite kernel function Inour approach we apply simple and effective linear kernelK(119909119895 119909119895) = 119909

119894minus 1199091198952 for our dictionary learning task

The mutual information is used to implicitly exploit theinformation between the selected samples and the remainingones The samples which have large mutual informationshould be selected from the pool of unlabeled data for refiningthe learned dictionary in DPL

Procedure of Active Samples Selection Based on the aboveanalysis we aim to integrate the strengths of informativenessmeasure and representativeness measure to select the unla-beled samples from the pool of the unlabeled dataWe choosethe samples that have not only large reconstruction errorand the entropy of probability distribution over class-specificreconstruction error with respect to the DPL classificationmodel but also the large representativeness regarding the restof unlabeled samples The sample set D

119904= 119909119904

1 119909119904

2 119909

119904

119873119904

are iteratively selected from pool by the following formula

119909119904= argmin119909119895

(119864 (119909119895) + 119872(119909

119895))

119895 isin 119899119897 + 1 119899119897 + 2 119873

(13)

The overall of our ADDL is given in Algorithm 1

4 Experiments

In this section the performance of the proposed frameworkis evaluated on two weather datasets We first give the

details about the datasets and experimental settingsThen theexperimental results are provided and analyzed

41 Datasets and Experimental Setting

Datasets The first dataset employed in our experiment is thedataset provided by Chen et al [9] (denoted as DATASET1) DATASET 1 contains 1000 images of size 3966 times 270 andeach image has been manually labeled as sunny cloudy orovercast There are 276 sunny images 251 cloudy images and473 overcast images in DATASET 1 Figure 2 shows threeimages from DATASET 1 with the label sunny cloudy andovercast respectively

Because there are few available public datasets forweather recognition we construct a new dataset (denotedas DATASET 2) to test the performance of the proposedmethod The images in DATASET 2 are selected from thepanorama images collected on the roof of BC buildingat EPFL (httppanoramaepflch provides high resolution(13200times 900) panorama images from 2005 till now recordingat every 10 minutes during daytime) and categorized intosunny cloudy or overcast based on the classification criterionpresented in [9] It includes 5000 images whichwere capturedat approximately every 30 minutes during daytime in 2014and the size of each image is 4821 times 400 Although bothDATASET 1 and DATASET 2 are constructed based on theimages provided by httppanoramaepflch DATASET 2is more challenging because it contains a large number ofimages captured in different seasons In Figure 3 some exam-ples of different labeled images in DATASET 2 are shown

Experimental Setting For the sky region of each image fourtypes of visual appearance features are extracted including200-dim SIFT 600-dim HSV color feature (200-dim his-togram of H channel 200-dim histogram of S channel and200-dimhistogramofV channel) 200-dimLBP and 200-dimgradient magnitude featureThese features are only extractedfrom the sky part of each image and the sky detector andfeature extraction procedure provided by Chen et al [9] areused in this paper Besides the visual appearance features twokinds of features based on physical characteristics of imagescaptured under different weather are also extracted fromthe nonsky region which consists of 200-dim bag-of-wordsrepresentation of the contrast and 200-dim bag-of-word

Mathematical Problems in Engineering 7

(a) (b)

(c)

Figure 2 Examples in DATASET 1 (a) Sunny (b) cloudy (c) overcast

(a) (b)

(c)

Figure 3 Examples in DATASET 2 (a) Sunny (b) cloudy (c) overcast

representation of the dark channel To be specific we divideeach nonsky region of the image into 32 times 32 blocks andextract the contrast and the dark channel features by (1)and (2) and then use bag-of-words model [37] to code eachfeature

In our experiment 50 images are randomly selectedin each dataset for training and the remaining data is usedfor testing The training data are randomly partitioned intolabeled sample set D

119897and unlabeled sample set D

119906 D119897is

applied for learning an initial dictionary andD119906is utilized for

actively selecting ldquoprofitablerdquo samples to iteratively improvethe classification performance To make the experimentresultsmore convincing each following experimental processis repeated ten times and then the mean and standarddeviation of the classification accuracy are reported

42 Experiment I Verifying the Performance of FeatureExtraction The effectiveness of the feature extraction ofour method is first evaluated Many previous works merelyextract the visual appearance features of the sky part forweather recognition [9] In order to validate the power ofour extracted features based on physical characteristics ofimages the results of two weather recognition schemes arecompared One only uses the visual appearance features toclassify the weather conditions and the other combines thevisual appearance features with features based on physicalcharacteristics of images to identify the weather conditionsIn order to weaken the influence of classifier on the results ofweather recognition119870-NN classification (119870 is experientiallyset to 30) SVM with the radial basis function kernel andthe original DPL without active samples selection procedureare applied in this experiment Figures 4 and 5 show thecomparison results on DATASET 1 and DATASET 2 respect-ively

In Figures 4 and 5 119909-axis represents the different numberof training samples and 119910-axis represents the average classifi-cation accuracy The red dotted lines indicate just six visualappearance features of the sky area are used and the bluesolid lines indicate both six visual appearance features and

two features based on physical characteristics of the nonskyarea are applied for recognition From Figures 4 and 5 it isclearly observed that the combination of visual appearancefeatures and physical features can achieve better performancefor weather recognition task

43 Experiment II Recognizing Weather Conditions by theProposed ADDL In this section the performance of theproposed ADDL algorithm is evaluated First experimentis conducted to give the best parameters for ADDL AndthenADDL is compared against several popular classificationmethods

431 Parameters Selection There are three important param-eters in the proposed approach that is 119896 120582 and 120591 119896 isthe number of atoms in each subdictionary 119863

119888learned from

samples in each class 120582 is used to control the discriminativeproperty of 119875 and 120591 is a scalar constant in DPL algorithmThe performances of our ADDL under various values of 119896 120582and 120591 are studied on DATASET 1 Figure 6(a) lists the classi-fication results when 119896 = 15 25 35 45 55 65 75 85 95 Itcan be seen that the highest average classification accuracyis obtained when 119896 = 25 This demonstrates that ADDLis effective to learn a compact dictionary According tothe observation in Figure 6(a) we set 119896 to be 25 in allexperiments The classification results obtained by using thedifferent 120591 and 120582 are shown in Figures 6(b) and 6(c) FromFigures 6(b) and 6(c) the optimal values of 120582 and 120591 are 005and 25 respectively This is because of the fact that a too bigor too small 120582 value will lead the reconstruction coefficient inADDL to be too sparse or too dense which will deterioratethe classification performance If 120591 is too large the effect ofthe reconstruction error constraint (the first term in (4)) andthe sparse constraint (the third term in (4)) is weakenedwhich will decrease the discrimination ability of the learneddictionary On the contrary if 120591 is too small the second termin (4) will be neglected in dictionary learning which alsoreduces the performance of algorithmHence we set120582 = 005

and 120591 = 25 for all experiments

8 Mathematical Problems in Engineering

100 200 300 400 50076

78

80

82

84

86

88

90

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(a) K-NN

100 200 300 400 50076

78

80

82

84

86

88

90

92

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(b) SVM

100 200 300 400 50084

86

88

90

92

94

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(c) DPL

Figure 4 Weather recognition results over DATASET 1 by using different features

Now we evaluate weather active samples selection canimprove the recognition performance 500 samples inDATASET 1 are randomly selected as training data and theremaining samples are used for testing In training dataset50 samples are randomly selected as the labeled dataset D

119897

and the remaining 450 samples are selected as the unlabeleddatasetD

119906The proposedADDLuses the labeled datasetD

119897to

learn an initial dictionary pair and then iteratively selects 50samples from D

119906to label for expanding the training dataset

Figure 7 shows the recognition accuracy versus the numberof iterations

In Figure 7 the 0th iteration indicates that we only usethe initial 50 labeled samples to learn the dictionary and the9th iterationmeans using all 500 training samples to learn thedictionary for recognition The recognition ability of ADDL

is improved by active samples selection procedure and itachieves highest accuracy when the number of iterationsis 3 total 200 samples are used for training It is worthmentioning that ADDL obtains the best results when thenumber of iterations is set as 3 If iterations are larger than 3the recognition rates will drop about 1This is because thereare somenoisy examples or ldquooutliersrdquo in the unlabeled datasetand the more noisy examples or ldquooutliersrdquo will be selectedto learn the dictionary along with the increase of iterationswhich interferes with the dictionary learning and leads to theclassification performance degradation In the following thenumber of iterations is set to 3 for all experiments

432 Comparisons of ADDL with Other Methods Here theproposed ADDL is compared with several methods The first

Mathematical Problems in Engineering 9

100 400 700 1000 1300 1600 1900 2200 250079

81

83

85

87

89

91

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(a) K-NN

100 400 700 1000 1300 1600 1900 2200 250083

85

87

89

91

93

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(b) SVM

100 400 700 1000 1300 1600 1900 2200 250086

88

90

92

94

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(c) DPL

Figure 5 Weather recognition results over DATASET 2 by using different features

two methods are 119870-NN algorithm used by Song et al [7]and SVM with the radial basis function kernel (RBF-SVM)adopted by Roser and Moosmann [5] The third methodis SRC [15] which directly uses all training samples as thedictionary for classification In order to confirm that theactive samples selection in our ADDLmethod is effective theproposed ADDL is compared with the original DPL method[19] We also compare ADDL with the method proposedby Chen et al [9] As far as we know the work in [9] isthe only framework which addresses the same problem withour method that is recognizing different weather conditions(sunny cloudy and overcast) of images captured by a stillcamera It actively selected useful samples to training SVMfor recognizing different weather conditions

For DATASET 1 500 images are randomly selected as thetraining samples and the rest of images are used as the testingsamples In the training dataset 50 images are randomlychosen as the initial labeled training dataset D

119897 and the

remaining 450 images are regarded as the unlabeled datasetD119906 ADDL and Chenrsquos method [9] both include the active

learning procedure thus they iteratively choose 150 samplesfrom D

119906to be labeled based on their criterion of the samples

selection and add these samples toD119897for further training the

classification model For 119870-NN RBF-SVM SRC and DPLmethods which are without the active learning procedure150 samples are randomly selected from D

119906to be labeled

for expanding the labeled training dataset D119897 Table 1 lists

the comparisons of our approach with several methods for

10 Mathematical Problems in Engineering

5 15 25 35 45 55 65 75 85 95

90

92

94Av

erag

e acc

urac

y (

)

k

(a)

00005 0005 005 05 10 30 50 70 90

90

92

94

Aver

age a

ccur

acy

()

120582

(b)

0005 05 5 25 45 65 85

90

92

94

Aver

age a

ccur

acy

()

120591

(c)

Figure 6 Selection of parameters 119909-axis represents the different values of parameters and 119910-axis represents the average classificationaccuracy (a)The average classification rate under different k (b)The average classification rate under different 120582 (c)The average classificationrate under different 120591

0 1 2 3 4 5 6 7 8 98284868890929496

Iteration

Aver

age a

ccur

acy

()

Figure 7 Recognition accuracy on DATASET 1 versus the numberof iterations

Table 1 Comparisons on DATASET 1 among different methods

Methods Classification rate (mean plusmn std)119870-NN 829plusmn 12RBF-SVM 856plusmn 16SRC 897plusmn 09DPL 913plusmn 16Chenrsquos method [9] 9298plusmn 055ADDL 940plusmn 02

weather classification As can be seen from Table 1 ADDLoutperforms other methods The mean classification rate ofADDL reaches about 94

In DATASET 2 2500 images are randomly selected asthe training samples and the rest of images are used as

Table 2 Comparison on DATASET 2 among different methods

Methods Classification rate (mean plusmn std)119870-NN 850plusmn 16RBF-SVM 881plusmn 14SRC 882plusmn 11DPL 896plusmn 08Chenrsquos method [9] 888plusmn 16ADDL 904plusmn 10

the testing samples In the training dataset 50 images arerandomly chosen as the initial labeled training datasetD

119897 the

remaining 2450 images are regarded as the unlabeled datasetD119897 All parameters setting for DATASET 2 are the same as

DATASET 1 Table 2 lists the recognition results of differentmethods which indicates that the validity of the proposedADDL is better than other methods

FromTables 1 and 2 two points can be observed First wecan find that the recognition performances of 119870-NN RBF-SVM SRC and DPL are overall inferior to the proposedADDL algorithm This is probably because these four algo-rithms randomly select the unlabeled data from the givenpool which do not consider whether the selected samples arebeneficial for improving the performance of the classificationmodel or not Second although the proposed ADDL andChenrsquosmethod [9] both include the active learning paradigmthe proposed ADDL performs better than Chenrsquos method[9] This is due to the fact that Chenrsquos method [9] onlyconsiders the informativeness and ignores representativeness

Mathematical Problems in Engineering 11

of samples when selecting the unlabeled samples from thegiven pool

5 Conclusions

We have presented an effective framework for classifyingthree types of weather (sunny cloudy and overcast) basedon the outdoor images Through the analysis of the differentvisual manifestations of images caused by different weathersthe various features are separately extracted from the sky areaand nonsky area of images which describes visual appear-ance properties and physical characteristics of images underdifferent weather conditions respectively ADDL approachwas proposed to learn a more discriminative dictionary forimproving the weather classification performance by select-ing the informative and representative unlabeled samplesfrom a given pool to expand the training dataset Sincethere is not much image dataset on weather recognition wehave collected and labeled a new weather dataset for testingthe proposed algorithm The experimental results show thatADDL is a fairly effective and inspiring strategy for weatherclassification which also can be used inmany other computervision tasks

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

This work is supported by Fund of Jilin Provincial Sci-ence amp Technology Department (nos 20130206042GX and20140204089GX) and National Natural Science Foundationof China (nos 61403078 11271064 and 61471111)

References

[1] F Nashashibi R de Charette and A Lia ldquoDetection ofunfocused raindrops on a windscreen using low level imageprocessingrdquo in Proceedings of the 11th International Conferenceon Control Automation Robotics and Vision (ICARCV rsquo10) pp1410ndash1415 Singapore December 2010

[2] H Kurihata T Takahashi I Ide et al ldquoRainy weather recog-nition from in-vehicle camera images for driver assistancerdquo inProceedings of the IEEE Intelligent Vehicles Symposium pp 205ndash210 IEEE Las Vegas Nev USA June 2005

[3] H Woo Y M Jung J-G Kim and J K Seo ldquoEnvironmentallyrobust motion detection for video surveillancerdquo IEEE Transac-tions on Image Processing vol 19 no 11 pp 2838ndash2848 2010

[4] H Katsura J Miura M Hild and Y Shirai ldquoA view-basedoutdoor navigation using object recognition robust to changesof weather and seasonsrdquo in Proceedings of the IEEERSJ Interna-tional Conference on Intelligent Robots and Systems (IROS rsquo03)vol 3 pp 2974ndash2979 IEEE Las Vegas Nev USAOctober 2003

[5] M Roser and F Moosmann ldquoClassification of weather sit-uations on single color imagesrdquo in Proceedings of the IEEEIntelligent Vehicles Symposium pp 798ndash803 IEEE EindhovenThe Netherlands June 2008

[6] X Yan Y Luo and X Zheng ldquoWeather recognition based onimages captured by vision system in vehiclerdquo in Advances in

Neural NetworksmdashISNN 2009 pp 390ndash398 Springer BerlinGermany 2009

[7] H Song Y Chen and Y Gao ldquoWeather condition recognitionbased on feature extraction and K-NNrdquo in Foundations andPractical Applications of Cognitive Systems and Information Pro-cessing Proceedings of the First International Conference on Cog-nitive Systems and Information Processing Beijing China Dec2012 (CSIP2012) Advances in Intelligent Systems and Comput-ing pp 199ndash210 Springer Berlin Germany 2014

[8] C Lu D Lin J Jia and C-K Tang ldquoTwo-class weather class-ificationrdquo in Proceedings of the 27th IEEE Conference on Com-puter Vision and Pattern Recognition (CVPR rsquo14) pp 3718ndash3725Columbus Ohio USA June 2014

[9] Z Chen F Yang A Lindner G Barrenetxea and M VetterlildquoHow is the weather automatic inference from imagesrdquo inProceedings of the 19th IEEE International Conference on ImageProcessing (ICIP rsquo12) pp 1853ndash1856 Orlando Fla USAOctober2012

[10] K He J Sun and X Tang ldquoSingle image haze removal usingdark channel priorrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 33 no 12 pp 2341ndash2353 2011

[11] A Li and H Shouno ldquoDictionary-based image denoising byfused-lasso atom selectionrdquo Mathematical Problems in Engi-neering vol 2014 Article ID 368602 10 pages 2014

[12] Z Feng M Yang L Zhang Y Liu and D Zhang ldquoJoint dis-criminative dimensionality reduction and dictionary learningfor face recognitionrdquo Pattern Recognition vol 46 no 8 pp2134ndash2143 2013

[13] J Dong C Sun andW Yang ldquoA supervised dictionary learningand discriminative weighting model for action recognitionrdquoNeurocomputing vol 158 pp 246ndash256 2015

[14] M Aharon M Elad and A Bruckstein ldquoK-SVD an algorithmfor designing overcomplete dictionaries for sparse representa-tionrdquo IEEE Transactions on Signal Processing vol 54 no 11 pp4311ndash4322 2006

[15] JWright A Y Yang A Ganesh S S Sastry and YMa ldquoRobustface recognition via sparse representationrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 31 no 2 pp210ndash227 2009

[16] J Mairal F Bach and J Ponce ldquoTask-driven dictionarylearningrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 34 no 4 pp 791ndash804 2012

[17] Z Jiang Z Lin and L S Davis ldquoLabel consistent K-SVD learn-ing a discriminative dictionary for recognitionrdquo IEEE Transac-tions on Pattern Analysis and Machine Intelligence vol 35 no11 pp 2651ndash2664 2013

[18] M Yang L Zhang X Feng and D Zhang ldquoSparse representa-tion based fisher discrimination dictionary learning for imageclassificationrdquo International Journal of Computer Vision vol109 no 3 pp 209ndash232 2014

[19] S Gu L Zhang W Zuo and X Feng ldquoProjective dictionarypair learning for pattern classificationrdquo inProceedings of the 28thAnnual Conference on Neural Information Processing Systems(NIPS rsquo14) pp 793ndash801 December 2014

[20] B Settles Active Learning Literature Survey vol 52 Universityof Wisconsin-Madison Madison Wis USA 2010

[21] D Cohn L Atlas and R Ladner ldquoImproving generalizationwith active learningrdquo Machine Learning vol 15 no 2 pp 201ndash221 1994

[22] I Dagan and S P Engelson ldquoCommittee-based sampling fortraining probabilistic classifiersrdquo in Proceedings of the 12th

12 Mathematical Problems in Engineering

International Conference on Machine Learning pp 150ndash157Morgan Kaufmann Tahoe City Calif USA 1995

[23] D Angluin ldquoQueries and concept learningrdquoMachine Learningvol 2 no 4 pp 319ndash342 1988

[24] R D King K E Whelan F M Jones et al ldquoFunctional geno-mic hypothesis generation and experimentation by a robot sci-entistrdquo Nature vol 427 no 6971 pp 247ndash252 2004

[25] S Chakraborty V Balasubramanian and S PanchanathanldquoDynamic batch mode active learningrdquo in Proceedings of theIEEE Conference on Computer Vision and Pattern Recognition(CVPR rsquo11) pp 2649ndash2656 Providence RI USA June 2011

[26] K Nigam and A McCallum ldquoEmploying EM in pool-basedactive learning for text classificationrdquo in Proceedings of the 15thInternational Conference on Machine Learning (ICML rsquo98) pp350ndash358 Morgan Kaufmann Madison Wis USA 1998

[27] A Kapoor K Grauman R Urtasun and T Darrell ldquoActivelearning with Gaussian processes for object categorizationrdquo inProceedings of the IEEE 11th International Conference on Com-puter Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro BrazilOctober 2007

[28] S J Huang R Jin and Z H Zhou ldquoActive learning by queryinginformative and representative examplesrdquo in Proceedings of theAnnual Conference on Neural Information Processing Systems(NIPS rsquo10) pp 892ndash900 2010

[29] P Donmez J G Carbonell and P N Bennett ldquoDual strategyactive learningrdquo inMachine Learning ECML 2007 J N Kok JKoronacki R L de Mantaras S Matwin D Mladenic and ASkowron Eds vol 4701 of Lecture Notes in Computer Sciencepp 116ndash127 Springer Berlin Germany 2007

[30] S C H Hoi R Jin J Zhu and M R Lyu ldquoSemi-supervisedSVM batch mode active learning for image retrievalrdquo in Pro-ceedings of the 26th IEEE Conference on Computer Vision andPattern Recognition (CVPR rsquo08) pp 1ndash7 Anchorage AlaskaUSA June 2008

[31] Z Xu K Yu V Tresp X Xu and J Wang ldquoRepresentativesampling for text classification using support vector machinesrdquoin Proceedings of the 25th European Conference on IR Research(ECIR rsquo03) Springer 2003

[32] A Bosch A Zisserman and X Munoz ldquoImage classificationusing random forests and fernsrdquo in Proceedings of the IEEE 11thInternational Conference on Computer Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro Brazil October 2007

[33] T Ojala M Pietikainen and D Harwood ldquoA comparativestudy of texture measures with classification based on featuredistributionsrdquo Pattern Recognition vol 29 no 1 pp 51ndash59 1996

[34] S G Narasimhan and S K Nayar ldquoChromatic framework forvision in bad weatherrdquo in Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo00) pp 598ndash605 Hilton Head Island SC USA June 2000

[35] S G Narasimhan and S K Nayar ldquoRemoving weather effectsfrom monochrome imagesrdquo in Proceedings of the IEEE Com-puter Society Conference on Computer Vision and Pattern Recog-nition vol 2 pp II-186ndashII-193 IEEE Kauai Hawaii USADecember 2001

[36] S G Narasimhan and S K Nayar ldquoVision and the atmosphererdquoInternational Journal of Computer Vision vol 48 no 3 pp 233ndash254 2002

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings ofthe International Workshop on Statistical Learning in ComputerVision (ECCV rsquo04) pp 1ndash22 Prague Czech Republic 2004

[38] D D Lewis andW A Gale ldquoA sequential algorithm for trainingtext classifiersrdquo in SIGIR rsquo94 Proceedings of the SeventeenthAnnual International ACM-SIGIR Conference on Research andDevelopment in Information Retrieval organised by Dublin CityUniversity pp 3ndash12 Springer Berlin Germany 1994

[39] B Settles and M Craven ldquoAn analysis of active learning strate-gies for sequence labeling tasksrdquo inProceedings of the Conferenceon Empirical Methods in Natural Language Processing (EMNLPrsquo08) pp 1070ndash1079 Association for Computational Linguistics2008

[40] G Schohn andD Cohn ldquoLess ismore active learning with sup-port vector machinesrdquo in Proceedings of the 17th InternationalConference on Machine Learning (ICML rsquo00) pp 839ndash846Morgan Kaufmann 2000

[41] S Tong and D Koller ldquoSupport vector machine active learningwith applications to text classificationrdquo The Journal of MachineLearning Research vol 2 pp 45ndash66 2002

[42] M Szummer and T S Jaakkola ldquoInformation regularizationwith partially labeled datardquo in Advances in Neural InformationProcessing Systems pp 1025ndash1032 MIT Press 2002

[43] X Li and Y Guo ldquoAdaptive active learning for image classifica-tionrdquo in Proceedings of the 26th IEEE Conference on ComputerVision and Pattern Recognition (CVPR rsquo13) pp 859ndash866 IEEEPortland Ore USA June 2013

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: Research Article Active Discriminative Dictionary Learning

2 Mathematical Problems in Engineering

weather conditions of images into sunny cloudy or overcastNevertheless they only extracted the features from sky partof the images and the useful information in nonsky part ofimages is neglected

In general the above methods have been successfullyapplied to the applications of weather recognition Howeverthey still suffer from the following limitations Firstly sincethese methods just extracted the features (eg SIFT LBP andHSV color) from whole images or only sky part of imagesthey neglected the different useful information in the sky partand nonsky part of images In the sky part of images thedistribution of cloud and the color of sky are key factors forweather recognition so the visual appearance feature suchas the texture color and shape should be extracted In thenonsky part of images the features based on physical prop-erties which can characterize the changes of images causedby the varying weather conditions and thus image contrast[5] and dark channel [10] should be considered Secondlythese methods directly used K-NN or SVM based algorithmsto classify the different weather conditions Although K-NNis a simple classifier and it is easy to implement it is notrobust enough for the complicated real-world images in thepractical application while SVM is a complex classifier andit is difficult to select the appropriate kernel function Lastlythese methods required a large amount of labeled data fortraining which is often expensive and seldom available

To address the above problemswe propose a novel frame-work to recognize different weather conditions (sunnycloudy and overcast) from outdoor images acquired by astatic camera looking at the same scene over a period of timeThe proposed method extracts not only the features from thesky parts of images which are relevant to the visualmanifesta-tions of differentweather conditions but also the features basedon physical characteristics in the nonsky parts Thus theextracted features aremore comprehensive for distinguishingthe images captured under various weather situations Unlikeother methods which used the traditional classifier (egSVM K-NN) the discriminative dictionary learning is usedas the classification model Moreover in order to achievegood performance of weather recognition with a few labeledsamples active discriminative dictionary learning algorithm(ADDL) is proposed In ADDL the active learning proce-dure is introduced to the dictionary learning which selectsthe informative and representative samples for learning animpact and discriminated dictionary to classify weatherconditions As far as we know the proposed framework is thefirst approachwhich combines the active learning technologyinto the dictionary learning for weather recognition

The rest of this paper is organized as follows Section 2briefly reviews some related work Section 3 presents thedetails of the feature extraction and the proposed ADDLalgorithm Extensive experiments and comparisons are con-ducted in Section 4 and Section 5 is the conclusion of thepaper

2 Related Work

21 Dictionary Learning Dictionary learning has emerged inrecent years It is one of the most popular tools for learning

the intrinsic structure of images and has achieved state-of-the-art performances in many computer vision and patternrecognition tasks [11ndash13]The unsupervised dictionary learn-ing algorithms such asK-SVD [14] have achieved satisfactoryresults in image restoration but they are not suitable forclassification tasks because it only requires that the learneddictionary could faithfully represent the training samplesBy exploring the label information of training dataset somesupervised dictionary learning approaches have been pro-posed to learn a discriminative dictionary for the classifi-cation task Among these methods one may directly usetraining data of all classes as the dictionary and the test imagecan be classified by finding which class leads to the minimalreconstruction error Such a naive supervised dictionarylearning method is called sparse representation based clas-sification (SRC) algorithm [15] which has shown good per-formance in face recognition However it was less effective inclassification when the raw training images include the noiseand trivial information and cannot sufficiently exploit the dis-criminative information in the training data Fortunately thisproblem can be addressed by properly learning a dictionaryfrom the original training data After the test samples areencoded over the learned dictionary both the coding residualand the coding coefficients can be employed for identifyingthe different classes of samples [16] Jiang et al [17] proposed amethod named label consistent K-SVD (LC-K-SVD) whichencouraged samples from the same class to have similarsparse codes by applying a binary class label sparse codematrix Fisher discrimination dictionary learning (FDDL)method [18] was proposed based on the Fisher criterionwhich learned a structured dictionary to distinguish samplesfrom different classes In most existing supervised dictionarylearningmethods 119897

0-norm or 119897

1-norm sparsity regularization

was adopted As a result they often suffer heavy computa-tional costs in both training and testing phases In order toaddress this limitation Gu et al [19] developed a projectivedictionary pair learning (DPL) algorithm which learned ananalysis dictionary together with a synthesis dictionary toattain the goal of signal representation and discriminationDPLmethod can successfully avoid solving the 119897

0-norm or 119897

1-

norm to accelerate the training and test process so we adoptthe DPL algorithm as the classification model in our work

22 Active Learning Active learning which aims to con-struct an efficient training dataset for improving the classifi-cation performance through iterative sampling has been wellstudied in the computer vision fields According to [20] theexisting active learning algorithms can be generally dividedinto three categories stream-based [21 22] membershipquery synthesis [23 24] and pool-based [25ndash27] Amongthem pool-based active learning is most widely used for real-world learning problems because it assumes that there is asmall set of labeled data and a large pool of unlabeled dataavailable This assumption is consistent with the actual situa-tion In this paper we adopt the pool-based active learning

The crucial point of pool-based active learning is how todefine a strategy to rank the unlabeled sample in the poolThere are two criteria informativeness and representative-ness which are widely considered for evaluating unlabeled

Mathematical Problems in Engineering 3

Features extraction(1) Visual appearance based features(2) Physical characteristics based features

Testing datasetLabeled dataset

Unlabeled dataset Output finaldictionary

Measures(1) Informativeness(2) Representativeness

Dictionarylearning

Selected samples

Classification results

Yes No

Training dataset

Expa

ndin

g

Images

Iter le N

Figure 1 The flow chart of the proposed method

samples Informativeness measures the capacity of the sam-ples in reducing the uncertainty of the classification modelwhile representativeness measures whether the samples wellrepresent the overall input patterns of unlabeled data or not[20] Most of active learning methods only take one of thetwo criteria into account when selecting unlabeled sampleswhich restricts the performance of the active learning [28]Although several approaches [29ndash31] have considered bothinformativeness and representativeness of unlabeled data toour knowledge almost no researchers have introduced thesetwo criteria of active learning into the dictionary learningalgorithms

3 Proposed Method

In this section we present an efficient scheme including theeffective feature extraction and ADDL classification modelto classify the weather conditions of images into sunnycloudy and overcast Figure 1 shows a flow chart of theoverall procedure of our method First the visual appearancebased features are extracted from the sky region and thephysical characteristics based features are extracted fromthe nonsky region of images Secondly the labeled trainingdataset is used to learn an initial dictionary and then thesamples are iteratively selected from unlabeled dataset basedon two measures informativeness and representativenessThese selected samples are used to expand the labeled datasetfor learning a more discriminative dictionary Finally thetesting dataset is classified by the learned dictionary

31 Features Extraction Feature extraction is an essentialpreprocessing step in pattern recognition problems In order

to express the difference among the images of the samescene taken under various weather situations we analyze thedifferent visual manifestations of images caused by differentweather conditions and extract the features which describeboth visual appearance properties and physical characteris-tics of images

From the viewpoint of visual appearance features ofimages the sky is the most important part in an outdoorimage for identifying theweather In the case of sunny the skyappears blue because the light is refracted as it passes throughthe atmosphere scattering blue light while under the over-cast condition the sky is white or grey due to the thick cloudcover The cloudy day is a situation between sunny and over-cast On a cloudy day the clouds are floating in the sky whichexhibit a wide range of shapes and textures Hence we extractthe SIFT [32] HSV color LBP [33] and the gradient magni-tude of the sky parts of images as the visual appearance featureof images These extracted features are used to describe thetexture color and shape of images for weather classification

Unlike Chen et al [9] who directly eliminated the nonskyregions of images we extract two features in the nonsky partsof images based on the physical characteristics which alsocan be used as powerful features to distinguish the differentweather conditions The interaction of light with the atmo-sphere has been studied as atmospheric optics In the sunny(clear) day the light rays reflected by scene objects reachto the observer without alteration or attenuation Howeverunder bad weather conditions atmospheric effects cannot beneglected anymore [34ndash36] The bad weather (eg overcast)causes decay in the image contrast which is exponential inthe depths of scene points [35] So images of the same scenetaken in different weather conditions should have different

4 Mathematical Problems in Engineering

contrasts The contrast CON of an image can be computedby

CON =119864max minus 119864min119864max + 119864min

(1)

where 119864max and 119864min are the maximum and minimumintensities of the image respectively

Overcast weather may come with haze [8] The darkchannel prior presented in [10] is effective to detect theimage haze It is based on the observation that most haze-free patches in the nonsky regions of images should have avery low intensity at least one RGB color channel The darkchannel 119869dark was defined as

119869dark

(119909) = min119888isin119903119892119887

( min1199091015840isinΩ(119909)

(119869119888(1199091015840))) (2)

where 119869119888 is one color channel of the image J and Ω(119909) is a

local patch centered at 119909In summary both the visual appearance features (SIFT

HSV color LBP and the gradient magnitude) of the skyregion and the physical characteristics based features (thecontrast and the dark channel) of the nonsky region areextracted to distinguish the images under different weathersituations And then the ldquobag-of-wordsrdquo model [37] is usedto code each feature for forming the feature vectors

32 The Proposed ADDL Classification Model Inspired bythe technological advancements of discriminative dictionarylearning and active learning we design an active discrimi-native dictionary learning (ADDL) algorithm to improve thediscriminative power of the learned dictionary In ADDLalgorithm DPL [19] is applied to learn a discriminativedictionary for recognizing different weather conditions andthe strategy of active samples selection is developed toiteratively select the unlabeled sample from a given pool toenlarge the training dataset for improving the DPL classifi-cation performance The criterion of active samples selectionincludes the informativenessmeasure and the representative-ness measure

321 DPL Algorithm Suppose there is a set of 119898-dimen-sionality training samples from 119862 classes denoted by 119883

= [1198831 119883

119888 119883

119862] with the label set 119884 = [119884

1 119884

119888

119884119862] where 119883

119888isin 119877119898times119899 denotes the sample set of 119888th class

and 119884119888denotes the corresponding label set DPL [19] algo-

rithm jointly learned an analysis dictionary 119875 = [1198751 119875

119888

119875119862] isin 119877119896119862times119898 and a synthesis dictionary 119863 = [119863

1 119863

119888

119863119862] isin 119877119898times119896119862 to avoid resolving the costly 119897

0-norm or 119897

1-

norm sparse coding process 119875 and 119863 were used for linearencoding representation coefficients and class-specific dis-criminative reconstruction respectively The object functionof DPL model is

119875lowast 119863lowast = argmin

119875119863

119862

sum

119888=1

1003817100381710038171003817119883119888 minus 119863119888119875119888119883119888

10038171003817100381710038172

119865+ 120582

10038171003817100381710038171003817119875119888119883119888

10038171003817100381710038171003817

2

119865

st 100381710038171003817100381711988911989410038171003817100381710038172

2le 1

(3)

where 119863119888

isin 119877119898times119896 and 119875

119888isin 119877119896times119898 represent subdictionary

pairs corresponding to class 119888119883119888represents the complemen-

tary matrix of 119883119888 120582 ≻ 0 is a scalar constant to control the

discriminative property of 119875 and 119889119894denotes the 119894th element

of dictionary 119863The objective function in (3) is generally nonconvex But

it can be relaxed to the following form by introducing avariable matrix 119860

119875lowast 119860lowast 119863lowast = argmin119875119860119863

119862

sum

119888=1

1003817100381710038171003817119883119888 minus 119863119888119860119888

10038171003817100381710038172

119865

+ 1205911003817100381710038171003817119875119888119883119888 minus 119860

119888

10038171003817100381710038172

119865

+ 12058210038171003817100381710038171003817119875119888119883119888

10038171003817100381710038171003817

2

119865

st 100381710038171003817100381711988911989410038171003817100381710038172

2le 1

(4)

where 120591 is an algorithm parameter According to [19] theobjective function in (4) can be solved by an alternativelyupdated manner

When 119863 and 119875 have been learned given a test sample119909119905 the class-specific reconstruction residual is used to assign

the class label So the classification model associated with theDPL is defined as

label (119909119905) = argmin

119888

1003817100381710038171003817119909119905 minus 119863119888119875119888119909119905

10038171003817100381710038172 (5)

When dealing with the classification task DPL requiressufficient labeled training samples to learn discriminativedictionary pair for obtaining good results In fact it is difficultand expensive to obtain the vast quantity of labeled dataIf we can exploit the information provided by the massiveinexpensive unlabeled samples and choose small amounts ofthe ldquoprofitablerdquo samples (the ldquoprofitablerdquo unlabeled samplesare the ones that are most beneficial for the improvement ofthe DPL classification performance) from unlabeled datasetto be labeled manually we would learn a more discriminativedictionary than the one learned only using a limited numberof labeled training data To achieve this we introduce theactive learning technique to DPL in the next section

322 Introducing Active Learning to DPL When evaluatingone sample is ldquoprofitablerdquo or not two measures are consid-ered informativeness and representativeness The proposedADDL iteratively evaluates both informativeness and repre-sentativeness of unlabeled samples in a given pool for seekingthe ones that are most beneficial for the improvement oftheDPL classification performance Specifically the informa-tiveness measure is constructed based on the reconstructionerror and the entropy on the probability distribution over theclass-specific reconstruction error and the representativenessis obtained from the distribution of the unlabeled dataset inthis study

Assume that we are given an initial training dataset D119897=

1199091 1199092 119909

119899119897 with its label set Y

119897= 1199101 1199102 119910

119899119897 and

an unlabeled dataset D119906

= 119909119899119897+1

119909119899119897+2

119909119873 where 119909

119894isin

119877119898times1 is an 119898 dimensional feature vector and 119910

119894isin 1 2 3

is the corresponding class label (sunny cloudy or overcast)

Mathematical Problems in Engineering 5

The target of ADDL is to iteratively select119873119904most ldquoprofitablerdquo

samples denoted by D119904 from D

119906to query their labels Y

119904and

then add them to D119897for improving the performance of the

dictionary learning classification model

Informativeness Measure Informativeness measure is aneffective criterion to select informative samples for reducingthe uncertainty of the classification model which capturesthe relationship of the candidate samples with the currentclassification model Probabilistic classification models selectthe one that has the largest entropy on the conditionaldistribution over its labels [38 39] Query-by-committeealgorithms choose the samples which have themost disagree-ment among a committee [22 26] In SVM methods themost informative sample is regarded as the one that is closestto the separating hyperplane [40 41] Since DPL is usedas the classification model in our framework we design aninformativeness measure based on the reconstruction errorand the entropy on the probability distribution over the class-specific reconstruction error of the sample

For dictionary learning the samples which are well-represented through the current learned dictionary are lesslikely to provide more information in further refining thedictionary Instead the samples have large reconstructionerror and large uncertainty should be mainly cared aboutbecause they have some additional information that is notcaptured by the current dictionary As a consequence theinformativeness measure is defined as follows

119864 (119909119895) = Error 119877

119895+ Entropy

119895 (6)

where Error 119877119895and Entropy

119895denote the reconstruction error

of the sample 119909119895

isin D119906with respect to the current learned

dictionary and the entropy of probability distribution overclass-specific reconstruction error of the sample 119909

119895isin D119906

respectively Error 119877119895is defined as

Error 119877119895= min119888

10038171003817100381710038171003817119909119895minus 119863119888119875119888119909119895

10038171003817100381710038171003817

2

(7)

where119863119888and119875119888represent subdictionary pairs corresponding

to the class 119888 which is learned by DPL algorithm as shownin (4) The larger Error 119877

119895indicates that the current learned

dictionary does not represent the sample 119909119895well

Since the class-specific reconstruction error is used toidentify the class label of sample 119909

119895(as shown in (5)) the

probability distribution of class-specific reconstruction errorof 119909119895can be acquired The class-specific reconstruction error

probability of 119909119895in class c is defined as

119901119888(119909119895) =

10038171003817100381710038171003817119909119895minus 119863119888119875119888119909119895

10038171003817100381710038171003817

2

sum119862

119888=1

10038171003817100381710038171003817119909119895minus 119863119888119875119888119909119895

10038171003817100381710038171003817

2 (8)

The class probability distribution 119901(119909119895) for sample 119909

119895is

computed as 119901(119909119895) = [119901

1 1199012 119901

119862] which demonstrates

how well the dictionary distinguishes the input sample Thatis if an input sample can be expressed well by the currentdictionary we will get a small value of 119909

119895minus 1198631198881198751198881199091198952

to one of class-specific subdictionaries and thus the classdistribution should reach the valley at the most likely classEntropy is a measure of uncertainty Hence in order toestimate the uncertainty of an input sample label the entropyof probability distribution over class-specific reconstructionerror is calculated as follows

Entropy119895= minus

119862

sum

119888=1

119901119888(119909119895) log119901

119888(119909119895) (9)

The high Entropy119895value demonstrates that 119909

119895is difficult to be

classified by the current learned dictionary thus it should beselected to be labeled and added to the labeled set for furthertraining dictionary learning

Representativeness Measure Since the informativeness mea-sure only considers how the candidate samples relate to thecurrent classificationmodel it ignores the potential structureinformation of the whole input unlabeled dataset Thereforethe representativeness measure is employed as an additionalcriterion to choose the useful unlabeled samples Represen-tativeness measure is to evaluate whether the samples wellrepresent the overall input patterns of unlabeled data whichexploits the relation between the candidate sample and therest of unlabeled samples

The distribution of unlabeled data is very useful fortraining a good classifier In previous active learning workthe marginal density and cosine distance are used as therepresentativeness measure to gain the information of datadistribution [39 42] Li and Guo [43] defined a morestraightforward representativeness measure called mutualinformation Its intention is to select the samples located inthe density region of the unlabeled data distribution which ismore representative regarding the remaining unlabeled datathan the ones located in the sparse region We introduce theframework of representativenessmeasure proposed by Li andGuo [43] into the dictionary learning

For an unlabeled sample 119909119895 the mutual information with

respect to other unlabeled samples is defined as follows

119872(119909119895) = 119867(119909

119895) minus 119867(119909

119895| 119883119880119895

) =1

2ln(

1205902

119895

1205902119895|119880119895

) (10)

where 119867(119909119895) and 119867(119909

119895| 119883119880119895

) denote the entropy and theconditional entropy of sample 119909

119895 respectively 119880

119895represents

the index set of unlabeled samples where 119895 has been removedfrom 119880 119880

119895= 119880 minus 119895 and 119883

119880119895

represents the set of samplesindexed by119880

119895 1205902119895and 120590

2

119895|119880119895

can be calculated by the followingformulas

1205902

119895= K (119909

119895 119909119895)

1205902

119895|119880119895

= 1205902

119895minus sum

119895119880119895

minus1

sum

119880119895119880119895

sum

119880119895119895

(11)

6 Mathematical Problems in Engineering

Inputs Labeled set D119897and its label set Y

119897 Unlabeled set D

119906 the number of iteration 119868

119905

and the number of unlabeled samples 119873119904to be selected in each iteration

(1) Initialization Learn an initial dictionary pair 119863lowast and 119875lowast by DPL algorithm from the D

119897

(2) For 119894 = 1 to 119868119905 do

(3) Compute 119864(119909119895) and 119872(119909

119895) by (6) and (10) for each sample 119909

119895in the unlabeled dataset D

119906

(4) Select 119873119904samples (denoted by D

119904) from the D

119906by (13) and add them into D

119897with their

class labels which manually assigned by user Then updates D119906= D119906minus D119904and D

119897= D119897cup D119904

(5) Learn the refined dictionaries 119863lowastnew and 119875lowast

new over the expanded dataset D119897

(6) End forOutput Final learned dictionary pair 119863lowastnew and 119875

lowast

new

Algorithm 1 Active discriminative dictionary learning (ADDL)

Assume that the index set 119880119895= (1 2 3 119905) and sum

119880119895119880119895

is a kernel matrix defined over all the unlabeled samplesindexed by 119880

119895 it is computed by the following form

sum

119880119895119880119895

= (

K (1199091 1199091) K (119909

1 1199092) sdot sdot sdot K (119909

1 119909119905)

K (1199092 1199091) K (119909

2 1199092) sdot sdot sdot K (119909

2 119909119905)

K (119909119905 1199091) K (119909

119905 1199092) sdot sdot sdot K (119909

119905 119909119905)

)

(12)

K(sdot) is a symmetric positive definite kernel function Inour approach we apply simple and effective linear kernelK(119909119895 119909119895) = 119909

119894minus 1199091198952 for our dictionary learning task

The mutual information is used to implicitly exploit theinformation between the selected samples and the remainingones The samples which have large mutual informationshould be selected from the pool of unlabeled data for refiningthe learned dictionary in DPL

Procedure of Active Samples Selection Based on the aboveanalysis we aim to integrate the strengths of informativenessmeasure and representativeness measure to select the unla-beled samples from the pool of the unlabeled dataWe choosethe samples that have not only large reconstruction errorand the entropy of probability distribution over class-specificreconstruction error with respect to the DPL classificationmodel but also the large representativeness regarding the restof unlabeled samples The sample set D

119904= 119909119904

1 119909119904

2 119909

119904

119873119904

are iteratively selected from pool by the following formula

119909119904= argmin119909119895

(119864 (119909119895) + 119872(119909

119895))

119895 isin 119899119897 + 1 119899119897 + 2 119873

(13)

The overall of our ADDL is given in Algorithm 1

4 Experiments

In this section the performance of the proposed frameworkis evaluated on two weather datasets We first give the

details about the datasets and experimental settingsThen theexperimental results are provided and analyzed

41 Datasets and Experimental Setting

Datasets The first dataset employed in our experiment is thedataset provided by Chen et al [9] (denoted as DATASET1) DATASET 1 contains 1000 images of size 3966 times 270 andeach image has been manually labeled as sunny cloudy orovercast There are 276 sunny images 251 cloudy images and473 overcast images in DATASET 1 Figure 2 shows threeimages from DATASET 1 with the label sunny cloudy andovercast respectively

Because there are few available public datasets forweather recognition we construct a new dataset (denotedas DATASET 2) to test the performance of the proposedmethod The images in DATASET 2 are selected from thepanorama images collected on the roof of BC buildingat EPFL (httppanoramaepflch provides high resolution(13200times 900) panorama images from 2005 till now recordingat every 10 minutes during daytime) and categorized intosunny cloudy or overcast based on the classification criterionpresented in [9] It includes 5000 images whichwere capturedat approximately every 30 minutes during daytime in 2014and the size of each image is 4821 times 400 Although bothDATASET 1 and DATASET 2 are constructed based on theimages provided by httppanoramaepflch DATASET 2is more challenging because it contains a large number ofimages captured in different seasons In Figure 3 some exam-ples of different labeled images in DATASET 2 are shown

Experimental Setting For the sky region of each image fourtypes of visual appearance features are extracted including200-dim SIFT 600-dim HSV color feature (200-dim his-togram of H channel 200-dim histogram of S channel and200-dimhistogramofV channel) 200-dimLBP and 200-dimgradient magnitude featureThese features are only extractedfrom the sky part of each image and the sky detector andfeature extraction procedure provided by Chen et al [9] areused in this paper Besides the visual appearance features twokinds of features based on physical characteristics of imagescaptured under different weather are also extracted fromthe nonsky region which consists of 200-dim bag-of-wordsrepresentation of the contrast and 200-dim bag-of-word

Mathematical Problems in Engineering 7

(a) (b)

(c)

Figure 2 Examples in DATASET 1 (a) Sunny (b) cloudy (c) overcast

(a) (b)

(c)

Figure 3 Examples in DATASET 2 (a) Sunny (b) cloudy (c) overcast

representation of the dark channel To be specific we divideeach nonsky region of the image into 32 times 32 blocks andextract the contrast and the dark channel features by (1)and (2) and then use bag-of-words model [37] to code eachfeature

In our experiment 50 images are randomly selectedin each dataset for training and the remaining data is usedfor testing The training data are randomly partitioned intolabeled sample set D

119897and unlabeled sample set D

119906 D119897is

applied for learning an initial dictionary andD119906is utilized for

actively selecting ldquoprofitablerdquo samples to iteratively improvethe classification performance To make the experimentresultsmore convincing each following experimental processis repeated ten times and then the mean and standarddeviation of the classification accuracy are reported

42 Experiment I Verifying the Performance of FeatureExtraction The effectiveness of the feature extraction ofour method is first evaluated Many previous works merelyextract the visual appearance features of the sky part forweather recognition [9] In order to validate the power ofour extracted features based on physical characteristics ofimages the results of two weather recognition schemes arecompared One only uses the visual appearance features toclassify the weather conditions and the other combines thevisual appearance features with features based on physicalcharacteristics of images to identify the weather conditionsIn order to weaken the influence of classifier on the results ofweather recognition119870-NN classification (119870 is experientiallyset to 30) SVM with the radial basis function kernel andthe original DPL without active samples selection procedureare applied in this experiment Figures 4 and 5 show thecomparison results on DATASET 1 and DATASET 2 respect-ively

In Figures 4 and 5 119909-axis represents the different numberof training samples and 119910-axis represents the average classifi-cation accuracy The red dotted lines indicate just six visualappearance features of the sky area are used and the bluesolid lines indicate both six visual appearance features and

two features based on physical characteristics of the nonskyarea are applied for recognition From Figures 4 and 5 it isclearly observed that the combination of visual appearancefeatures and physical features can achieve better performancefor weather recognition task

43 Experiment II Recognizing Weather Conditions by theProposed ADDL In this section the performance of theproposed ADDL algorithm is evaluated First experimentis conducted to give the best parameters for ADDL AndthenADDL is compared against several popular classificationmethods

431 Parameters Selection There are three important param-eters in the proposed approach that is 119896 120582 and 120591 119896 isthe number of atoms in each subdictionary 119863

119888learned from

samples in each class 120582 is used to control the discriminativeproperty of 119875 and 120591 is a scalar constant in DPL algorithmThe performances of our ADDL under various values of 119896 120582and 120591 are studied on DATASET 1 Figure 6(a) lists the classi-fication results when 119896 = 15 25 35 45 55 65 75 85 95 Itcan be seen that the highest average classification accuracyis obtained when 119896 = 25 This demonstrates that ADDLis effective to learn a compact dictionary According tothe observation in Figure 6(a) we set 119896 to be 25 in allexperiments The classification results obtained by using thedifferent 120591 and 120582 are shown in Figures 6(b) and 6(c) FromFigures 6(b) and 6(c) the optimal values of 120582 and 120591 are 005and 25 respectively This is because of the fact that a too bigor too small 120582 value will lead the reconstruction coefficient inADDL to be too sparse or too dense which will deterioratethe classification performance If 120591 is too large the effect ofthe reconstruction error constraint (the first term in (4)) andthe sparse constraint (the third term in (4)) is weakenedwhich will decrease the discrimination ability of the learneddictionary On the contrary if 120591 is too small the second termin (4) will be neglected in dictionary learning which alsoreduces the performance of algorithmHence we set120582 = 005

and 120591 = 25 for all experiments

8 Mathematical Problems in Engineering

100 200 300 400 50076

78

80

82

84

86

88

90

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(a) K-NN

100 200 300 400 50076

78

80

82

84

86

88

90

92

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(b) SVM

100 200 300 400 50084

86

88

90

92

94

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(c) DPL

Figure 4 Weather recognition results over DATASET 1 by using different features

Now we evaluate weather active samples selection canimprove the recognition performance 500 samples inDATASET 1 are randomly selected as training data and theremaining samples are used for testing In training dataset50 samples are randomly selected as the labeled dataset D

119897

and the remaining 450 samples are selected as the unlabeleddatasetD

119906The proposedADDLuses the labeled datasetD

119897to

learn an initial dictionary pair and then iteratively selects 50samples from D

119906to label for expanding the training dataset

Figure 7 shows the recognition accuracy versus the numberof iterations

In Figure 7 the 0th iteration indicates that we only usethe initial 50 labeled samples to learn the dictionary and the9th iterationmeans using all 500 training samples to learn thedictionary for recognition The recognition ability of ADDL

is improved by active samples selection procedure and itachieves highest accuracy when the number of iterationsis 3 total 200 samples are used for training It is worthmentioning that ADDL obtains the best results when thenumber of iterations is set as 3 If iterations are larger than 3the recognition rates will drop about 1This is because thereare somenoisy examples or ldquooutliersrdquo in the unlabeled datasetand the more noisy examples or ldquooutliersrdquo will be selectedto learn the dictionary along with the increase of iterationswhich interferes with the dictionary learning and leads to theclassification performance degradation In the following thenumber of iterations is set to 3 for all experiments

432 Comparisons of ADDL with Other Methods Here theproposed ADDL is compared with several methods The first

Mathematical Problems in Engineering 9

100 400 700 1000 1300 1600 1900 2200 250079

81

83

85

87

89

91

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(a) K-NN

100 400 700 1000 1300 1600 1900 2200 250083

85

87

89

91

93

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(b) SVM

100 400 700 1000 1300 1600 1900 2200 250086

88

90

92

94

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(c) DPL

Figure 5 Weather recognition results over DATASET 2 by using different features

two methods are 119870-NN algorithm used by Song et al [7]and SVM with the radial basis function kernel (RBF-SVM)adopted by Roser and Moosmann [5] The third methodis SRC [15] which directly uses all training samples as thedictionary for classification In order to confirm that theactive samples selection in our ADDLmethod is effective theproposed ADDL is compared with the original DPL method[19] We also compare ADDL with the method proposedby Chen et al [9] As far as we know the work in [9] isthe only framework which addresses the same problem withour method that is recognizing different weather conditions(sunny cloudy and overcast) of images captured by a stillcamera It actively selected useful samples to training SVMfor recognizing different weather conditions

For DATASET 1 500 images are randomly selected as thetraining samples and the rest of images are used as the testingsamples In the training dataset 50 images are randomlychosen as the initial labeled training dataset D

119897 and the

remaining 450 images are regarded as the unlabeled datasetD119906 ADDL and Chenrsquos method [9] both include the active

learning procedure thus they iteratively choose 150 samplesfrom D

119906to be labeled based on their criterion of the samples

selection and add these samples toD119897for further training the

classification model For 119870-NN RBF-SVM SRC and DPLmethods which are without the active learning procedure150 samples are randomly selected from D

119906to be labeled

for expanding the labeled training dataset D119897 Table 1 lists

the comparisons of our approach with several methods for

10 Mathematical Problems in Engineering

5 15 25 35 45 55 65 75 85 95

90

92

94Av

erag

e acc

urac

y (

)

k

(a)

00005 0005 005 05 10 30 50 70 90

90

92

94

Aver

age a

ccur

acy

()

120582

(b)

0005 05 5 25 45 65 85

90

92

94

Aver

age a

ccur

acy

()

120591

(c)

Figure 6 Selection of parameters 119909-axis represents the different values of parameters and 119910-axis represents the average classificationaccuracy (a)The average classification rate under different k (b)The average classification rate under different 120582 (c)The average classificationrate under different 120591

0 1 2 3 4 5 6 7 8 98284868890929496

Iteration

Aver

age a

ccur

acy

()

Figure 7 Recognition accuracy on DATASET 1 versus the numberof iterations

Table 1 Comparisons on DATASET 1 among different methods

Methods Classification rate (mean plusmn std)119870-NN 829plusmn 12RBF-SVM 856plusmn 16SRC 897plusmn 09DPL 913plusmn 16Chenrsquos method [9] 9298plusmn 055ADDL 940plusmn 02

weather classification As can be seen from Table 1 ADDLoutperforms other methods The mean classification rate ofADDL reaches about 94

In DATASET 2 2500 images are randomly selected asthe training samples and the rest of images are used as

Table 2 Comparison on DATASET 2 among different methods

Methods Classification rate (mean plusmn std)119870-NN 850plusmn 16RBF-SVM 881plusmn 14SRC 882plusmn 11DPL 896plusmn 08Chenrsquos method [9] 888plusmn 16ADDL 904plusmn 10

the testing samples In the training dataset 50 images arerandomly chosen as the initial labeled training datasetD

119897 the

remaining 2450 images are regarded as the unlabeled datasetD119897 All parameters setting for DATASET 2 are the same as

DATASET 1 Table 2 lists the recognition results of differentmethods which indicates that the validity of the proposedADDL is better than other methods

FromTables 1 and 2 two points can be observed First wecan find that the recognition performances of 119870-NN RBF-SVM SRC and DPL are overall inferior to the proposedADDL algorithm This is probably because these four algo-rithms randomly select the unlabeled data from the givenpool which do not consider whether the selected samples arebeneficial for improving the performance of the classificationmodel or not Second although the proposed ADDL andChenrsquosmethod [9] both include the active learning paradigmthe proposed ADDL performs better than Chenrsquos method[9] This is due to the fact that Chenrsquos method [9] onlyconsiders the informativeness and ignores representativeness

Mathematical Problems in Engineering 11

of samples when selecting the unlabeled samples from thegiven pool

5 Conclusions

We have presented an effective framework for classifyingthree types of weather (sunny cloudy and overcast) basedon the outdoor images Through the analysis of the differentvisual manifestations of images caused by different weathersthe various features are separately extracted from the sky areaand nonsky area of images which describes visual appear-ance properties and physical characteristics of images underdifferent weather conditions respectively ADDL approachwas proposed to learn a more discriminative dictionary forimproving the weather classification performance by select-ing the informative and representative unlabeled samplesfrom a given pool to expand the training dataset Sincethere is not much image dataset on weather recognition wehave collected and labeled a new weather dataset for testingthe proposed algorithm The experimental results show thatADDL is a fairly effective and inspiring strategy for weatherclassification which also can be used inmany other computervision tasks

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

This work is supported by Fund of Jilin Provincial Sci-ence amp Technology Department (nos 20130206042GX and20140204089GX) and National Natural Science Foundationof China (nos 61403078 11271064 and 61471111)

References

[1] F Nashashibi R de Charette and A Lia ldquoDetection ofunfocused raindrops on a windscreen using low level imageprocessingrdquo in Proceedings of the 11th International Conferenceon Control Automation Robotics and Vision (ICARCV rsquo10) pp1410ndash1415 Singapore December 2010

[2] H Kurihata T Takahashi I Ide et al ldquoRainy weather recog-nition from in-vehicle camera images for driver assistancerdquo inProceedings of the IEEE Intelligent Vehicles Symposium pp 205ndash210 IEEE Las Vegas Nev USA June 2005

[3] H Woo Y M Jung J-G Kim and J K Seo ldquoEnvironmentallyrobust motion detection for video surveillancerdquo IEEE Transac-tions on Image Processing vol 19 no 11 pp 2838ndash2848 2010

[4] H Katsura J Miura M Hild and Y Shirai ldquoA view-basedoutdoor navigation using object recognition robust to changesof weather and seasonsrdquo in Proceedings of the IEEERSJ Interna-tional Conference on Intelligent Robots and Systems (IROS rsquo03)vol 3 pp 2974ndash2979 IEEE Las Vegas Nev USAOctober 2003

[5] M Roser and F Moosmann ldquoClassification of weather sit-uations on single color imagesrdquo in Proceedings of the IEEEIntelligent Vehicles Symposium pp 798ndash803 IEEE EindhovenThe Netherlands June 2008

[6] X Yan Y Luo and X Zheng ldquoWeather recognition based onimages captured by vision system in vehiclerdquo in Advances in

Neural NetworksmdashISNN 2009 pp 390ndash398 Springer BerlinGermany 2009

[7] H Song Y Chen and Y Gao ldquoWeather condition recognitionbased on feature extraction and K-NNrdquo in Foundations andPractical Applications of Cognitive Systems and Information Pro-cessing Proceedings of the First International Conference on Cog-nitive Systems and Information Processing Beijing China Dec2012 (CSIP2012) Advances in Intelligent Systems and Comput-ing pp 199ndash210 Springer Berlin Germany 2014

[8] C Lu D Lin J Jia and C-K Tang ldquoTwo-class weather class-ificationrdquo in Proceedings of the 27th IEEE Conference on Com-puter Vision and Pattern Recognition (CVPR rsquo14) pp 3718ndash3725Columbus Ohio USA June 2014

[9] Z Chen F Yang A Lindner G Barrenetxea and M VetterlildquoHow is the weather automatic inference from imagesrdquo inProceedings of the 19th IEEE International Conference on ImageProcessing (ICIP rsquo12) pp 1853ndash1856 Orlando Fla USAOctober2012

[10] K He J Sun and X Tang ldquoSingle image haze removal usingdark channel priorrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 33 no 12 pp 2341ndash2353 2011

[11] A Li and H Shouno ldquoDictionary-based image denoising byfused-lasso atom selectionrdquo Mathematical Problems in Engi-neering vol 2014 Article ID 368602 10 pages 2014

[12] Z Feng M Yang L Zhang Y Liu and D Zhang ldquoJoint dis-criminative dimensionality reduction and dictionary learningfor face recognitionrdquo Pattern Recognition vol 46 no 8 pp2134ndash2143 2013

[13] J Dong C Sun andW Yang ldquoA supervised dictionary learningand discriminative weighting model for action recognitionrdquoNeurocomputing vol 158 pp 246ndash256 2015

[14] M Aharon M Elad and A Bruckstein ldquoK-SVD an algorithmfor designing overcomplete dictionaries for sparse representa-tionrdquo IEEE Transactions on Signal Processing vol 54 no 11 pp4311ndash4322 2006

[15] JWright A Y Yang A Ganesh S S Sastry and YMa ldquoRobustface recognition via sparse representationrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 31 no 2 pp210ndash227 2009

[16] J Mairal F Bach and J Ponce ldquoTask-driven dictionarylearningrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 34 no 4 pp 791ndash804 2012

[17] Z Jiang Z Lin and L S Davis ldquoLabel consistent K-SVD learn-ing a discriminative dictionary for recognitionrdquo IEEE Transac-tions on Pattern Analysis and Machine Intelligence vol 35 no11 pp 2651ndash2664 2013

[18] M Yang L Zhang X Feng and D Zhang ldquoSparse representa-tion based fisher discrimination dictionary learning for imageclassificationrdquo International Journal of Computer Vision vol109 no 3 pp 209ndash232 2014

[19] S Gu L Zhang W Zuo and X Feng ldquoProjective dictionarypair learning for pattern classificationrdquo inProceedings of the 28thAnnual Conference on Neural Information Processing Systems(NIPS rsquo14) pp 793ndash801 December 2014

[20] B Settles Active Learning Literature Survey vol 52 Universityof Wisconsin-Madison Madison Wis USA 2010

[21] D Cohn L Atlas and R Ladner ldquoImproving generalizationwith active learningrdquo Machine Learning vol 15 no 2 pp 201ndash221 1994

[22] I Dagan and S P Engelson ldquoCommittee-based sampling fortraining probabilistic classifiersrdquo in Proceedings of the 12th

12 Mathematical Problems in Engineering

International Conference on Machine Learning pp 150ndash157Morgan Kaufmann Tahoe City Calif USA 1995

[23] D Angluin ldquoQueries and concept learningrdquoMachine Learningvol 2 no 4 pp 319ndash342 1988

[24] R D King K E Whelan F M Jones et al ldquoFunctional geno-mic hypothesis generation and experimentation by a robot sci-entistrdquo Nature vol 427 no 6971 pp 247ndash252 2004

[25] S Chakraborty V Balasubramanian and S PanchanathanldquoDynamic batch mode active learningrdquo in Proceedings of theIEEE Conference on Computer Vision and Pattern Recognition(CVPR rsquo11) pp 2649ndash2656 Providence RI USA June 2011

[26] K Nigam and A McCallum ldquoEmploying EM in pool-basedactive learning for text classificationrdquo in Proceedings of the 15thInternational Conference on Machine Learning (ICML rsquo98) pp350ndash358 Morgan Kaufmann Madison Wis USA 1998

[27] A Kapoor K Grauman R Urtasun and T Darrell ldquoActivelearning with Gaussian processes for object categorizationrdquo inProceedings of the IEEE 11th International Conference on Com-puter Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro BrazilOctober 2007

[28] S J Huang R Jin and Z H Zhou ldquoActive learning by queryinginformative and representative examplesrdquo in Proceedings of theAnnual Conference on Neural Information Processing Systems(NIPS rsquo10) pp 892ndash900 2010

[29] P Donmez J G Carbonell and P N Bennett ldquoDual strategyactive learningrdquo inMachine Learning ECML 2007 J N Kok JKoronacki R L de Mantaras S Matwin D Mladenic and ASkowron Eds vol 4701 of Lecture Notes in Computer Sciencepp 116ndash127 Springer Berlin Germany 2007

[30] S C H Hoi R Jin J Zhu and M R Lyu ldquoSemi-supervisedSVM batch mode active learning for image retrievalrdquo in Pro-ceedings of the 26th IEEE Conference on Computer Vision andPattern Recognition (CVPR rsquo08) pp 1ndash7 Anchorage AlaskaUSA June 2008

[31] Z Xu K Yu V Tresp X Xu and J Wang ldquoRepresentativesampling for text classification using support vector machinesrdquoin Proceedings of the 25th European Conference on IR Research(ECIR rsquo03) Springer 2003

[32] A Bosch A Zisserman and X Munoz ldquoImage classificationusing random forests and fernsrdquo in Proceedings of the IEEE 11thInternational Conference on Computer Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro Brazil October 2007

[33] T Ojala M Pietikainen and D Harwood ldquoA comparativestudy of texture measures with classification based on featuredistributionsrdquo Pattern Recognition vol 29 no 1 pp 51ndash59 1996

[34] S G Narasimhan and S K Nayar ldquoChromatic framework forvision in bad weatherrdquo in Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo00) pp 598ndash605 Hilton Head Island SC USA June 2000

[35] S G Narasimhan and S K Nayar ldquoRemoving weather effectsfrom monochrome imagesrdquo in Proceedings of the IEEE Com-puter Society Conference on Computer Vision and Pattern Recog-nition vol 2 pp II-186ndashII-193 IEEE Kauai Hawaii USADecember 2001

[36] S G Narasimhan and S K Nayar ldquoVision and the atmosphererdquoInternational Journal of Computer Vision vol 48 no 3 pp 233ndash254 2002

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings ofthe International Workshop on Statistical Learning in ComputerVision (ECCV rsquo04) pp 1ndash22 Prague Czech Republic 2004

[38] D D Lewis andW A Gale ldquoA sequential algorithm for trainingtext classifiersrdquo in SIGIR rsquo94 Proceedings of the SeventeenthAnnual International ACM-SIGIR Conference on Research andDevelopment in Information Retrieval organised by Dublin CityUniversity pp 3ndash12 Springer Berlin Germany 1994

[39] B Settles and M Craven ldquoAn analysis of active learning strate-gies for sequence labeling tasksrdquo inProceedings of the Conferenceon Empirical Methods in Natural Language Processing (EMNLPrsquo08) pp 1070ndash1079 Association for Computational Linguistics2008

[40] G Schohn andD Cohn ldquoLess ismore active learning with sup-port vector machinesrdquo in Proceedings of the 17th InternationalConference on Machine Learning (ICML rsquo00) pp 839ndash846Morgan Kaufmann 2000

[41] S Tong and D Koller ldquoSupport vector machine active learningwith applications to text classificationrdquo The Journal of MachineLearning Research vol 2 pp 45ndash66 2002

[42] M Szummer and T S Jaakkola ldquoInformation regularizationwith partially labeled datardquo in Advances in Neural InformationProcessing Systems pp 1025ndash1032 MIT Press 2002

[43] X Li and Y Guo ldquoAdaptive active learning for image classifica-tionrdquo in Proceedings of the 26th IEEE Conference on ComputerVision and Pattern Recognition (CVPR rsquo13) pp 859ndash866 IEEEPortland Ore USA June 2013

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: Research Article Active Discriminative Dictionary Learning

Mathematical Problems in Engineering 3

Features extraction(1) Visual appearance based features(2) Physical characteristics based features

Testing datasetLabeled dataset

Unlabeled dataset Output finaldictionary

Measures(1) Informativeness(2) Representativeness

Dictionarylearning

Selected samples

Classification results

Yes No

Training dataset

Expa

ndin

g

Images

Iter le N

Figure 1 The flow chart of the proposed method

samples Informativeness measures the capacity of the sam-ples in reducing the uncertainty of the classification modelwhile representativeness measures whether the samples wellrepresent the overall input patterns of unlabeled data or not[20] Most of active learning methods only take one of thetwo criteria into account when selecting unlabeled sampleswhich restricts the performance of the active learning [28]Although several approaches [29ndash31] have considered bothinformativeness and representativeness of unlabeled data toour knowledge almost no researchers have introduced thesetwo criteria of active learning into the dictionary learningalgorithms

3 Proposed Method

In this section we present an efficient scheme including theeffective feature extraction and ADDL classification modelto classify the weather conditions of images into sunnycloudy and overcast Figure 1 shows a flow chart of theoverall procedure of our method First the visual appearancebased features are extracted from the sky region and thephysical characteristics based features are extracted fromthe nonsky region of images Secondly the labeled trainingdataset is used to learn an initial dictionary and then thesamples are iteratively selected from unlabeled dataset basedon two measures informativeness and representativenessThese selected samples are used to expand the labeled datasetfor learning a more discriminative dictionary Finally thetesting dataset is classified by the learned dictionary

31 Features Extraction Feature extraction is an essentialpreprocessing step in pattern recognition problems In order

to express the difference among the images of the samescene taken under various weather situations we analyze thedifferent visual manifestations of images caused by differentweather conditions and extract the features which describeboth visual appearance properties and physical characteris-tics of images

From the viewpoint of visual appearance features ofimages the sky is the most important part in an outdoorimage for identifying theweather In the case of sunny the skyappears blue because the light is refracted as it passes throughthe atmosphere scattering blue light while under the over-cast condition the sky is white or grey due to the thick cloudcover The cloudy day is a situation between sunny and over-cast On a cloudy day the clouds are floating in the sky whichexhibit a wide range of shapes and textures Hence we extractthe SIFT [32] HSV color LBP [33] and the gradient magni-tude of the sky parts of images as the visual appearance featureof images These extracted features are used to describe thetexture color and shape of images for weather classification

Unlike Chen et al [9] who directly eliminated the nonskyregions of images we extract two features in the nonsky partsof images based on the physical characteristics which alsocan be used as powerful features to distinguish the differentweather conditions The interaction of light with the atmo-sphere has been studied as atmospheric optics In the sunny(clear) day the light rays reflected by scene objects reachto the observer without alteration or attenuation Howeverunder bad weather conditions atmospheric effects cannot beneglected anymore [34ndash36] The bad weather (eg overcast)causes decay in the image contrast which is exponential inthe depths of scene points [35] So images of the same scenetaken in different weather conditions should have different

4 Mathematical Problems in Engineering

contrasts The contrast CON of an image can be computedby

CON =119864max minus 119864min119864max + 119864min

(1)

where 119864max and 119864min are the maximum and minimumintensities of the image respectively

Overcast weather may come with haze [8] The darkchannel prior presented in [10] is effective to detect theimage haze It is based on the observation that most haze-free patches in the nonsky regions of images should have avery low intensity at least one RGB color channel The darkchannel 119869dark was defined as

119869dark

(119909) = min119888isin119903119892119887

( min1199091015840isinΩ(119909)

(119869119888(1199091015840))) (2)

where 119869119888 is one color channel of the image J and Ω(119909) is a

local patch centered at 119909In summary both the visual appearance features (SIFT

HSV color LBP and the gradient magnitude) of the skyregion and the physical characteristics based features (thecontrast and the dark channel) of the nonsky region areextracted to distinguish the images under different weathersituations And then the ldquobag-of-wordsrdquo model [37] is usedto code each feature for forming the feature vectors

32 The Proposed ADDL Classification Model Inspired bythe technological advancements of discriminative dictionarylearning and active learning we design an active discrimi-native dictionary learning (ADDL) algorithm to improve thediscriminative power of the learned dictionary In ADDLalgorithm DPL [19] is applied to learn a discriminativedictionary for recognizing different weather conditions andthe strategy of active samples selection is developed toiteratively select the unlabeled sample from a given pool toenlarge the training dataset for improving the DPL classifi-cation performance The criterion of active samples selectionincludes the informativenessmeasure and the representative-ness measure

321 DPL Algorithm Suppose there is a set of 119898-dimen-sionality training samples from 119862 classes denoted by 119883

= [1198831 119883

119888 119883

119862] with the label set 119884 = [119884

1 119884

119888

119884119862] where 119883

119888isin 119877119898times119899 denotes the sample set of 119888th class

and 119884119888denotes the corresponding label set DPL [19] algo-

rithm jointly learned an analysis dictionary 119875 = [1198751 119875

119888

119875119862] isin 119877119896119862times119898 and a synthesis dictionary 119863 = [119863

1 119863

119888

119863119862] isin 119877119898times119896119862 to avoid resolving the costly 119897

0-norm or 119897

1-

norm sparse coding process 119875 and 119863 were used for linearencoding representation coefficients and class-specific dis-criminative reconstruction respectively The object functionof DPL model is

119875lowast 119863lowast = argmin

119875119863

119862

sum

119888=1

1003817100381710038171003817119883119888 minus 119863119888119875119888119883119888

10038171003817100381710038172

119865+ 120582

10038171003817100381710038171003817119875119888119883119888

10038171003817100381710038171003817

2

119865

st 100381710038171003817100381711988911989410038171003817100381710038172

2le 1

(3)

where 119863119888

isin 119877119898times119896 and 119875

119888isin 119877119896times119898 represent subdictionary

pairs corresponding to class 119888119883119888represents the complemen-

tary matrix of 119883119888 120582 ≻ 0 is a scalar constant to control the

discriminative property of 119875 and 119889119894denotes the 119894th element

of dictionary 119863The objective function in (3) is generally nonconvex But

it can be relaxed to the following form by introducing avariable matrix 119860

119875lowast 119860lowast 119863lowast = argmin119875119860119863

119862

sum

119888=1

1003817100381710038171003817119883119888 minus 119863119888119860119888

10038171003817100381710038172

119865

+ 1205911003817100381710038171003817119875119888119883119888 minus 119860

119888

10038171003817100381710038172

119865

+ 12058210038171003817100381710038171003817119875119888119883119888

10038171003817100381710038171003817

2

119865

st 100381710038171003817100381711988911989410038171003817100381710038172

2le 1

(4)

where 120591 is an algorithm parameter According to [19] theobjective function in (4) can be solved by an alternativelyupdated manner

When 119863 and 119875 have been learned given a test sample119909119905 the class-specific reconstruction residual is used to assign

the class label So the classification model associated with theDPL is defined as

label (119909119905) = argmin

119888

1003817100381710038171003817119909119905 minus 119863119888119875119888119909119905

10038171003817100381710038172 (5)

When dealing with the classification task DPL requiressufficient labeled training samples to learn discriminativedictionary pair for obtaining good results In fact it is difficultand expensive to obtain the vast quantity of labeled dataIf we can exploit the information provided by the massiveinexpensive unlabeled samples and choose small amounts ofthe ldquoprofitablerdquo samples (the ldquoprofitablerdquo unlabeled samplesare the ones that are most beneficial for the improvement ofthe DPL classification performance) from unlabeled datasetto be labeled manually we would learn a more discriminativedictionary than the one learned only using a limited numberof labeled training data To achieve this we introduce theactive learning technique to DPL in the next section

322 Introducing Active Learning to DPL When evaluatingone sample is ldquoprofitablerdquo or not two measures are consid-ered informativeness and representativeness The proposedADDL iteratively evaluates both informativeness and repre-sentativeness of unlabeled samples in a given pool for seekingthe ones that are most beneficial for the improvement oftheDPL classification performance Specifically the informa-tiveness measure is constructed based on the reconstructionerror and the entropy on the probability distribution over theclass-specific reconstruction error and the representativenessis obtained from the distribution of the unlabeled dataset inthis study

Assume that we are given an initial training dataset D119897=

1199091 1199092 119909

119899119897 with its label set Y

119897= 1199101 1199102 119910

119899119897 and

an unlabeled dataset D119906

= 119909119899119897+1

119909119899119897+2

119909119873 where 119909

119894isin

119877119898times1 is an 119898 dimensional feature vector and 119910

119894isin 1 2 3

is the corresponding class label (sunny cloudy or overcast)

Mathematical Problems in Engineering 5

The target of ADDL is to iteratively select119873119904most ldquoprofitablerdquo

samples denoted by D119904 from D

119906to query their labels Y

119904and

then add them to D119897for improving the performance of the

dictionary learning classification model

Informativeness Measure Informativeness measure is aneffective criterion to select informative samples for reducingthe uncertainty of the classification model which capturesthe relationship of the candidate samples with the currentclassification model Probabilistic classification models selectthe one that has the largest entropy on the conditionaldistribution over its labels [38 39] Query-by-committeealgorithms choose the samples which have themost disagree-ment among a committee [22 26] In SVM methods themost informative sample is regarded as the one that is closestto the separating hyperplane [40 41] Since DPL is usedas the classification model in our framework we design aninformativeness measure based on the reconstruction errorand the entropy on the probability distribution over the class-specific reconstruction error of the sample

For dictionary learning the samples which are well-represented through the current learned dictionary are lesslikely to provide more information in further refining thedictionary Instead the samples have large reconstructionerror and large uncertainty should be mainly cared aboutbecause they have some additional information that is notcaptured by the current dictionary As a consequence theinformativeness measure is defined as follows

119864 (119909119895) = Error 119877

119895+ Entropy

119895 (6)

where Error 119877119895and Entropy

119895denote the reconstruction error

of the sample 119909119895

isin D119906with respect to the current learned

dictionary and the entropy of probability distribution overclass-specific reconstruction error of the sample 119909

119895isin D119906

respectively Error 119877119895is defined as

Error 119877119895= min119888

10038171003817100381710038171003817119909119895minus 119863119888119875119888119909119895

10038171003817100381710038171003817

2

(7)

where119863119888and119875119888represent subdictionary pairs corresponding

to the class 119888 which is learned by DPL algorithm as shownin (4) The larger Error 119877

119895indicates that the current learned

dictionary does not represent the sample 119909119895well

Since the class-specific reconstruction error is used toidentify the class label of sample 119909

119895(as shown in (5)) the

probability distribution of class-specific reconstruction errorof 119909119895can be acquired The class-specific reconstruction error

probability of 119909119895in class c is defined as

119901119888(119909119895) =

10038171003817100381710038171003817119909119895minus 119863119888119875119888119909119895

10038171003817100381710038171003817

2

sum119862

119888=1

10038171003817100381710038171003817119909119895minus 119863119888119875119888119909119895

10038171003817100381710038171003817

2 (8)

The class probability distribution 119901(119909119895) for sample 119909

119895is

computed as 119901(119909119895) = [119901

1 1199012 119901

119862] which demonstrates

how well the dictionary distinguishes the input sample Thatis if an input sample can be expressed well by the currentdictionary we will get a small value of 119909

119895minus 1198631198881198751198881199091198952

to one of class-specific subdictionaries and thus the classdistribution should reach the valley at the most likely classEntropy is a measure of uncertainty Hence in order toestimate the uncertainty of an input sample label the entropyof probability distribution over class-specific reconstructionerror is calculated as follows

Entropy119895= minus

119862

sum

119888=1

119901119888(119909119895) log119901

119888(119909119895) (9)

The high Entropy119895value demonstrates that 119909

119895is difficult to be

classified by the current learned dictionary thus it should beselected to be labeled and added to the labeled set for furthertraining dictionary learning

Representativeness Measure Since the informativeness mea-sure only considers how the candidate samples relate to thecurrent classificationmodel it ignores the potential structureinformation of the whole input unlabeled dataset Thereforethe representativeness measure is employed as an additionalcriterion to choose the useful unlabeled samples Represen-tativeness measure is to evaluate whether the samples wellrepresent the overall input patterns of unlabeled data whichexploits the relation between the candidate sample and therest of unlabeled samples

The distribution of unlabeled data is very useful fortraining a good classifier In previous active learning workthe marginal density and cosine distance are used as therepresentativeness measure to gain the information of datadistribution [39 42] Li and Guo [43] defined a morestraightforward representativeness measure called mutualinformation Its intention is to select the samples located inthe density region of the unlabeled data distribution which ismore representative regarding the remaining unlabeled datathan the ones located in the sparse region We introduce theframework of representativenessmeasure proposed by Li andGuo [43] into the dictionary learning

For an unlabeled sample 119909119895 the mutual information with

respect to other unlabeled samples is defined as follows

119872(119909119895) = 119867(119909

119895) minus 119867(119909

119895| 119883119880119895

) =1

2ln(

1205902

119895

1205902119895|119880119895

) (10)

where 119867(119909119895) and 119867(119909

119895| 119883119880119895

) denote the entropy and theconditional entropy of sample 119909

119895 respectively 119880

119895represents

the index set of unlabeled samples where 119895 has been removedfrom 119880 119880

119895= 119880 minus 119895 and 119883

119880119895

represents the set of samplesindexed by119880

119895 1205902119895and 120590

2

119895|119880119895

can be calculated by the followingformulas

1205902

119895= K (119909

119895 119909119895)

1205902

119895|119880119895

= 1205902

119895minus sum

119895119880119895

minus1

sum

119880119895119880119895

sum

119880119895119895

(11)

6 Mathematical Problems in Engineering

Inputs Labeled set D119897and its label set Y

119897 Unlabeled set D

119906 the number of iteration 119868

119905

and the number of unlabeled samples 119873119904to be selected in each iteration

(1) Initialization Learn an initial dictionary pair 119863lowast and 119875lowast by DPL algorithm from the D

119897

(2) For 119894 = 1 to 119868119905 do

(3) Compute 119864(119909119895) and 119872(119909

119895) by (6) and (10) for each sample 119909

119895in the unlabeled dataset D

119906

(4) Select 119873119904samples (denoted by D

119904) from the D

119906by (13) and add them into D

119897with their

class labels which manually assigned by user Then updates D119906= D119906minus D119904and D

119897= D119897cup D119904

(5) Learn the refined dictionaries 119863lowastnew and 119875lowast

new over the expanded dataset D119897

(6) End forOutput Final learned dictionary pair 119863lowastnew and 119875

lowast

new

Algorithm 1 Active discriminative dictionary learning (ADDL)

Assume that the index set 119880119895= (1 2 3 119905) and sum

119880119895119880119895

is a kernel matrix defined over all the unlabeled samplesindexed by 119880

119895 it is computed by the following form

sum

119880119895119880119895

= (

K (1199091 1199091) K (119909

1 1199092) sdot sdot sdot K (119909

1 119909119905)

K (1199092 1199091) K (119909

2 1199092) sdot sdot sdot K (119909

2 119909119905)

K (119909119905 1199091) K (119909

119905 1199092) sdot sdot sdot K (119909

119905 119909119905)

)

(12)

K(sdot) is a symmetric positive definite kernel function Inour approach we apply simple and effective linear kernelK(119909119895 119909119895) = 119909

119894minus 1199091198952 for our dictionary learning task

The mutual information is used to implicitly exploit theinformation between the selected samples and the remainingones The samples which have large mutual informationshould be selected from the pool of unlabeled data for refiningthe learned dictionary in DPL

Procedure of Active Samples Selection Based on the aboveanalysis we aim to integrate the strengths of informativenessmeasure and representativeness measure to select the unla-beled samples from the pool of the unlabeled dataWe choosethe samples that have not only large reconstruction errorand the entropy of probability distribution over class-specificreconstruction error with respect to the DPL classificationmodel but also the large representativeness regarding the restof unlabeled samples The sample set D

119904= 119909119904

1 119909119904

2 119909

119904

119873119904

are iteratively selected from pool by the following formula

119909119904= argmin119909119895

(119864 (119909119895) + 119872(119909

119895))

119895 isin 119899119897 + 1 119899119897 + 2 119873

(13)

The overall of our ADDL is given in Algorithm 1

4 Experiments

In this section the performance of the proposed frameworkis evaluated on two weather datasets We first give the

details about the datasets and experimental settingsThen theexperimental results are provided and analyzed

41 Datasets and Experimental Setting

Datasets The first dataset employed in our experiment is thedataset provided by Chen et al [9] (denoted as DATASET1) DATASET 1 contains 1000 images of size 3966 times 270 andeach image has been manually labeled as sunny cloudy orovercast There are 276 sunny images 251 cloudy images and473 overcast images in DATASET 1 Figure 2 shows threeimages from DATASET 1 with the label sunny cloudy andovercast respectively

Because there are few available public datasets forweather recognition we construct a new dataset (denotedas DATASET 2) to test the performance of the proposedmethod The images in DATASET 2 are selected from thepanorama images collected on the roof of BC buildingat EPFL (httppanoramaepflch provides high resolution(13200times 900) panorama images from 2005 till now recordingat every 10 minutes during daytime) and categorized intosunny cloudy or overcast based on the classification criterionpresented in [9] It includes 5000 images whichwere capturedat approximately every 30 minutes during daytime in 2014and the size of each image is 4821 times 400 Although bothDATASET 1 and DATASET 2 are constructed based on theimages provided by httppanoramaepflch DATASET 2is more challenging because it contains a large number ofimages captured in different seasons In Figure 3 some exam-ples of different labeled images in DATASET 2 are shown

Experimental Setting For the sky region of each image fourtypes of visual appearance features are extracted including200-dim SIFT 600-dim HSV color feature (200-dim his-togram of H channel 200-dim histogram of S channel and200-dimhistogramofV channel) 200-dimLBP and 200-dimgradient magnitude featureThese features are only extractedfrom the sky part of each image and the sky detector andfeature extraction procedure provided by Chen et al [9] areused in this paper Besides the visual appearance features twokinds of features based on physical characteristics of imagescaptured under different weather are also extracted fromthe nonsky region which consists of 200-dim bag-of-wordsrepresentation of the contrast and 200-dim bag-of-word

Mathematical Problems in Engineering 7

(a) (b)

(c)

Figure 2 Examples in DATASET 1 (a) Sunny (b) cloudy (c) overcast

(a) (b)

(c)

Figure 3 Examples in DATASET 2 (a) Sunny (b) cloudy (c) overcast

representation of the dark channel To be specific we divideeach nonsky region of the image into 32 times 32 blocks andextract the contrast and the dark channel features by (1)and (2) and then use bag-of-words model [37] to code eachfeature

In our experiment 50 images are randomly selectedin each dataset for training and the remaining data is usedfor testing The training data are randomly partitioned intolabeled sample set D

119897and unlabeled sample set D

119906 D119897is

applied for learning an initial dictionary andD119906is utilized for

actively selecting ldquoprofitablerdquo samples to iteratively improvethe classification performance To make the experimentresultsmore convincing each following experimental processis repeated ten times and then the mean and standarddeviation of the classification accuracy are reported

42 Experiment I Verifying the Performance of FeatureExtraction The effectiveness of the feature extraction ofour method is first evaluated Many previous works merelyextract the visual appearance features of the sky part forweather recognition [9] In order to validate the power ofour extracted features based on physical characteristics ofimages the results of two weather recognition schemes arecompared One only uses the visual appearance features toclassify the weather conditions and the other combines thevisual appearance features with features based on physicalcharacteristics of images to identify the weather conditionsIn order to weaken the influence of classifier on the results ofweather recognition119870-NN classification (119870 is experientiallyset to 30) SVM with the radial basis function kernel andthe original DPL without active samples selection procedureare applied in this experiment Figures 4 and 5 show thecomparison results on DATASET 1 and DATASET 2 respect-ively

In Figures 4 and 5 119909-axis represents the different numberof training samples and 119910-axis represents the average classifi-cation accuracy The red dotted lines indicate just six visualappearance features of the sky area are used and the bluesolid lines indicate both six visual appearance features and

two features based on physical characteristics of the nonskyarea are applied for recognition From Figures 4 and 5 it isclearly observed that the combination of visual appearancefeatures and physical features can achieve better performancefor weather recognition task

43 Experiment II Recognizing Weather Conditions by theProposed ADDL In this section the performance of theproposed ADDL algorithm is evaluated First experimentis conducted to give the best parameters for ADDL AndthenADDL is compared against several popular classificationmethods

431 Parameters Selection There are three important param-eters in the proposed approach that is 119896 120582 and 120591 119896 isthe number of atoms in each subdictionary 119863

119888learned from

samples in each class 120582 is used to control the discriminativeproperty of 119875 and 120591 is a scalar constant in DPL algorithmThe performances of our ADDL under various values of 119896 120582and 120591 are studied on DATASET 1 Figure 6(a) lists the classi-fication results when 119896 = 15 25 35 45 55 65 75 85 95 Itcan be seen that the highest average classification accuracyis obtained when 119896 = 25 This demonstrates that ADDLis effective to learn a compact dictionary According tothe observation in Figure 6(a) we set 119896 to be 25 in allexperiments The classification results obtained by using thedifferent 120591 and 120582 are shown in Figures 6(b) and 6(c) FromFigures 6(b) and 6(c) the optimal values of 120582 and 120591 are 005and 25 respectively This is because of the fact that a too bigor too small 120582 value will lead the reconstruction coefficient inADDL to be too sparse or too dense which will deterioratethe classification performance If 120591 is too large the effect ofthe reconstruction error constraint (the first term in (4)) andthe sparse constraint (the third term in (4)) is weakenedwhich will decrease the discrimination ability of the learneddictionary On the contrary if 120591 is too small the second termin (4) will be neglected in dictionary learning which alsoreduces the performance of algorithmHence we set120582 = 005

and 120591 = 25 for all experiments

8 Mathematical Problems in Engineering

100 200 300 400 50076

78

80

82

84

86

88

90

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(a) K-NN

100 200 300 400 50076

78

80

82

84

86

88

90

92

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(b) SVM

100 200 300 400 50084

86

88

90

92

94

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(c) DPL

Figure 4 Weather recognition results over DATASET 1 by using different features

Now we evaluate weather active samples selection canimprove the recognition performance 500 samples inDATASET 1 are randomly selected as training data and theremaining samples are used for testing In training dataset50 samples are randomly selected as the labeled dataset D

119897

and the remaining 450 samples are selected as the unlabeleddatasetD

119906The proposedADDLuses the labeled datasetD

119897to

learn an initial dictionary pair and then iteratively selects 50samples from D

119906to label for expanding the training dataset

Figure 7 shows the recognition accuracy versus the numberof iterations

In Figure 7 the 0th iteration indicates that we only usethe initial 50 labeled samples to learn the dictionary and the9th iterationmeans using all 500 training samples to learn thedictionary for recognition The recognition ability of ADDL

is improved by active samples selection procedure and itachieves highest accuracy when the number of iterationsis 3 total 200 samples are used for training It is worthmentioning that ADDL obtains the best results when thenumber of iterations is set as 3 If iterations are larger than 3the recognition rates will drop about 1This is because thereare somenoisy examples or ldquooutliersrdquo in the unlabeled datasetand the more noisy examples or ldquooutliersrdquo will be selectedto learn the dictionary along with the increase of iterationswhich interferes with the dictionary learning and leads to theclassification performance degradation In the following thenumber of iterations is set to 3 for all experiments

432 Comparisons of ADDL with Other Methods Here theproposed ADDL is compared with several methods The first

Mathematical Problems in Engineering 9

100 400 700 1000 1300 1600 1900 2200 250079

81

83

85

87

89

91

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(a) K-NN

100 400 700 1000 1300 1600 1900 2200 250083

85

87

89

91

93

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(b) SVM

100 400 700 1000 1300 1600 1900 2200 250086

88

90

92

94

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(c) DPL

Figure 5 Weather recognition results over DATASET 2 by using different features

two methods are 119870-NN algorithm used by Song et al [7]and SVM with the radial basis function kernel (RBF-SVM)adopted by Roser and Moosmann [5] The third methodis SRC [15] which directly uses all training samples as thedictionary for classification In order to confirm that theactive samples selection in our ADDLmethod is effective theproposed ADDL is compared with the original DPL method[19] We also compare ADDL with the method proposedby Chen et al [9] As far as we know the work in [9] isthe only framework which addresses the same problem withour method that is recognizing different weather conditions(sunny cloudy and overcast) of images captured by a stillcamera It actively selected useful samples to training SVMfor recognizing different weather conditions

For DATASET 1 500 images are randomly selected as thetraining samples and the rest of images are used as the testingsamples In the training dataset 50 images are randomlychosen as the initial labeled training dataset D

119897 and the

remaining 450 images are regarded as the unlabeled datasetD119906 ADDL and Chenrsquos method [9] both include the active

learning procedure thus they iteratively choose 150 samplesfrom D

119906to be labeled based on their criterion of the samples

selection and add these samples toD119897for further training the

classification model For 119870-NN RBF-SVM SRC and DPLmethods which are without the active learning procedure150 samples are randomly selected from D

119906to be labeled

for expanding the labeled training dataset D119897 Table 1 lists

the comparisons of our approach with several methods for

10 Mathematical Problems in Engineering

5 15 25 35 45 55 65 75 85 95

90

92

94Av

erag

e acc

urac

y (

)

k

(a)

00005 0005 005 05 10 30 50 70 90

90

92

94

Aver

age a

ccur

acy

()

120582

(b)

0005 05 5 25 45 65 85

90

92

94

Aver

age a

ccur

acy

()

120591

(c)

Figure 6 Selection of parameters 119909-axis represents the different values of parameters and 119910-axis represents the average classificationaccuracy (a)The average classification rate under different k (b)The average classification rate under different 120582 (c)The average classificationrate under different 120591

0 1 2 3 4 5 6 7 8 98284868890929496

Iteration

Aver

age a

ccur

acy

()

Figure 7 Recognition accuracy on DATASET 1 versus the numberof iterations

Table 1 Comparisons on DATASET 1 among different methods

Methods Classification rate (mean plusmn std)119870-NN 829plusmn 12RBF-SVM 856plusmn 16SRC 897plusmn 09DPL 913plusmn 16Chenrsquos method [9] 9298plusmn 055ADDL 940plusmn 02

weather classification As can be seen from Table 1 ADDLoutperforms other methods The mean classification rate ofADDL reaches about 94

In DATASET 2 2500 images are randomly selected asthe training samples and the rest of images are used as

Table 2 Comparison on DATASET 2 among different methods

Methods Classification rate (mean plusmn std)119870-NN 850plusmn 16RBF-SVM 881plusmn 14SRC 882plusmn 11DPL 896plusmn 08Chenrsquos method [9] 888plusmn 16ADDL 904plusmn 10

the testing samples In the training dataset 50 images arerandomly chosen as the initial labeled training datasetD

119897 the

remaining 2450 images are regarded as the unlabeled datasetD119897 All parameters setting for DATASET 2 are the same as

DATASET 1 Table 2 lists the recognition results of differentmethods which indicates that the validity of the proposedADDL is better than other methods

FromTables 1 and 2 two points can be observed First wecan find that the recognition performances of 119870-NN RBF-SVM SRC and DPL are overall inferior to the proposedADDL algorithm This is probably because these four algo-rithms randomly select the unlabeled data from the givenpool which do not consider whether the selected samples arebeneficial for improving the performance of the classificationmodel or not Second although the proposed ADDL andChenrsquosmethod [9] both include the active learning paradigmthe proposed ADDL performs better than Chenrsquos method[9] This is due to the fact that Chenrsquos method [9] onlyconsiders the informativeness and ignores representativeness

Mathematical Problems in Engineering 11

of samples when selecting the unlabeled samples from thegiven pool

5 Conclusions

We have presented an effective framework for classifyingthree types of weather (sunny cloudy and overcast) basedon the outdoor images Through the analysis of the differentvisual manifestations of images caused by different weathersthe various features are separately extracted from the sky areaand nonsky area of images which describes visual appear-ance properties and physical characteristics of images underdifferent weather conditions respectively ADDL approachwas proposed to learn a more discriminative dictionary forimproving the weather classification performance by select-ing the informative and representative unlabeled samplesfrom a given pool to expand the training dataset Sincethere is not much image dataset on weather recognition wehave collected and labeled a new weather dataset for testingthe proposed algorithm The experimental results show thatADDL is a fairly effective and inspiring strategy for weatherclassification which also can be used inmany other computervision tasks

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

This work is supported by Fund of Jilin Provincial Sci-ence amp Technology Department (nos 20130206042GX and20140204089GX) and National Natural Science Foundationof China (nos 61403078 11271064 and 61471111)

References

[1] F Nashashibi R de Charette and A Lia ldquoDetection ofunfocused raindrops on a windscreen using low level imageprocessingrdquo in Proceedings of the 11th International Conferenceon Control Automation Robotics and Vision (ICARCV rsquo10) pp1410ndash1415 Singapore December 2010

[2] H Kurihata T Takahashi I Ide et al ldquoRainy weather recog-nition from in-vehicle camera images for driver assistancerdquo inProceedings of the IEEE Intelligent Vehicles Symposium pp 205ndash210 IEEE Las Vegas Nev USA June 2005

[3] H Woo Y M Jung J-G Kim and J K Seo ldquoEnvironmentallyrobust motion detection for video surveillancerdquo IEEE Transac-tions on Image Processing vol 19 no 11 pp 2838ndash2848 2010

[4] H Katsura J Miura M Hild and Y Shirai ldquoA view-basedoutdoor navigation using object recognition robust to changesof weather and seasonsrdquo in Proceedings of the IEEERSJ Interna-tional Conference on Intelligent Robots and Systems (IROS rsquo03)vol 3 pp 2974ndash2979 IEEE Las Vegas Nev USAOctober 2003

[5] M Roser and F Moosmann ldquoClassification of weather sit-uations on single color imagesrdquo in Proceedings of the IEEEIntelligent Vehicles Symposium pp 798ndash803 IEEE EindhovenThe Netherlands June 2008

[6] X Yan Y Luo and X Zheng ldquoWeather recognition based onimages captured by vision system in vehiclerdquo in Advances in

Neural NetworksmdashISNN 2009 pp 390ndash398 Springer BerlinGermany 2009

[7] H Song Y Chen and Y Gao ldquoWeather condition recognitionbased on feature extraction and K-NNrdquo in Foundations andPractical Applications of Cognitive Systems and Information Pro-cessing Proceedings of the First International Conference on Cog-nitive Systems and Information Processing Beijing China Dec2012 (CSIP2012) Advances in Intelligent Systems and Comput-ing pp 199ndash210 Springer Berlin Germany 2014

[8] C Lu D Lin J Jia and C-K Tang ldquoTwo-class weather class-ificationrdquo in Proceedings of the 27th IEEE Conference on Com-puter Vision and Pattern Recognition (CVPR rsquo14) pp 3718ndash3725Columbus Ohio USA June 2014

[9] Z Chen F Yang A Lindner G Barrenetxea and M VetterlildquoHow is the weather automatic inference from imagesrdquo inProceedings of the 19th IEEE International Conference on ImageProcessing (ICIP rsquo12) pp 1853ndash1856 Orlando Fla USAOctober2012

[10] K He J Sun and X Tang ldquoSingle image haze removal usingdark channel priorrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 33 no 12 pp 2341ndash2353 2011

[11] A Li and H Shouno ldquoDictionary-based image denoising byfused-lasso atom selectionrdquo Mathematical Problems in Engi-neering vol 2014 Article ID 368602 10 pages 2014

[12] Z Feng M Yang L Zhang Y Liu and D Zhang ldquoJoint dis-criminative dimensionality reduction and dictionary learningfor face recognitionrdquo Pattern Recognition vol 46 no 8 pp2134ndash2143 2013

[13] J Dong C Sun andW Yang ldquoA supervised dictionary learningand discriminative weighting model for action recognitionrdquoNeurocomputing vol 158 pp 246ndash256 2015

[14] M Aharon M Elad and A Bruckstein ldquoK-SVD an algorithmfor designing overcomplete dictionaries for sparse representa-tionrdquo IEEE Transactions on Signal Processing vol 54 no 11 pp4311ndash4322 2006

[15] JWright A Y Yang A Ganesh S S Sastry and YMa ldquoRobustface recognition via sparse representationrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 31 no 2 pp210ndash227 2009

[16] J Mairal F Bach and J Ponce ldquoTask-driven dictionarylearningrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 34 no 4 pp 791ndash804 2012

[17] Z Jiang Z Lin and L S Davis ldquoLabel consistent K-SVD learn-ing a discriminative dictionary for recognitionrdquo IEEE Transac-tions on Pattern Analysis and Machine Intelligence vol 35 no11 pp 2651ndash2664 2013

[18] M Yang L Zhang X Feng and D Zhang ldquoSparse representa-tion based fisher discrimination dictionary learning for imageclassificationrdquo International Journal of Computer Vision vol109 no 3 pp 209ndash232 2014

[19] S Gu L Zhang W Zuo and X Feng ldquoProjective dictionarypair learning for pattern classificationrdquo inProceedings of the 28thAnnual Conference on Neural Information Processing Systems(NIPS rsquo14) pp 793ndash801 December 2014

[20] B Settles Active Learning Literature Survey vol 52 Universityof Wisconsin-Madison Madison Wis USA 2010

[21] D Cohn L Atlas and R Ladner ldquoImproving generalizationwith active learningrdquo Machine Learning vol 15 no 2 pp 201ndash221 1994

[22] I Dagan and S P Engelson ldquoCommittee-based sampling fortraining probabilistic classifiersrdquo in Proceedings of the 12th

12 Mathematical Problems in Engineering

International Conference on Machine Learning pp 150ndash157Morgan Kaufmann Tahoe City Calif USA 1995

[23] D Angluin ldquoQueries and concept learningrdquoMachine Learningvol 2 no 4 pp 319ndash342 1988

[24] R D King K E Whelan F M Jones et al ldquoFunctional geno-mic hypothesis generation and experimentation by a robot sci-entistrdquo Nature vol 427 no 6971 pp 247ndash252 2004

[25] S Chakraborty V Balasubramanian and S PanchanathanldquoDynamic batch mode active learningrdquo in Proceedings of theIEEE Conference on Computer Vision and Pattern Recognition(CVPR rsquo11) pp 2649ndash2656 Providence RI USA June 2011

[26] K Nigam and A McCallum ldquoEmploying EM in pool-basedactive learning for text classificationrdquo in Proceedings of the 15thInternational Conference on Machine Learning (ICML rsquo98) pp350ndash358 Morgan Kaufmann Madison Wis USA 1998

[27] A Kapoor K Grauman R Urtasun and T Darrell ldquoActivelearning with Gaussian processes for object categorizationrdquo inProceedings of the IEEE 11th International Conference on Com-puter Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro BrazilOctober 2007

[28] S J Huang R Jin and Z H Zhou ldquoActive learning by queryinginformative and representative examplesrdquo in Proceedings of theAnnual Conference on Neural Information Processing Systems(NIPS rsquo10) pp 892ndash900 2010

[29] P Donmez J G Carbonell and P N Bennett ldquoDual strategyactive learningrdquo inMachine Learning ECML 2007 J N Kok JKoronacki R L de Mantaras S Matwin D Mladenic and ASkowron Eds vol 4701 of Lecture Notes in Computer Sciencepp 116ndash127 Springer Berlin Germany 2007

[30] S C H Hoi R Jin J Zhu and M R Lyu ldquoSemi-supervisedSVM batch mode active learning for image retrievalrdquo in Pro-ceedings of the 26th IEEE Conference on Computer Vision andPattern Recognition (CVPR rsquo08) pp 1ndash7 Anchorage AlaskaUSA June 2008

[31] Z Xu K Yu V Tresp X Xu and J Wang ldquoRepresentativesampling for text classification using support vector machinesrdquoin Proceedings of the 25th European Conference on IR Research(ECIR rsquo03) Springer 2003

[32] A Bosch A Zisserman and X Munoz ldquoImage classificationusing random forests and fernsrdquo in Proceedings of the IEEE 11thInternational Conference on Computer Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro Brazil October 2007

[33] T Ojala M Pietikainen and D Harwood ldquoA comparativestudy of texture measures with classification based on featuredistributionsrdquo Pattern Recognition vol 29 no 1 pp 51ndash59 1996

[34] S G Narasimhan and S K Nayar ldquoChromatic framework forvision in bad weatherrdquo in Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo00) pp 598ndash605 Hilton Head Island SC USA June 2000

[35] S G Narasimhan and S K Nayar ldquoRemoving weather effectsfrom monochrome imagesrdquo in Proceedings of the IEEE Com-puter Society Conference on Computer Vision and Pattern Recog-nition vol 2 pp II-186ndashII-193 IEEE Kauai Hawaii USADecember 2001

[36] S G Narasimhan and S K Nayar ldquoVision and the atmosphererdquoInternational Journal of Computer Vision vol 48 no 3 pp 233ndash254 2002

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings ofthe International Workshop on Statistical Learning in ComputerVision (ECCV rsquo04) pp 1ndash22 Prague Czech Republic 2004

[38] D D Lewis andW A Gale ldquoA sequential algorithm for trainingtext classifiersrdquo in SIGIR rsquo94 Proceedings of the SeventeenthAnnual International ACM-SIGIR Conference on Research andDevelopment in Information Retrieval organised by Dublin CityUniversity pp 3ndash12 Springer Berlin Germany 1994

[39] B Settles and M Craven ldquoAn analysis of active learning strate-gies for sequence labeling tasksrdquo inProceedings of the Conferenceon Empirical Methods in Natural Language Processing (EMNLPrsquo08) pp 1070ndash1079 Association for Computational Linguistics2008

[40] G Schohn andD Cohn ldquoLess ismore active learning with sup-port vector machinesrdquo in Proceedings of the 17th InternationalConference on Machine Learning (ICML rsquo00) pp 839ndash846Morgan Kaufmann 2000

[41] S Tong and D Koller ldquoSupport vector machine active learningwith applications to text classificationrdquo The Journal of MachineLearning Research vol 2 pp 45ndash66 2002

[42] M Szummer and T S Jaakkola ldquoInformation regularizationwith partially labeled datardquo in Advances in Neural InformationProcessing Systems pp 1025ndash1032 MIT Press 2002

[43] X Li and Y Guo ldquoAdaptive active learning for image classifica-tionrdquo in Proceedings of the 26th IEEE Conference on ComputerVision and Pattern Recognition (CVPR rsquo13) pp 859ndash866 IEEEPortland Ore USA June 2013

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: Research Article Active Discriminative Dictionary Learning

4 Mathematical Problems in Engineering

contrasts The contrast CON of an image can be computedby

CON =119864max minus 119864min119864max + 119864min

(1)

where 119864max and 119864min are the maximum and minimumintensities of the image respectively

Overcast weather may come with haze [8] The darkchannel prior presented in [10] is effective to detect theimage haze It is based on the observation that most haze-free patches in the nonsky regions of images should have avery low intensity at least one RGB color channel The darkchannel 119869dark was defined as

119869dark

(119909) = min119888isin119903119892119887

( min1199091015840isinΩ(119909)

(119869119888(1199091015840))) (2)

where 119869119888 is one color channel of the image J and Ω(119909) is a

local patch centered at 119909In summary both the visual appearance features (SIFT

HSV color LBP and the gradient magnitude) of the skyregion and the physical characteristics based features (thecontrast and the dark channel) of the nonsky region areextracted to distinguish the images under different weathersituations And then the ldquobag-of-wordsrdquo model [37] is usedto code each feature for forming the feature vectors

32 The Proposed ADDL Classification Model Inspired bythe technological advancements of discriminative dictionarylearning and active learning we design an active discrimi-native dictionary learning (ADDL) algorithm to improve thediscriminative power of the learned dictionary In ADDLalgorithm DPL [19] is applied to learn a discriminativedictionary for recognizing different weather conditions andthe strategy of active samples selection is developed toiteratively select the unlabeled sample from a given pool toenlarge the training dataset for improving the DPL classifi-cation performance The criterion of active samples selectionincludes the informativenessmeasure and the representative-ness measure

321 DPL Algorithm Suppose there is a set of 119898-dimen-sionality training samples from 119862 classes denoted by 119883

= [1198831 119883

119888 119883

119862] with the label set 119884 = [119884

1 119884

119888

119884119862] where 119883

119888isin 119877119898times119899 denotes the sample set of 119888th class

and 119884119888denotes the corresponding label set DPL [19] algo-

rithm jointly learned an analysis dictionary 119875 = [1198751 119875

119888

119875119862] isin 119877119896119862times119898 and a synthesis dictionary 119863 = [119863

1 119863

119888

119863119862] isin 119877119898times119896119862 to avoid resolving the costly 119897

0-norm or 119897

1-

norm sparse coding process 119875 and 119863 were used for linearencoding representation coefficients and class-specific dis-criminative reconstruction respectively The object functionof DPL model is

119875lowast 119863lowast = argmin

119875119863

119862

sum

119888=1

1003817100381710038171003817119883119888 minus 119863119888119875119888119883119888

10038171003817100381710038172

119865+ 120582

10038171003817100381710038171003817119875119888119883119888

10038171003817100381710038171003817

2

119865

st 100381710038171003817100381711988911989410038171003817100381710038172

2le 1

(3)

where 119863119888

isin 119877119898times119896 and 119875

119888isin 119877119896times119898 represent subdictionary

pairs corresponding to class 119888119883119888represents the complemen-

tary matrix of 119883119888 120582 ≻ 0 is a scalar constant to control the

discriminative property of 119875 and 119889119894denotes the 119894th element

of dictionary 119863The objective function in (3) is generally nonconvex But

it can be relaxed to the following form by introducing avariable matrix 119860

119875lowast 119860lowast 119863lowast = argmin119875119860119863

119862

sum

119888=1

1003817100381710038171003817119883119888 minus 119863119888119860119888

10038171003817100381710038172

119865

+ 1205911003817100381710038171003817119875119888119883119888 minus 119860

119888

10038171003817100381710038172

119865

+ 12058210038171003817100381710038171003817119875119888119883119888

10038171003817100381710038171003817

2

119865

st 100381710038171003817100381711988911989410038171003817100381710038172

2le 1

(4)

where 120591 is an algorithm parameter According to [19] theobjective function in (4) can be solved by an alternativelyupdated manner

When 119863 and 119875 have been learned given a test sample119909119905 the class-specific reconstruction residual is used to assign

the class label So the classification model associated with theDPL is defined as

label (119909119905) = argmin

119888

1003817100381710038171003817119909119905 minus 119863119888119875119888119909119905

10038171003817100381710038172 (5)

When dealing with the classification task DPL requiressufficient labeled training samples to learn discriminativedictionary pair for obtaining good results In fact it is difficultand expensive to obtain the vast quantity of labeled dataIf we can exploit the information provided by the massiveinexpensive unlabeled samples and choose small amounts ofthe ldquoprofitablerdquo samples (the ldquoprofitablerdquo unlabeled samplesare the ones that are most beneficial for the improvement ofthe DPL classification performance) from unlabeled datasetto be labeled manually we would learn a more discriminativedictionary than the one learned only using a limited numberof labeled training data To achieve this we introduce theactive learning technique to DPL in the next section

322 Introducing Active Learning to DPL When evaluatingone sample is ldquoprofitablerdquo or not two measures are consid-ered informativeness and representativeness The proposedADDL iteratively evaluates both informativeness and repre-sentativeness of unlabeled samples in a given pool for seekingthe ones that are most beneficial for the improvement oftheDPL classification performance Specifically the informa-tiveness measure is constructed based on the reconstructionerror and the entropy on the probability distribution over theclass-specific reconstruction error and the representativenessis obtained from the distribution of the unlabeled dataset inthis study

Assume that we are given an initial training dataset D119897=

1199091 1199092 119909

119899119897 with its label set Y

119897= 1199101 1199102 119910

119899119897 and

an unlabeled dataset D119906

= 119909119899119897+1

119909119899119897+2

119909119873 where 119909

119894isin

119877119898times1 is an 119898 dimensional feature vector and 119910

119894isin 1 2 3

is the corresponding class label (sunny cloudy or overcast)

Mathematical Problems in Engineering 5

The target of ADDL is to iteratively select119873119904most ldquoprofitablerdquo

samples denoted by D119904 from D

119906to query their labels Y

119904and

then add them to D119897for improving the performance of the

dictionary learning classification model

Informativeness Measure Informativeness measure is aneffective criterion to select informative samples for reducingthe uncertainty of the classification model which capturesthe relationship of the candidate samples with the currentclassification model Probabilistic classification models selectthe one that has the largest entropy on the conditionaldistribution over its labels [38 39] Query-by-committeealgorithms choose the samples which have themost disagree-ment among a committee [22 26] In SVM methods themost informative sample is regarded as the one that is closestto the separating hyperplane [40 41] Since DPL is usedas the classification model in our framework we design aninformativeness measure based on the reconstruction errorand the entropy on the probability distribution over the class-specific reconstruction error of the sample

For dictionary learning the samples which are well-represented through the current learned dictionary are lesslikely to provide more information in further refining thedictionary Instead the samples have large reconstructionerror and large uncertainty should be mainly cared aboutbecause they have some additional information that is notcaptured by the current dictionary As a consequence theinformativeness measure is defined as follows

119864 (119909119895) = Error 119877

119895+ Entropy

119895 (6)

where Error 119877119895and Entropy

119895denote the reconstruction error

of the sample 119909119895

isin D119906with respect to the current learned

dictionary and the entropy of probability distribution overclass-specific reconstruction error of the sample 119909

119895isin D119906

respectively Error 119877119895is defined as

Error 119877119895= min119888

10038171003817100381710038171003817119909119895minus 119863119888119875119888119909119895

10038171003817100381710038171003817

2

(7)

where119863119888and119875119888represent subdictionary pairs corresponding

to the class 119888 which is learned by DPL algorithm as shownin (4) The larger Error 119877

119895indicates that the current learned

dictionary does not represent the sample 119909119895well

Since the class-specific reconstruction error is used toidentify the class label of sample 119909

119895(as shown in (5)) the

probability distribution of class-specific reconstruction errorof 119909119895can be acquired The class-specific reconstruction error

probability of 119909119895in class c is defined as

119901119888(119909119895) =

10038171003817100381710038171003817119909119895minus 119863119888119875119888119909119895

10038171003817100381710038171003817

2

sum119862

119888=1

10038171003817100381710038171003817119909119895minus 119863119888119875119888119909119895

10038171003817100381710038171003817

2 (8)

The class probability distribution 119901(119909119895) for sample 119909

119895is

computed as 119901(119909119895) = [119901

1 1199012 119901

119862] which demonstrates

how well the dictionary distinguishes the input sample Thatis if an input sample can be expressed well by the currentdictionary we will get a small value of 119909

119895minus 1198631198881198751198881199091198952

to one of class-specific subdictionaries and thus the classdistribution should reach the valley at the most likely classEntropy is a measure of uncertainty Hence in order toestimate the uncertainty of an input sample label the entropyof probability distribution over class-specific reconstructionerror is calculated as follows

Entropy119895= minus

119862

sum

119888=1

119901119888(119909119895) log119901

119888(119909119895) (9)

The high Entropy119895value demonstrates that 119909

119895is difficult to be

classified by the current learned dictionary thus it should beselected to be labeled and added to the labeled set for furthertraining dictionary learning

Representativeness Measure Since the informativeness mea-sure only considers how the candidate samples relate to thecurrent classificationmodel it ignores the potential structureinformation of the whole input unlabeled dataset Thereforethe representativeness measure is employed as an additionalcriterion to choose the useful unlabeled samples Represen-tativeness measure is to evaluate whether the samples wellrepresent the overall input patterns of unlabeled data whichexploits the relation between the candidate sample and therest of unlabeled samples

The distribution of unlabeled data is very useful fortraining a good classifier In previous active learning workthe marginal density and cosine distance are used as therepresentativeness measure to gain the information of datadistribution [39 42] Li and Guo [43] defined a morestraightforward representativeness measure called mutualinformation Its intention is to select the samples located inthe density region of the unlabeled data distribution which ismore representative regarding the remaining unlabeled datathan the ones located in the sparse region We introduce theframework of representativenessmeasure proposed by Li andGuo [43] into the dictionary learning

For an unlabeled sample 119909119895 the mutual information with

respect to other unlabeled samples is defined as follows

119872(119909119895) = 119867(119909

119895) minus 119867(119909

119895| 119883119880119895

) =1

2ln(

1205902

119895

1205902119895|119880119895

) (10)

where 119867(119909119895) and 119867(119909

119895| 119883119880119895

) denote the entropy and theconditional entropy of sample 119909

119895 respectively 119880

119895represents

the index set of unlabeled samples where 119895 has been removedfrom 119880 119880

119895= 119880 minus 119895 and 119883

119880119895

represents the set of samplesindexed by119880

119895 1205902119895and 120590

2

119895|119880119895

can be calculated by the followingformulas

1205902

119895= K (119909

119895 119909119895)

1205902

119895|119880119895

= 1205902

119895minus sum

119895119880119895

minus1

sum

119880119895119880119895

sum

119880119895119895

(11)

6 Mathematical Problems in Engineering

Inputs Labeled set D119897and its label set Y

119897 Unlabeled set D

119906 the number of iteration 119868

119905

and the number of unlabeled samples 119873119904to be selected in each iteration

(1) Initialization Learn an initial dictionary pair 119863lowast and 119875lowast by DPL algorithm from the D

119897

(2) For 119894 = 1 to 119868119905 do

(3) Compute 119864(119909119895) and 119872(119909

119895) by (6) and (10) for each sample 119909

119895in the unlabeled dataset D

119906

(4) Select 119873119904samples (denoted by D

119904) from the D

119906by (13) and add them into D

119897with their

class labels which manually assigned by user Then updates D119906= D119906minus D119904and D

119897= D119897cup D119904

(5) Learn the refined dictionaries 119863lowastnew and 119875lowast

new over the expanded dataset D119897

(6) End forOutput Final learned dictionary pair 119863lowastnew and 119875

lowast

new

Algorithm 1 Active discriminative dictionary learning (ADDL)

Assume that the index set 119880119895= (1 2 3 119905) and sum

119880119895119880119895

is a kernel matrix defined over all the unlabeled samplesindexed by 119880

119895 it is computed by the following form

sum

119880119895119880119895

= (

K (1199091 1199091) K (119909

1 1199092) sdot sdot sdot K (119909

1 119909119905)

K (1199092 1199091) K (119909

2 1199092) sdot sdot sdot K (119909

2 119909119905)

K (119909119905 1199091) K (119909

119905 1199092) sdot sdot sdot K (119909

119905 119909119905)

)

(12)

K(sdot) is a symmetric positive definite kernel function Inour approach we apply simple and effective linear kernelK(119909119895 119909119895) = 119909

119894minus 1199091198952 for our dictionary learning task

The mutual information is used to implicitly exploit theinformation between the selected samples and the remainingones The samples which have large mutual informationshould be selected from the pool of unlabeled data for refiningthe learned dictionary in DPL

Procedure of Active Samples Selection Based on the aboveanalysis we aim to integrate the strengths of informativenessmeasure and representativeness measure to select the unla-beled samples from the pool of the unlabeled dataWe choosethe samples that have not only large reconstruction errorand the entropy of probability distribution over class-specificreconstruction error with respect to the DPL classificationmodel but also the large representativeness regarding the restof unlabeled samples The sample set D

119904= 119909119904

1 119909119904

2 119909

119904

119873119904

are iteratively selected from pool by the following formula

119909119904= argmin119909119895

(119864 (119909119895) + 119872(119909

119895))

119895 isin 119899119897 + 1 119899119897 + 2 119873

(13)

The overall of our ADDL is given in Algorithm 1

4 Experiments

In this section the performance of the proposed frameworkis evaluated on two weather datasets We first give the

details about the datasets and experimental settingsThen theexperimental results are provided and analyzed

41 Datasets and Experimental Setting

Datasets The first dataset employed in our experiment is thedataset provided by Chen et al [9] (denoted as DATASET1) DATASET 1 contains 1000 images of size 3966 times 270 andeach image has been manually labeled as sunny cloudy orovercast There are 276 sunny images 251 cloudy images and473 overcast images in DATASET 1 Figure 2 shows threeimages from DATASET 1 with the label sunny cloudy andovercast respectively

Because there are few available public datasets forweather recognition we construct a new dataset (denotedas DATASET 2) to test the performance of the proposedmethod The images in DATASET 2 are selected from thepanorama images collected on the roof of BC buildingat EPFL (httppanoramaepflch provides high resolution(13200times 900) panorama images from 2005 till now recordingat every 10 minutes during daytime) and categorized intosunny cloudy or overcast based on the classification criterionpresented in [9] It includes 5000 images whichwere capturedat approximately every 30 minutes during daytime in 2014and the size of each image is 4821 times 400 Although bothDATASET 1 and DATASET 2 are constructed based on theimages provided by httppanoramaepflch DATASET 2is more challenging because it contains a large number ofimages captured in different seasons In Figure 3 some exam-ples of different labeled images in DATASET 2 are shown

Experimental Setting For the sky region of each image fourtypes of visual appearance features are extracted including200-dim SIFT 600-dim HSV color feature (200-dim his-togram of H channel 200-dim histogram of S channel and200-dimhistogramofV channel) 200-dimLBP and 200-dimgradient magnitude featureThese features are only extractedfrom the sky part of each image and the sky detector andfeature extraction procedure provided by Chen et al [9] areused in this paper Besides the visual appearance features twokinds of features based on physical characteristics of imagescaptured under different weather are also extracted fromthe nonsky region which consists of 200-dim bag-of-wordsrepresentation of the contrast and 200-dim bag-of-word

Mathematical Problems in Engineering 7

(a) (b)

(c)

Figure 2 Examples in DATASET 1 (a) Sunny (b) cloudy (c) overcast

(a) (b)

(c)

Figure 3 Examples in DATASET 2 (a) Sunny (b) cloudy (c) overcast

representation of the dark channel To be specific we divideeach nonsky region of the image into 32 times 32 blocks andextract the contrast and the dark channel features by (1)and (2) and then use bag-of-words model [37] to code eachfeature

In our experiment 50 images are randomly selectedin each dataset for training and the remaining data is usedfor testing The training data are randomly partitioned intolabeled sample set D

119897and unlabeled sample set D

119906 D119897is

applied for learning an initial dictionary andD119906is utilized for

actively selecting ldquoprofitablerdquo samples to iteratively improvethe classification performance To make the experimentresultsmore convincing each following experimental processis repeated ten times and then the mean and standarddeviation of the classification accuracy are reported

42 Experiment I Verifying the Performance of FeatureExtraction The effectiveness of the feature extraction ofour method is first evaluated Many previous works merelyextract the visual appearance features of the sky part forweather recognition [9] In order to validate the power ofour extracted features based on physical characteristics ofimages the results of two weather recognition schemes arecompared One only uses the visual appearance features toclassify the weather conditions and the other combines thevisual appearance features with features based on physicalcharacteristics of images to identify the weather conditionsIn order to weaken the influence of classifier on the results ofweather recognition119870-NN classification (119870 is experientiallyset to 30) SVM with the radial basis function kernel andthe original DPL without active samples selection procedureare applied in this experiment Figures 4 and 5 show thecomparison results on DATASET 1 and DATASET 2 respect-ively

In Figures 4 and 5 119909-axis represents the different numberof training samples and 119910-axis represents the average classifi-cation accuracy The red dotted lines indicate just six visualappearance features of the sky area are used and the bluesolid lines indicate both six visual appearance features and

two features based on physical characteristics of the nonskyarea are applied for recognition From Figures 4 and 5 it isclearly observed that the combination of visual appearancefeatures and physical features can achieve better performancefor weather recognition task

43 Experiment II Recognizing Weather Conditions by theProposed ADDL In this section the performance of theproposed ADDL algorithm is evaluated First experimentis conducted to give the best parameters for ADDL AndthenADDL is compared against several popular classificationmethods

431 Parameters Selection There are three important param-eters in the proposed approach that is 119896 120582 and 120591 119896 isthe number of atoms in each subdictionary 119863

119888learned from

samples in each class 120582 is used to control the discriminativeproperty of 119875 and 120591 is a scalar constant in DPL algorithmThe performances of our ADDL under various values of 119896 120582and 120591 are studied on DATASET 1 Figure 6(a) lists the classi-fication results when 119896 = 15 25 35 45 55 65 75 85 95 Itcan be seen that the highest average classification accuracyis obtained when 119896 = 25 This demonstrates that ADDLis effective to learn a compact dictionary According tothe observation in Figure 6(a) we set 119896 to be 25 in allexperiments The classification results obtained by using thedifferent 120591 and 120582 are shown in Figures 6(b) and 6(c) FromFigures 6(b) and 6(c) the optimal values of 120582 and 120591 are 005and 25 respectively This is because of the fact that a too bigor too small 120582 value will lead the reconstruction coefficient inADDL to be too sparse or too dense which will deterioratethe classification performance If 120591 is too large the effect ofthe reconstruction error constraint (the first term in (4)) andthe sparse constraint (the third term in (4)) is weakenedwhich will decrease the discrimination ability of the learneddictionary On the contrary if 120591 is too small the second termin (4) will be neglected in dictionary learning which alsoreduces the performance of algorithmHence we set120582 = 005

and 120591 = 25 for all experiments

8 Mathematical Problems in Engineering

100 200 300 400 50076

78

80

82

84

86

88

90

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(a) K-NN

100 200 300 400 50076

78

80

82

84

86

88

90

92

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(b) SVM

100 200 300 400 50084

86

88

90

92

94

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(c) DPL

Figure 4 Weather recognition results over DATASET 1 by using different features

Now we evaluate weather active samples selection canimprove the recognition performance 500 samples inDATASET 1 are randomly selected as training data and theremaining samples are used for testing In training dataset50 samples are randomly selected as the labeled dataset D

119897

and the remaining 450 samples are selected as the unlabeleddatasetD

119906The proposedADDLuses the labeled datasetD

119897to

learn an initial dictionary pair and then iteratively selects 50samples from D

119906to label for expanding the training dataset

Figure 7 shows the recognition accuracy versus the numberof iterations

In Figure 7 the 0th iteration indicates that we only usethe initial 50 labeled samples to learn the dictionary and the9th iterationmeans using all 500 training samples to learn thedictionary for recognition The recognition ability of ADDL

is improved by active samples selection procedure and itachieves highest accuracy when the number of iterationsis 3 total 200 samples are used for training It is worthmentioning that ADDL obtains the best results when thenumber of iterations is set as 3 If iterations are larger than 3the recognition rates will drop about 1This is because thereare somenoisy examples or ldquooutliersrdquo in the unlabeled datasetand the more noisy examples or ldquooutliersrdquo will be selectedto learn the dictionary along with the increase of iterationswhich interferes with the dictionary learning and leads to theclassification performance degradation In the following thenumber of iterations is set to 3 for all experiments

432 Comparisons of ADDL with Other Methods Here theproposed ADDL is compared with several methods The first

Mathematical Problems in Engineering 9

100 400 700 1000 1300 1600 1900 2200 250079

81

83

85

87

89

91

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(a) K-NN

100 400 700 1000 1300 1600 1900 2200 250083

85

87

89

91

93

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(b) SVM

100 400 700 1000 1300 1600 1900 2200 250086

88

90

92

94

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(c) DPL

Figure 5 Weather recognition results over DATASET 2 by using different features

two methods are 119870-NN algorithm used by Song et al [7]and SVM with the radial basis function kernel (RBF-SVM)adopted by Roser and Moosmann [5] The third methodis SRC [15] which directly uses all training samples as thedictionary for classification In order to confirm that theactive samples selection in our ADDLmethod is effective theproposed ADDL is compared with the original DPL method[19] We also compare ADDL with the method proposedby Chen et al [9] As far as we know the work in [9] isthe only framework which addresses the same problem withour method that is recognizing different weather conditions(sunny cloudy and overcast) of images captured by a stillcamera It actively selected useful samples to training SVMfor recognizing different weather conditions

For DATASET 1 500 images are randomly selected as thetraining samples and the rest of images are used as the testingsamples In the training dataset 50 images are randomlychosen as the initial labeled training dataset D

119897 and the

remaining 450 images are regarded as the unlabeled datasetD119906 ADDL and Chenrsquos method [9] both include the active

learning procedure thus they iteratively choose 150 samplesfrom D

119906to be labeled based on their criterion of the samples

selection and add these samples toD119897for further training the

classification model For 119870-NN RBF-SVM SRC and DPLmethods which are without the active learning procedure150 samples are randomly selected from D

119906to be labeled

for expanding the labeled training dataset D119897 Table 1 lists

the comparisons of our approach with several methods for

10 Mathematical Problems in Engineering

5 15 25 35 45 55 65 75 85 95

90

92

94Av

erag

e acc

urac

y (

)

k

(a)

00005 0005 005 05 10 30 50 70 90

90

92

94

Aver

age a

ccur

acy

()

120582

(b)

0005 05 5 25 45 65 85

90

92

94

Aver

age a

ccur

acy

()

120591

(c)

Figure 6 Selection of parameters 119909-axis represents the different values of parameters and 119910-axis represents the average classificationaccuracy (a)The average classification rate under different k (b)The average classification rate under different 120582 (c)The average classificationrate under different 120591

0 1 2 3 4 5 6 7 8 98284868890929496

Iteration

Aver

age a

ccur

acy

()

Figure 7 Recognition accuracy on DATASET 1 versus the numberof iterations

Table 1 Comparisons on DATASET 1 among different methods

Methods Classification rate (mean plusmn std)119870-NN 829plusmn 12RBF-SVM 856plusmn 16SRC 897plusmn 09DPL 913plusmn 16Chenrsquos method [9] 9298plusmn 055ADDL 940plusmn 02

weather classification As can be seen from Table 1 ADDLoutperforms other methods The mean classification rate ofADDL reaches about 94

In DATASET 2 2500 images are randomly selected asthe training samples and the rest of images are used as

Table 2 Comparison on DATASET 2 among different methods

Methods Classification rate (mean plusmn std)119870-NN 850plusmn 16RBF-SVM 881plusmn 14SRC 882plusmn 11DPL 896plusmn 08Chenrsquos method [9] 888plusmn 16ADDL 904plusmn 10

the testing samples In the training dataset 50 images arerandomly chosen as the initial labeled training datasetD

119897 the

remaining 2450 images are regarded as the unlabeled datasetD119897 All parameters setting for DATASET 2 are the same as

DATASET 1 Table 2 lists the recognition results of differentmethods which indicates that the validity of the proposedADDL is better than other methods

FromTables 1 and 2 two points can be observed First wecan find that the recognition performances of 119870-NN RBF-SVM SRC and DPL are overall inferior to the proposedADDL algorithm This is probably because these four algo-rithms randomly select the unlabeled data from the givenpool which do not consider whether the selected samples arebeneficial for improving the performance of the classificationmodel or not Second although the proposed ADDL andChenrsquosmethod [9] both include the active learning paradigmthe proposed ADDL performs better than Chenrsquos method[9] This is due to the fact that Chenrsquos method [9] onlyconsiders the informativeness and ignores representativeness

Mathematical Problems in Engineering 11

of samples when selecting the unlabeled samples from thegiven pool

5 Conclusions

We have presented an effective framework for classifyingthree types of weather (sunny cloudy and overcast) basedon the outdoor images Through the analysis of the differentvisual manifestations of images caused by different weathersthe various features are separately extracted from the sky areaand nonsky area of images which describes visual appear-ance properties and physical characteristics of images underdifferent weather conditions respectively ADDL approachwas proposed to learn a more discriminative dictionary forimproving the weather classification performance by select-ing the informative and representative unlabeled samplesfrom a given pool to expand the training dataset Sincethere is not much image dataset on weather recognition wehave collected and labeled a new weather dataset for testingthe proposed algorithm The experimental results show thatADDL is a fairly effective and inspiring strategy for weatherclassification which also can be used inmany other computervision tasks

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

This work is supported by Fund of Jilin Provincial Sci-ence amp Technology Department (nos 20130206042GX and20140204089GX) and National Natural Science Foundationof China (nos 61403078 11271064 and 61471111)

References

[1] F Nashashibi R de Charette and A Lia ldquoDetection ofunfocused raindrops on a windscreen using low level imageprocessingrdquo in Proceedings of the 11th International Conferenceon Control Automation Robotics and Vision (ICARCV rsquo10) pp1410ndash1415 Singapore December 2010

[2] H Kurihata T Takahashi I Ide et al ldquoRainy weather recog-nition from in-vehicle camera images for driver assistancerdquo inProceedings of the IEEE Intelligent Vehicles Symposium pp 205ndash210 IEEE Las Vegas Nev USA June 2005

[3] H Woo Y M Jung J-G Kim and J K Seo ldquoEnvironmentallyrobust motion detection for video surveillancerdquo IEEE Transac-tions on Image Processing vol 19 no 11 pp 2838ndash2848 2010

[4] H Katsura J Miura M Hild and Y Shirai ldquoA view-basedoutdoor navigation using object recognition robust to changesof weather and seasonsrdquo in Proceedings of the IEEERSJ Interna-tional Conference on Intelligent Robots and Systems (IROS rsquo03)vol 3 pp 2974ndash2979 IEEE Las Vegas Nev USAOctober 2003

[5] M Roser and F Moosmann ldquoClassification of weather sit-uations on single color imagesrdquo in Proceedings of the IEEEIntelligent Vehicles Symposium pp 798ndash803 IEEE EindhovenThe Netherlands June 2008

[6] X Yan Y Luo and X Zheng ldquoWeather recognition based onimages captured by vision system in vehiclerdquo in Advances in

Neural NetworksmdashISNN 2009 pp 390ndash398 Springer BerlinGermany 2009

[7] H Song Y Chen and Y Gao ldquoWeather condition recognitionbased on feature extraction and K-NNrdquo in Foundations andPractical Applications of Cognitive Systems and Information Pro-cessing Proceedings of the First International Conference on Cog-nitive Systems and Information Processing Beijing China Dec2012 (CSIP2012) Advances in Intelligent Systems and Comput-ing pp 199ndash210 Springer Berlin Germany 2014

[8] C Lu D Lin J Jia and C-K Tang ldquoTwo-class weather class-ificationrdquo in Proceedings of the 27th IEEE Conference on Com-puter Vision and Pattern Recognition (CVPR rsquo14) pp 3718ndash3725Columbus Ohio USA June 2014

[9] Z Chen F Yang A Lindner G Barrenetxea and M VetterlildquoHow is the weather automatic inference from imagesrdquo inProceedings of the 19th IEEE International Conference on ImageProcessing (ICIP rsquo12) pp 1853ndash1856 Orlando Fla USAOctober2012

[10] K He J Sun and X Tang ldquoSingle image haze removal usingdark channel priorrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 33 no 12 pp 2341ndash2353 2011

[11] A Li and H Shouno ldquoDictionary-based image denoising byfused-lasso atom selectionrdquo Mathematical Problems in Engi-neering vol 2014 Article ID 368602 10 pages 2014

[12] Z Feng M Yang L Zhang Y Liu and D Zhang ldquoJoint dis-criminative dimensionality reduction and dictionary learningfor face recognitionrdquo Pattern Recognition vol 46 no 8 pp2134ndash2143 2013

[13] J Dong C Sun andW Yang ldquoA supervised dictionary learningand discriminative weighting model for action recognitionrdquoNeurocomputing vol 158 pp 246ndash256 2015

[14] M Aharon M Elad and A Bruckstein ldquoK-SVD an algorithmfor designing overcomplete dictionaries for sparse representa-tionrdquo IEEE Transactions on Signal Processing vol 54 no 11 pp4311ndash4322 2006

[15] JWright A Y Yang A Ganesh S S Sastry and YMa ldquoRobustface recognition via sparse representationrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 31 no 2 pp210ndash227 2009

[16] J Mairal F Bach and J Ponce ldquoTask-driven dictionarylearningrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 34 no 4 pp 791ndash804 2012

[17] Z Jiang Z Lin and L S Davis ldquoLabel consistent K-SVD learn-ing a discriminative dictionary for recognitionrdquo IEEE Transac-tions on Pattern Analysis and Machine Intelligence vol 35 no11 pp 2651ndash2664 2013

[18] M Yang L Zhang X Feng and D Zhang ldquoSparse representa-tion based fisher discrimination dictionary learning for imageclassificationrdquo International Journal of Computer Vision vol109 no 3 pp 209ndash232 2014

[19] S Gu L Zhang W Zuo and X Feng ldquoProjective dictionarypair learning for pattern classificationrdquo inProceedings of the 28thAnnual Conference on Neural Information Processing Systems(NIPS rsquo14) pp 793ndash801 December 2014

[20] B Settles Active Learning Literature Survey vol 52 Universityof Wisconsin-Madison Madison Wis USA 2010

[21] D Cohn L Atlas and R Ladner ldquoImproving generalizationwith active learningrdquo Machine Learning vol 15 no 2 pp 201ndash221 1994

[22] I Dagan and S P Engelson ldquoCommittee-based sampling fortraining probabilistic classifiersrdquo in Proceedings of the 12th

12 Mathematical Problems in Engineering

International Conference on Machine Learning pp 150ndash157Morgan Kaufmann Tahoe City Calif USA 1995

[23] D Angluin ldquoQueries and concept learningrdquoMachine Learningvol 2 no 4 pp 319ndash342 1988

[24] R D King K E Whelan F M Jones et al ldquoFunctional geno-mic hypothesis generation and experimentation by a robot sci-entistrdquo Nature vol 427 no 6971 pp 247ndash252 2004

[25] S Chakraborty V Balasubramanian and S PanchanathanldquoDynamic batch mode active learningrdquo in Proceedings of theIEEE Conference on Computer Vision and Pattern Recognition(CVPR rsquo11) pp 2649ndash2656 Providence RI USA June 2011

[26] K Nigam and A McCallum ldquoEmploying EM in pool-basedactive learning for text classificationrdquo in Proceedings of the 15thInternational Conference on Machine Learning (ICML rsquo98) pp350ndash358 Morgan Kaufmann Madison Wis USA 1998

[27] A Kapoor K Grauman R Urtasun and T Darrell ldquoActivelearning with Gaussian processes for object categorizationrdquo inProceedings of the IEEE 11th International Conference on Com-puter Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro BrazilOctober 2007

[28] S J Huang R Jin and Z H Zhou ldquoActive learning by queryinginformative and representative examplesrdquo in Proceedings of theAnnual Conference on Neural Information Processing Systems(NIPS rsquo10) pp 892ndash900 2010

[29] P Donmez J G Carbonell and P N Bennett ldquoDual strategyactive learningrdquo inMachine Learning ECML 2007 J N Kok JKoronacki R L de Mantaras S Matwin D Mladenic and ASkowron Eds vol 4701 of Lecture Notes in Computer Sciencepp 116ndash127 Springer Berlin Germany 2007

[30] S C H Hoi R Jin J Zhu and M R Lyu ldquoSemi-supervisedSVM batch mode active learning for image retrievalrdquo in Pro-ceedings of the 26th IEEE Conference on Computer Vision andPattern Recognition (CVPR rsquo08) pp 1ndash7 Anchorage AlaskaUSA June 2008

[31] Z Xu K Yu V Tresp X Xu and J Wang ldquoRepresentativesampling for text classification using support vector machinesrdquoin Proceedings of the 25th European Conference on IR Research(ECIR rsquo03) Springer 2003

[32] A Bosch A Zisserman and X Munoz ldquoImage classificationusing random forests and fernsrdquo in Proceedings of the IEEE 11thInternational Conference on Computer Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro Brazil October 2007

[33] T Ojala M Pietikainen and D Harwood ldquoA comparativestudy of texture measures with classification based on featuredistributionsrdquo Pattern Recognition vol 29 no 1 pp 51ndash59 1996

[34] S G Narasimhan and S K Nayar ldquoChromatic framework forvision in bad weatherrdquo in Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo00) pp 598ndash605 Hilton Head Island SC USA June 2000

[35] S G Narasimhan and S K Nayar ldquoRemoving weather effectsfrom monochrome imagesrdquo in Proceedings of the IEEE Com-puter Society Conference on Computer Vision and Pattern Recog-nition vol 2 pp II-186ndashII-193 IEEE Kauai Hawaii USADecember 2001

[36] S G Narasimhan and S K Nayar ldquoVision and the atmosphererdquoInternational Journal of Computer Vision vol 48 no 3 pp 233ndash254 2002

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings ofthe International Workshop on Statistical Learning in ComputerVision (ECCV rsquo04) pp 1ndash22 Prague Czech Republic 2004

[38] D D Lewis andW A Gale ldquoA sequential algorithm for trainingtext classifiersrdquo in SIGIR rsquo94 Proceedings of the SeventeenthAnnual International ACM-SIGIR Conference on Research andDevelopment in Information Retrieval organised by Dublin CityUniversity pp 3ndash12 Springer Berlin Germany 1994

[39] B Settles and M Craven ldquoAn analysis of active learning strate-gies for sequence labeling tasksrdquo inProceedings of the Conferenceon Empirical Methods in Natural Language Processing (EMNLPrsquo08) pp 1070ndash1079 Association for Computational Linguistics2008

[40] G Schohn andD Cohn ldquoLess ismore active learning with sup-port vector machinesrdquo in Proceedings of the 17th InternationalConference on Machine Learning (ICML rsquo00) pp 839ndash846Morgan Kaufmann 2000

[41] S Tong and D Koller ldquoSupport vector machine active learningwith applications to text classificationrdquo The Journal of MachineLearning Research vol 2 pp 45ndash66 2002

[42] M Szummer and T S Jaakkola ldquoInformation regularizationwith partially labeled datardquo in Advances in Neural InformationProcessing Systems pp 1025ndash1032 MIT Press 2002

[43] X Li and Y Guo ldquoAdaptive active learning for image classifica-tionrdquo in Proceedings of the 26th IEEE Conference on ComputerVision and Pattern Recognition (CVPR rsquo13) pp 859ndash866 IEEEPortland Ore USA June 2013

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: Research Article Active Discriminative Dictionary Learning

Mathematical Problems in Engineering 5

The target of ADDL is to iteratively select119873119904most ldquoprofitablerdquo

samples denoted by D119904 from D

119906to query their labels Y

119904and

then add them to D119897for improving the performance of the

dictionary learning classification model

Informativeness Measure Informativeness measure is aneffective criterion to select informative samples for reducingthe uncertainty of the classification model which capturesthe relationship of the candidate samples with the currentclassification model Probabilistic classification models selectthe one that has the largest entropy on the conditionaldistribution over its labels [38 39] Query-by-committeealgorithms choose the samples which have themost disagree-ment among a committee [22 26] In SVM methods themost informative sample is regarded as the one that is closestto the separating hyperplane [40 41] Since DPL is usedas the classification model in our framework we design aninformativeness measure based on the reconstruction errorand the entropy on the probability distribution over the class-specific reconstruction error of the sample

For dictionary learning the samples which are well-represented through the current learned dictionary are lesslikely to provide more information in further refining thedictionary Instead the samples have large reconstructionerror and large uncertainty should be mainly cared aboutbecause they have some additional information that is notcaptured by the current dictionary As a consequence theinformativeness measure is defined as follows

119864 (119909119895) = Error 119877

119895+ Entropy

119895 (6)

where Error 119877119895and Entropy

119895denote the reconstruction error

of the sample 119909119895

isin D119906with respect to the current learned

dictionary and the entropy of probability distribution overclass-specific reconstruction error of the sample 119909

119895isin D119906

respectively Error 119877119895is defined as

Error 119877119895= min119888

10038171003817100381710038171003817119909119895minus 119863119888119875119888119909119895

10038171003817100381710038171003817

2

(7)

where119863119888and119875119888represent subdictionary pairs corresponding

to the class 119888 which is learned by DPL algorithm as shownin (4) The larger Error 119877

119895indicates that the current learned

dictionary does not represent the sample 119909119895well

Since the class-specific reconstruction error is used toidentify the class label of sample 119909

119895(as shown in (5)) the

probability distribution of class-specific reconstruction errorof 119909119895can be acquired The class-specific reconstruction error

probability of 119909119895in class c is defined as

119901119888(119909119895) =

10038171003817100381710038171003817119909119895minus 119863119888119875119888119909119895

10038171003817100381710038171003817

2

sum119862

119888=1

10038171003817100381710038171003817119909119895minus 119863119888119875119888119909119895

10038171003817100381710038171003817

2 (8)

The class probability distribution 119901(119909119895) for sample 119909

119895is

computed as 119901(119909119895) = [119901

1 1199012 119901

119862] which demonstrates

how well the dictionary distinguishes the input sample Thatis if an input sample can be expressed well by the currentdictionary we will get a small value of 119909

119895minus 1198631198881198751198881199091198952

to one of class-specific subdictionaries and thus the classdistribution should reach the valley at the most likely classEntropy is a measure of uncertainty Hence in order toestimate the uncertainty of an input sample label the entropyof probability distribution over class-specific reconstructionerror is calculated as follows

Entropy119895= minus

119862

sum

119888=1

119901119888(119909119895) log119901

119888(119909119895) (9)

The high Entropy119895value demonstrates that 119909

119895is difficult to be

classified by the current learned dictionary thus it should beselected to be labeled and added to the labeled set for furthertraining dictionary learning

Representativeness Measure Since the informativeness mea-sure only considers how the candidate samples relate to thecurrent classificationmodel it ignores the potential structureinformation of the whole input unlabeled dataset Thereforethe representativeness measure is employed as an additionalcriterion to choose the useful unlabeled samples Represen-tativeness measure is to evaluate whether the samples wellrepresent the overall input patterns of unlabeled data whichexploits the relation between the candidate sample and therest of unlabeled samples

The distribution of unlabeled data is very useful fortraining a good classifier In previous active learning workthe marginal density and cosine distance are used as therepresentativeness measure to gain the information of datadistribution [39 42] Li and Guo [43] defined a morestraightforward representativeness measure called mutualinformation Its intention is to select the samples located inthe density region of the unlabeled data distribution which ismore representative regarding the remaining unlabeled datathan the ones located in the sparse region We introduce theframework of representativenessmeasure proposed by Li andGuo [43] into the dictionary learning

For an unlabeled sample 119909119895 the mutual information with

respect to other unlabeled samples is defined as follows

119872(119909119895) = 119867(119909

119895) minus 119867(119909

119895| 119883119880119895

) =1

2ln(

1205902

119895

1205902119895|119880119895

) (10)

where 119867(119909119895) and 119867(119909

119895| 119883119880119895

) denote the entropy and theconditional entropy of sample 119909

119895 respectively 119880

119895represents

the index set of unlabeled samples where 119895 has been removedfrom 119880 119880

119895= 119880 minus 119895 and 119883

119880119895

represents the set of samplesindexed by119880

119895 1205902119895and 120590

2

119895|119880119895

can be calculated by the followingformulas

1205902

119895= K (119909

119895 119909119895)

1205902

119895|119880119895

= 1205902

119895minus sum

119895119880119895

minus1

sum

119880119895119880119895

sum

119880119895119895

(11)

6 Mathematical Problems in Engineering

Inputs Labeled set D119897and its label set Y

119897 Unlabeled set D

119906 the number of iteration 119868

119905

and the number of unlabeled samples 119873119904to be selected in each iteration

(1) Initialization Learn an initial dictionary pair 119863lowast and 119875lowast by DPL algorithm from the D

119897

(2) For 119894 = 1 to 119868119905 do

(3) Compute 119864(119909119895) and 119872(119909

119895) by (6) and (10) for each sample 119909

119895in the unlabeled dataset D

119906

(4) Select 119873119904samples (denoted by D

119904) from the D

119906by (13) and add them into D

119897with their

class labels which manually assigned by user Then updates D119906= D119906minus D119904and D

119897= D119897cup D119904

(5) Learn the refined dictionaries 119863lowastnew and 119875lowast

new over the expanded dataset D119897

(6) End forOutput Final learned dictionary pair 119863lowastnew and 119875

lowast

new

Algorithm 1 Active discriminative dictionary learning (ADDL)

Assume that the index set 119880119895= (1 2 3 119905) and sum

119880119895119880119895

is a kernel matrix defined over all the unlabeled samplesindexed by 119880

119895 it is computed by the following form

sum

119880119895119880119895

= (

K (1199091 1199091) K (119909

1 1199092) sdot sdot sdot K (119909

1 119909119905)

K (1199092 1199091) K (119909

2 1199092) sdot sdot sdot K (119909

2 119909119905)

K (119909119905 1199091) K (119909

119905 1199092) sdot sdot sdot K (119909

119905 119909119905)

)

(12)

K(sdot) is a symmetric positive definite kernel function Inour approach we apply simple and effective linear kernelK(119909119895 119909119895) = 119909

119894minus 1199091198952 for our dictionary learning task

The mutual information is used to implicitly exploit theinformation between the selected samples and the remainingones The samples which have large mutual informationshould be selected from the pool of unlabeled data for refiningthe learned dictionary in DPL

Procedure of Active Samples Selection Based on the aboveanalysis we aim to integrate the strengths of informativenessmeasure and representativeness measure to select the unla-beled samples from the pool of the unlabeled dataWe choosethe samples that have not only large reconstruction errorand the entropy of probability distribution over class-specificreconstruction error with respect to the DPL classificationmodel but also the large representativeness regarding the restof unlabeled samples The sample set D

119904= 119909119904

1 119909119904

2 119909

119904

119873119904

are iteratively selected from pool by the following formula

119909119904= argmin119909119895

(119864 (119909119895) + 119872(119909

119895))

119895 isin 119899119897 + 1 119899119897 + 2 119873

(13)

The overall of our ADDL is given in Algorithm 1

4 Experiments

In this section the performance of the proposed frameworkis evaluated on two weather datasets We first give the

details about the datasets and experimental settingsThen theexperimental results are provided and analyzed

41 Datasets and Experimental Setting

Datasets The first dataset employed in our experiment is thedataset provided by Chen et al [9] (denoted as DATASET1) DATASET 1 contains 1000 images of size 3966 times 270 andeach image has been manually labeled as sunny cloudy orovercast There are 276 sunny images 251 cloudy images and473 overcast images in DATASET 1 Figure 2 shows threeimages from DATASET 1 with the label sunny cloudy andovercast respectively

Because there are few available public datasets forweather recognition we construct a new dataset (denotedas DATASET 2) to test the performance of the proposedmethod The images in DATASET 2 are selected from thepanorama images collected on the roof of BC buildingat EPFL (httppanoramaepflch provides high resolution(13200times 900) panorama images from 2005 till now recordingat every 10 minutes during daytime) and categorized intosunny cloudy or overcast based on the classification criterionpresented in [9] It includes 5000 images whichwere capturedat approximately every 30 minutes during daytime in 2014and the size of each image is 4821 times 400 Although bothDATASET 1 and DATASET 2 are constructed based on theimages provided by httppanoramaepflch DATASET 2is more challenging because it contains a large number ofimages captured in different seasons In Figure 3 some exam-ples of different labeled images in DATASET 2 are shown

Experimental Setting For the sky region of each image fourtypes of visual appearance features are extracted including200-dim SIFT 600-dim HSV color feature (200-dim his-togram of H channel 200-dim histogram of S channel and200-dimhistogramofV channel) 200-dimLBP and 200-dimgradient magnitude featureThese features are only extractedfrom the sky part of each image and the sky detector andfeature extraction procedure provided by Chen et al [9] areused in this paper Besides the visual appearance features twokinds of features based on physical characteristics of imagescaptured under different weather are also extracted fromthe nonsky region which consists of 200-dim bag-of-wordsrepresentation of the contrast and 200-dim bag-of-word

Mathematical Problems in Engineering 7

(a) (b)

(c)

Figure 2 Examples in DATASET 1 (a) Sunny (b) cloudy (c) overcast

(a) (b)

(c)

Figure 3 Examples in DATASET 2 (a) Sunny (b) cloudy (c) overcast

representation of the dark channel To be specific we divideeach nonsky region of the image into 32 times 32 blocks andextract the contrast and the dark channel features by (1)and (2) and then use bag-of-words model [37] to code eachfeature

In our experiment 50 images are randomly selectedin each dataset for training and the remaining data is usedfor testing The training data are randomly partitioned intolabeled sample set D

119897and unlabeled sample set D

119906 D119897is

applied for learning an initial dictionary andD119906is utilized for

actively selecting ldquoprofitablerdquo samples to iteratively improvethe classification performance To make the experimentresultsmore convincing each following experimental processis repeated ten times and then the mean and standarddeviation of the classification accuracy are reported

42 Experiment I Verifying the Performance of FeatureExtraction The effectiveness of the feature extraction ofour method is first evaluated Many previous works merelyextract the visual appearance features of the sky part forweather recognition [9] In order to validate the power ofour extracted features based on physical characteristics ofimages the results of two weather recognition schemes arecompared One only uses the visual appearance features toclassify the weather conditions and the other combines thevisual appearance features with features based on physicalcharacteristics of images to identify the weather conditionsIn order to weaken the influence of classifier on the results ofweather recognition119870-NN classification (119870 is experientiallyset to 30) SVM with the radial basis function kernel andthe original DPL without active samples selection procedureare applied in this experiment Figures 4 and 5 show thecomparison results on DATASET 1 and DATASET 2 respect-ively

In Figures 4 and 5 119909-axis represents the different numberof training samples and 119910-axis represents the average classifi-cation accuracy The red dotted lines indicate just six visualappearance features of the sky area are used and the bluesolid lines indicate both six visual appearance features and

two features based on physical characteristics of the nonskyarea are applied for recognition From Figures 4 and 5 it isclearly observed that the combination of visual appearancefeatures and physical features can achieve better performancefor weather recognition task

43 Experiment II Recognizing Weather Conditions by theProposed ADDL In this section the performance of theproposed ADDL algorithm is evaluated First experimentis conducted to give the best parameters for ADDL AndthenADDL is compared against several popular classificationmethods

431 Parameters Selection There are three important param-eters in the proposed approach that is 119896 120582 and 120591 119896 isthe number of atoms in each subdictionary 119863

119888learned from

samples in each class 120582 is used to control the discriminativeproperty of 119875 and 120591 is a scalar constant in DPL algorithmThe performances of our ADDL under various values of 119896 120582and 120591 are studied on DATASET 1 Figure 6(a) lists the classi-fication results when 119896 = 15 25 35 45 55 65 75 85 95 Itcan be seen that the highest average classification accuracyis obtained when 119896 = 25 This demonstrates that ADDLis effective to learn a compact dictionary According tothe observation in Figure 6(a) we set 119896 to be 25 in allexperiments The classification results obtained by using thedifferent 120591 and 120582 are shown in Figures 6(b) and 6(c) FromFigures 6(b) and 6(c) the optimal values of 120582 and 120591 are 005and 25 respectively This is because of the fact that a too bigor too small 120582 value will lead the reconstruction coefficient inADDL to be too sparse or too dense which will deterioratethe classification performance If 120591 is too large the effect ofthe reconstruction error constraint (the first term in (4)) andthe sparse constraint (the third term in (4)) is weakenedwhich will decrease the discrimination ability of the learneddictionary On the contrary if 120591 is too small the second termin (4) will be neglected in dictionary learning which alsoreduces the performance of algorithmHence we set120582 = 005

and 120591 = 25 for all experiments

8 Mathematical Problems in Engineering

100 200 300 400 50076

78

80

82

84

86

88

90

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(a) K-NN

100 200 300 400 50076

78

80

82

84

86

88

90

92

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(b) SVM

100 200 300 400 50084

86

88

90

92

94

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(c) DPL

Figure 4 Weather recognition results over DATASET 1 by using different features

Now we evaluate weather active samples selection canimprove the recognition performance 500 samples inDATASET 1 are randomly selected as training data and theremaining samples are used for testing In training dataset50 samples are randomly selected as the labeled dataset D

119897

and the remaining 450 samples are selected as the unlabeleddatasetD

119906The proposedADDLuses the labeled datasetD

119897to

learn an initial dictionary pair and then iteratively selects 50samples from D

119906to label for expanding the training dataset

Figure 7 shows the recognition accuracy versus the numberof iterations

In Figure 7 the 0th iteration indicates that we only usethe initial 50 labeled samples to learn the dictionary and the9th iterationmeans using all 500 training samples to learn thedictionary for recognition The recognition ability of ADDL

is improved by active samples selection procedure and itachieves highest accuracy when the number of iterationsis 3 total 200 samples are used for training It is worthmentioning that ADDL obtains the best results when thenumber of iterations is set as 3 If iterations are larger than 3the recognition rates will drop about 1This is because thereare somenoisy examples or ldquooutliersrdquo in the unlabeled datasetand the more noisy examples or ldquooutliersrdquo will be selectedto learn the dictionary along with the increase of iterationswhich interferes with the dictionary learning and leads to theclassification performance degradation In the following thenumber of iterations is set to 3 for all experiments

432 Comparisons of ADDL with Other Methods Here theproposed ADDL is compared with several methods The first

Mathematical Problems in Engineering 9

100 400 700 1000 1300 1600 1900 2200 250079

81

83

85

87

89

91

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(a) K-NN

100 400 700 1000 1300 1600 1900 2200 250083

85

87

89

91

93

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(b) SVM

100 400 700 1000 1300 1600 1900 2200 250086

88

90

92

94

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(c) DPL

Figure 5 Weather recognition results over DATASET 2 by using different features

two methods are 119870-NN algorithm used by Song et al [7]and SVM with the radial basis function kernel (RBF-SVM)adopted by Roser and Moosmann [5] The third methodis SRC [15] which directly uses all training samples as thedictionary for classification In order to confirm that theactive samples selection in our ADDLmethod is effective theproposed ADDL is compared with the original DPL method[19] We also compare ADDL with the method proposedby Chen et al [9] As far as we know the work in [9] isthe only framework which addresses the same problem withour method that is recognizing different weather conditions(sunny cloudy and overcast) of images captured by a stillcamera It actively selected useful samples to training SVMfor recognizing different weather conditions

For DATASET 1 500 images are randomly selected as thetraining samples and the rest of images are used as the testingsamples In the training dataset 50 images are randomlychosen as the initial labeled training dataset D

119897 and the

remaining 450 images are regarded as the unlabeled datasetD119906 ADDL and Chenrsquos method [9] both include the active

learning procedure thus they iteratively choose 150 samplesfrom D

119906to be labeled based on their criterion of the samples

selection and add these samples toD119897for further training the

classification model For 119870-NN RBF-SVM SRC and DPLmethods which are without the active learning procedure150 samples are randomly selected from D

119906to be labeled

for expanding the labeled training dataset D119897 Table 1 lists

the comparisons of our approach with several methods for

10 Mathematical Problems in Engineering

5 15 25 35 45 55 65 75 85 95

90

92

94Av

erag

e acc

urac

y (

)

k

(a)

00005 0005 005 05 10 30 50 70 90

90

92

94

Aver

age a

ccur

acy

()

120582

(b)

0005 05 5 25 45 65 85

90

92

94

Aver

age a

ccur

acy

()

120591

(c)

Figure 6 Selection of parameters 119909-axis represents the different values of parameters and 119910-axis represents the average classificationaccuracy (a)The average classification rate under different k (b)The average classification rate under different 120582 (c)The average classificationrate under different 120591

0 1 2 3 4 5 6 7 8 98284868890929496

Iteration

Aver

age a

ccur

acy

()

Figure 7 Recognition accuracy on DATASET 1 versus the numberof iterations

Table 1 Comparisons on DATASET 1 among different methods

Methods Classification rate (mean plusmn std)119870-NN 829plusmn 12RBF-SVM 856plusmn 16SRC 897plusmn 09DPL 913plusmn 16Chenrsquos method [9] 9298plusmn 055ADDL 940plusmn 02

weather classification As can be seen from Table 1 ADDLoutperforms other methods The mean classification rate ofADDL reaches about 94

In DATASET 2 2500 images are randomly selected asthe training samples and the rest of images are used as

Table 2 Comparison on DATASET 2 among different methods

Methods Classification rate (mean plusmn std)119870-NN 850plusmn 16RBF-SVM 881plusmn 14SRC 882plusmn 11DPL 896plusmn 08Chenrsquos method [9] 888plusmn 16ADDL 904plusmn 10

the testing samples In the training dataset 50 images arerandomly chosen as the initial labeled training datasetD

119897 the

remaining 2450 images are regarded as the unlabeled datasetD119897 All parameters setting for DATASET 2 are the same as

DATASET 1 Table 2 lists the recognition results of differentmethods which indicates that the validity of the proposedADDL is better than other methods

FromTables 1 and 2 two points can be observed First wecan find that the recognition performances of 119870-NN RBF-SVM SRC and DPL are overall inferior to the proposedADDL algorithm This is probably because these four algo-rithms randomly select the unlabeled data from the givenpool which do not consider whether the selected samples arebeneficial for improving the performance of the classificationmodel or not Second although the proposed ADDL andChenrsquosmethod [9] both include the active learning paradigmthe proposed ADDL performs better than Chenrsquos method[9] This is due to the fact that Chenrsquos method [9] onlyconsiders the informativeness and ignores representativeness

Mathematical Problems in Engineering 11

of samples when selecting the unlabeled samples from thegiven pool

5 Conclusions

We have presented an effective framework for classifyingthree types of weather (sunny cloudy and overcast) basedon the outdoor images Through the analysis of the differentvisual manifestations of images caused by different weathersthe various features are separately extracted from the sky areaand nonsky area of images which describes visual appear-ance properties and physical characteristics of images underdifferent weather conditions respectively ADDL approachwas proposed to learn a more discriminative dictionary forimproving the weather classification performance by select-ing the informative and representative unlabeled samplesfrom a given pool to expand the training dataset Sincethere is not much image dataset on weather recognition wehave collected and labeled a new weather dataset for testingthe proposed algorithm The experimental results show thatADDL is a fairly effective and inspiring strategy for weatherclassification which also can be used inmany other computervision tasks

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

This work is supported by Fund of Jilin Provincial Sci-ence amp Technology Department (nos 20130206042GX and20140204089GX) and National Natural Science Foundationof China (nos 61403078 11271064 and 61471111)

References

[1] F Nashashibi R de Charette and A Lia ldquoDetection ofunfocused raindrops on a windscreen using low level imageprocessingrdquo in Proceedings of the 11th International Conferenceon Control Automation Robotics and Vision (ICARCV rsquo10) pp1410ndash1415 Singapore December 2010

[2] H Kurihata T Takahashi I Ide et al ldquoRainy weather recog-nition from in-vehicle camera images for driver assistancerdquo inProceedings of the IEEE Intelligent Vehicles Symposium pp 205ndash210 IEEE Las Vegas Nev USA June 2005

[3] H Woo Y M Jung J-G Kim and J K Seo ldquoEnvironmentallyrobust motion detection for video surveillancerdquo IEEE Transac-tions on Image Processing vol 19 no 11 pp 2838ndash2848 2010

[4] H Katsura J Miura M Hild and Y Shirai ldquoA view-basedoutdoor navigation using object recognition robust to changesof weather and seasonsrdquo in Proceedings of the IEEERSJ Interna-tional Conference on Intelligent Robots and Systems (IROS rsquo03)vol 3 pp 2974ndash2979 IEEE Las Vegas Nev USAOctober 2003

[5] M Roser and F Moosmann ldquoClassification of weather sit-uations on single color imagesrdquo in Proceedings of the IEEEIntelligent Vehicles Symposium pp 798ndash803 IEEE EindhovenThe Netherlands June 2008

[6] X Yan Y Luo and X Zheng ldquoWeather recognition based onimages captured by vision system in vehiclerdquo in Advances in

Neural NetworksmdashISNN 2009 pp 390ndash398 Springer BerlinGermany 2009

[7] H Song Y Chen and Y Gao ldquoWeather condition recognitionbased on feature extraction and K-NNrdquo in Foundations andPractical Applications of Cognitive Systems and Information Pro-cessing Proceedings of the First International Conference on Cog-nitive Systems and Information Processing Beijing China Dec2012 (CSIP2012) Advances in Intelligent Systems and Comput-ing pp 199ndash210 Springer Berlin Germany 2014

[8] C Lu D Lin J Jia and C-K Tang ldquoTwo-class weather class-ificationrdquo in Proceedings of the 27th IEEE Conference on Com-puter Vision and Pattern Recognition (CVPR rsquo14) pp 3718ndash3725Columbus Ohio USA June 2014

[9] Z Chen F Yang A Lindner G Barrenetxea and M VetterlildquoHow is the weather automatic inference from imagesrdquo inProceedings of the 19th IEEE International Conference on ImageProcessing (ICIP rsquo12) pp 1853ndash1856 Orlando Fla USAOctober2012

[10] K He J Sun and X Tang ldquoSingle image haze removal usingdark channel priorrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 33 no 12 pp 2341ndash2353 2011

[11] A Li and H Shouno ldquoDictionary-based image denoising byfused-lasso atom selectionrdquo Mathematical Problems in Engi-neering vol 2014 Article ID 368602 10 pages 2014

[12] Z Feng M Yang L Zhang Y Liu and D Zhang ldquoJoint dis-criminative dimensionality reduction and dictionary learningfor face recognitionrdquo Pattern Recognition vol 46 no 8 pp2134ndash2143 2013

[13] J Dong C Sun andW Yang ldquoA supervised dictionary learningand discriminative weighting model for action recognitionrdquoNeurocomputing vol 158 pp 246ndash256 2015

[14] M Aharon M Elad and A Bruckstein ldquoK-SVD an algorithmfor designing overcomplete dictionaries for sparse representa-tionrdquo IEEE Transactions on Signal Processing vol 54 no 11 pp4311ndash4322 2006

[15] JWright A Y Yang A Ganesh S S Sastry and YMa ldquoRobustface recognition via sparse representationrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 31 no 2 pp210ndash227 2009

[16] J Mairal F Bach and J Ponce ldquoTask-driven dictionarylearningrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 34 no 4 pp 791ndash804 2012

[17] Z Jiang Z Lin and L S Davis ldquoLabel consistent K-SVD learn-ing a discriminative dictionary for recognitionrdquo IEEE Transac-tions on Pattern Analysis and Machine Intelligence vol 35 no11 pp 2651ndash2664 2013

[18] M Yang L Zhang X Feng and D Zhang ldquoSparse representa-tion based fisher discrimination dictionary learning for imageclassificationrdquo International Journal of Computer Vision vol109 no 3 pp 209ndash232 2014

[19] S Gu L Zhang W Zuo and X Feng ldquoProjective dictionarypair learning for pattern classificationrdquo inProceedings of the 28thAnnual Conference on Neural Information Processing Systems(NIPS rsquo14) pp 793ndash801 December 2014

[20] B Settles Active Learning Literature Survey vol 52 Universityof Wisconsin-Madison Madison Wis USA 2010

[21] D Cohn L Atlas and R Ladner ldquoImproving generalizationwith active learningrdquo Machine Learning vol 15 no 2 pp 201ndash221 1994

[22] I Dagan and S P Engelson ldquoCommittee-based sampling fortraining probabilistic classifiersrdquo in Proceedings of the 12th

12 Mathematical Problems in Engineering

International Conference on Machine Learning pp 150ndash157Morgan Kaufmann Tahoe City Calif USA 1995

[23] D Angluin ldquoQueries and concept learningrdquoMachine Learningvol 2 no 4 pp 319ndash342 1988

[24] R D King K E Whelan F M Jones et al ldquoFunctional geno-mic hypothesis generation and experimentation by a robot sci-entistrdquo Nature vol 427 no 6971 pp 247ndash252 2004

[25] S Chakraborty V Balasubramanian and S PanchanathanldquoDynamic batch mode active learningrdquo in Proceedings of theIEEE Conference on Computer Vision and Pattern Recognition(CVPR rsquo11) pp 2649ndash2656 Providence RI USA June 2011

[26] K Nigam and A McCallum ldquoEmploying EM in pool-basedactive learning for text classificationrdquo in Proceedings of the 15thInternational Conference on Machine Learning (ICML rsquo98) pp350ndash358 Morgan Kaufmann Madison Wis USA 1998

[27] A Kapoor K Grauman R Urtasun and T Darrell ldquoActivelearning with Gaussian processes for object categorizationrdquo inProceedings of the IEEE 11th International Conference on Com-puter Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro BrazilOctober 2007

[28] S J Huang R Jin and Z H Zhou ldquoActive learning by queryinginformative and representative examplesrdquo in Proceedings of theAnnual Conference on Neural Information Processing Systems(NIPS rsquo10) pp 892ndash900 2010

[29] P Donmez J G Carbonell and P N Bennett ldquoDual strategyactive learningrdquo inMachine Learning ECML 2007 J N Kok JKoronacki R L de Mantaras S Matwin D Mladenic and ASkowron Eds vol 4701 of Lecture Notes in Computer Sciencepp 116ndash127 Springer Berlin Germany 2007

[30] S C H Hoi R Jin J Zhu and M R Lyu ldquoSemi-supervisedSVM batch mode active learning for image retrievalrdquo in Pro-ceedings of the 26th IEEE Conference on Computer Vision andPattern Recognition (CVPR rsquo08) pp 1ndash7 Anchorage AlaskaUSA June 2008

[31] Z Xu K Yu V Tresp X Xu and J Wang ldquoRepresentativesampling for text classification using support vector machinesrdquoin Proceedings of the 25th European Conference on IR Research(ECIR rsquo03) Springer 2003

[32] A Bosch A Zisserman and X Munoz ldquoImage classificationusing random forests and fernsrdquo in Proceedings of the IEEE 11thInternational Conference on Computer Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro Brazil October 2007

[33] T Ojala M Pietikainen and D Harwood ldquoA comparativestudy of texture measures with classification based on featuredistributionsrdquo Pattern Recognition vol 29 no 1 pp 51ndash59 1996

[34] S G Narasimhan and S K Nayar ldquoChromatic framework forvision in bad weatherrdquo in Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo00) pp 598ndash605 Hilton Head Island SC USA June 2000

[35] S G Narasimhan and S K Nayar ldquoRemoving weather effectsfrom monochrome imagesrdquo in Proceedings of the IEEE Com-puter Society Conference on Computer Vision and Pattern Recog-nition vol 2 pp II-186ndashII-193 IEEE Kauai Hawaii USADecember 2001

[36] S G Narasimhan and S K Nayar ldquoVision and the atmosphererdquoInternational Journal of Computer Vision vol 48 no 3 pp 233ndash254 2002

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings ofthe International Workshop on Statistical Learning in ComputerVision (ECCV rsquo04) pp 1ndash22 Prague Czech Republic 2004

[38] D D Lewis andW A Gale ldquoA sequential algorithm for trainingtext classifiersrdquo in SIGIR rsquo94 Proceedings of the SeventeenthAnnual International ACM-SIGIR Conference on Research andDevelopment in Information Retrieval organised by Dublin CityUniversity pp 3ndash12 Springer Berlin Germany 1994

[39] B Settles and M Craven ldquoAn analysis of active learning strate-gies for sequence labeling tasksrdquo inProceedings of the Conferenceon Empirical Methods in Natural Language Processing (EMNLPrsquo08) pp 1070ndash1079 Association for Computational Linguistics2008

[40] G Schohn andD Cohn ldquoLess ismore active learning with sup-port vector machinesrdquo in Proceedings of the 17th InternationalConference on Machine Learning (ICML rsquo00) pp 839ndash846Morgan Kaufmann 2000

[41] S Tong and D Koller ldquoSupport vector machine active learningwith applications to text classificationrdquo The Journal of MachineLearning Research vol 2 pp 45ndash66 2002

[42] M Szummer and T S Jaakkola ldquoInformation regularizationwith partially labeled datardquo in Advances in Neural InformationProcessing Systems pp 1025ndash1032 MIT Press 2002

[43] X Li and Y Guo ldquoAdaptive active learning for image classifica-tionrdquo in Proceedings of the 26th IEEE Conference on ComputerVision and Pattern Recognition (CVPR rsquo13) pp 859ndash866 IEEEPortland Ore USA June 2013

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: Research Article Active Discriminative Dictionary Learning

6 Mathematical Problems in Engineering

Inputs Labeled set D119897and its label set Y

119897 Unlabeled set D

119906 the number of iteration 119868

119905

and the number of unlabeled samples 119873119904to be selected in each iteration

(1) Initialization Learn an initial dictionary pair 119863lowast and 119875lowast by DPL algorithm from the D

119897

(2) For 119894 = 1 to 119868119905 do

(3) Compute 119864(119909119895) and 119872(119909

119895) by (6) and (10) for each sample 119909

119895in the unlabeled dataset D

119906

(4) Select 119873119904samples (denoted by D

119904) from the D

119906by (13) and add them into D

119897with their

class labels which manually assigned by user Then updates D119906= D119906minus D119904and D

119897= D119897cup D119904

(5) Learn the refined dictionaries 119863lowastnew and 119875lowast

new over the expanded dataset D119897

(6) End forOutput Final learned dictionary pair 119863lowastnew and 119875

lowast

new

Algorithm 1 Active discriminative dictionary learning (ADDL)

Assume that the index set 119880119895= (1 2 3 119905) and sum

119880119895119880119895

is a kernel matrix defined over all the unlabeled samplesindexed by 119880

119895 it is computed by the following form

sum

119880119895119880119895

= (

K (1199091 1199091) K (119909

1 1199092) sdot sdot sdot K (119909

1 119909119905)

K (1199092 1199091) K (119909

2 1199092) sdot sdot sdot K (119909

2 119909119905)

K (119909119905 1199091) K (119909

119905 1199092) sdot sdot sdot K (119909

119905 119909119905)

)

(12)

K(sdot) is a symmetric positive definite kernel function Inour approach we apply simple and effective linear kernelK(119909119895 119909119895) = 119909

119894minus 1199091198952 for our dictionary learning task

The mutual information is used to implicitly exploit theinformation between the selected samples and the remainingones The samples which have large mutual informationshould be selected from the pool of unlabeled data for refiningthe learned dictionary in DPL

Procedure of Active Samples Selection Based on the aboveanalysis we aim to integrate the strengths of informativenessmeasure and representativeness measure to select the unla-beled samples from the pool of the unlabeled dataWe choosethe samples that have not only large reconstruction errorand the entropy of probability distribution over class-specificreconstruction error with respect to the DPL classificationmodel but also the large representativeness regarding the restof unlabeled samples The sample set D

119904= 119909119904

1 119909119904

2 119909

119904

119873119904

are iteratively selected from pool by the following formula

119909119904= argmin119909119895

(119864 (119909119895) + 119872(119909

119895))

119895 isin 119899119897 + 1 119899119897 + 2 119873

(13)

The overall of our ADDL is given in Algorithm 1

4 Experiments

In this section the performance of the proposed frameworkis evaluated on two weather datasets We first give the

details about the datasets and experimental settingsThen theexperimental results are provided and analyzed

41 Datasets and Experimental Setting

Datasets The first dataset employed in our experiment is thedataset provided by Chen et al [9] (denoted as DATASET1) DATASET 1 contains 1000 images of size 3966 times 270 andeach image has been manually labeled as sunny cloudy orovercast There are 276 sunny images 251 cloudy images and473 overcast images in DATASET 1 Figure 2 shows threeimages from DATASET 1 with the label sunny cloudy andovercast respectively

Because there are few available public datasets forweather recognition we construct a new dataset (denotedas DATASET 2) to test the performance of the proposedmethod The images in DATASET 2 are selected from thepanorama images collected on the roof of BC buildingat EPFL (httppanoramaepflch provides high resolution(13200times 900) panorama images from 2005 till now recordingat every 10 minutes during daytime) and categorized intosunny cloudy or overcast based on the classification criterionpresented in [9] It includes 5000 images whichwere capturedat approximately every 30 minutes during daytime in 2014and the size of each image is 4821 times 400 Although bothDATASET 1 and DATASET 2 are constructed based on theimages provided by httppanoramaepflch DATASET 2is more challenging because it contains a large number ofimages captured in different seasons In Figure 3 some exam-ples of different labeled images in DATASET 2 are shown

Experimental Setting For the sky region of each image fourtypes of visual appearance features are extracted including200-dim SIFT 600-dim HSV color feature (200-dim his-togram of H channel 200-dim histogram of S channel and200-dimhistogramofV channel) 200-dimLBP and 200-dimgradient magnitude featureThese features are only extractedfrom the sky part of each image and the sky detector andfeature extraction procedure provided by Chen et al [9] areused in this paper Besides the visual appearance features twokinds of features based on physical characteristics of imagescaptured under different weather are also extracted fromthe nonsky region which consists of 200-dim bag-of-wordsrepresentation of the contrast and 200-dim bag-of-word

Mathematical Problems in Engineering 7

(a) (b)

(c)

Figure 2 Examples in DATASET 1 (a) Sunny (b) cloudy (c) overcast

(a) (b)

(c)

Figure 3 Examples in DATASET 2 (a) Sunny (b) cloudy (c) overcast

representation of the dark channel To be specific we divideeach nonsky region of the image into 32 times 32 blocks andextract the contrast and the dark channel features by (1)and (2) and then use bag-of-words model [37] to code eachfeature

In our experiment 50 images are randomly selectedin each dataset for training and the remaining data is usedfor testing The training data are randomly partitioned intolabeled sample set D

119897and unlabeled sample set D

119906 D119897is

applied for learning an initial dictionary andD119906is utilized for

actively selecting ldquoprofitablerdquo samples to iteratively improvethe classification performance To make the experimentresultsmore convincing each following experimental processis repeated ten times and then the mean and standarddeviation of the classification accuracy are reported

42 Experiment I Verifying the Performance of FeatureExtraction The effectiveness of the feature extraction ofour method is first evaluated Many previous works merelyextract the visual appearance features of the sky part forweather recognition [9] In order to validate the power ofour extracted features based on physical characteristics ofimages the results of two weather recognition schemes arecompared One only uses the visual appearance features toclassify the weather conditions and the other combines thevisual appearance features with features based on physicalcharacteristics of images to identify the weather conditionsIn order to weaken the influence of classifier on the results ofweather recognition119870-NN classification (119870 is experientiallyset to 30) SVM with the radial basis function kernel andthe original DPL without active samples selection procedureare applied in this experiment Figures 4 and 5 show thecomparison results on DATASET 1 and DATASET 2 respect-ively

In Figures 4 and 5 119909-axis represents the different numberof training samples and 119910-axis represents the average classifi-cation accuracy The red dotted lines indicate just six visualappearance features of the sky area are used and the bluesolid lines indicate both six visual appearance features and

two features based on physical characteristics of the nonskyarea are applied for recognition From Figures 4 and 5 it isclearly observed that the combination of visual appearancefeatures and physical features can achieve better performancefor weather recognition task

43 Experiment II Recognizing Weather Conditions by theProposed ADDL In this section the performance of theproposed ADDL algorithm is evaluated First experimentis conducted to give the best parameters for ADDL AndthenADDL is compared against several popular classificationmethods

431 Parameters Selection There are three important param-eters in the proposed approach that is 119896 120582 and 120591 119896 isthe number of atoms in each subdictionary 119863

119888learned from

samples in each class 120582 is used to control the discriminativeproperty of 119875 and 120591 is a scalar constant in DPL algorithmThe performances of our ADDL under various values of 119896 120582and 120591 are studied on DATASET 1 Figure 6(a) lists the classi-fication results when 119896 = 15 25 35 45 55 65 75 85 95 Itcan be seen that the highest average classification accuracyis obtained when 119896 = 25 This demonstrates that ADDLis effective to learn a compact dictionary According tothe observation in Figure 6(a) we set 119896 to be 25 in allexperiments The classification results obtained by using thedifferent 120591 and 120582 are shown in Figures 6(b) and 6(c) FromFigures 6(b) and 6(c) the optimal values of 120582 and 120591 are 005and 25 respectively This is because of the fact that a too bigor too small 120582 value will lead the reconstruction coefficient inADDL to be too sparse or too dense which will deterioratethe classification performance If 120591 is too large the effect ofthe reconstruction error constraint (the first term in (4)) andthe sparse constraint (the third term in (4)) is weakenedwhich will decrease the discrimination ability of the learneddictionary On the contrary if 120591 is too small the second termin (4) will be neglected in dictionary learning which alsoreduces the performance of algorithmHence we set120582 = 005

and 120591 = 25 for all experiments

8 Mathematical Problems in Engineering

100 200 300 400 50076

78

80

82

84

86

88

90

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(a) K-NN

100 200 300 400 50076

78

80

82

84

86

88

90

92

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(b) SVM

100 200 300 400 50084

86

88

90

92

94

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(c) DPL

Figure 4 Weather recognition results over DATASET 1 by using different features

Now we evaluate weather active samples selection canimprove the recognition performance 500 samples inDATASET 1 are randomly selected as training data and theremaining samples are used for testing In training dataset50 samples are randomly selected as the labeled dataset D

119897

and the remaining 450 samples are selected as the unlabeleddatasetD

119906The proposedADDLuses the labeled datasetD

119897to

learn an initial dictionary pair and then iteratively selects 50samples from D

119906to label for expanding the training dataset

Figure 7 shows the recognition accuracy versus the numberof iterations

In Figure 7 the 0th iteration indicates that we only usethe initial 50 labeled samples to learn the dictionary and the9th iterationmeans using all 500 training samples to learn thedictionary for recognition The recognition ability of ADDL

is improved by active samples selection procedure and itachieves highest accuracy when the number of iterationsis 3 total 200 samples are used for training It is worthmentioning that ADDL obtains the best results when thenumber of iterations is set as 3 If iterations are larger than 3the recognition rates will drop about 1This is because thereare somenoisy examples or ldquooutliersrdquo in the unlabeled datasetand the more noisy examples or ldquooutliersrdquo will be selectedto learn the dictionary along with the increase of iterationswhich interferes with the dictionary learning and leads to theclassification performance degradation In the following thenumber of iterations is set to 3 for all experiments

432 Comparisons of ADDL with Other Methods Here theproposed ADDL is compared with several methods The first

Mathematical Problems in Engineering 9

100 400 700 1000 1300 1600 1900 2200 250079

81

83

85

87

89

91

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(a) K-NN

100 400 700 1000 1300 1600 1900 2200 250083

85

87

89

91

93

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(b) SVM

100 400 700 1000 1300 1600 1900 2200 250086

88

90

92

94

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(c) DPL

Figure 5 Weather recognition results over DATASET 2 by using different features

two methods are 119870-NN algorithm used by Song et al [7]and SVM with the radial basis function kernel (RBF-SVM)adopted by Roser and Moosmann [5] The third methodis SRC [15] which directly uses all training samples as thedictionary for classification In order to confirm that theactive samples selection in our ADDLmethod is effective theproposed ADDL is compared with the original DPL method[19] We also compare ADDL with the method proposedby Chen et al [9] As far as we know the work in [9] isthe only framework which addresses the same problem withour method that is recognizing different weather conditions(sunny cloudy and overcast) of images captured by a stillcamera It actively selected useful samples to training SVMfor recognizing different weather conditions

For DATASET 1 500 images are randomly selected as thetraining samples and the rest of images are used as the testingsamples In the training dataset 50 images are randomlychosen as the initial labeled training dataset D

119897 and the

remaining 450 images are regarded as the unlabeled datasetD119906 ADDL and Chenrsquos method [9] both include the active

learning procedure thus they iteratively choose 150 samplesfrom D

119906to be labeled based on their criterion of the samples

selection and add these samples toD119897for further training the

classification model For 119870-NN RBF-SVM SRC and DPLmethods which are without the active learning procedure150 samples are randomly selected from D

119906to be labeled

for expanding the labeled training dataset D119897 Table 1 lists

the comparisons of our approach with several methods for

10 Mathematical Problems in Engineering

5 15 25 35 45 55 65 75 85 95

90

92

94Av

erag

e acc

urac

y (

)

k

(a)

00005 0005 005 05 10 30 50 70 90

90

92

94

Aver

age a

ccur

acy

()

120582

(b)

0005 05 5 25 45 65 85

90

92

94

Aver

age a

ccur

acy

()

120591

(c)

Figure 6 Selection of parameters 119909-axis represents the different values of parameters and 119910-axis represents the average classificationaccuracy (a)The average classification rate under different k (b)The average classification rate under different 120582 (c)The average classificationrate under different 120591

0 1 2 3 4 5 6 7 8 98284868890929496

Iteration

Aver

age a

ccur

acy

()

Figure 7 Recognition accuracy on DATASET 1 versus the numberof iterations

Table 1 Comparisons on DATASET 1 among different methods

Methods Classification rate (mean plusmn std)119870-NN 829plusmn 12RBF-SVM 856plusmn 16SRC 897plusmn 09DPL 913plusmn 16Chenrsquos method [9] 9298plusmn 055ADDL 940plusmn 02

weather classification As can be seen from Table 1 ADDLoutperforms other methods The mean classification rate ofADDL reaches about 94

In DATASET 2 2500 images are randomly selected asthe training samples and the rest of images are used as

Table 2 Comparison on DATASET 2 among different methods

Methods Classification rate (mean plusmn std)119870-NN 850plusmn 16RBF-SVM 881plusmn 14SRC 882plusmn 11DPL 896plusmn 08Chenrsquos method [9] 888plusmn 16ADDL 904plusmn 10

the testing samples In the training dataset 50 images arerandomly chosen as the initial labeled training datasetD

119897 the

remaining 2450 images are regarded as the unlabeled datasetD119897 All parameters setting for DATASET 2 are the same as

DATASET 1 Table 2 lists the recognition results of differentmethods which indicates that the validity of the proposedADDL is better than other methods

FromTables 1 and 2 two points can be observed First wecan find that the recognition performances of 119870-NN RBF-SVM SRC and DPL are overall inferior to the proposedADDL algorithm This is probably because these four algo-rithms randomly select the unlabeled data from the givenpool which do not consider whether the selected samples arebeneficial for improving the performance of the classificationmodel or not Second although the proposed ADDL andChenrsquosmethod [9] both include the active learning paradigmthe proposed ADDL performs better than Chenrsquos method[9] This is due to the fact that Chenrsquos method [9] onlyconsiders the informativeness and ignores representativeness

Mathematical Problems in Engineering 11

of samples when selecting the unlabeled samples from thegiven pool

5 Conclusions

We have presented an effective framework for classifyingthree types of weather (sunny cloudy and overcast) basedon the outdoor images Through the analysis of the differentvisual manifestations of images caused by different weathersthe various features are separately extracted from the sky areaand nonsky area of images which describes visual appear-ance properties and physical characteristics of images underdifferent weather conditions respectively ADDL approachwas proposed to learn a more discriminative dictionary forimproving the weather classification performance by select-ing the informative and representative unlabeled samplesfrom a given pool to expand the training dataset Sincethere is not much image dataset on weather recognition wehave collected and labeled a new weather dataset for testingthe proposed algorithm The experimental results show thatADDL is a fairly effective and inspiring strategy for weatherclassification which also can be used inmany other computervision tasks

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

This work is supported by Fund of Jilin Provincial Sci-ence amp Technology Department (nos 20130206042GX and20140204089GX) and National Natural Science Foundationof China (nos 61403078 11271064 and 61471111)

References

[1] F Nashashibi R de Charette and A Lia ldquoDetection ofunfocused raindrops on a windscreen using low level imageprocessingrdquo in Proceedings of the 11th International Conferenceon Control Automation Robotics and Vision (ICARCV rsquo10) pp1410ndash1415 Singapore December 2010

[2] H Kurihata T Takahashi I Ide et al ldquoRainy weather recog-nition from in-vehicle camera images for driver assistancerdquo inProceedings of the IEEE Intelligent Vehicles Symposium pp 205ndash210 IEEE Las Vegas Nev USA June 2005

[3] H Woo Y M Jung J-G Kim and J K Seo ldquoEnvironmentallyrobust motion detection for video surveillancerdquo IEEE Transac-tions on Image Processing vol 19 no 11 pp 2838ndash2848 2010

[4] H Katsura J Miura M Hild and Y Shirai ldquoA view-basedoutdoor navigation using object recognition robust to changesof weather and seasonsrdquo in Proceedings of the IEEERSJ Interna-tional Conference on Intelligent Robots and Systems (IROS rsquo03)vol 3 pp 2974ndash2979 IEEE Las Vegas Nev USAOctober 2003

[5] M Roser and F Moosmann ldquoClassification of weather sit-uations on single color imagesrdquo in Proceedings of the IEEEIntelligent Vehicles Symposium pp 798ndash803 IEEE EindhovenThe Netherlands June 2008

[6] X Yan Y Luo and X Zheng ldquoWeather recognition based onimages captured by vision system in vehiclerdquo in Advances in

Neural NetworksmdashISNN 2009 pp 390ndash398 Springer BerlinGermany 2009

[7] H Song Y Chen and Y Gao ldquoWeather condition recognitionbased on feature extraction and K-NNrdquo in Foundations andPractical Applications of Cognitive Systems and Information Pro-cessing Proceedings of the First International Conference on Cog-nitive Systems and Information Processing Beijing China Dec2012 (CSIP2012) Advances in Intelligent Systems and Comput-ing pp 199ndash210 Springer Berlin Germany 2014

[8] C Lu D Lin J Jia and C-K Tang ldquoTwo-class weather class-ificationrdquo in Proceedings of the 27th IEEE Conference on Com-puter Vision and Pattern Recognition (CVPR rsquo14) pp 3718ndash3725Columbus Ohio USA June 2014

[9] Z Chen F Yang A Lindner G Barrenetxea and M VetterlildquoHow is the weather automatic inference from imagesrdquo inProceedings of the 19th IEEE International Conference on ImageProcessing (ICIP rsquo12) pp 1853ndash1856 Orlando Fla USAOctober2012

[10] K He J Sun and X Tang ldquoSingle image haze removal usingdark channel priorrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 33 no 12 pp 2341ndash2353 2011

[11] A Li and H Shouno ldquoDictionary-based image denoising byfused-lasso atom selectionrdquo Mathematical Problems in Engi-neering vol 2014 Article ID 368602 10 pages 2014

[12] Z Feng M Yang L Zhang Y Liu and D Zhang ldquoJoint dis-criminative dimensionality reduction and dictionary learningfor face recognitionrdquo Pattern Recognition vol 46 no 8 pp2134ndash2143 2013

[13] J Dong C Sun andW Yang ldquoA supervised dictionary learningand discriminative weighting model for action recognitionrdquoNeurocomputing vol 158 pp 246ndash256 2015

[14] M Aharon M Elad and A Bruckstein ldquoK-SVD an algorithmfor designing overcomplete dictionaries for sparse representa-tionrdquo IEEE Transactions on Signal Processing vol 54 no 11 pp4311ndash4322 2006

[15] JWright A Y Yang A Ganesh S S Sastry and YMa ldquoRobustface recognition via sparse representationrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 31 no 2 pp210ndash227 2009

[16] J Mairal F Bach and J Ponce ldquoTask-driven dictionarylearningrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 34 no 4 pp 791ndash804 2012

[17] Z Jiang Z Lin and L S Davis ldquoLabel consistent K-SVD learn-ing a discriminative dictionary for recognitionrdquo IEEE Transac-tions on Pattern Analysis and Machine Intelligence vol 35 no11 pp 2651ndash2664 2013

[18] M Yang L Zhang X Feng and D Zhang ldquoSparse representa-tion based fisher discrimination dictionary learning for imageclassificationrdquo International Journal of Computer Vision vol109 no 3 pp 209ndash232 2014

[19] S Gu L Zhang W Zuo and X Feng ldquoProjective dictionarypair learning for pattern classificationrdquo inProceedings of the 28thAnnual Conference on Neural Information Processing Systems(NIPS rsquo14) pp 793ndash801 December 2014

[20] B Settles Active Learning Literature Survey vol 52 Universityof Wisconsin-Madison Madison Wis USA 2010

[21] D Cohn L Atlas and R Ladner ldquoImproving generalizationwith active learningrdquo Machine Learning vol 15 no 2 pp 201ndash221 1994

[22] I Dagan and S P Engelson ldquoCommittee-based sampling fortraining probabilistic classifiersrdquo in Proceedings of the 12th

12 Mathematical Problems in Engineering

International Conference on Machine Learning pp 150ndash157Morgan Kaufmann Tahoe City Calif USA 1995

[23] D Angluin ldquoQueries and concept learningrdquoMachine Learningvol 2 no 4 pp 319ndash342 1988

[24] R D King K E Whelan F M Jones et al ldquoFunctional geno-mic hypothesis generation and experimentation by a robot sci-entistrdquo Nature vol 427 no 6971 pp 247ndash252 2004

[25] S Chakraborty V Balasubramanian and S PanchanathanldquoDynamic batch mode active learningrdquo in Proceedings of theIEEE Conference on Computer Vision and Pattern Recognition(CVPR rsquo11) pp 2649ndash2656 Providence RI USA June 2011

[26] K Nigam and A McCallum ldquoEmploying EM in pool-basedactive learning for text classificationrdquo in Proceedings of the 15thInternational Conference on Machine Learning (ICML rsquo98) pp350ndash358 Morgan Kaufmann Madison Wis USA 1998

[27] A Kapoor K Grauman R Urtasun and T Darrell ldquoActivelearning with Gaussian processes for object categorizationrdquo inProceedings of the IEEE 11th International Conference on Com-puter Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro BrazilOctober 2007

[28] S J Huang R Jin and Z H Zhou ldquoActive learning by queryinginformative and representative examplesrdquo in Proceedings of theAnnual Conference on Neural Information Processing Systems(NIPS rsquo10) pp 892ndash900 2010

[29] P Donmez J G Carbonell and P N Bennett ldquoDual strategyactive learningrdquo inMachine Learning ECML 2007 J N Kok JKoronacki R L de Mantaras S Matwin D Mladenic and ASkowron Eds vol 4701 of Lecture Notes in Computer Sciencepp 116ndash127 Springer Berlin Germany 2007

[30] S C H Hoi R Jin J Zhu and M R Lyu ldquoSemi-supervisedSVM batch mode active learning for image retrievalrdquo in Pro-ceedings of the 26th IEEE Conference on Computer Vision andPattern Recognition (CVPR rsquo08) pp 1ndash7 Anchorage AlaskaUSA June 2008

[31] Z Xu K Yu V Tresp X Xu and J Wang ldquoRepresentativesampling for text classification using support vector machinesrdquoin Proceedings of the 25th European Conference on IR Research(ECIR rsquo03) Springer 2003

[32] A Bosch A Zisserman and X Munoz ldquoImage classificationusing random forests and fernsrdquo in Proceedings of the IEEE 11thInternational Conference on Computer Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro Brazil October 2007

[33] T Ojala M Pietikainen and D Harwood ldquoA comparativestudy of texture measures with classification based on featuredistributionsrdquo Pattern Recognition vol 29 no 1 pp 51ndash59 1996

[34] S G Narasimhan and S K Nayar ldquoChromatic framework forvision in bad weatherrdquo in Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo00) pp 598ndash605 Hilton Head Island SC USA June 2000

[35] S G Narasimhan and S K Nayar ldquoRemoving weather effectsfrom monochrome imagesrdquo in Proceedings of the IEEE Com-puter Society Conference on Computer Vision and Pattern Recog-nition vol 2 pp II-186ndashII-193 IEEE Kauai Hawaii USADecember 2001

[36] S G Narasimhan and S K Nayar ldquoVision and the atmosphererdquoInternational Journal of Computer Vision vol 48 no 3 pp 233ndash254 2002

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings ofthe International Workshop on Statistical Learning in ComputerVision (ECCV rsquo04) pp 1ndash22 Prague Czech Republic 2004

[38] D D Lewis andW A Gale ldquoA sequential algorithm for trainingtext classifiersrdquo in SIGIR rsquo94 Proceedings of the SeventeenthAnnual International ACM-SIGIR Conference on Research andDevelopment in Information Retrieval organised by Dublin CityUniversity pp 3ndash12 Springer Berlin Germany 1994

[39] B Settles and M Craven ldquoAn analysis of active learning strate-gies for sequence labeling tasksrdquo inProceedings of the Conferenceon Empirical Methods in Natural Language Processing (EMNLPrsquo08) pp 1070ndash1079 Association for Computational Linguistics2008

[40] G Schohn andD Cohn ldquoLess ismore active learning with sup-port vector machinesrdquo in Proceedings of the 17th InternationalConference on Machine Learning (ICML rsquo00) pp 839ndash846Morgan Kaufmann 2000

[41] S Tong and D Koller ldquoSupport vector machine active learningwith applications to text classificationrdquo The Journal of MachineLearning Research vol 2 pp 45ndash66 2002

[42] M Szummer and T S Jaakkola ldquoInformation regularizationwith partially labeled datardquo in Advances in Neural InformationProcessing Systems pp 1025ndash1032 MIT Press 2002

[43] X Li and Y Guo ldquoAdaptive active learning for image classifica-tionrdquo in Proceedings of the 26th IEEE Conference on ComputerVision and Pattern Recognition (CVPR rsquo13) pp 859ndash866 IEEEPortland Ore USA June 2013

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: Research Article Active Discriminative Dictionary Learning

Mathematical Problems in Engineering 7

(a) (b)

(c)

Figure 2 Examples in DATASET 1 (a) Sunny (b) cloudy (c) overcast

(a) (b)

(c)

Figure 3 Examples in DATASET 2 (a) Sunny (b) cloudy (c) overcast

representation of the dark channel To be specific we divideeach nonsky region of the image into 32 times 32 blocks andextract the contrast and the dark channel features by (1)and (2) and then use bag-of-words model [37] to code eachfeature

In our experiment 50 images are randomly selectedin each dataset for training and the remaining data is usedfor testing The training data are randomly partitioned intolabeled sample set D

119897and unlabeled sample set D

119906 D119897is

applied for learning an initial dictionary andD119906is utilized for

actively selecting ldquoprofitablerdquo samples to iteratively improvethe classification performance To make the experimentresultsmore convincing each following experimental processis repeated ten times and then the mean and standarddeviation of the classification accuracy are reported

42 Experiment I Verifying the Performance of FeatureExtraction The effectiveness of the feature extraction ofour method is first evaluated Many previous works merelyextract the visual appearance features of the sky part forweather recognition [9] In order to validate the power ofour extracted features based on physical characteristics ofimages the results of two weather recognition schemes arecompared One only uses the visual appearance features toclassify the weather conditions and the other combines thevisual appearance features with features based on physicalcharacteristics of images to identify the weather conditionsIn order to weaken the influence of classifier on the results ofweather recognition119870-NN classification (119870 is experientiallyset to 30) SVM with the radial basis function kernel andthe original DPL without active samples selection procedureare applied in this experiment Figures 4 and 5 show thecomparison results on DATASET 1 and DATASET 2 respect-ively

In Figures 4 and 5 119909-axis represents the different numberof training samples and 119910-axis represents the average classifi-cation accuracy The red dotted lines indicate just six visualappearance features of the sky area are used and the bluesolid lines indicate both six visual appearance features and

two features based on physical characteristics of the nonskyarea are applied for recognition From Figures 4 and 5 it isclearly observed that the combination of visual appearancefeatures and physical features can achieve better performancefor weather recognition task

43 Experiment II Recognizing Weather Conditions by theProposed ADDL In this section the performance of theproposed ADDL algorithm is evaluated First experimentis conducted to give the best parameters for ADDL AndthenADDL is compared against several popular classificationmethods

431 Parameters Selection There are three important param-eters in the proposed approach that is 119896 120582 and 120591 119896 isthe number of atoms in each subdictionary 119863

119888learned from

samples in each class 120582 is used to control the discriminativeproperty of 119875 and 120591 is a scalar constant in DPL algorithmThe performances of our ADDL under various values of 119896 120582and 120591 are studied on DATASET 1 Figure 6(a) lists the classi-fication results when 119896 = 15 25 35 45 55 65 75 85 95 Itcan be seen that the highest average classification accuracyis obtained when 119896 = 25 This demonstrates that ADDLis effective to learn a compact dictionary According tothe observation in Figure 6(a) we set 119896 to be 25 in allexperiments The classification results obtained by using thedifferent 120591 and 120582 are shown in Figures 6(b) and 6(c) FromFigures 6(b) and 6(c) the optimal values of 120582 and 120591 are 005and 25 respectively This is because of the fact that a too bigor too small 120582 value will lead the reconstruction coefficient inADDL to be too sparse or too dense which will deterioratethe classification performance If 120591 is too large the effect ofthe reconstruction error constraint (the first term in (4)) andthe sparse constraint (the third term in (4)) is weakenedwhich will decrease the discrimination ability of the learneddictionary On the contrary if 120591 is too small the second termin (4) will be neglected in dictionary learning which alsoreduces the performance of algorithmHence we set120582 = 005

and 120591 = 25 for all experiments

8 Mathematical Problems in Engineering

100 200 300 400 50076

78

80

82

84

86

88

90

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(a) K-NN

100 200 300 400 50076

78

80

82

84

86

88

90

92

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(b) SVM

100 200 300 400 50084

86

88

90

92

94

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(c) DPL

Figure 4 Weather recognition results over DATASET 1 by using different features

Now we evaluate weather active samples selection canimprove the recognition performance 500 samples inDATASET 1 are randomly selected as training data and theremaining samples are used for testing In training dataset50 samples are randomly selected as the labeled dataset D

119897

and the remaining 450 samples are selected as the unlabeleddatasetD

119906The proposedADDLuses the labeled datasetD

119897to

learn an initial dictionary pair and then iteratively selects 50samples from D

119906to label for expanding the training dataset

Figure 7 shows the recognition accuracy versus the numberof iterations

In Figure 7 the 0th iteration indicates that we only usethe initial 50 labeled samples to learn the dictionary and the9th iterationmeans using all 500 training samples to learn thedictionary for recognition The recognition ability of ADDL

is improved by active samples selection procedure and itachieves highest accuracy when the number of iterationsis 3 total 200 samples are used for training It is worthmentioning that ADDL obtains the best results when thenumber of iterations is set as 3 If iterations are larger than 3the recognition rates will drop about 1This is because thereare somenoisy examples or ldquooutliersrdquo in the unlabeled datasetand the more noisy examples or ldquooutliersrdquo will be selectedto learn the dictionary along with the increase of iterationswhich interferes with the dictionary learning and leads to theclassification performance degradation In the following thenumber of iterations is set to 3 for all experiments

432 Comparisons of ADDL with Other Methods Here theproposed ADDL is compared with several methods The first

Mathematical Problems in Engineering 9

100 400 700 1000 1300 1600 1900 2200 250079

81

83

85

87

89

91

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(a) K-NN

100 400 700 1000 1300 1600 1900 2200 250083

85

87

89

91

93

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(b) SVM

100 400 700 1000 1300 1600 1900 2200 250086

88

90

92

94

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(c) DPL

Figure 5 Weather recognition results over DATASET 2 by using different features

two methods are 119870-NN algorithm used by Song et al [7]and SVM with the radial basis function kernel (RBF-SVM)adopted by Roser and Moosmann [5] The third methodis SRC [15] which directly uses all training samples as thedictionary for classification In order to confirm that theactive samples selection in our ADDLmethod is effective theproposed ADDL is compared with the original DPL method[19] We also compare ADDL with the method proposedby Chen et al [9] As far as we know the work in [9] isthe only framework which addresses the same problem withour method that is recognizing different weather conditions(sunny cloudy and overcast) of images captured by a stillcamera It actively selected useful samples to training SVMfor recognizing different weather conditions

For DATASET 1 500 images are randomly selected as thetraining samples and the rest of images are used as the testingsamples In the training dataset 50 images are randomlychosen as the initial labeled training dataset D

119897 and the

remaining 450 images are regarded as the unlabeled datasetD119906 ADDL and Chenrsquos method [9] both include the active

learning procedure thus they iteratively choose 150 samplesfrom D

119906to be labeled based on their criterion of the samples

selection and add these samples toD119897for further training the

classification model For 119870-NN RBF-SVM SRC and DPLmethods which are without the active learning procedure150 samples are randomly selected from D

119906to be labeled

for expanding the labeled training dataset D119897 Table 1 lists

the comparisons of our approach with several methods for

10 Mathematical Problems in Engineering

5 15 25 35 45 55 65 75 85 95

90

92

94Av

erag

e acc

urac

y (

)

k

(a)

00005 0005 005 05 10 30 50 70 90

90

92

94

Aver

age a

ccur

acy

()

120582

(b)

0005 05 5 25 45 65 85

90

92

94

Aver

age a

ccur

acy

()

120591

(c)

Figure 6 Selection of parameters 119909-axis represents the different values of parameters and 119910-axis represents the average classificationaccuracy (a)The average classification rate under different k (b)The average classification rate under different 120582 (c)The average classificationrate under different 120591

0 1 2 3 4 5 6 7 8 98284868890929496

Iteration

Aver

age a

ccur

acy

()

Figure 7 Recognition accuracy on DATASET 1 versus the numberof iterations

Table 1 Comparisons on DATASET 1 among different methods

Methods Classification rate (mean plusmn std)119870-NN 829plusmn 12RBF-SVM 856plusmn 16SRC 897plusmn 09DPL 913plusmn 16Chenrsquos method [9] 9298plusmn 055ADDL 940plusmn 02

weather classification As can be seen from Table 1 ADDLoutperforms other methods The mean classification rate ofADDL reaches about 94

In DATASET 2 2500 images are randomly selected asthe training samples and the rest of images are used as

Table 2 Comparison on DATASET 2 among different methods

Methods Classification rate (mean plusmn std)119870-NN 850plusmn 16RBF-SVM 881plusmn 14SRC 882plusmn 11DPL 896plusmn 08Chenrsquos method [9] 888plusmn 16ADDL 904plusmn 10

the testing samples In the training dataset 50 images arerandomly chosen as the initial labeled training datasetD

119897 the

remaining 2450 images are regarded as the unlabeled datasetD119897 All parameters setting for DATASET 2 are the same as

DATASET 1 Table 2 lists the recognition results of differentmethods which indicates that the validity of the proposedADDL is better than other methods

FromTables 1 and 2 two points can be observed First wecan find that the recognition performances of 119870-NN RBF-SVM SRC and DPL are overall inferior to the proposedADDL algorithm This is probably because these four algo-rithms randomly select the unlabeled data from the givenpool which do not consider whether the selected samples arebeneficial for improving the performance of the classificationmodel or not Second although the proposed ADDL andChenrsquosmethod [9] both include the active learning paradigmthe proposed ADDL performs better than Chenrsquos method[9] This is due to the fact that Chenrsquos method [9] onlyconsiders the informativeness and ignores representativeness

Mathematical Problems in Engineering 11

of samples when selecting the unlabeled samples from thegiven pool

5 Conclusions

We have presented an effective framework for classifyingthree types of weather (sunny cloudy and overcast) basedon the outdoor images Through the analysis of the differentvisual manifestations of images caused by different weathersthe various features are separately extracted from the sky areaand nonsky area of images which describes visual appear-ance properties and physical characteristics of images underdifferent weather conditions respectively ADDL approachwas proposed to learn a more discriminative dictionary forimproving the weather classification performance by select-ing the informative and representative unlabeled samplesfrom a given pool to expand the training dataset Sincethere is not much image dataset on weather recognition wehave collected and labeled a new weather dataset for testingthe proposed algorithm The experimental results show thatADDL is a fairly effective and inspiring strategy for weatherclassification which also can be used inmany other computervision tasks

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

This work is supported by Fund of Jilin Provincial Sci-ence amp Technology Department (nos 20130206042GX and20140204089GX) and National Natural Science Foundationof China (nos 61403078 11271064 and 61471111)

References

[1] F Nashashibi R de Charette and A Lia ldquoDetection ofunfocused raindrops on a windscreen using low level imageprocessingrdquo in Proceedings of the 11th International Conferenceon Control Automation Robotics and Vision (ICARCV rsquo10) pp1410ndash1415 Singapore December 2010

[2] H Kurihata T Takahashi I Ide et al ldquoRainy weather recog-nition from in-vehicle camera images for driver assistancerdquo inProceedings of the IEEE Intelligent Vehicles Symposium pp 205ndash210 IEEE Las Vegas Nev USA June 2005

[3] H Woo Y M Jung J-G Kim and J K Seo ldquoEnvironmentallyrobust motion detection for video surveillancerdquo IEEE Transac-tions on Image Processing vol 19 no 11 pp 2838ndash2848 2010

[4] H Katsura J Miura M Hild and Y Shirai ldquoA view-basedoutdoor navigation using object recognition robust to changesof weather and seasonsrdquo in Proceedings of the IEEERSJ Interna-tional Conference on Intelligent Robots and Systems (IROS rsquo03)vol 3 pp 2974ndash2979 IEEE Las Vegas Nev USAOctober 2003

[5] M Roser and F Moosmann ldquoClassification of weather sit-uations on single color imagesrdquo in Proceedings of the IEEEIntelligent Vehicles Symposium pp 798ndash803 IEEE EindhovenThe Netherlands June 2008

[6] X Yan Y Luo and X Zheng ldquoWeather recognition based onimages captured by vision system in vehiclerdquo in Advances in

Neural NetworksmdashISNN 2009 pp 390ndash398 Springer BerlinGermany 2009

[7] H Song Y Chen and Y Gao ldquoWeather condition recognitionbased on feature extraction and K-NNrdquo in Foundations andPractical Applications of Cognitive Systems and Information Pro-cessing Proceedings of the First International Conference on Cog-nitive Systems and Information Processing Beijing China Dec2012 (CSIP2012) Advances in Intelligent Systems and Comput-ing pp 199ndash210 Springer Berlin Germany 2014

[8] C Lu D Lin J Jia and C-K Tang ldquoTwo-class weather class-ificationrdquo in Proceedings of the 27th IEEE Conference on Com-puter Vision and Pattern Recognition (CVPR rsquo14) pp 3718ndash3725Columbus Ohio USA June 2014

[9] Z Chen F Yang A Lindner G Barrenetxea and M VetterlildquoHow is the weather automatic inference from imagesrdquo inProceedings of the 19th IEEE International Conference on ImageProcessing (ICIP rsquo12) pp 1853ndash1856 Orlando Fla USAOctober2012

[10] K He J Sun and X Tang ldquoSingle image haze removal usingdark channel priorrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 33 no 12 pp 2341ndash2353 2011

[11] A Li and H Shouno ldquoDictionary-based image denoising byfused-lasso atom selectionrdquo Mathematical Problems in Engi-neering vol 2014 Article ID 368602 10 pages 2014

[12] Z Feng M Yang L Zhang Y Liu and D Zhang ldquoJoint dis-criminative dimensionality reduction and dictionary learningfor face recognitionrdquo Pattern Recognition vol 46 no 8 pp2134ndash2143 2013

[13] J Dong C Sun andW Yang ldquoA supervised dictionary learningand discriminative weighting model for action recognitionrdquoNeurocomputing vol 158 pp 246ndash256 2015

[14] M Aharon M Elad and A Bruckstein ldquoK-SVD an algorithmfor designing overcomplete dictionaries for sparse representa-tionrdquo IEEE Transactions on Signal Processing vol 54 no 11 pp4311ndash4322 2006

[15] JWright A Y Yang A Ganesh S S Sastry and YMa ldquoRobustface recognition via sparse representationrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 31 no 2 pp210ndash227 2009

[16] J Mairal F Bach and J Ponce ldquoTask-driven dictionarylearningrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 34 no 4 pp 791ndash804 2012

[17] Z Jiang Z Lin and L S Davis ldquoLabel consistent K-SVD learn-ing a discriminative dictionary for recognitionrdquo IEEE Transac-tions on Pattern Analysis and Machine Intelligence vol 35 no11 pp 2651ndash2664 2013

[18] M Yang L Zhang X Feng and D Zhang ldquoSparse representa-tion based fisher discrimination dictionary learning for imageclassificationrdquo International Journal of Computer Vision vol109 no 3 pp 209ndash232 2014

[19] S Gu L Zhang W Zuo and X Feng ldquoProjective dictionarypair learning for pattern classificationrdquo inProceedings of the 28thAnnual Conference on Neural Information Processing Systems(NIPS rsquo14) pp 793ndash801 December 2014

[20] B Settles Active Learning Literature Survey vol 52 Universityof Wisconsin-Madison Madison Wis USA 2010

[21] D Cohn L Atlas and R Ladner ldquoImproving generalizationwith active learningrdquo Machine Learning vol 15 no 2 pp 201ndash221 1994

[22] I Dagan and S P Engelson ldquoCommittee-based sampling fortraining probabilistic classifiersrdquo in Proceedings of the 12th

12 Mathematical Problems in Engineering

International Conference on Machine Learning pp 150ndash157Morgan Kaufmann Tahoe City Calif USA 1995

[23] D Angluin ldquoQueries and concept learningrdquoMachine Learningvol 2 no 4 pp 319ndash342 1988

[24] R D King K E Whelan F M Jones et al ldquoFunctional geno-mic hypothesis generation and experimentation by a robot sci-entistrdquo Nature vol 427 no 6971 pp 247ndash252 2004

[25] S Chakraborty V Balasubramanian and S PanchanathanldquoDynamic batch mode active learningrdquo in Proceedings of theIEEE Conference on Computer Vision and Pattern Recognition(CVPR rsquo11) pp 2649ndash2656 Providence RI USA June 2011

[26] K Nigam and A McCallum ldquoEmploying EM in pool-basedactive learning for text classificationrdquo in Proceedings of the 15thInternational Conference on Machine Learning (ICML rsquo98) pp350ndash358 Morgan Kaufmann Madison Wis USA 1998

[27] A Kapoor K Grauman R Urtasun and T Darrell ldquoActivelearning with Gaussian processes for object categorizationrdquo inProceedings of the IEEE 11th International Conference on Com-puter Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro BrazilOctober 2007

[28] S J Huang R Jin and Z H Zhou ldquoActive learning by queryinginformative and representative examplesrdquo in Proceedings of theAnnual Conference on Neural Information Processing Systems(NIPS rsquo10) pp 892ndash900 2010

[29] P Donmez J G Carbonell and P N Bennett ldquoDual strategyactive learningrdquo inMachine Learning ECML 2007 J N Kok JKoronacki R L de Mantaras S Matwin D Mladenic and ASkowron Eds vol 4701 of Lecture Notes in Computer Sciencepp 116ndash127 Springer Berlin Germany 2007

[30] S C H Hoi R Jin J Zhu and M R Lyu ldquoSemi-supervisedSVM batch mode active learning for image retrievalrdquo in Pro-ceedings of the 26th IEEE Conference on Computer Vision andPattern Recognition (CVPR rsquo08) pp 1ndash7 Anchorage AlaskaUSA June 2008

[31] Z Xu K Yu V Tresp X Xu and J Wang ldquoRepresentativesampling for text classification using support vector machinesrdquoin Proceedings of the 25th European Conference on IR Research(ECIR rsquo03) Springer 2003

[32] A Bosch A Zisserman and X Munoz ldquoImage classificationusing random forests and fernsrdquo in Proceedings of the IEEE 11thInternational Conference on Computer Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro Brazil October 2007

[33] T Ojala M Pietikainen and D Harwood ldquoA comparativestudy of texture measures with classification based on featuredistributionsrdquo Pattern Recognition vol 29 no 1 pp 51ndash59 1996

[34] S G Narasimhan and S K Nayar ldquoChromatic framework forvision in bad weatherrdquo in Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo00) pp 598ndash605 Hilton Head Island SC USA June 2000

[35] S G Narasimhan and S K Nayar ldquoRemoving weather effectsfrom monochrome imagesrdquo in Proceedings of the IEEE Com-puter Society Conference on Computer Vision and Pattern Recog-nition vol 2 pp II-186ndashII-193 IEEE Kauai Hawaii USADecember 2001

[36] S G Narasimhan and S K Nayar ldquoVision and the atmosphererdquoInternational Journal of Computer Vision vol 48 no 3 pp 233ndash254 2002

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings ofthe International Workshop on Statistical Learning in ComputerVision (ECCV rsquo04) pp 1ndash22 Prague Czech Republic 2004

[38] D D Lewis andW A Gale ldquoA sequential algorithm for trainingtext classifiersrdquo in SIGIR rsquo94 Proceedings of the SeventeenthAnnual International ACM-SIGIR Conference on Research andDevelopment in Information Retrieval organised by Dublin CityUniversity pp 3ndash12 Springer Berlin Germany 1994

[39] B Settles and M Craven ldquoAn analysis of active learning strate-gies for sequence labeling tasksrdquo inProceedings of the Conferenceon Empirical Methods in Natural Language Processing (EMNLPrsquo08) pp 1070ndash1079 Association for Computational Linguistics2008

[40] G Schohn andD Cohn ldquoLess ismore active learning with sup-port vector machinesrdquo in Proceedings of the 17th InternationalConference on Machine Learning (ICML rsquo00) pp 839ndash846Morgan Kaufmann 2000

[41] S Tong and D Koller ldquoSupport vector machine active learningwith applications to text classificationrdquo The Journal of MachineLearning Research vol 2 pp 45ndash66 2002

[42] M Szummer and T S Jaakkola ldquoInformation regularizationwith partially labeled datardquo in Advances in Neural InformationProcessing Systems pp 1025ndash1032 MIT Press 2002

[43] X Li and Y Guo ldquoAdaptive active learning for image classifica-tionrdquo in Proceedings of the 26th IEEE Conference on ComputerVision and Pattern Recognition (CVPR rsquo13) pp 859ndash866 IEEEPortland Ore USA June 2013

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: Research Article Active Discriminative Dictionary Learning

8 Mathematical Problems in Engineering

100 200 300 400 50076

78

80

82

84

86

88

90

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(a) K-NN

100 200 300 400 50076

78

80

82

84

86

88

90

92

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(b) SVM

100 200 300 400 50084

86

88

90

92

94

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(c) DPL

Figure 4 Weather recognition results over DATASET 1 by using different features

Now we evaluate weather active samples selection canimprove the recognition performance 500 samples inDATASET 1 are randomly selected as training data and theremaining samples are used for testing In training dataset50 samples are randomly selected as the labeled dataset D

119897

and the remaining 450 samples are selected as the unlabeleddatasetD

119906The proposedADDLuses the labeled datasetD

119897to

learn an initial dictionary pair and then iteratively selects 50samples from D

119906to label for expanding the training dataset

Figure 7 shows the recognition accuracy versus the numberof iterations

In Figure 7 the 0th iteration indicates that we only usethe initial 50 labeled samples to learn the dictionary and the9th iterationmeans using all 500 training samples to learn thedictionary for recognition The recognition ability of ADDL

is improved by active samples selection procedure and itachieves highest accuracy when the number of iterationsis 3 total 200 samples are used for training It is worthmentioning that ADDL obtains the best results when thenumber of iterations is set as 3 If iterations are larger than 3the recognition rates will drop about 1This is because thereare somenoisy examples or ldquooutliersrdquo in the unlabeled datasetand the more noisy examples or ldquooutliersrdquo will be selectedto learn the dictionary along with the increase of iterationswhich interferes with the dictionary learning and leads to theclassification performance degradation In the following thenumber of iterations is set to 3 for all experiments

432 Comparisons of ADDL with Other Methods Here theproposed ADDL is compared with several methods The first

Mathematical Problems in Engineering 9

100 400 700 1000 1300 1600 1900 2200 250079

81

83

85

87

89

91

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(a) K-NN

100 400 700 1000 1300 1600 1900 2200 250083

85

87

89

91

93

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(b) SVM

100 400 700 1000 1300 1600 1900 2200 250086

88

90

92

94

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(c) DPL

Figure 5 Weather recognition results over DATASET 2 by using different features

two methods are 119870-NN algorithm used by Song et al [7]and SVM with the radial basis function kernel (RBF-SVM)adopted by Roser and Moosmann [5] The third methodis SRC [15] which directly uses all training samples as thedictionary for classification In order to confirm that theactive samples selection in our ADDLmethod is effective theproposed ADDL is compared with the original DPL method[19] We also compare ADDL with the method proposedby Chen et al [9] As far as we know the work in [9] isthe only framework which addresses the same problem withour method that is recognizing different weather conditions(sunny cloudy and overcast) of images captured by a stillcamera It actively selected useful samples to training SVMfor recognizing different weather conditions

For DATASET 1 500 images are randomly selected as thetraining samples and the rest of images are used as the testingsamples In the training dataset 50 images are randomlychosen as the initial labeled training dataset D

119897 and the

remaining 450 images are regarded as the unlabeled datasetD119906 ADDL and Chenrsquos method [9] both include the active

learning procedure thus they iteratively choose 150 samplesfrom D

119906to be labeled based on their criterion of the samples

selection and add these samples toD119897for further training the

classification model For 119870-NN RBF-SVM SRC and DPLmethods which are without the active learning procedure150 samples are randomly selected from D

119906to be labeled

for expanding the labeled training dataset D119897 Table 1 lists

the comparisons of our approach with several methods for

10 Mathematical Problems in Engineering

5 15 25 35 45 55 65 75 85 95

90

92

94Av

erag

e acc

urac

y (

)

k

(a)

00005 0005 005 05 10 30 50 70 90

90

92

94

Aver

age a

ccur

acy

()

120582

(b)

0005 05 5 25 45 65 85

90

92

94

Aver

age a

ccur

acy

()

120591

(c)

Figure 6 Selection of parameters 119909-axis represents the different values of parameters and 119910-axis represents the average classificationaccuracy (a)The average classification rate under different k (b)The average classification rate under different 120582 (c)The average classificationrate under different 120591

0 1 2 3 4 5 6 7 8 98284868890929496

Iteration

Aver

age a

ccur

acy

()

Figure 7 Recognition accuracy on DATASET 1 versus the numberof iterations

Table 1 Comparisons on DATASET 1 among different methods

Methods Classification rate (mean plusmn std)119870-NN 829plusmn 12RBF-SVM 856plusmn 16SRC 897plusmn 09DPL 913plusmn 16Chenrsquos method [9] 9298plusmn 055ADDL 940plusmn 02

weather classification As can be seen from Table 1 ADDLoutperforms other methods The mean classification rate ofADDL reaches about 94

In DATASET 2 2500 images are randomly selected asthe training samples and the rest of images are used as

Table 2 Comparison on DATASET 2 among different methods

Methods Classification rate (mean plusmn std)119870-NN 850plusmn 16RBF-SVM 881plusmn 14SRC 882plusmn 11DPL 896plusmn 08Chenrsquos method [9] 888plusmn 16ADDL 904plusmn 10

the testing samples In the training dataset 50 images arerandomly chosen as the initial labeled training datasetD

119897 the

remaining 2450 images are regarded as the unlabeled datasetD119897 All parameters setting for DATASET 2 are the same as

DATASET 1 Table 2 lists the recognition results of differentmethods which indicates that the validity of the proposedADDL is better than other methods

FromTables 1 and 2 two points can be observed First wecan find that the recognition performances of 119870-NN RBF-SVM SRC and DPL are overall inferior to the proposedADDL algorithm This is probably because these four algo-rithms randomly select the unlabeled data from the givenpool which do not consider whether the selected samples arebeneficial for improving the performance of the classificationmodel or not Second although the proposed ADDL andChenrsquosmethod [9] both include the active learning paradigmthe proposed ADDL performs better than Chenrsquos method[9] This is due to the fact that Chenrsquos method [9] onlyconsiders the informativeness and ignores representativeness

Mathematical Problems in Engineering 11

of samples when selecting the unlabeled samples from thegiven pool

5 Conclusions

We have presented an effective framework for classifyingthree types of weather (sunny cloudy and overcast) basedon the outdoor images Through the analysis of the differentvisual manifestations of images caused by different weathersthe various features are separately extracted from the sky areaand nonsky area of images which describes visual appear-ance properties and physical characteristics of images underdifferent weather conditions respectively ADDL approachwas proposed to learn a more discriminative dictionary forimproving the weather classification performance by select-ing the informative and representative unlabeled samplesfrom a given pool to expand the training dataset Sincethere is not much image dataset on weather recognition wehave collected and labeled a new weather dataset for testingthe proposed algorithm The experimental results show thatADDL is a fairly effective and inspiring strategy for weatherclassification which also can be used inmany other computervision tasks

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

This work is supported by Fund of Jilin Provincial Sci-ence amp Technology Department (nos 20130206042GX and20140204089GX) and National Natural Science Foundationof China (nos 61403078 11271064 and 61471111)

References

[1] F Nashashibi R de Charette and A Lia ldquoDetection ofunfocused raindrops on a windscreen using low level imageprocessingrdquo in Proceedings of the 11th International Conferenceon Control Automation Robotics and Vision (ICARCV rsquo10) pp1410ndash1415 Singapore December 2010

[2] H Kurihata T Takahashi I Ide et al ldquoRainy weather recog-nition from in-vehicle camera images for driver assistancerdquo inProceedings of the IEEE Intelligent Vehicles Symposium pp 205ndash210 IEEE Las Vegas Nev USA June 2005

[3] H Woo Y M Jung J-G Kim and J K Seo ldquoEnvironmentallyrobust motion detection for video surveillancerdquo IEEE Transac-tions on Image Processing vol 19 no 11 pp 2838ndash2848 2010

[4] H Katsura J Miura M Hild and Y Shirai ldquoA view-basedoutdoor navigation using object recognition robust to changesof weather and seasonsrdquo in Proceedings of the IEEERSJ Interna-tional Conference on Intelligent Robots and Systems (IROS rsquo03)vol 3 pp 2974ndash2979 IEEE Las Vegas Nev USAOctober 2003

[5] M Roser and F Moosmann ldquoClassification of weather sit-uations on single color imagesrdquo in Proceedings of the IEEEIntelligent Vehicles Symposium pp 798ndash803 IEEE EindhovenThe Netherlands June 2008

[6] X Yan Y Luo and X Zheng ldquoWeather recognition based onimages captured by vision system in vehiclerdquo in Advances in

Neural NetworksmdashISNN 2009 pp 390ndash398 Springer BerlinGermany 2009

[7] H Song Y Chen and Y Gao ldquoWeather condition recognitionbased on feature extraction and K-NNrdquo in Foundations andPractical Applications of Cognitive Systems and Information Pro-cessing Proceedings of the First International Conference on Cog-nitive Systems and Information Processing Beijing China Dec2012 (CSIP2012) Advances in Intelligent Systems and Comput-ing pp 199ndash210 Springer Berlin Germany 2014

[8] C Lu D Lin J Jia and C-K Tang ldquoTwo-class weather class-ificationrdquo in Proceedings of the 27th IEEE Conference on Com-puter Vision and Pattern Recognition (CVPR rsquo14) pp 3718ndash3725Columbus Ohio USA June 2014

[9] Z Chen F Yang A Lindner G Barrenetxea and M VetterlildquoHow is the weather automatic inference from imagesrdquo inProceedings of the 19th IEEE International Conference on ImageProcessing (ICIP rsquo12) pp 1853ndash1856 Orlando Fla USAOctober2012

[10] K He J Sun and X Tang ldquoSingle image haze removal usingdark channel priorrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 33 no 12 pp 2341ndash2353 2011

[11] A Li and H Shouno ldquoDictionary-based image denoising byfused-lasso atom selectionrdquo Mathematical Problems in Engi-neering vol 2014 Article ID 368602 10 pages 2014

[12] Z Feng M Yang L Zhang Y Liu and D Zhang ldquoJoint dis-criminative dimensionality reduction and dictionary learningfor face recognitionrdquo Pattern Recognition vol 46 no 8 pp2134ndash2143 2013

[13] J Dong C Sun andW Yang ldquoA supervised dictionary learningand discriminative weighting model for action recognitionrdquoNeurocomputing vol 158 pp 246ndash256 2015

[14] M Aharon M Elad and A Bruckstein ldquoK-SVD an algorithmfor designing overcomplete dictionaries for sparse representa-tionrdquo IEEE Transactions on Signal Processing vol 54 no 11 pp4311ndash4322 2006

[15] JWright A Y Yang A Ganesh S S Sastry and YMa ldquoRobustface recognition via sparse representationrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 31 no 2 pp210ndash227 2009

[16] J Mairal F Bach and J Ponce ldquoTask-driven dictionarylearningrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 34 no 4 pp 791ndash804 2012

[17] Z Jiang Z Lin and L S Davis ldquoLabel consistent K-SVD learn-ing a discriminative dictionary for recognitionrdquo IEEE Transac-tions on Pattern Analysis and Machine Intelligence vol 35 no11 pp 2651ndash2664 2013

[18] M Yang L Zhang X Feng and D Zhang ldquoSparse representa-tion based fisher discrimination dictionary learning for imageclassificationrdquo International Journal of Computer Vision vol109 no 3 pp 209ndash232 2014

[19] S Gu L Zhang W Zuo and X Feng ldquoProjective dictionarypair learning for pattern classificationrdquo inProceedings of the 28thAnnual Conference on Neural Information Processing Systems(NIPS rsquo14) pp 793ndash801 December 2014

[20] B Settles Active Learning Literature Survey vol 52 Universityof Wisconsin-Madison Madison Wis USA 2010

[21] D Cohn L Atlas and R Ladner ldquoImproving generalizationwith active learningrdquo Machine Learning vol 15 no 2 pp 201ndash221 1994

[22] I Dagan and S P Engelson ldquoCommittee-based sampling fortraining probabilistic classifiersrdquo in Proceedings of the 12th

12 Mathematical Problems in Engineering

International Conference on Machine Learning pp 150ndash157Morgan Kaufmann Tahoe City Calif USA 1995

[23] D Angluin ldquoQueries and concept learningrdquoMachine Learningvol 2 no 4 pp 319ndash342 1988

[24] R D King K E Whelan F M Jones et al ldquoFunctional geno-mic hypothesis generation and experimentation by a robot sci-entistrdquo Nature vol 427 no 6971 pp 247ndash252 2004

[25] S Chakraborty V Balasubramanian and S PanchanathanldquoDynamic batch mode active learningrdquo in Proceedings of theIEEE Conference on Computer Vision and Pattern Recognition(CVPR rsquo11) pp 2649ndash2656 Providence RI USA June 2011

[26] K Nigam and A McCallum ldquoEmploying EM in pool-basedactive learning for text classificationrdquo in Proceedings of the 15thInternational Conference on Machine Learning (ICML rsquo98) pp350ndash358 Morgan Kaufmann Madison Wis USA 1998

[27] A Kapoor K Grauman R Urtasun and T Darrell ldquoActivelearning with Gaussian processes for object categorizationrdquo inProceedings of the IEEE 11th International Conference on Com-puter Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro BrazilOctober 2007

[28] S J Huang R Jin and Z H Zhou ldquoActive learning by queryinginformative and representative examplesrdquo in Proceedings of theAnnual Conference on Neural Information Processing Systems(NIPS rsquo10) pp 892ndash900 2010

[29] P Donmez J G Carbonell and P N Bennett ldquoDual strategyactive learningrdquo inMachine Learning ECML 2007 J N Kok JKoronacki R L de Mantaras S Matwin D Mladenic and ASkowron Eds vol 4701 of Lecture Notes in Computer Sciencepp 116ndash127 Springer Berlin Germany 2007

[30] S C H Hoi R Jin J Zhu and M R Lyu ldquoSemi-supervisedSVM batch mode active learning for image retrievalrdquo in Pro-ceedings of the 26th IEEE Conference on Computer Vision andPattern Recognition (CVPR rsquo08) pp 1ndash7 Anchorage AlaskaUSA June 2008

[31] Z Xu K Yu V Tresp X Xu and J Wang ldquoRepresentativesampling for text classification using support vector machinesrdquoin Proceedings of the 25th European Conference on IR Research(ECIR rsquo03) Springer 2003

[32] A Bosch A Zisserman and X Munoz ldquoImage classificationusing random forests and fernsrdquo in Proceedings of the IEEE 11thInternational Conference on Computer Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro Brazil October 2007

[33] T Ojala M Pietikainen and D Harwood ldquoA comparativestudy of texture measures with classification based on featuredistributionsrdquo Pattern Recognition vol 29 no 1 pp 51ndash59 1996

[34] S G Narasimhan and S K Nayar ldquoChromatic framework forvision in bad weatherrdquo in Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo00) pp 598ndash605 Hilton Head Island SC USA June 2000

[35] S G Narasimhan and S K Nayar ldquoRemoving weather effectsfrom monochrome imagesrdquo in Proceedings of the IEEE Com-puter Society Conference on Computer Vision and Pattern Recog-nition vol 2 pp II-186ndashII-193 IEEE Kauai Hawaii USADecember 2001

[36] S G Narasimhan and S K Nayar ldquoVision and the atmosphererdquoInternational Journal of Computer Vision vol 48 no 3 pp 233ndash254 2002

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings ofthe International Workshop on Statistical Learning in ComputerVision (ECCV rsquo04) pp 1ndash22 Prague Czech Republic 2004

[38] D D Lewis andW A Gale ldquoA sequential algorithm for trainingtext classifiersrdquo in SIGIR rsquo94 Proceedings of the SeventeenthAnnual International ACM-SIGIR Conference on Research andDevelopment in Information Retrieval organised by Dublin CityUniversity pp 3ndash12 Springer Berlin Germany 1994

[39] B Settles and M Craven ldquoAn analysis of active learning strate-gies for sequence labeling tasksrdquo inProceedings of the Conferenceon Empirical Methods in Natural Language Processing (EMNLPrsquo08) pp 1070ndash1079 Association for Computational Linguistics2008

[40] G Schohn andD Cohn ldquoLess ismore active learning with sup-port vector machinesrdquo in Proceedings of the 17th InternationalConference on Machine Learning (ICML rsquo00) pp 839ndash846Morgan Kaufmann 2000

[41] S Tong and D Koller ldquoSupport vector machine active learningwith applications to text classificationrdquo The Journal of MachineLearning Research vol 2 pp 45ndash66 2002

[42] M Szummer and T S Jaakkola ldquoInformation regularizationwith partially labeled datardquo in Advances in Neural InformationProcessing Systems pp 1025ndash1032 MIT Press 2002

[43] X Li and Y Guo ldquoAdaptive active learning for image classifica-tionrdquo in Proceedings of the 26th IEEE Conference on ComputerVision and Pattern Recognition (CVPR rsquo13) pp 859ndash866 IEEEPortland Ore USA June 2013

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 9: Research Article Active Discriminative Dictionary Learning

Mathematical Problems in Engineering 9

100 400 700 1000 1300 1600 1900 2200 250079

81

83

85

87

89

91

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(a) K-NN

100 400 700 1000 1300 1600 1900 2200 250083

85

87

89

91

93

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(b) SVM

100 400 700 1000 1300 1600 1900 2200 250086

88

90

92

94

Number of training samples

Aver

age a

ccur

acy

()

6 features8 features

(c) DPL

Figure 5 Weather recognition results over DATASET 2 by using different features

two methods are 119870-NN algorithm used by Song et al [7]and SVM with the radial basis function kernel (RBF-SVM)adopted by Roser and Moosmann [5] The third methodis SRC [15] which directly uses all training samples as thedictionary for classification In order to confirm that theactive samples selection in our ADDLmethod is effective theproposed ADDL is compared with the original DPL method[19] We also compare ADDL with the method proposedby Chen et al [9] As far as we know the work in [9] isthe only framework which addresses the same problem withour method that is recognizing different weather conditions(sunny cloudy and overcast) of images captured by a stillcamera It actively selected useful samples to training SVMfor recognizing different weather conditions

For DATASET 1 500 images are randomly selected as thetraining samples and the rest of images are used as the testingsamples In the training dataset 50 images are randomlychosen as the initial labeled training dataset D

119897 and the

remaining 450 images are regarded as the unlabeled datasetD119906 ADDL and Chenrsquos method [9] both include the active

learning procedure thus they iteratively choose 150 samplesfrom D

119906to be labeled based on their criterion of the samples

selection and add these samples toD119897for further training the

classification model For 119870-NN RBF-SVM SRC and DPLmethods which are without the active learning procedure150 samples are randomly selected from D

119906to be labeled

for expanding the labeled training dataset D119897 Table 1 lists

the comparisons of our approach with several methods for

10 Mathematical Problems in Engineering

5 15 25 35 45 55 65 75 85 95

90

92

94Av

erag

e acc

urac

y (

)

k

(a)

00005 0005 005 05 10 30 50 70 90

90

92

94

Aver

age a

ccur

acy

()

120582

(b)

0005 05 5 25 45 65 85

90

92

94

Aver

age a

ccur

acy

()

120591

(c)

Figure 6 Selection of parameters 119909-axis represents the different values of parameters and 119910-axis represents the average classificationaccuracy (a)The average classification rate under different k (b)The average classification rate under different 120582 (c)The average classificationrate under different 120591

0 1 2 3 4 5 6 7 8 98284868890929496

Iteration

Aver

age a

ccur

acy

()

Figure 7 Recognition accuracy on DATASET 1 versus the numberof iterations

Table 1 Comparisons on DATASET 1 among different methods

Methods Classification rate (mean plusmn std)119870-NN 829plusmn 12RBF-SVM 856plusmn 16SRC 897plusmn 09DPL 913plusmn 16Chenrsquos method [9] 9298plusmn 055ADDL 940plusmn 02

weather classification As can be seen from Table 1 ADDLoutperforms other methods The mean classification rate ofADDL reaches about 94

In DATASET 2 2500 images are randomly selected asthe training samples and the rest of images are used as

Table 2 Comparison on DATASET 2 among different methods

Methods Classification rate (mean plusmn std)119870-NN 850plusmn 16RBF-SVM 881plusmn 14SRC 882plusmn 11DPL 896plusmn 08Chenrsquos method [9] 888plusmn 16ADDL 904plusmn 10

the testing samples In the training dataset 50 images arerandomly chosen as the initial labeled training datasetD

119897 the

remaining 2450 images are regarded as the unlabeled datasetD119897 All parameters setting for DATASET 2 are the same as

DATASET 1 Table 2 lists the recognition results of differentmethods which indicates that the validity of the proposedADDL is better than other methods

FromTables 1 and 2 two points can be observed First wecan find that the recognition performances of 119870-NN RBF-SVM SRC and DPL are overall inferior to the proposedADDL algorithm This is probably because these four algo-rithms randomly select the unlabeled data from the givenpool which do not consider whether the selected samples arebeneficial for improving the performance of the classificationmodel or not Second although the proposed ADDL andChenrsquosmethod [9] both include the active learning paradigmthe proposed ADDL performs better than Chenrsquos method[9] This is due to the fact that Chenrsquos method [9] onlyconsiders the informativeness and ignores representativeness

Mathematical Problems in Engineering 11

of samples when selecting the unlabeled samples from thegiven pool

5 Conclusions

We have presented an effective framework for classifyingthree types of weather (sunny cloudy and overcast) basedon the outdoor images Through the analysis of the differentvisual manifestations of images caused by different weathersthe various features are separately extracted from the sky areaand nonsky area of images which describes visual appear-ance properties and physical characteristics of images underdifferent weather conditions respectively ADDL approachwas proposed to learn a more discriminative dictionary forimproving the weather classification performance by select-ing the informative and representative unlabeled samplesfrom a given pool to expand the training dataset Sincethere is not much image dataset on weather recognition wehave collected and labeled a new weather dataset for testingthe proposed algorithm The experimental results show thatADDL is a fairly effective and inspiring strategy for weatherclassification which also can be used inmany other computervision tasks

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

This work is supported by Fund of Jilin Provincial Sci-ence amp Technology Department (nos 20130206042GX and20140204089GX) and National Natural Science Foundationof China (nos 61403078 11271064 and 61471111)

References

[1] F Nashashibi R de Charette and A Lia ldquoDetection ofunfocused raindrops on a windscreen using low level imageprocessingrdquo in Proceedings of the 11th International Conferenceon Control Automation Robotics and Vision (ICARCV rsquo10) pp1410ndash1415 Singapore December 2010

[2] H Kurihata T Takahashi I Ide et al ldquoRainy weather recog-nition from in-vehicle camera images for driver assistancerdquo inProceedings of the IEEE Intelligent Vehicles Symposium pp 205ndash210 IEEE Las Vegas Nev USA June 2005

[3] H Woo Y M Jung J-G Kim and J K Seo ldquoEnvironmentallyrobust motion detection for video surveillancerdquo IEEE Transac-tions on Image Processing vol 19 no 11 pp 2838ndash2848 2010

[4] H Katsura J Miura M Hild and Y Shirai ldquoA view-basedoutdoor navigation using object recognition robust to changesof weather and seasonsrdquo in Proceedings of the IEEERSJ Interna-tional Conference on Intelligent Robots and Systems (IROS rsquo03)vol 3 pp 2974ndash2979 IEEE Las Vegas Nev USAOctober 2003

[5] M Roser and F Moosmann ldquoClassification of weather sit-uations on single color imagesrdquo in Proceedings of the IEEEIntelligent Vehicles Symposium pp 798ndash803 IEEE EindhovenThe Netherlands June 2008

[6] X Yan Y Luo and X Zheng ldquoWeather recognition based onimages captured by vision system in vehiclerdquo in Advances in

Neural NetworksmdashISNN 2009 pp 390ndash398 Springer BerlinGermany 2009

[7] H Song Y Chen and Y Gao ldquoWeather condition recognitionbased on feature extraction and K-NNrdquo in Foundations andPractical Applications of Cognitive Systems and Information Pro-cessing Proceedings of the First International Conference on Cog-nitive Systems and Information Processing Beijing China Dec2012 (CSIP2012) Advances in Intelligent Systems and Comput-ing pp 199ndash210 Springer Berlin Germany 2014

[8] C Lu D Lin J Jia and C-K Tang ldquoTwo-class weather class-ificationrdquo in Proceedings of the 27th IEEE Conference on Com-puter Vision and Pattern Recognition (CVPR rsquo14) pp 3718ndash3725Columbus Ohio USA June 2014

[9] Z Chen F Yang A Lindner G Barrenetxea and M VetterlildquoHow is the weather automatic inference from imagesrdquo inProceedings of the 19th IEEE International Conference on ImageProcessing (ICIP rsquo12) pp 1853ndash1856 Orlando Fla USAOctober2012

[10] K He J Sun and X Tang ldquoSingle image haze removal usingdark channel priorrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 33 no 12 pp 2341ndash2353 2011

[11] A Li and H Shouno ldquoDictionary-based image denoising byfused-lasso atom selectionrdquo Mathematical Problems in Engi-neering vol 2014 Article ID 368602 10 pages 2014

[12] Z Feng M Yang L Zhang Y Liu and D Zhang ldquoJoint dis-criminative dimensionality reduction and dictionary learningfor face recognitionrdquo Pattern Recognition vol 46 no 8 pp2134ndash2143 2013

[13] J Dong C Sun andW Yang ldquoA supervised dictionary learningand discriminative weighting model for action recognitionrdquoNeurocomputing vol 158 pp 246ndash256 2015

[14] M Aharon M Elad and A Bruckstein ldquoK-SVD an algorithmfor designing overcomplete dictionaries for sparse representa-tionrdquo IEEE Transactions on Signal Processing vol 54 no 11 pp4311ndash4322 2006

[15] JWright A Y Yang A Ganesh S S Sastry and YMa ldquoRobustface recognition via sparse representationrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 31 no 2 pp210ndash227 2009

[16] J Mairal F Bach and J Ponce ldquoTask-driven dictionarylearningrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 34 no 4 pp 791ndash804 2012

[17] Z Jiang Z Lin and L S Davis ldquoLabel consistent K-SVD learn-ing a discriminative dictionary for recognitionrdquo IEEE Transac-tions on Pattern Analysis and Machine Intelligence vol 35 no11 pp 2651ndash2664 2013

[18] M Yang L Zhang X Feng and D Zhang ldquoSparse representa-tion based fisher discrimination dictionary learning for imageclassificationrdquo International Journal of Computer Vision vol109 no 3 pp 209ndash232 2014

[19] S Gu L Zhang W Zuo and X Feng ldquoProjective dictionarypair learning for pattern classificationrdquo inProceedings of the 28thAnnual Conference on Neural Information Processing Systems(NIPS rsquo14) pp 793ndash801 December 2014

[20] B Settles Active Learning Literature Survey vol 52 Universityof Wisconsin-Madison Madison Wis USA 2010

[21] D Cohn L Atlas and R Ladner ldquoImproving generalizationwith active learningrdquo Machine Learning vol 15 no 2 pp 201ndash221 1994

[22] I Dagan and S P Engelson ldquoCommittee-based sampling fortraining probabilistic classifiersrdquo in Proceedings of the 12th

12 Mathematical Problems in Engineering

International Conference on Machine Learning pp 150ndash157Morgan Kaufmann Tahoe City Calif USA 1995

[23] D Angluin ldquoQueries and concept learningrdquoMachine Learningvol 2 no 4 pp 319ndash342 1988

[24] R D King K E Whelan F M Jones et al ldquoFunctional geno-mic hypothesis generation and experimentation by a robot sci-entistrdquo Nature vol 427 no 6971 pp 247ndash252 2004

[25] S Chakraborty V Balasubramanian and S PanchanathanldquoDynamic batch mode active learningrdquo in Proceedings of theIEEE Conference on Computer Vision and Pattern Recognition(CVPR rsquo11) pp 2649ndash2656 Providence RI USA June 2011

[26] K Nigam and A McCallum ldquoEmploying EM in pool-basedactive learning for text classificationrdquo in Proceedings of the 15thInternational Conference on Machine Learning (ICML rsquo98) pp350ndash358 Morgan Kaufmann Madison Wis USA 1998

[27] A Kapoor K Grauman R Urtasun and T Darrell ldquoActivelearning with Gaussian processes for object categorizationrdquo inProceedings of the IEEE 11th International Conference on Com-puter Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro BrazilOctober 2007

[28] S J Huang R Jin and Z H Zhou ldquoActive learning by queryinginformative and representative examplesrdquo in Proceedings of theAnnual Conference on Neural Information Processing Systems(NIPS rsquo10) pp 892ndash900 2010

[29] P Donmez J G Carbonell and P N Bennett ldquoDual strategyactive learningrdquo inMachine Learning ECML 2007 J N Kok JKoronacki R L de Mantaras S Matwin D Mladenic and ASkowron Eds vol 4701 of Lecture Notes in Computer Sciencepp 116ndash127 Springer Berlin Germany 2007

[30] S C H Hoi R Jin J Zhu and M R Lyu ldquoSemi-supervisedSVM batch mode active learning for image retrievalrdquo in Pro-ceedings of the 26th IEEE Conference on Computer Vision andPattern Recognition (CVPR rsquo08) pp 1ndash7 Anchorage AlaskaUSA June 2008

[31] Z Xu K Yu V Tresp X Xu and J Wang ldquoRepresentativesampling for text classification using support vector machinesrdquoin Proceedings of the 25th European Conference on IR Research(ECIR rsquo03) Springer 2003

[32] A Bosch A Zisserman and X Munoz ldquoImage classificationusing random forests and fernsrdquo in Proceedings of the IEEE 11thInternational Conference on Computer Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro Brazil October 2007

[33] T Ojala M Pietikainen and D Harwood ldquoA comparativestudy of texture measures with classification based on featuredistributionsrdquo Pattern Recognition vol 29 no 1 pp 51ndash59 1996

[34] S G Narasimhan and S K Nayar ldquoChromatic framework forvision in bad weatherrdquo in Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo00) pp 598ndash605 Hilton Head Island SC USA June 2000

[35] S G Narasimhan and S K Nayar ldquoRemoving weather effectsfrom monochrome imagesrdquo in Proceedings of the IEEE Com-puter Society Conference on Computer Vision and Pattern Recog-nition vol 2 pp II-186ndashII-193 IEEE Kauai Hawaii USADecember 2001

[36] S G Narasimhan and S K Nayar ldquoVision and the atmosphererdquoInternational Journal of Computer Vision vol 48 no 3 pp 233ndash254 2002

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings ofthe International Workshop on Statistical Learning in ComputerVision (ECCV rsquo04) pp 1ndash22 Prague Czech Republic 2004

[38] D D Lewis andW A Gale ldquoA sequential algorithm for trainingtext classifiersrdquo in SIGIR rsquo94 Proceedings of the SeventeenthAnnual International ACM-SIGIR Conference on Research andDevelopment in Information Retrieval organised by Dublin CityUniversity pp 3ndash12 Springer Berlin Germany 1994

[39] B Settles and M Craven ldquoAn analysis of active learning strate-gies for sequence labeling tasksrdquo inProceedings of the Conferenceon Empirical Methods in Natural Language Processing (EMNLPrsquo08) pp 1070ndash1079 Association for Computational Linguistics2008

[40] G Schohn andD Cohn ldquoLess ismore active learning with sup-port vector machinesrdquo in Proceedings of the 17th InternationalConference on Machine Learning (ICML rsquo00) pp 839ndash846Morgan Kaufmann 2000

[41] S Tong and D Koller ldquoSupport vector machine active learningwith applications to text classificationrdquo The Journal of MachineLearning Research vol 2 pp 45ndash66 2002

[42] M Szummer and T S Jaakkola ldquoInformation regularizationwith partially labeled datardquo in Advances in Neural InformationProcessing Systems pp 1025ndash1032 MIT Press 2002

[43] X Li and Y Guo ldquoAdaptive active learning for image classifica-tionrdquo in Proceedings of the 26th IEEE Conference on ComputerVision and Pattern Recognition (CVPR rsquo13) pp 859ndash866 IEEEPortland Ore USA June 2013

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 10: Research Article Active Discriminative Dictionary Learning

10 Mathematical Problems in Engineering

5 15 25 35 45 55 65 75 85 95

90

92

94Av

erag

e acc

urac

y (

)

k

(a)

00005 0005 005 05 10 30 50 70 90

90

92

94

Aver

age a

ccur

acy

()

120582

(b)

0005 05 5 25 45 65 85

90

92

94

Aver

age a

ccur

acy

()

120591

(c)

Figure 6 Selection of parameters 119909-axis represents the different values of parameters and 119910-axis represents the average classificationaccuracy (a)The average classification rate under different k (b)The average classification rate under different 120582 (c)The average classificationrate under different 120591

0 1 2 3 4 5 6 7 8 98284868890929496

Iteration

Aver

age a

ccur

acy

()

Figure 7 Recognition accuracy on DATASET 1 versus the numberof iterations

Table 1 Comparisons on DATASET 1 among different methods

Methods Classification rate (mean plusmn std)119870-NN 829plusmn 12RBF-SVM 856plusmn 16SRC 897plusmn 09DPL 913plusmn 16Chenrsquos method [9] 9298plusmn 055ADDL 940plusmn 02

weather classification As can be seen from Table 1 ADDLoutperforms other methods The mean classification rate ofADDL reaches about 94

In DATASET 2 2500 images are randomly selected asthe training samples and the rest of images are used as

Table 2 Comparison on DATASET 2 among different methods

Methods Classification rate (mean plusmn std)119870-NN 850plusmn 16RBF-SVM 881plusmn 14SRC 882plusmn 11DPL 896plusmn 08Chenrsquos method [9] 888plusmn 16ADDL 904plusmn 10

the testing samples In the training dataset 50 images arerandomly chosen as the initial labeled training datasetD

119897 the

remaining 2450 images are regarded as the unlabeled datasetD119897 All parameters setting for DATASET 2 are the same as

DATASET 1 Table 2 lists the recognition results of differentmethods which indicates that the validity of the proposedADDL is better than other methods

FromTables 1 and 2 two points can be observed First wecan find that the recognition performances of 119870-NN RBF-SVM SRC and DPL are overall inferior to the proposedADDL algorithm This is probably because these four algo-rithms randomly select the unlabeled data from the givenpool which do not consider whether the selected samples arebeneficial for improving the performance of the classificationmodel or not Second although the proposed ADDL andChenrsquosmethod [9] both include the active learning paradigmthe proposed ADDL performs better than Chenrsquos method[9] This is due to the fact that Chenrsquos method [9] onlyconsiders the informativeness and ignores representativeness

Mathematical Problems in Engineering 11

of samples when selecting the unlabeled samples from thegiven pool

5 Conclusions

We have presented an effective framework for classifyingthree types of weather (sunny cloudy and overcast) basedon the outdoor images Through the analysis of the differentvisual manifestations of images caused by different weathersthe various features are separately extracted from the sky areaand nonsky area of images which describes visual appear-ance properties and physical characteristics of images underdifferent weather conditions respectively ADDL approachwas proposed to learn a more discriminative dictionary forimproving the weather classification performance by select-ing the informative and representative unlabeled samplesfrom a given pool to expand the training dataset Sincethere is not much image dataset on weather recognition wehave collected and labeled a new weather dataset for testingthe proposed algorithm The experimental results show thatADDL is a fairly effective and inspiring strategy for weatherclassification which also can be used inmany other computervision tasks

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

This work is supported by Fund of Jilin Provincial Sci-ence amp Technology Department (nos 20130206042GX and20140204089GX) and National Natural Science Foundationof China (nos 61403078 11271064 and 61471111)

References

[1] F Nashashibi R de Charette and A Lia ldquoDetection ofunfocused raindrops on a windscreen using low level imageprocessingrdquo in Proceedings of the 11th International Conferenceon Control Automation Robotics and Vision (ICARCV rsquo10) pp1410ndash1415 Singapore December 2010

[2] H Kurihata T Takahashi I Ide et al ldquoRainy weather recog-nition from in-vehicle camera images for driver assistancerdquo inProceedings of the IEEE Intelligent Vehicles Symposium pp 205ndash210 IEEE Las Vegas Nev USA June 2005

[3] H Woo Y M Jung J-G Kim and J K Seo ldquoEnvironmentallyrobust motion detection for video surveillancerdquo IEEE Transac-tions on Image Processing vol 19 no 11 pp 2838ndash2848 2010

[4] H Katsura J Miura M Hild and Y Shirai ldquoA view-basedoutdoor navigation using object recognition robust to changesof weather and seasonsrdquo in Proceedings of the IEEERSJ Interna-tional Conference on Intelligent Robots and Systems (IROS rsquo03)vol 3 pp 2974ndash2979 IEEE Las Vegas Nev USAOctober 2003

[5] M Roser and F Moosmann ldquoClassification of weather sit-uations on single color imagesrdquo in Proceedings of the IEEEIntelligent Vehicles Symposium pp 798ndash803 IEEE EindhovenThe Netherlands June 2008

[6] X Yan Y Luo and X Zheng ldquoWeather recognition based onimages captured by vision system in vehiclerdquo in Advances in

Neural NetworksmdashISNN 2009 pp 390ndash398 Springer BerlinGermany 2009

[7] H Song Y Chen and Y Gao ldquoWeather condition recognitionbased on feature extraction and K-NNrdquo in Foundations andPractical Applications of Cognitive Systems and Information Pro-cessing Proceedings of the First International Conference on Cog-nitive Systems and Information Processing Beijing China Dec2012 (CSIP2012) Advances in Intelligent Systems and Comput-ing pp 199ndash210 Springer Berlin Germany 2014

[8] C Lu D Lin J Jia and C-K Tang ldquoTwo-class weather class-ificationrdquo in Proceedings of the 27th IEEE Conference on Com-puter Vision and Pattern Recognition (CVPR rsquo14) pp 3718ndash3725Columbus Ohio USA June 2014

[9] Z Chen F Yang A Lindner G Barrenetxea and M VetterlildquoHow is the weather automatic inference from imagesrdquo inProceedings of the 19th IEEE International Conference on ImageProcessing (ICIP rsquo12) pp 1853ndash1856 Orlando Fla USAOctober2012

[10] K He J Sun and X Tang ldquoSingle image haze removal usingdark channel priorrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 33 no 12 pp 2341ndash2353 2011

[11] A Li and H Shouno ldquoDictionary-based image denoising byfused-lasso atom selectionrdquo Mathematical Problems in Engi-neering vol 2014 Article ID 368602 10 pages 2014

[12] Z Feng M Yang L Zhang Y Liu and D Zhang ldquoJoint dis-criminative dimensionality reduction and dictionary learningfor face recognitionrdquo Pattern Recognition vol 46 no 8 pp2134ndash2143 2013

[13] J Dong C Sun andW Yang ldquoA supervised dictionary learningand discriminative weighting model for action recognitionrdquoNeurocomputing vol 158 pp 246ndash256 2015

[14] M Aharon M Elad and A Bruckstein ldquoK-SVD an algorithmfor designing overcomplete dictionaries for sparse representa-tionrdquo IEEE Transactions on Signal Processing vol 54 no 11 pp4311ndash4322 2006

[15] JWright A Y Yang A Ganesh S S Sastry and YMa ldquoRobustface recognition via sparse representationrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 31 no 2 pp210ndash227 2009

[16] J Mairal F Bach and J Ponce ldquoTask-driven dictionarylearningrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 34 no 4 pp 791ndash804 2012

[17] Z Jiang Z Lin and L S Davis ldquoLabel consistent K-SVD learn-ing a discriminative dictionary for recognitionrdquo IEEE Transac-tions on Pattern Analysis and Machine Intelligence vol 35 no11 pp 2651ndash2664 2013

[18] M Yang L Zhang X Feng and D Zhang ldquoSparse representa-tion based fisher discrimination dictionary learning for imageclassificationrdquo International Journal of Computer Vision vol109 no 3 pp 209ndash232 2014

[19] S Gu L Zhang W Zuo and X Feng ldquoProjective dictionarypair learning for pattern classificationrdquo inProceedings of the 28thAnnual Conference on Neural Information Processing Systems(NIPS rsquo14) pp 793ndash801 December 2014

[20] B Settles Active Learning Literature Survey vol 52 Universityof Wisconsin-Madison Madison Wis USA 2010

[21] D Cohn L Atlas and R Ladner ldquoImproving generalizationwith active learningrdquo Machine Learning vol 15 no 2 pp 201ndash221 1994

[22] I Dagan and S P Engelson ldquoCommittee-based sampling fortraining probabilistic classifiersrdquo in Proceedings of the 12th

12 Mathematical Problems in Engineering

International Conference on Machine Learning pp 150ndash157Morgan Kaufmann Tahoe City Calif USA 1995

[23] D Angluin ldquoQueries and concept learningrdquoMachine Learningvol 2 no 4 pp 319ndash342 1988

[24] R D King K E Whelan F M Jones et al ldquoFunctional geno-mic hypothesis generation and experimentation by a robot sci-entistrdquo Nature vol 427 no 6971 pp 247ndash252 2004

[25] S Chakraborty V Balasubramanian and S PanchanathanldquoDynamic batch mode active learningrdquo in Proceedings of theIEEE Conference on Computer Vision and Pattern Recognition(CVPR rsquo11) pp 2649ndash2656 Providence RI USA June 2011

[26] K Nigam and A McCallum ldquoEmploying EM in pool-basedactive learning for text classificationrdquo in Proceedings of the 15thInternational Conference on Machine Learning (ICML rsquo98) pp350ndash358 Morgan Kaufmann Madison Wis USA 1998

[27] A Kapoor K Grauman R Urtasun and T Darrell ldquoActivelearning with Gaussian processes for object categorizationrdquo inProceedings of the IEEE 11th International Conference on Com-puter Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro BrazilOctober 2007

[28] S J Huang R Jin and Z H Zhou ldquoActive learning by queryinginformative and representative examplesrdquo in Proceedings of theAnnual Conference on Neural Information Processing Systems(NIPS rsquo10) pp 892ndash900 2010

[29] P Donmez J G Carbonell and P N Bennett ldquoDual strategyactive learningrdquo inMachine Learning ECML 2007 J N Kok JKoronacki R L de Mantaras S Matwin D Mladenic and ASkowron Eds vol 4701 of Lecture Notes in Computer Sciencepp 116ndash127 Springer Berlin Germany 2007

[30] S C H Hoi R Jin J Zhu and M R Lyu ldquoSemi-supervisedSVM batch mode active learning for image retrievalrdquo in Pro-ceedings of the 26th IEEE Conference on Computer Vision andPattern Recognition (CVPR rsquo08) pp 1ndash7 Anchorage AlaskaUSA June 2008

[31] Z Xu K Yu V Tresp X Xu and J Wang ldquoRepresentativesampling for text classification using support vector machinesrdquoin Proceedings of the 25th European Conference on IR Research(ECIR rsquo03) Springer 2003

[32] A Bosch A Zisserman and X Munoz ldquoImage classificationusing random forests and fernsrdquo in Proceedings of the IEEE 11thInternational Conference on Computer Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro Brazil October 2007

[33] T Ojala M Pietikainen and D Harwood ldquoA comparativestudy of texture measures with classification based on featuredistributionsrdquo Pattern Recognition vol 29 no 1 pp 51ndash59 1996

[34] S G Narasimhan and S K Nayar ldquoChromatic framework forvision in bad weatherrdquo in Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo00) pp 598ndash605 Hilton Head Island SC USA June 2000

[35] S G Narasimhan and S K Nayar ldquoRemoving weather effectsfrom monochrome imagesrdquo in Proceedings of the IEEE Com-puter Society Conference on Computer Vision and Pattern Recog-nition vol 2 pp II-186ndashII-193 IEEE Kauai Hawaii USADecember 2001

[36] S G Narasimhan and S K Nayar ldquoVision and the atmosphererdquoInternational Journal of Computer Vision vol 48 no 3 pp 233ndash254 2002

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings ofthe International Workshop on Statistical Learning in ComputerVision (ECCV rsquo04) pp 1ndash22 Prague Czech Republic 2004

[38] D D Lewis andW A Gale ldquoA sequential algorithm for trainingtext classifiersrdquo in SIGIR rsquo94 Proceedings of the SeventeenthAnnual International ACM-SIGIR Conference on Research andDevelopment in Information Retrieval organised by Dublin CityUniversity pp 3ndash12 Springer Berlin Germany 1994

[39] B Settles and M Craven ldquoAn analysis of active learning strate-gies for sequence labeling tasksrdquo inProceedings of the Conferenceon Empirical Methods in Natural Language Processing (EMNLPrsquo08) pp 1070ndash1079 Association for Computational Linguistics2008

[40] G Schohn andD Cohn ldquoLess ismore active learning with sup-port vector machinesrdquo in Proceedings of the 17th InternationalConference on Machine Learning (ICML rsquo00) pp 839ndash846Morgan Kaufmann 2000

[41] S Tong and D Koller ldquoSupport vector machine active learningwith applications to text classificationrdquo The Journal of MachineLearning Research vol 2 pp 45ndash66 2002

[42] M Szummer and T S Jaakkola ldquoInformation regularizationwith partially labeled datardquo in Advances in Neural InformationProcessing Systems pp 1025ndash1032 MIT Press 2002

[43] X Li and Y Guo ldquoAdaptive active learning for image classifica-tionrdquo in Proceedings of the 26th IEEE Conference on ComputerVision and Pattern Recognition (CVPR rsquo13) pp 859ndash866 IEEEPortland Ore USA June 2013

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 11: Research Article Active Discriminative Dictionary Learning

Mathematical Problems in Engineering 11

of samples when selecting the unlabeled samples from thegiven pool

5 Conclusions

We have presented an effective framework for classifyingthree types of weather (sunny cloudy and overcast) basedon the outdoor images Through the analysis of the differentvisual manifestations of images caused by different weathersthe various features are separately extracted from the sky areaand nonsky area of images which describes visual appear-ance properties and physical characteristics of images underdifferent weather conditions respectively ADDL approachwas proposed to learn a more discriminative dictionary forimproving the weather classification performance by select-ing the informative and representative unlabeled samplesfrom a given pool to expand the training dataset Sincethere is not much image dataset on weather recognition wehave collected and labeled a new weather dataset for testingthe proposed algorithm The experimental results show thatADDL is a fairly effective and inspiring strategy for weatherclassification which also can be used inmany other computervision tasks

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

This work is supported by Fund of Jilin Provincial Sci-ence amp Technology Department (nos 20130206042GX and20140204089GX) and National Natural Science Foundationof China (nos 61403078 11271064 and 61471111)

References

[1] F Nashashibi R de Charette and A Lia ldquoDetection ofunfocused raindrops on a windscreen using low level imageprocessingrdquo in Proceedings of the 11th International Conferenceon Control Automation Robotics and Vision (ICARCV rsquo10) pp1410ndash1415 Singapore December 2010

[2] H Kurihata T Takahashi I Ide et al ldquoRainy weather recog-nition from in-vehicle camera images for driver assistancerdquo inProceedings of the IEEE Intelligent Vehicles Symposium pp 205ndash210 IEEE Las Vegas Nev USA June 2005

[3] H Woo Y M Jung J-G Kim and J K Seo ldquoEnvironmentallyrobust motion detection for video surveillancerdquo IEEE Transac-tions on Image Processing vol 19 no 11 pp 2838ndash2848 2010

[4] H Katsura J Miura M Hild and Y Shirai ldquoA view-basedoutdoor navigation using object recognition robust to changesof weather and seasonsrdquo in Proceedings of the IEEERSJ Interna-tional Conference on Intelligent Robots and Systems (IROS rsquo03)vol 3 pp 2974ndash2979 IEEE Las Vegas Nev USAOctober 2003

[5] M Roser and F Moosmann ldquoClassification of weather sit-uations on single color imagesrdquo in Proceedings of the IEEEIntelligent Vehicles Symposium pp 798ndash803 IEEE EindhovenThe Netherlands June 2008

[6] X Yan Y Luo and X Zheng ldquoWeather recognition based onimages captured by vision system in vehiclerdquo in Advances in

Neural NetworksmdashISNN 2009 pp 390ndash398 Springer BerlinGermany 2009

[7] H Song Y Chen and Y Gao ldquoWeather condition recognitionbased on feature extraction and K-NNrdquo in Foundations andPractical Applications of Cognitive Systems and Information Pro-cessing Proceedings of the First International Conference on Cog-nitive Systems and Information Processing Beijing China Dec2012 (CSIP2012) Advances in Intelligent Systems and Comput-ing pp 199ndash210 Springer Berlin Germany 2014

[8] C Lu D Lin J Jia and C-K Tang ldquoTwo-class weather class-ificationrdquo in Proceedings of the 27th IEEE Conference on Com-puter Vision and Pattern Recognition (CVPR rsquo14) pp 3718ndash3725Columbus Ohio USA June 2014

[9] Z Chen F Yang A Lindner G Barrenetxea and M VetterlildquoHow is the weather automatic inference from imagesrdquo inProceedings of the 19th IEEE International Conference on ImageProcessing (ICIP rsquo12) pp 1853ndash1856 Orlando Fla USAOctober2012

[10] K He J Sun and X Tang ldquoSingle image haze removal usingdark channel priorrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 33 no 12 pp 2341ndash2353 2011

[11] A Li and H Shouno ldquoDictionary-based image denoising byfused-lasso atom selectionrdquo Mathematical Problems in Engi-neering vol 2014 Article ID 368602 10 pages 2014

[12] Z Feng M Yang L Zhang Y Liu and D Zhang ldquoJoint dis-criminative dimensionality reduction and dictionary learningfor face recognitionrdquo Pattern Recognition vol 46 no 8 pp2134ndash2143 2013

[13] J Dong C Sun andW Yang ldquoA supervised dictionary learningand discriminative weighting model for action recognitionrdquoNeurocomputing vol 158 pp 246ndash256 2015

[14] M Aharon M Elad and A Bruckstein ldquoK-SVD an algorithmfor designing overcomplete dictionaries for sparse representa-tionrdquo IEEE Transactions on Signal Processing vol 54 no 11 pp4311ndash4322 2006

[15] JWright A Y Yang A Ganesh S S Sastry and YMa ldquoRobustface recognition via sparse representationrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 31 no 2 pp210ndash227 2009

[16] J Mairal F Bach and J Ponce ldquoTask-driven dictionarylearningrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 34 no 4 pp 791ndash804 2012

[17] Z Jiang Z Lin and L S Davis ldquoLabel consistent K-SVD learn-ing a discriminative dictionary for recognitionrdquo IEEE Transac-tions on Pattern Analysis and Machine Intelligence vol 35 no11 pp 2651ndash2664 2013

[18] M Yang L Zhang X Feng and D Zhang ldquoSparse representa-tion based fisher discrimination dictionary learning for imageclassificationrdquo International Journal of Computer Vision vol109 no 3 pp 209ndash232 2014

[19] S Gu L Zhang W Zuo and X Feng ldquoProjective dictionarypair learning for pattern classificationrdquo inProceedings of the 28thAnnual Conference on Neural Information Processing Systems(NIPS rsquo14) pp 793ndash801 December 2014

[20] B Settles Active Learning Literature Survey vol 52 Universityof Wisconsin-Madison Madison Wis USA 2010

[21] D Cohn L Atlas and R Ladner ldquoImproving generalizationwith active learningrdquo Machine Learning vol 15 no 2 pp 201ndash221 1994

[22] I Dagan and S P Engelson ldquoCommittee-based sampling fortraining probabilistic classifiersrdquo in Proceedings of the 12th

12 Mathematical Problems in Engineering

International Conference on Machine Learning pp 150ndash157Morgan Kaufmann Tahoe City Calif USA 1995

[23] D Angluin ldquoQueries and concept learningrdquoMachine Learningvol 2 no 4 pp 319ndash342 1988

[24] R D King K E Whelan F M Jones et al ldquoFunctional geno-mic hypothesis generation and experimentation by a robot sci-entistrdquo Nature vol 427 no 6971 pp 247ndash252 2004

[25] S Chakraborty V Balasubramanian and S PanchanathanldquoDynamic batch mode active learningrdquo in Proceedings of theIEEE Conference on Computer Vision and Pattern Recognition(CVPR rsquo11) pp 2649ndash2656 Providence RI USA June 2011

[26] K Nigam and A McCallum ldquoEmploying EM in pool-basedactive learning for text classificationrdquo in Proceedings of the 15thInternational Conference on Machine Learning (ICML rsquo98) pp350ndash358 Morgan Kaufmann Madison Wis USA 1998

[27] A Kapoor K Grauman R Urtasun and T Darrell ldquoActivelearning with Gaussian processes for object categorizationrdquo inProceedings of the IEEE 11th International Conference on Com-puter Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro BrazilOctober 2007

[28] S J Huang R Jin and Z H Zhou ldquoActive learning by queryinginformative and representative examplesrdquo in Proceedings of theAnnual Conference on Neural Information Processing Systems(NIPS rsquo10) pp 892ndash900 2010

[29] P Donmez J G Carbonell and P N Bennett ldquoDual strategyactive learningrdquo inMachine Learning ECML 2007 J N Kok JKoronacki R L de Mantaras S Matwin D Mladenic and ASkowron Eds vol 4701 of Lecture Notes in Computer Sciencepp 116ndash127 Springer Berlin Germany 2007

[30] S C H Hoi R Jin J Zhu and M R Lyu ldquoSemi-supervisedSVM batch mode active learning for image retrievalrdquo in Pro-ceedings of the 26th IEEE Conference on Computer Vision andPattern Recognition (CVPR rsquo08) pp 1ndash7 Anchorage AlaskaUSA June 2008

[31] Z Xu K Yu V Tresp X Xu and J Wang ldquoRepresentativesampling for text classification using support vector machinesrdquoin Proceedings of the 25th European Conference on IR Research(ECIR rsquo03) Springer 2003

[32] A Bosch A Zisserman and X Munoz ldquoImage classificationusing random forests and fernsrdquo in Proceedings of the IEEE 11thInternational Conference on Computer Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro Brazil October 2007

[33] T Ojala M Pietikainen and D Harwood ldquoA comparativestudy of texture measures with classification based on featuredistributionsrdquo Pattern Recognition vol 29 no 1 pp 51ndash59 1996

[34] S G Narasimhan and S K Nayar ldquoChromatic framework forvision in bad weatherrdquo in Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo00) pp 598ndash605 Hilton Head Island SC USA June 2000

[35] S G Narasimhan and S K Nayar ldquoRemoving weather effectsfrom monochrome imagesrdquo in Proceedings of the IEEE Com-puter Society Conference on Computer Vision and Pattern Recog-nition vol 2 pp II-186ndashII-193 IEEE Kauai Hawaii USADecember 2001

[36] S G Narasimhan and S K Nayar ldquoVision and the atmosphererdquoInternational Journal of Computer Vision vol 48 no 3 pp 233ndash254 2002

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings ofthe International Workshop on Statistical Learning in ComputerVision (ECCV rsquo04) pp 1ndash22 Prague Czech Republic 2004

[38] D D Lewis andW A Gale ldquoA sequential algorithm for trainingtext classifiersrdquo in SIGIR rsquo94 Proceedings of the SeventeenthAnnual International ACM-SIGIR Conference on Research andDevelopment in Information Retrieval organised by Dublin CityUniversity pp 3ndash12 Springer Berlin Germany 1994

[39] B Settles and M Craven ldquoAn analysis of active learning strate-gies for sequence labeling tasksrdquo inProceedings of the Conferenceon Empirical Methods in Natural Language Processing (EMNLPrsquo08) pp 1070ndash1079 Association for Computational Linguistics2008

[40] G Schohn andD Cohn ldquoLess ismore active learning with sup-port vector machinesrdquo in Proceedings of the 17th InternationalConference on Machine Learning (ICML rsquo00) pp 839ndash846Morgan Kaufmann 2000

[41] S Tong and D Koller ldquoSupport vector machine active learningwith applications to text classificationrdquo The Journal of MachineLearning Research vol 2 pp 45ndash66 2002

[42] M Szummer and T S Jaakkola ldquoInformation regularizationwith partially labeled datardquo in Advances in Neural InformationProcessing Systems pp 1025ndash1032 MIT Press 2002

[43] X Li and Y Guo ldquoAdaptive active learning for image classifica-tionrdquo in Proceedings of the 26th IEEE Conference on ComputerVision and Pattern Recognition (CVPR rsquo13) pp 859ndash866 IEEEPortland Ore USA June 2013

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 12: Research Article Active Discriminative Dictionary Learning

12 Mathematical Problems in Engineering

International Conference on Machine Learning pp 150ndash157Morgan Kaufmann Tahoe City Calif USA 1995

[23] D Angluin ldquoQueries and concept learningrdquoMachine Learningvol 2 no 4 pp 319ndash342 1988

[24] R D King K E Whelan F M Jones et al ldquoFunctional geno-mic hypothesis generation and experimentation by a robot sci-entistrdquo Nature vol 427 no 6971 pp 247ndash252 2004

[25] S Chakraborty V Balasubramanian and S PanchanathanldquoDynamic batch mode active learningrdquo in Proceedings of theIEEE Conference on Computer Vision and Pattern Recognition(CVPR rsquo11) pp 2649ndash2656 Providence RI USA June 2011

[26] K Nigam and A McCallum ldquoEmploying EM in pool-basedactive learning for text classificationrdquo in Proceedings of the 15thInternational Conference on Machine Learning (ICML rsquo98) pp350ndash358 Morgan Kaufmann Madison Wis USA 1998

[27] A Kapoor K Grauman R Urtasun and T Darrell ldquoActivelearning with Gaussian processes for object categorizationrdquo inProceedings of the IEEE 11th International Conference on Com-puter Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro BrazilOctober 2007

[28] S J Huang R Jin and Z H Zhou ldquoActive learning by queryinginformative and representative examplesrdquo in Proceedings of theAnnual Conference on Neural Information Processing Systems(NIPS rsquo10) pp 892ndash900 2010

[29] P Donmez J G Carbonell and P N Bennett ldquoDual strategyactive learningrdquo inMachine Learning ECML 2007 J N Kok JKoronacki R L de Mantaras S Matwin D Mladenic and ASkowron Eds vol 4701 of Lecture Notes in Computer Sciencepp 116ndash127 Springer Berlin Germany 2007

[30] S C H Hoi R Jin J Zhu and M R Lyu ldquoSemi-supervisedSVM batch mode active learning for image retrievalrdquo in Pro-ceedings of the 26th IEEE Conference on Computer Vision andPattern Recognition (CVPR rsquo08) pp 1ndash7 Anchorage AlaskaUSA June 2008

[31] Z Xu K Yu V Tresp X Xu and J Wang ldquoRepresentativesampling for text classification using support vector machinesrdquoin Proceedings of the 25th European Conference on IR Research(ECIR rsquo03) Springer 2003

[32] A Bosch A Zisserman and X Munoz ldquoImage classificationusing random forests and fernsrdquo in Proceedings of the IEEE 11thInternational Conference on Computer Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro Brazil October 2007

[33] T Ojala M Pietikainen and D Harwood ldquoA comparativestudy of texture measures with classification based on featuredistributionsrdquo Pattern Recognition vol 29 no 1 pp 51ndash59 1996

[34] S G Narasimhan and S K Nayar ldquoChromatic framework forvision in bad weatherrdquo in Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo00) pp 598ndash605 Hilton Head Island SC USA June 2000

[35] S G Narasimhan and S K Nayar ldquoRemoving weather effectsfrom monochrome imagesrdquo in Proceedings of the IEEE Com-puter Society Conference on Computer Vision and Pattern Recog-nition vol 2 pp II-186ndashII-193 IEEE Kauai Hawaii USADecember 2001

[36] S G Narasimhan and S K Nayar ldquoVision and the atmosphererdquoInternational Journal of Computer Vision vol 48 no 3 pp 233ndash254 2002

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings ofthe International Workshop on Statistical Learning in ComputerVision (ECCV rsquo04) pp 1ndash22 Prague Czech Republic 2004

[38] D D Lewis andW A Gale ldquoA sequential algorithm for trainingtext classifiersrdquo in SIGIR rsquo94 Proceedings of the SeventeenthAnnual International ACM-SIGIR Conference on Research andDevelopment in Information Retrieval organised by Dublin CityUniversity pp 3ndash12 Springer Berlin Germany 1994

[39] B Settles and M Craven ldquoAn analysis of active learning strate-gies for sequence labeling tasksrdquo inProceedings of the Conferenceon Empirical Methods in Natural Language Processing (EMNLPrsquo08) pp 1070ndash1079 Association for Computational Linguistics2008

[40] G Schohn andD Cohn ldquoLess ismore active learning with sup-port vector machinesrdquo in Proceedings of the 17th InternationalConference on Machine Learning (ICML rsquo00) pp 839ndash846Morgan Kaufmann 2000

[41] S Tong and D Koller ldquoSupport vector machine active learningwith applications to text classificationrdquo The Journal of MachineLearning Research vol 2 pp 45ndash66 2002

[42] M Szummer and T S Jaakkola ldquoInformation regularizationwith partially labeled datardquo in Advances in Neural InformationProcessing Systems pp 1025ndash1032 MIT Press 2002

[43] X Li and Y Guo ldquoAdaptive active learning for image classifica-tionrdquo in Proceedings of the 26th IEEE Conference on ComputerVision and Pattern Recognition (CVPR rsquo13) pp 859ndash866 IEEEPortland Ore USA June 2013

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 13: Research Article Active Discriminative Dictionary Learning

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of