ABC+financialClassification

th

tin

t, In

t, D

t, Fi

Applied Soft Computing 10 (2010) 806812

Financial classication problem

re a

es a

viou

cati

p to

ria

od

ssi

n

pared with the results of a particle swarm optimization algorithm, an ant

colony optimization, a genetic algorithm and a tabu search algorithm.

2009 Elsevier B.V. All rights reserved.

Contents lists available at ScienceDirect

Applied Soft

journa l homepage: www.e1. Introduction

Several biological and natural processes have been inuencingthe methodologies in science and technology in an increasingmanner in the past years. Feedback control processes, articialneurons, the DNA molecule description and similar genomicsmatters, studies of the behaviour of natural immunologicalsystems, and more, represent some of the very successful domainsof this kind in a variety of real world applications. During the lastdecade, nature inspired intelligence becomes increasingly popularthrough the development and utilization of intelligent paradigmsin advanced information systems design. Cross-disciplinary team-based thinking attempts to cross-fertilize engineering and lifescience understanding into advanced inter-operable systems. Themethods contribute to technological advances driven by conceptsfrom nature/biology including advances in structural genomics(intelligent drug design through imprecise data bases), mapping ofgenes to proteins and proteins to genes (one-to-many and many-to-one characteristics of naturally occurring organisms),modellingof complete cell structures (showing modularity and hierarchy),functional genomics (handling of hybrid sources and heteroge-neous and inconsistent origins of disparate databases), self-

organization of natural systems, etc. Among the most popularnature inspired approaches, when the task is optimization withincomplex domains of data or information, are those methodsrepresenting successful animal and micro-organism team beha-viour, such as swarm or ocking intelligence (birds ocks or shschools inspired particle swarm optimization [1]), articialimmune systems (that mimic the biological one [2,3]), or antcolonies (ants foraging behaviours gave rise to ant colonyoptimization [4,5]), etc.

In the recent few years a number of swarm intelligencealgorithms, based on the behaviour of the bees have beenpresented [6]. These algorithms are divided, mainly, in twocategories according to their behaviour in the nature, the foragingbehaviour and the mating behaviour. The most importantapproaches that simulate the foraging behaviour of the beesare the Articial Bee Colony (ABC) algorithm proposed byKaraboga and Basturk [7,8], the Virtual Bee algorithm proposedby Yang [9], the Bee Colony Optimization algorithm proposed byTeodorovic andDellOrco [10], the BeeHive algorithmproposed byWedde et al. [11], the Bee Swarm Optimization algorithmproposed by Drias et al. [12] and the Bees algorithm proposedby Pham et al. [13]. The Articial Bee Colony algorithm [7,8] is,mainly, applied in continuous optimization problems andsimulates the waggled dance behaviour that a swarm of beesperform during the foraging process of the bees. In this algorithmthere are three groups of bees, the employed bees (beesthat determines the food source (possible solutions) from a

* Corresponding author. Tel.: +30 28210 37236; fax: +30 28210 69410.

E-mail addresses: [email protected] (M. Marinaki), [email protected]

(Y. Marinakis), [email protected] (C. Zopounidis).

1568-4946/$ see front matter 2009 Elsevier B.V. All rights reserved.Honey Bees Mating Optimization algori

Magdalene Marinaki a, Yannis Marinakis b, Constana Technical University of Crete, Department of Production Engineering and Managemenb Technical University of Crete, Department of Production Engineering and Managemenc Technical University of Crete, Department of Production Engineering and Managemen

A R T I C L E I N F O

Article history:

Received 2 December 2008

Received in revised form 8 September 2009

Accepted 10 September 2009

Available online 17 September 2009

Keywords:

Honey Bees Mating Optimization

Metaheuristics

Feature selection problem

A B S T R A C T

Nature inspiredmethods a

of problems. This study us

based on the mating beha

are often based on classi

groups. One important ste

the selection of the approp

hand. The proposed meth

algorithmwhile for the cla

of themethod is tested in a

proposed method are comdoi:10.1016/j.asoc.2009.09.010m for nancial classication problems

Zopounidis c,*

dustrial Systems Control Laboratory, 73100 Chania, Greece

ecision Support Systems Laboratory, 73100 Chania, Greece

nancial Engineering Laboratory, University Campus, 73100 Chania, Greece

pproaches that are used in various elds and for the solution for a number

nature inspired method, namely Honey Bees Mating Optimization, that is

r of honey bees for a nancial classication problem. Financial decisions

on models which are used to assign a set of observations into predened

wards the development of accurate nancial classicationmodels involves

te independent variables (features) which are relevant for the problem at

uses for the feature selection step, the Honey Bees Mating Optimization

cation step, Nearest Neighbor based classiers are used. The performance

ancial classication task involving credit risk assessment. The results of the

Computing

l sevier .com/ locate /asoc

M. Marinaki et al. / Applied Soft Computing 10 (2010) 806812 807prespecied set of food sources and share this information(waggle dance)with the other bees in thehive), the onlookers bees(bees that based on the information that they take from theemployed bees they search for a better food source in theneighborhood of the memorized food sources) and the scout bees(employed bees that their food source has been abandoned andthey search for a new food source randomly). The Virtual Beealgorithm [9] is, also, applied in continuous optimizationproblems. In this algorithm, the population of the bees areassociated with a memory, a food source, and then all thememories communicate between them with a waggle danceprocedure. The whole procedure is similar with a geneticalgorithm and it has been applied on two function optimizationproblems with two parameters. In the BeeHive [11] algorithm, aprotocol inspired from dance language and foraging behaviour ofhoney bees is used. In the Bees Swarm Optimization [12], initiallya bee nds an initial solution (food source) and from this solutionthe other solutions are produced with certain strategies. Then,every bee is assigned in a solution and when they accomplishedtheir search, the bees communicate between them with a waggledance strategy and the best solution will become the newreference solution. To avoid cycling the authors use a tabu list. Inthe Bees algorithm [13], a population of initial solutions (foodsources) are randomly generated. Then, the bees are assigned tothe solutions based on their tness function. Thebees return to thehive and based on their food sources, a number of bees areassigned to the same food source in order to nd a betterneighborhood solution. In the Bee Colony Optimization [10]algorithm, a step by step solution is produced by each forager beeand when the foragers returns to the hive a waggle dance isperformed by each forager. Then the other bees, based on aprobability, follow the foragers. This algorithm looks like the AntColony Optimization [5] algorithm but it does not use at all theconcept of pheromone trails.

Contrary to the fact that there are many algorithms that arebased on the foraging behaviour of the bees, the main algorithmproposed based on the marriage behaviour is the Honey BeesMating Optimization algorithm (HBMO), that was presented[14,15]. Since then, it has been used on a number of differentapplications [1618]. The Honey Bees Mating Optimizationalgorithm simulates the mating process of the queen of the hive.The mating process of the queen begins when the queen ightsaway from the nest performing the mating ight during whichthe drones follow the queen and mate with her in the air. Thealgorithm is a swarm intelligence algorithm since it uses aswarm of bees where there are three kinds of bees, the queen,the drones and the workers. There is a number of proceduresthat can be applied inside the swarm. In the Honey Bees MatingOptimization algorithm, the procedure of mating of the queenwith the drones is described. From this point of view someonewould classify the HBMO algorithm as a memetic algorithm,since we have an elitist genetic algorithm where the queen playsthe role of super-parent. But of course this method is not asimple memetic algorithm because in this algorithm we have anumber of details that differentiate the HBMO algorithm from asimple memetic algorithm. First, the queen is ying randomly inthe air and, based on her speed and her energy, if she meets adrone then there is a possibility to mate with him. Even if thequeen mates with the drone, she does not create directly a broodbut stores the genotype (with the term genotype we meansome of the basic characteristics of the drones, i.e. part of thesolution) of the drone in her spermatheca and the brood iscreated only when the mating ight has been completed.Another difference of the proposed algorithm from a memeticalgorithm is that the broods are not created by using one queenand one drone but each brood uses parts of the solutions(genotype) of the one queen and more than one drones. In ourproposed algorithm except of this classic procedure that hasbeen used from the researchers in the previous publishedalgorithms based on Honey Bees Mating Optimization [1418],we use also an adaptive memory procedure in order the queento have the possibility to store from previous selected gooddrones (in previous mating ights) part of their solutions inorder to use them in a new mating ight and to produce morettest drones. Another difference from a classic memeticalgorithm is the role of the workers. Someone could say thatsince the role of the workers is simply the brood care and theyare only a local search phase in the algorithm, then we have oneof the basic characteristics of the memetic algorithms (a geneticalgorithm with a local search phase [19]). But here we have astrict parallelism of the local search phase with what happens inreal life, i.e. with the foraging behaviour of the honey bees. Wemean that each one of the workers, which are different honeybees, takes care one brood in order to nd for him food and feedhim with the royal jelly in order to make him ttest and if thisbrood is better than the queen to take her place. And thus, thisalgorithm combines both the mating process of the queen andone part of the foraging behaviour of the honey bees inside thehive.

This paper presents a novel approach to solve the FeatureSubset Selection Problem using Honey Bees Mating OptimizationAlgorithm. In the classication phase of the proposed algorithm,a number of variants of the Nearest Neighbor classicationmethod are used [20]. The algorithm is used for the credit riskassessment classication task that is a very challenging andimportant management science problem in the domain ofnancial analysis [21]. Modern nance is a broad eld ofteninvolved with hard decision-making problems related to riskmanagement. In several cases, nancial decision-making pro-blems require the assignment of the available options intopredened groups/classes. Credit risk analysis, bankruptcyprediction, and country risk assessment, among other are sometypical examples [22]. In this context the development of reliableclassication models is clearly of major importance to research-ers and practitioners. The development of nancial classicationmodels is a complicated process, involving careful data collectionand pre-processing, model development, validation and imple-mentation. Focusing on model development, several methodshave been used, including statistical methods, articial intelli-gence techniques and operations research methodologies. In allcases, the quality of the data is a fundamental point. This ismainly related to the adequacy of the sample data in terms of thenumber of observation and the relevance of the decisionattributes (i.e., independent variables) used in the analysis.During the last years the application of nature inspired methodsto nancial problems have been developed [2328]. Moreprecisely, in [24] a procedure that utilizes a genetic algorithmin order to solve the Feature Subset Selection Problem ispresented and is combined with a number of Nearest Neighborbased classiers. The genetic based classication algorithm isapplied for the solution of the credit risk assessment classica-tion problem. In [25], a memetic algorithm, which is based on theconcepts of genetic algorithms and particle swarm optimizationis presented. Contrary to genetic algorithms, in this algorithm theevolution of each individual of the population is performed usinga particle swarm optimization algorithm. The memetic-basedclassication algorithm is combined with a number of nearestneighbor based classiers and is tested in a very signicantnancial classication task, involving the identication ofqualied audit reports. In [26] a tahu search metaheuristiccombined with a number of Nearest Neighbor classiers isapplied for the solution of the credit risk assessment problem

M. Marinaki et al. / Applied Soft Computing 10 (2010) 806812808while in [27] an ant colony optimization algorithm combinedwith a number of Nearest Neighbor classiers for the solution ofthe same nancial classication problem is presented. All themethods presented in [2427] gave very satisfactory results andcompared to other classic metaheuristic algorithms their resultswere in all cases better than the results of the classicmetaheuristic algorithms.

The rest of the paper is organized as follows: the next sectionprovides a short description of the feature selection problem. InSection 3, a detailed analysis of the proposed algorithm ispresented. Section 4 describes the applications context using theaforementioned nancial data sets and the experimental settings,whereas Section 5 presents the obtained computational results.The last section concludes the paper and discusses some futureresearch directions.

2. Feature selection problem

Recently, there has been an increasing need for novel data-mining methodologies that can analyze and interpret largevolumes of data. The proper selection of the right set of featuresfor classication is one of the most important problems indesigning a good classier. Feature selection is widely used as therst stage of classication task to reduce the dimension of problem,decrease noise, improve speed by the elimination of irrelevant orredundant features. The basic feature selection problem is anoptimization problem, with a performance measure for eachsubset of features tomeasure its ability to classify the samples. Theproblem is to search through the space of feature subsets toidentify the optimal or near-optimal subset, with respect to theperformance measure.

In the literature many successful feature selection algorithmshave been proposed. These algorithms can be classiedinto two categories based on whether features are doneindependently of the learning algorithm used to constructthe classier. If feature selection depends on learningalgorithm, the approach is referred to as a w rapper model.Otherwise, it is said to be a lter model. Filters, such as mutualinformation (MI), are based on the statistical tools. Wrappersassess subsets of features according to their usefulness to agiven classier.

Unfortunately, nding the optimum feature subset has beenproved to be NP-hard [29]. Many algorithms are, thus, proposed tond suboptimum solutions in comparably smaller amount of time[30]. The Branch and Bound approaches (BB) [31], the SequentialForward/Backward Search (SFS/SBE) [3234] and the lterapproaches [33,35] search for suboptimum solutions. One of themost important lter approaches is the Kira and Rendells Reliefalgorithm [36]. Stochastic algorithms, including Simulated Anneal-ing (SA) [37], Scatter Search algorithms [38], Ant ColonyOptimization [3943] and Genetic algorithms (GA) [33,44] areof great interest because they often yield high accuracy and aremuch faster.

3. The Honey Bees Mating Optimization algorithm

In this paper, as it has already been mentioned, an algorithmfor the solution to the feature selection problem based onthe Honey Bees Mating Optimization is presented. Thisalgorithm is combined with three Nearest Neighbor-basedclassiers, the 1-Nearest Neighbor, the k-Nearest Neighborand the Weighted k (wk)-Nearest Neighbor classier. Apseudocode of the proposed Honey Bees Mating Optimizationbased classication algorithm is presented in the following andthen an analytical description of the steps of the algorithm isgiven.3.1. Nearest Neighbor classiers

Initially, the classic 1-Nearest Neighbor (1-nn) method is used.The 1-nnworks as follows: in each iteration of the feature selectionalgorithm, a number of features are activated. For each sample ofthe test set, the Euclidean Distance from each sample in thetraining set is calculated. The Euclidean Distance is calculated asfollows:

Di j Xdl1

jxil x jlj2vuut (1)

where Di j is the distance between the test sample i 1; . . . ;Mtest(Mtest is the number of test samples) and the training samplej 1; . . . ;Mtrain (Mtrain is the number of training samples), andl 1; . . . ; d is the number of activated features in each iteration.

With this procedure the nearest sample from the training set iscalculated. Thus, each test sample is classied in the same class towhich its nearest sample from the training set belongs.

The previous approach may be extended to the k-NearestNeighbor (k-nn)method,wherewe examine the k-nearest samplesfrom the training set and, then, classify the test sample by using a

the

init

M. Marinaki et al. / Applied Soft Computing 10 (2010) 806812 809voting scheme. The most common way is to choose the mostrepresentative class in the training set.

Thus, the k-nn method makes a decision based on the majorityclass membership among the k-Nearest Neighbors of an unknownsample. In other words, every member among the k-NearestNeighbors has an equal percentage in the vote.

However, it is natural to attach more weight to those membersthat are closer to the test samples. This method is called theWeighted k Nearest Neighbor (wk-nn). In this method, the mostdistant neighbor from the test sample is denoted by i 1while theNearest Neighbor is denoted by i k. The ith neighbor receives theweight

wi i

Xki1

i

(2)

Thus, the following holds:

wkwk1 w1>0 (3)wk wk1 w1 1: (4)

3.2. Fitness function

The tness function measures the quality of the producedmembers of the population. In this problem, the quality ismeasured with the overall classication accuracy (see below fora complete description). Thus, for each bee the classiers (1-Nearest Neighbor, k-Nearest Neighbor or wk-Nearest Neighbor)are called and the produced overall classication accuracy (OCA)gives the tness function. In the tness function we would like tomaximize the OCA. The accuracy of a C class problem can bedescribed using a C C confusion matrix. The element ci j in row iand column j describes the number of samples of true class jclassied as class i, i.e., all correctly classied samples are placed inthe diagonal and the remaining misclassied cases in the upperand lower triangular parts. Thus, the overall classication accuracy(OCA) can be easily obtained as:

OCA 100

XCi1

cii

XCi1

XCj1

ci j

: (5)

3.3. HBMO algorithm

Initially, we have to choose the population of the honey beesthat will congure the initial hive. Each bee is randomly placed inthe d-dimensional space as a candidate solution (in the featureselection problem d corresponds to the number of activatedfeatures). One of the key issues in designing a successful algorithmfor feature selection problem is to nd a suitablemapping betweenfeature selection problem solutions and bees in Honey BeesMatingOptimization algorithm. Every candidate feature in HBMO ismapped into a binary bee where the bit 1 denotes that thecorresponding feature is selected and the bit 0 denotes that thefeature is not selected. Afterwards the tness of each individual iscalculated as described previously and the best member of theinitial population of bees is selected as the queen of the hive whileall the other members of the population are the drones.

Before the process of mating begins, the user has to dene anumber that corresponds to the queens size of spermatheca. Thisnumber corresponds to the maximum number of mating of thequeen in a single mating ight. Each time the queen successfullymates with a drone the genotype of the drone is stored and aspermatheca [16]. A drone mates with a queen probabilisticallyusing the following annealing function [14,15]:

ProbD eD f =Speedt (6)

where ProbD is the probability of adding the sperm of drone D tothe spermatheca of the queen (that is, the probability of asuccessful mating), D f is the absolute difference between thetness ofD and the tness of the queen and Speedt is the speed ofthe queen at time t. The probability of mating is high when thequeen is still at the beginning of her mating ight, therefore herspeed is high, or when the tness of the drone is as good as thequeens. After each transition in space, the queens speed andenergy decays according to the following equations:

Speedt 1 a Speedt (7)

energyt 1 a energyt (8)

where a is a factor 2 0;1 and is the amount of speed and energyreduction after each transition and each step. A number of matingights are realized. If themating is successful (i.e., the drone passesthe probabilistic decision rule), the drones sperm is stored in thequeens spermatheca. By crossovering the drones genotypes withthe queens, a new brood (trial solution) is formed which later canbe improved, employing workers to conduct local search. One ofthe major differences of the Honey Bees Mating Optimizationalgorithm from the classic evolutionary algorithms is that since thequeen stores a number of different drones sperm in herspermatheca she can use parts of the genotype of different dronesto create a new solution which gives the possibility to have morettest broods. In the crossover phase, we use a crossover procedurewhich initially identies the common characteristics of the queenand the drone and, then, inherits them to the broods. This crossoveroperator is a kind of adaptive memory procedure. Initially, theadaptivememory has been proposed by Rochat and Taillard [45] asa part of a tabu searchmetaheuristic for the solution of the VehicleRouting Problem. This procedure stores characteristics of goodsolutions. Each time a new good solution is found the adaptivememory is updated. In our case, in the rst generation the adaptivememory is empty. In order to add a solution or a part of a solutionin the adaptive memory there are a number of possibilities:

(1) The candidate for the adaptive memory solution is a previousbest solution (queen) and its tness function is at most 10%worst than the value of the current best solution.

(2) The candidate for the adaptivememory solution is amember ofthe population (drone) and its tness function is at most 10%worst than the value of the current best solution.

(3) A sequence of activated features is common for the queen andfor a number of drones.

More analytically, in this crossover operator, the points areselected randomly from the adaptive memory, from the selecteddrones and from the queen. Thus, initially two crossover operatorenernestting ight of the queen. At the start of the ight, the queen isialized with some energy content (initially, the speed and thegy of the queen are generated at random) and returns to herwhen the energy is within some threshold from zero to fullthe qmaause in the real life only one queen will survive in a hive, andnumber of broods is set equal to the number corresponding toueens spermatheca size. Then, we are ready to begin the(HBMbecvariable is increased by one until the size of spermatheca isreached. Another two parameters have to be dened, the numberof queens and the number of broods thatwill be born by all queens.In this implementation of Honey Bees Mating Optimization

O) algorithm, the number of queens is set equal to one,

th

Therefore, within the scope of this current study the analysis wasbased only on nancial information. A total of 26 nancial ratios(Table 1) and annual changes (features) were initially consideredbased on data availability and previous studies on creditassessment and bankruptcy prediction.

5. Computational results

The algorithms were implemented in Fortran 90 (Lahey f95compiler) on a Centrino Mobile Intel Pentium M750/1.86 GHz,running Suse Linux 9.1. To test the efciency of the proposedmethod, the 10-fold cross-validation procedure is utilized.Initially, the data set is divided into 10 disjoint groupscontaining approximately M=10 samples each, where M is thenumber of the samples in the data set. Next, each of these groupsis systematically removed from the data set, a model is builtfrom the remaining parts (the training set) and, then, theaccuracy of the model is calculated using the excluding parts(the test set).

As has already been mentioned, three approaches that usedifferent classiers, the 1-nn, k-nn and the wk-nn, are used. InHBMO based classier the value of k is changed dynamicallydepending on the number of iterations. Each generation usesdifferent k. The reason why k does not have a constant value is thatwewould like to ensure the diversity of the bees in each iteration ofHBMO, respectively. The determination of k is done by using arandom number generator with a uniform distribution 0;1 ineach iteration. Then, the produced number is converted to an

M. Marinaki et al. / Applied Soft Computing 10 (2010) 806812810bit qit; if randi0;1 Cr1adit; if Cr1 < randi0;1 Cr2dit; otherwise:

8

Documents

ABC+financialClassification