9
Research Article The Application of Pattern Recognition in Electrofacies Analysis Huan Li, 1 Xiao Yang, 2 and Wenhong Wei 1 1 Dongguan University of Technology, Dongguan 523808, China 2 School of Information Science and Technology, Tsinghua University, Beijing 100084, China Correspondence should be addressed to Wenhong Wei; [email protected] Received 24 February 2014; Accepted 26 April 2014; Published 4 June 2014 Academic Editor: Guiming Luo Copyright © 2014 Huan Li et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Pattern recognition is an important analytical tool in electrofacies analysis. In this paper, we study several commonly used clustering and classification algorithms. On the basis of advantages and disadvantages of existing algorithms, we introduce the KMRIC algorithm, which improves initial centers of K-means. Also, we propose the AKM algorithm which automatically determines the number of clusters and apply support vector machine to classification. Finally, we apply these algorithms to electrofacies analysis, where the experiments on the real-world datasets are carried out to compare the merits of various algorithms. 1. Introduction e basic principle of electrofacies analysis is to determine the lithological types corresponding to electrofacies according to the known lithological and underlying parameters in the key well. en we conduct clustering and discriminant analysis of key well and noncoring wells to automatically judge the automatica. Clustering means the process of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar to each other with respect to a certain similarity measure and which are dis- similar to objects of other groups. e applications of cluster analysis have been used in a wide range of different areas, including artificial intelligence, bioinformatics, biology, com- puter vision, data compression, image analysis, information retrieval, machine learning, marketing, medicine, pattern recognition, spatial database analysis, statistics, recommen- dation systems, and web mining. Dong et al. [1] proposed an improvement method based on K-means, which obtains the optimized initial center from a group of initial clustering centers. e K-means algorithm is one of the most popular and widespread partitioning clustering algorithms because of its superior feasibility and efficiency in dealing with a large amount of data. e main drawback of the KM algorithm is that the cluster result is sensitive to the selection of the initial cluster centers and may converge to the local optima. At present, the development tendency of clustering method is to find a global optimal solution in combination with the global optimization methods like simulated annealing, particle swarm, and other local methods like K-means [26]. Pelleg and Moore [7] pro- posed an algorithm which can automatically determine the optimal number of clusters during clustering. e challenge of clustering high-dimensional data has emerged in recent years. Clustering high-dimensional data is the cluster analysis of data anywhere from a few dozens to many thousands of dimensions. Such high-dimensional data spaces are oſten encountered in areas such as medicine, biology, bioinformat- ics, and the clustering of text documents, where, if a word- frequency vector is used, the number of dimensions equals the size of the dictionary. In high-dimensional clustering, generally the original space is transformed by PCA, SVD, K-L transformation, and other dimensionality reduction methods first; then the clustering of low-dimensional space is performed. Bertini et al. [8] introduced a high-dimensional visualization technology, showing multidimensional data on two-dimensional plane. Hindawi Publishing Corporation Journal of Applied Mathematics Volume 2014, Article ID 640406, 8 pages http://dx.doi.org/10.1155/2014/640406

Research Article The Application of Pattern Recognition in ...downloads.hindawi.com/journals/jam/2014/640406.pdf · Research Article The Application of Pattern Recognition in Electrofacies

Embed Size (px)

Citation preview

Page 1: Research Article The Application of Pattern Recognition in ...downloads.hindawi.com/journals/jam/2014/640406.pdf · Research Article The Application of Pattern Recognition in Electrofacies

Research ArticleThe Application of Pattern Recognition in Electrofacies Analysis

Huan Li1 Xiao Yang2 and Wenhong Wei1

1 Dongguan University of Technology Dongguan 523808 China2 School of Information Science and Technology Tsinghua University Beijing 100084 China

Correspondence should be addressed to Wenhong Wei weiwhdguteducn

Received 24 February 2014 Accepted 26 April 2014 Published 4 June 2014

Academic Editor Guiming Luo

Copyright copy 2014 Huan Li et alThis is an open access article distributed under the Creative Commons Attribution License whichpermits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Pattern recognition is an important analytical tool in electrofacies analysis In this paper we study several commonly used clusteringand classification algorithms On the basis of advantages and disadvantages of existing algorithms we introduce the KMRICalgorithm which improves initial centers of K-means Also we propose the AKM algorithm which automatically determines thenumber of clusters and apply support vector machine to classification Finally we apply these algorithms to electrofacies analysiswhere the experiments on the real-world datasets are carried out to compare the merits of various algorithms

1 Introduction

Thebasic principle of electrofacies analysis is to determine thelithological types corresponding to electrofacies according tothe known lithological and underlying parameters in the keywell Then we conduct clustering and discriminant analysisof key well and noncoring wells to automatically judge theautomatica

Clusteringmeans the process of partitioning an unlabeleddataset into groups of similar objects Each group called acluster consists of objects that are similar to each other withrespect to a certain similarity measure and which are dis-similar to objects of other groups The applications of clusteranalysis have been used in a wide range of different areasincluding artificial intelligence bioinformatics biology com-puter vision data compression image analysis informationretrieval machine learning marketing medicine patternrecognition spatial database analysis statistics recommen-dation systems and web mining

Dong et al [1] proposed an improvement method basedon K-means which obtains the optimized initial center froma group of initial clustering centers The K-means algorithmis one of the most popular and widespread partitioningclustering algorithms because of its superior feasibility and

efficiency in dealing with a large amount of data The maindrawback of the KM algorithm is that the cluster resultis sensitive to the selection of the initial cluster centersand may converge to the local optima At present thedevelopment tendency of clusteringmethod is to find a globaloptimal solution in combinationwith the global optimizationmethods like simulated annealing particle swarm and otherlocal methods like K-means [2ndash6] Pelleg andMoore [7] pro-posed an algorithm which can automatically determine theoptimal number of clusters during clustering The challengeof clustering high-dimensional data has emerged in recentyears Clustering high-dimensional data is the cluster analysisof data anywhere from a few dozens to many thousandsof dimensions Such high-dimensional data spaces are oftenencountered in areas such as medicine biology bioinformat-ics and the clustering of text documents where if a word-frequency vector is used the number of dimensions equalsthe size of the dictionary In high-dimensional clusteringgenerally the original space is transformed by PCA SVDK-L transformation and other dimensionality reductionmethods first then the clustering of low-dimensional spaceis performed Bertini et al [8] introduced a high-dimensionalvisualization technology showing multidimensional data ontwo-dimensional plane

Hindawi Publishing CorporationJournal of Applied MathematicsVolume 2014 Article ID 640406 8 pageshttpdxdoiorg1011552014640406

2 Journal of Applied Mathematics

K-means [9 10] is a clustering method most widely usedin science and engineering nowadays However it has thefollowing 5 deficiencies [3 5]

(1) The results are initial center initiative(2) Only local optimal solution can be obtained rather

than global optimal solution(3) The number of clustering k should be set in advanced

artificially(4) The error point imposes serious impacts on the results

of clustering(5) The algorithm lacks scalability

The paper introduces an improved algorithm accordingto the deficiencies of K-means

2 Improve K-Means Method of Initial Center

Aimed at the disadvantages (1) and (4) inK-means algorithmwe propose a K-means algorithm with refined initial centers(KMRIC for short) based on the works of predecessors [1]

(1) Randomly extract J sample subsets 119878119894 119894 = 1 2 119869

(2) Conduct K-means clustering of 119869 sample subsetsrespectively on the whole data field to get J setsCM119894 119894 = 1 2 119869 CM = ⋃119869

119894=1CM119894 in which there

are119870 times 119869 points for CM at most(3) Conduct K-means clustering on CM by taking CM as

the initial clustering center to get J clustering centersets FM

119894119894 = 1 2 119869

It can be seen from Figure 1 that the clustering center isobtained fromdifferent subsample set near the real clusteringcenter and clustering is formed by different subsample setIn (3) selecting the one with the minimum sum of squaresof deviations as the improved initial clustering center canreduce the randomness brought by random selection In (2)to eliminate the influence of error point the modified K-means algorithm (KmeansMod) is adoptedKmeansMod hasthe following modification based on the standard K-meanswhen the standardK-means algorithm is completed the datapoint contained in each clustering will be checked If the datapoint contained in a clustering is zero the original center willbe replaced by taking the data point furthest to the clusteringcenter as a new center and then the K-means algorithm isreran

KMRPIC algorithm eliminates the sensitivity ofK-meansalgorithm to data input consequence and initial centerswhich is an obvious improvement compared with K-meanseffect When applied to large-scale data KMRIC can reducethe iterations and improve the execution efficiency

3 Adaptive K-Means

The number of clusters k of K-means algorithm should be setin advance manually However actually we do not know thevalue of k especially in the case of high dimension of data soit is more difficult to select the correct value of k

stands for the real clusteringstands for the cluster that is obtained by different subsample set

Figure 1 Multicombination clustering center obtained from multi-ple sample subsets

X-means put forward by Pelleg and Moore [7] canautomatically determine the number of clusters HoweverX-means is prone to split data into more clusters than theactual ones which is particularly obvious when the data isnot strictly subject to the normal distribution Lewis [11]statistics are introduced as the standard of measuring thenormal distribution and propose an adaptiveK-means (AKMfor short)

The AKM algorithm first assumes that all data are inthe same cluster then the number of clustering is graduallyincreased in the subsequent iterations In each iterationwhether each cluster satisfied the normal distribution isjudged at once if not the cluster should be split into twoclusters After each splitting K-means clustering is carriedout in the whole data field to improve the clustering resultsThe iteration ends until there is no splitting and then the finalclustering results will be obtained The schematic diagram ofAKM algorithm is shown in Figure 2 In Figure 2 clusteringis divided into three categories firstly then each categoryis split into two subclasses At last the results are got afterone splitting to judge whether each subclass follows Gaussiandistribution

The judgment of splitting is as follows

(1) Select the confidence level 120572

(2) Run KMRIC program and split X into two to get twoclustering centers 119888

1 1198882

(3) Let ] = 1198881minus1198882be anN-dimensional vector connecting

the two centers which is themain direction of judgingthe normal distribution X is projected on ] 1199091015840

119894=

(⟨119909119894 ]⟩]2)1198831015840 is transformed to make its mean as 0

and variance as 1

(4) Suppose that 119911119894= 119865(119909

1015840

(119894)) The results 1198602

lowast(119885) with

respect to confidence level 120572 are not significant soaccept119867

0 reserve the original clustering center 119888 and

abandon 1198881and 1198882 Otherwise reject 119867

0 and replace

the original clustering center 119888 by 1198881and 1198882

Journal of Applied Mathematics 3

(a) It is divided into three categories (b) Each category is split into two subclasses

(c) Get the results after one splitting to judge whether eachsubclass follows Gaussian distribution

Figure 2 Schematic diagram of AKM algorithm

1198602

lowast(119885) is the statistics of Anderson Darling

1198602(119885) = minus

1

119899

119899

sum

119894=1

(2119894 minus 1) [log (119911119894) + log (1 minus 119911119899+1minus119894)] minus 119899

(1)

Figure 3 shows two distribution circumstances InFigure 3(a) the subclass follows Gaussian distributionbut in Figure 3(b) the subclass does not follow Gaussiandistribution AKM algorithm can judge whether eachsubclass follows Gaussian distribution

AKM integrates the determination process of the numberof clusters and the clustering process which can automati-cally determine the optimal number of clusters thus avoidingthe subjectivity in the selection of number of clusters and theblindness of initialization and can also distinguish the errors

4 Discriminant Method

41 Fisher Classification Fisher method actually is about thedimension compression Projecting the samples which can

be easily separated in higher space on a straight line arbitrarilymay be difficult to be identified for different types mixedtogether Generally the best direction can always be foundto separate the samples when projected on that directionBut how to find out the best direction and how to realizethe transformations of projection toward the best directionare the very two problems to be solved by Fisher algorithmFigure 4 shows analysis schematic diagram of Fisher algo-rithm using linear discriminant In Figure 4(a) the samplecannot be identifiedwhen being projected on coordinate axisand in Figure 4(b) the projection samples can be identified bylooking for a direction

42 Potential Function Classification Potential function acommon method used in nonlinear classifier is a way tosolve the classification problems of pattern via the conceptionof electric field In the potential function classification thesamples belonging to one category are treated as positivecharge while the samples belonging to another category aretreated as the negative charge thus turning the classificationproblems of pattern to the matter of transferring the positive

4 Journal of Applied Mathematics

(a) The subclass follows Gaussian distribution (b) The subclass does not follow Gaussian distribution

Figure 3 Judge whether each subclass follows Gaussian distribution

(a) The sample cannot be identified when beingprojected on coordinate axis

(b) Theprojection samples can be identified by lookingfor a direction

Figure 4 Schematic diagram of Fisher linear discriminant analysis

charge and negative charge and the equipotential line whereits electric potential is zero is the decision boundary Thetraining course of potential function algorithm is a processof accumulating electric potential when the samples are inputone after another by exploiting the potential function

43 Least Squares Support Vector Machine (LS-SVM) Basedon the VC dimension theory of statistical learning theoryand the structural riskminimization principle support vectormachinesmethod [12] converts the practical problem to high-dimensional feature space through nonlinear transform-ation and realizes the nonlinear discriminant function in theoriginal space by constructing linear discriminant functionin higher space By means of introducing the least squareslinear system into support vector machine to replace thetraditional one quadratic programming method which isadopted to settle the problems of classification and estima-tion is a kind of extension of traditional support vectormachine

5 Procedures of Electrofacies Analysis

The procedure of electrofacies analysis is shown in Figure 5

51 Feature Extraction of Log Data The primary step toestablish electrofacies is to extract a set of log data featuresthat can reflect the lithologic character of sedimentary rockGenerally there are 9 types of well-logging items or moreand those logging items are interrelated There are two waysto eliminate gibberish simplify control methods and reducecalculated amount (1) principal component analysis (2)Select logging items manually The extracted logging itemswill be recorded in Table stdlogdata as the data source forclustering analysis

52 Clustering Analysis In order to find out the electrofaciesof the same type and establish a standard library in electro-facies analysis clustering analysis must be conducted to stra-tum Finally the classification results acquired by clustering

Journal of Applied Mathematics 5

Input number of typical

Set up dynamic data

Principal component

analysis

Select well-

logging item

Clustering analysis

Lithology dictionary

Set up electrofacies-lithofa

Input number of unknown

Set up dynamic data

Select well-logging item

Discriminate analysis

Determine

Figure 5 Flow diagram of electrofacies analysis

should be recorded in the column of ldquoCategoryrdquo in Tablestdlogdata and the lithology be recorded in the column ofldquoLithologyrdquo according to the lithology dictionary

53 Discriminant Analysis After establishing lithofaciesdatabase namely the electrofacies of type well it is possibleto discriminate the lithofacies of other wells After discrim-ination the data and discriminant results will be written inTable anylogdata and the logging items bewritten in the Tableanylogitem

6 Comparison and Analysis ofResults of Algorithm

61 Experimental Data The Iris dataset [13] usually serves asthe testing dataset for benchmark function in which eachrecord contains 4 attributes of Iris totaling 150 samplesThe correct classification result is that each type of datahas 50 samples Eight attributes are included in each set ofdata of electrofacies totaling 177 samples As for the realdata in electrofacies there is no strictly accurate number ofcategories and standard classification Judging by experience8 classifications may be rational

62 Analysis of Experimental Results of Cluster

621 Iris Dataset It can be easily seen from Figures 6ndash9that the cluster obtained by standard K-means algorithm ispretty different from the standard results while the clusteringresults obtained by ISODATA and KMRIC come near to thestandard ones and are the same as the results obtained bybuilt-in K-means algorithm of Matlab AKM has only twocategories The second and the third categories are deemedas belonging to the same normal distribution that are neverapart for they are approximate to each other and have someparts overlapped (see Table 1 and 2)

Table 1 Clustering method comparison under Iris dataset

119870-means ISODATA KMRIC AKM MatlabType I 30 50 50 53 50Type II 24 39 39 97 38Type III 96 61 61 0 62Accuracy 693 926 926 667 92

Table 2 Clustering method comparisons under Iris dataset

119870-means ISODATA KMRIC AKM MatlabType I 56 35 47 46 47Type II 38 30 46 40 45Type III 36 26 26 26 23Type IV 18 23 14 23 23Type V 11 17 13 13 13Type VI 9 13 12 12 12Type VII 8 12 10 10 11Type VIII 1 10 9 7 3Type IX 0 8 0 0 0Type X 0 3 0 0 0

622 Electrofacies Dataset It can be seen from Figures 10ndash13that the clustering results obtained by K-means have largeerror while the cluster obtained by KMRIC and AKM isrelatively rational and can basically reflect the right classifi-cation and AKM can also identify the accurate number ofclustering automatically Compared with ISODATA AKMis more accurate in determining the number of clusteringand its clustering results are more rational as well Besidesit proves that the hypothesis testing way to judge the numberof clustering of AKM is more universal than that by judgingit based on the between-class distance of ISODATA

6 Journal of Applied Mathematics

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 6 Clustering results of dataset by Matlab figure

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 7 Clustering results obtained by standard K-means

Table 3 Number of misclassification and accuracy of variousdiscriminant methods under Iris dataset

Fisher Potential function LS-SVMType I 0 0 0Type II 1 0 0Type III 0 0 0Total 1 0 0Accuracy 967 100 100

63 Experimental Results and Analysis of Classification

631 Iris Dataset See Table 3

632 Electrofacies Dataset It can be seen fromTables 3 and 4that these three classification methods all work well whenprocessing the Iris data for the data structure of Iris is quitesimple and low in dimension As for electrofacies data Fisherdiscriminant analysis is not applicable due to the singular

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 8 Clustering results obtained by ISODATA and KMRIC

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 9 Clustering results obtained by AKM

Table 4 Number of misclassification of various discriminantmethods under electrofacies dataset

Fisher Potential function LS-SVMType I mdash 0 0Type II mdash 0 2Type III mdash 0 0Type IV mdash 1 2Type V mdash 0 0Type VI mdash 0 0Type VII mdash 0 3Type VIII mdash 0 2Total mdash 1 9Accuracy mdash 949 769

within-class scatter 119878119908

matrix while the potential func-tion and LS-SVM still have better accuracy to classificationThe multiclassification of LS-SVM application remains forfurther study

Journal of Applied Mathematics 7

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 10 Clustering results obtained by standard K-means

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 11 Clustering results obtained by ISODATA

7 Conclusion

On the basis of analyzing the strengths and weaknesses of theexisting main algorithms for clustering this paper proposedthe KMRIC algorithm for improving initial points and theAKM algorithm for determining the number of clusters Thesupport vector machine has also been used for classificationFinally the algorithms are applied to electrofacies analysisThrough the experimental analysis comparison was madeamong algorithms According to the experimental resultsthe KMRIC algorithm erases the sensibility of K-meansalgorithm to data input sequence and initial centers and itachieves an obvious improvement relative to K-means andISODATA AKMalgorithmmixes the process of determiningthe number of clusters and the clustering process togetherto avoid the subjectivity in selecting the number of clustersand the blindness in initial divisions Under general condi-tion the number of clusters and rational clusters can be foundcorrectly

There are some other problems that remain open Thevolatility of results which was caused by the randomness

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 12 Clustering results obtained by KMRIC

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 13 Clustering results obtained by AKM

of selecting initial points in KMRIC existed in KMRIC andAKM To address this problem we can lower the randomnessby selecting the optimal initial points repeatedly Hierarchicalclustering is a very stable method but its disadvantage is themassive calculation cost How to combine the hierarchicalclustering and the abovementionedmethods may be taken asthe improvement direction in future

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Research of the authors was supported by theNationalNatureScience Foundation of China (no 61103037) Nature ScienceFoundation of Guangdong Province (no S2013010011858)Project of Guangdong University of Outstanding Young Tal-ents Cultivation (no 2012LYM 0125) and Dongguan Scienceand Technology Project (no 2012108102007)

8 Journal of Applied Mathematics

References

[1] S Dong D D Zhou and W Ding ldquoFlow cluster algorithmbased on improvedK-meansmethodrdquo IETE Journal of Researchvol 59 no 4 pp 326ndash333 2013

[2] J Q He H Dai and X Song ldquoThe combination stretching fun-ction technique with simulated annealing algorithm for globaloptimizationrdquo Optimization Methods and Software vol 29 no3 pp 629ndash645 2014

[3] J Liu and T Z Liu ldquoDetecting community structure in complexnetworks using simulated annealing with k-means algorithmsrdquoPhysica A StatisticalMechanics and Its Applications vol 389 no11 pp 2300ndash2309 2010

[4] S H Kim and L Li ldquoStatistical identifiability and convergenceevaluation for nonlinear pharmacokinetic models with particleswarm optimizationrdquo Computer Methods and Programs inBiomedicine vol 113 no 2 pp 413ndash432 2014

[5] S Kalyani andK S Swarup ldquoParticle swarmoptimization basedK-means clustering approach for security assessment in powersystemsrdquo Expert Systems with Applications vol 38 no 9 pp10839ndash10846 2011

[6] D H Wang J F Wang and X Y Xu ldquoA relevance vectormachine and bare-bones particle swarm optimization hybridalgorithm for PD pattern recognition of XLPE cablerdquo Journal ofComputational Information Systems vol 8 no 2 pp 451ndash4582012

[7] D Pelleg and AWMoore ldquoX-means extending K-means withefficient estimation of the number of clustersrdquo in Proceedings ofthe 17th International Conference onMachine Learning pp 727ndash734 2000

[8] E Bertini A Tatu and D Keim ldquoQuality metrics in high-dimensional data visualization an overview and systemati-zationrdquo IEEE Transactions on Visualization and ComputerGraphics vol 17 no 12 pp 2203ndash2212 2011

[9] L M Li and Z S Wang ldquoMethod of redundant features elimi-nating based on k-means clusteringrdquo Applied Mechanics andMaterials vol 488 pp 1023ndash1026 2014

[10] C H Lin C C Chen H L Lee et al ldquoFast K-means algorithmbased on a level histogram for image retrievalrdquo Expert Systemswith Applications vol 41 no 7 pp 3276ndash3283 2014

[11] P AW Lewis ldquoDistribution of the Anderson-Darling statisticrdquoAnnals of Mathematical Statistics vol 32 pp 1118ndash1124 1961

[12] M Z Tang and C H Yang ldquoExcellent operational patternrecognition based on simultaneously optimizing cost-sensitivesupport vector machinerdquo CIESC Journal vol 64 no 12 pp4509ndash4514 2013

[13] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: Research Article The Application of Pattern Recognition in ...downloads.hindawi.com/journals/jam/2014/640406.pdf · Research Article The Application of Pattern Recognition in Electrofacies

2 Journal of Applied Mathematics

K-means [9 10] is a clustering method most widely usedin science and engineering nowadays However it has thefollowing 5 deficiencies [3 5]

(1) The results are initial center initiative(2) Only local optimal solution can be obtained rather

than global optimal solution(3) The number of clustering k should be set in advanced

artificially(4) The error point imposes serious impacts on the results

of clustering(5) The algorithm lacks scalability

The paper introduces an improved algorithm accordingto the deficiencies of K-means

2 Improve K-Means Method of Initial Center

Aimed at the disadvantages (1) and (4) inK-means algorithmwe propose a K-means algorithm with refined initial centers(KMRIC for short) based on the works of predecessors [1]

(1) Randomly extract J sample subsets 119878119894 119894 = 1 2 119869

(2) Conduct K-means clustering of 119869 sample subsetsrespectively on the whole data field to get J setsCM119894 119894 = 1 2 119869 CM = ⋃119869

119894=1CM119894 in which there

are119870 times 119869 points for CM at most(3) Conduct K-means clustering on CM by taking CM as

the initial clustering center to get J clustering centersets FM

119894119894 = 1 2 119869

It can be seen from Figure 1 that the clustering center isobtained fromdifferent subsample set near the real clusteringcenter and clustering is formed by different subsample setIn (3) selecting the one with the minimum sum of squaresof deviations as the improved initial clustering center canreduce the randomness brought by random selection In (2)to eliminate the influence of error point the modified K-means algorithm (KmeansMod) is adoptedKmeansMod hasthe following modification based on the standard K-meanswhen the standardK-means algorithm is completed the datapoint contained in each clustering will be checked If the datapoint contained in a clustering is zero the original center willbe replaced by taking the data point furthest to the clusteringcenter as a new center and then the K-means algorithm isreran

KMRPIC algorithm eliminates the sensitivity ofK-meansalgorithm to data input consequence and initial centerswhich is an obvious improvement compared with K-meanseffect When applied to large-scale data KMRIC can reducethe iterations and improve the execution efficiency

3 Adaptive K-Means

The number of clusters k of K-means algorithm should be setin advance manually However actually we do not know thevalue of k especially in the case of high dimension of data soit is more difficult to select the correct value of k

stands for the real clusteringstands for the cluster that is obtained by different subsample set

Figure 1 Multicombination clustering center obtained from multi-ple sample subsets

X-means put forward by Pelleg and Moore [7] canautomatically determine the number of clusters HoweverX-means is prone to split data into more clusters than theactual ones which is particularly obvious when the data isnot strictly subject to the normal distribution Lewis [11]statistics are introduced as the standard of measuring thenormal distribution and propose an adaptiveK-means (AKMfor short)

The AKM algorithm first assumes that all data are inthe same cluster then the number of clustering is graduallyincreased in the subsequent iterations In each iterationwhether each cluster satisfied the normal distribution isjudged at once if not the cluster should be split into twoclusters After each splitting K-means clustering is carriedout in the whole data field to improve the clustering resultsThe iteration ends until there is no splitting and then the finalclustering results will be obtained The schematic diagram ofAKM algorithm is shown in Figure 2 In Figure 2 clusteringis divided into three categories firstly then each categoryis split into two subclasses At last the results are got afterone splitting to judge whether each subclass follows Gaussiandistribution

The judgment of splitting is as follows

(1) Select the confidence level 120572

(2) Run KMRIC program and split X into two to get twoclustering centers 119888

1 1198882

(3) Let ] = 1198881minus1198882be anN-dimensional vector connecting

the two centers which is themain direction of judgingthe normal distribution X is projected on ] 1199091015840

119894=

(⟨119909119894 ]⟩]2)1198831015840 is transformed to make its mean as 0

and variance as 1

(4) Suppose that 119911119894= 119865(119909

1015840

(119894)) The results 1198602

lowast(119885) with

respect to confidence level 120572 are not significant soaccept119867

0 reserve the original clustering center 119888 and

abandon 1198881and 1198882 Otherwise reject 119867

0 and replace

the original clustering center 119888 by 1198881and 1198882

Journal of Applied Mathematics 3

(a) It is divided into three categories (b) Each category is split into two subclasses

(c) Get the results after one splitting to judge whether eachsubclass follows Gaussian distribution

Figure 2 Schematic diagram of AKM algorithm

1198602

lowast(119885) is the statistics of Anderson Darling

1198602(119885) = minus

1

119899

119899

sum

119894=1

(2119894 minus 1) [log (119911119894) + log (1 minus 119911119899+1minus119894)] minus 119899

(1)

Figure 3 shows two distribution circumstances InFigure 3(a) the subclass follows Gaussian distributionbut in Figure 3(b) the subclass does not follow Gaussiandistribution AKM algorithm can judge whether eachsubclass follows Gaussian distribution

AKM integrates the determination process of the numberof clusters and the clustering process which can automati-cally determine the optimal number of clusters thus avoidingthe subjectivity in the selection of number of clusters and theblindness of initialization and can also distinguish the errors

4 Discriminant Method

41 Fisher Classification Fisher method actually is about thedimension compression Projecting the samples which can

be easily separated in higher space on a straight line arbitrarilymay be difficult to be identified for different types mixedtogether Generally the best direction can always be foundto separate the samples when projected on that directionBut how to find out the best direction and how to realizethe transformations of projection toward the best directionare the very two problems to be solved by Fisher algorithmFigure 4 shows analysis schematic diagram of Fisher algo-rithm using linear discriminant In Figure 4(a) the samplecannot be identifiedwhen being projected on coordinate axisand in Figure 4(b) the projection samples can be identified bylooking for a direction

42 Potential Function Classification Potential function acommon method used in nonlinear classifier is a way tosolve the classification problems of pattern via the conceptionof electric field In the potential function classification thesamples belonging to one category are treated as positivecharge while the samples belonging to another category aretreated as the negative charge thus turning the classificationproblems of pattern to the matter of transferring the positive

4 Journal of Applied Mathematics

(a) The subclass follows Gaussian distribution (b) The subclass does not follow Gaussian distribution

Figure 3 Judge whether each subclass follows Gaussian distribution

(a) The sample cannot be identified when beingprojected on coordinate axis

(b) Theprojection samples can be identified by lookingfor a direction

Figure 4 Schematic diagram of Fisher linear discriminant analysis

charge and negative charge and the equipotential line whereits electric potential is zero is the decision boundary Thetraining course of potential function algorithm is a processof accumulating electric potential when the samples are inputone after another by exploiting the potential function

43 Least Squares Support Vector Machine (LS-SVM) Basedon the VC dimension theory of statistical learning theoryand the structural riskminimization principle support vectormachinesmethod [12] converts the practical problem to high-dimensional feature space through nonlinear transform-ation and realizes the nonlinear discriminant function in theoriginal space by constructing linear discriminant functionin higher space By means of introducing the least squareslinear system into support vector machine to replace thetraditional one quadratic programming method which isadopted to settle the problems of classification and estima-tion is a kind of extension of traditional support vectormachine

5 Procedures of Electrofacies Analysis

The procedure of electrofacies analysis is shown in Figure 5

51 Feature Extraction of Log Data The primary step toestablish electrofacies is to extract a set of log data featuresthat can reflect the lithologic character of sedimentary rockGenerally there are 9 types of well-logging items or moreand those logging items are interrelated There are two waysto eliminate gibberish simplify control methods and reducecalculated amount (1) principal component analysis (2)Select logging items manually The extracted logging itemswill be recorded in Table stdlogdata as the data source forclustering analysis

52 Clustering Analysis In order to find out the electrofaciesof the same type and establish a standard library in electro-facies analysis clustering analysis must be conducted to stra-tum Finally the classification results acquired by clustering

Journal of Applied Mathematics 5

Input number of typical

Set up dynamic data

Principal component

analysis

Select well-

logging item

Clustering analysis

Lithology dictionary

Set up electrofacies-lithofa

Input number of unknown

Set up dynamic data

Select well-logging item

Discriminate analysis

Determine

Figure 5 Flow diagram of electrofacies analysis

should be recorded in the column of ldquoCategoryrdquo in Tablestdlogdata and the lithology be recorded in the column ofldquoLithologyrdquo according to the lithology dictionary

53 Discriminant Analysis After establishing lithofaciesdatabase namely the electrofacies of type well it is possibleto discriminate the lithofacies of other wells After discrim-ination the data and discriminant results will be written inTable anylogdata and the logging items bewritten in the Tableanylogitem

6 Comparison and Analysis ofResults of Algorithm

61 Experimental Data The Iris dataset [13] usually serves asthe testing dataset for benchmark function in which eachrecord contains 4 attributes of Iris totaling 150 samplesThe correct classification result is that each type of datahas 50 samples Eight attributes are included in each set ofdata of electrofacies totaling 177 samples As for the realdata in electrofacies there is no strictly accurate number ofcategories and standard classification Judging by experience8 classifications may be rational

62 Analysis of Experimental Results of Cluster

621 Iris Dataset It can be easily seen from Figures 6ndash9that the cluster obtained by standard K-means algorithm ispretty different from the standard results while the clusteringresults obtained by ISODATA and KMRIC come near to thestandard ones and are the same as the results obtained bybuilt-in K-means algorithm of Matlab AKM has only twocategories The second and the third categories are deemedas belonging to the same normal distribution that are neverapart for they are approximate to each other and have someparts overlapped (see Table 1 and 2)

Table 1 Clustering method comparison under Iris dataset

119870-means ISODATA KMRIC AKM MatlabType I 30 50 50 53 50Type II 24 39 39 97 38Type III 96 61 61 0 62Accuracy 693 926 926 667 92

Table 2 Clustering method comparisons under Iris dataset

119870-means ISODATA KMRIC AKM MatlabType I 56 35 47 46 47Type II 38 30 46 40 45Type III 36 26 26 26 23Type IV 18 23 14 23 23Type V 11 17 13 13 13Type VI 9 13 12 12 12Type VII 8 12 10 10 11Type VIII 1 10 9 7 3Type IX 0 8 0 0 0Type X 0 3 0 0 0

622 Electrofacies Dataset It can be seen from Figures 10ndash13that the clustering results obtained by K-means have largeerror while the cluster obtained by KMRIC and AKM isrelatively rational and can basically reflect the right classifi-cation and AKM can also identify the accurate number ofclustering automatically Compared with ISODATA AKMis more accurate in determining the number of clusteringand its clustering results are more rational as well Besidesit proves that the hypothesis testing way to judge the numberof clustering of AKM is more universal than that by judgingit based on the between-class distance of ISODATA

6 Journal of Applied Mathematics

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 6 Clustering results of dataset by Matlab figure

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 7 Clustering results obtained by standard K-means

Table 3 Number of misclassification and accuracy of variousdiscriminant methods under Iris dataset

Fisher Potential function LS-SVMType I 0 0 0Type II 1 0 0Type III 0 0 0Total 1 0 0Accuracy 967 100 100

63 Experimental Results and Analysis of Classification

631 Iris Dataset See Table 3

632 Electrofacies Dataset It can be seen fromTables 3 and 4that these three classification methods all work well whenprocessing the Iris data for the data structure of Iris is quitesimple and low in dimension As for electrofacies data Fisherdiscriminant analysis is not applicable due to the singular

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 8 Clustering results obtained by ISODATA and KMRIC

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 9 Clustering results obtained by AKM

Table 4 Number of misclassification of various discriminantmethods under electrofacies dataset

Fisher Potential function LS-SVMType I mdash 0 0Type II mdash 0 2Type III mdash 0 0Type IV mdash 1 2Type V mdash 0 0Type VI mdash 0 0Type VII mdash 0 3Type VIII mdash 0 2Total mdash 1 9Accuracy mdash 949 769

within-class scatter 119878119908

matrix while the potential func-tion and LS-SVM still have better accuracy to classificationThe multiclassification of LS-SVM application remains forfurther study

Journal of Applied Mathematics 7

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 10 Clustering results obtained by standard K-means

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 11 Clustering results obtained by ISODATA

7 Conclusion

On the basis of analyzing the strengths and weaknesses of theexisting main algorithms for clustering this paper proposedthe KMRIC algorithm for improving initial points and theAKM algorithm for determining the number of clusters Thesupport vector machine has also been used for classificationFinally the algorithms are applied to electrofacies analysisThrough the experimental analysis comparison was madeamong algorithms According to the experimental resultsthe KMRIC algorithm erases the sensibility of K-meansalgorithm to data input sequence and initial centers and itachieves an obvious improvement relative to K-means andISODATA AKMalgorithmmixes the process of determiningthe number of clusters and the clustering process togetherto avoid the subjectivity in selecting the number of clustersand the blindness in initial divisions Under general condi-tion the number of clusters and rational clusters can be foundcorrectly

There are some other problems that remain open Thevolatility of results which was caused by the randomness

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 12 Clustering results obtained by KMRIC

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 13 Clustering results obtained by AKM

of selecting initial points in KMRIC existed in KMRIC andAKM To address this problem we can lower the randomnessby selecting the optimal initial points repeatedly Hierarchicalclustering is a very stable method but its disadvantage is themassive calculation cost How to combine the hierarchicalclustering and the abovementionedmethods may be taken asthe improvement direction in future

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Research of the authors was supported by theNationalNatureScience Foundation of China (no 61103037) Nature ScienceFoundation of Guangdong Province (no S2013010011858)Project of Guangdong University of Outstanding Young Tal-ents Cultivation (no 2012LYM 0125) and Dongguan Scienceand Technology Project (no 2012108102007)

8 Journal of Applied Mathematics

References

[1] S Dong D D Zhou and W Ding ldquoFlow cluster algorithmbased on improvedK-meansmethodrdquo IETE Journal of Researchvol 59 no 4 pp 326ndash333 2013

[2] J Q He H Dai and X Song ldquoThe combination stretching fun-ction technique with simulated annealing algorithm for globaloptimizationrdquo Optimization Methods and Software vol 29 no3 pp 629ndash645 2014

[3] J Liu and T Z Liu ldquoDetecting community structure in complexnetworks using simulated annealing with k-means algorithmsrdquoPhysica A StatisticalMechanics and Its Applications vol 389 no11 pp 2300ndash2309 2010

[4] S H Kim and L Li ldquoStatistical identifiability and convergenceevaluation for nonlinear pharmacokinetic models with particleswarm optimizationrdquo Computer Methods and Programs inBiomedicine vol 113 no 2 pp 413ndash432 2014

[5] S Kalyani andK S Swarup ldquoParticle swarmoptimization basedK-means clustering approach for security assessment in powersystemsrdquo Expert Systems with Applications vol 38 no 9 pp10839ndash10846 2011

[6] D H Wang J F Wang and X Y Xu ldquoA relevance vectormachine and bare-bones particle swarm optimization hybridalgorithm for PD pattern recognition of XLPE cablerdquo Journal ofComputational Information Systems vol 8 no 2 pp 451ndash4582012

[7] D Pelleg and AWMoore ldquoX-means extending K-means withefficient estimation of the number of clustersrdquo in Proceedings ofthe 17th International Conference onMachine Learning pp 727ndash734 2000

[8] E Bertini A Tatu and D Keim ldquoQuality metrics in high-dimensional data visualization an overview and systemati-zationrdquo IEEE Transactions on Visualization and ComputerGraphics vol 17 no 12 pp 2203ndash2212 2011

[9] L M Li and Z S Wang ldquoMethod of redundant features elimi-nating based on k-means clusteringrdquo Applied Mechanics andMaterials vol 488 pp 1023ndash1026 2014

[10] C H Lin C C Chen H L Lee et al ldquoFast K-means algorithmbased on a level histogram for image retrievalrdquo Expert Systemswith Applications vol 41 no 7 pp 3276ndash3283 2014

[11] P AW Lewis ldquoDistribution of the Anderson-Darling statisticrdquoAnnals of Mathematical Statistics vol 32 pp 1118ndash1124 1961

[12] M Z Tang and C H Yang ldquoExcellent operational patternrecognition based on simultaneously optimizing cost-sensitivesupport vector machinerdquo CIESC Journal vol 64 no 12 pp4509ndash4514 2013

[13] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: Research Article The Application of Pattern Recognition in ...downloads.hindawi.com/journals/jam/2014/640406.pdf · Research Article The Application of Pattern Recognition in Electrofacies

Journal of Applied Mathematics 3

(a) It is divided into three categories (b) Each category is split into two subclasses

(c) Get the results after one splitting to judge whether eachsubclass follows Gaussian distribution

Figure 2 Schematic diagram of AKM algorithm

1198602

lowast(119885) is the statistics of Anderson Darling

1198602(119885) = minus

1

119899

119899

sum

119894=1

(2119894 minus 1) [log (119911119894) + log (1 minus 119911119899+1minus119894)] minus 119899

(1)

Figure 3 shows two distribution circumstances InFigure 3(a) the subclass follows Gaussian distributionbut in Figure 3(b) the subclass does not follow Gaussiandistribution AKM algorithm can judge whether eachsubclass follows Gaussian distribution

AKM integrates the determination process of the numberof clusters and the clustering process which can automati-cally determine the optimal number of clusters thus avoidingthe subjectivity in the selection of number of clusters and theblindness of initialization and can also distinguish the errors

4 Discriminant Method

41 Fisher Classification Fisher method actually is about thedimension compression Projecting the samples which can

be easily separated in higher space on a straight line arbitrarilymay be difficult to be identified for different types mixedtogether Generally the best direction can always be foundto separate the samples when projected on that directionBut how to find out the best direction and how to realizethe transformations of projection toward the best directionare the very two problems to be solved by Fisher algorithmFigure 4 shows analysis schematic diagram of Fisher algo-rithm using linear discriminant In Figure 4(a) the samplecannot be identifiedwhen being projected on coordinate axisand in Figure 4(b) the projection samples can be identified bylooking for a direction

42 Potential Function Classification Potential function acommon method used in nonlinear classifier is a way tosolve the classification problems of pattern via the conceptionof electric field In the potential function classification thesamples belonging to one category are treated as positivecharge while the samples belonging to another category aretreated as the negative charge thus turning the classificationproblems of pattern to the matter of transferring the positive

4 Journal of Applied Mathematics

(a) The subclass follows Gaussian distribution (b) The subclass does not follow Gaussian distribution

Figure 3 Judge whether each subclass follows Gaussian distribution

(a) The sample cannot be identified when beingprojected on coordinate axis

(b) Theprojection samples can be identified by lookingfor a direction

Figure 4 Schematic diagram of Fisher linear discriminant analysis

charge and negative charge and the equipotential line whereits electric potential is zero is the decision boundary Thetraining course of potential function algorithm is a processof accumulating electric potential when the samples are inputone after another by exploiting the potential function

43 Least Squares Support Vector Machine (LS-SVM) Basedon the VC dimension theory of statistical learning theoryand the structural riskminimization principle support vectormachinesmethod [12] converts the practical problem to high-dimensional feature space through nonlinear transform-ation and realizes the nonlinear discriminant function in theoriginal space by constructing linear discriminant functionin higher space By means of introducing the least squareslinear system into support vector machine to replace thetraditional one quadratic programming method which isadopted to settle the problems of classification and estima-tion is a kind of extension of traditional support vectormachine

5 Procedures of Electrofacies Analysis

The procedure of electrofacies analysis is shown in Figure 5

51 Feature Extraction of Log Data The primary step toestablish electrofacies is to extract a set of log data featuresthat can reflect the lithologic character of sedimentary rockGenerally there are 9 types of well-logging items or moreand those logging items are interrelated There are two waysto eliminate gibberish simplify control methods and reducecalculated amount (1) principal component analysis (2)Select logging items manually The extracted logging itemswill be recorded in Table stdlogdata as the data source forclustering analysis

52 Clustering Analysis In order to find out the electrofaciesof the same type and establish a standard library in electro-facies analysis clustering analysis must be conducted to stra-tum Finally the classification results acquired by clustering

Journal of Applied Mathematics 5

Input number of typical

Set up dynamic data

Principal component

analysis

Select well-

logging item

Clustering analysis

Lithology dictionary

Set up electrofacies-lithofa

Input number of unknown

Set up dynamic data

Select well-logging item

Discriminate analysis

Determine

Figure 5 Flow diagram of electrofacies analysis

should be recorded in the column of ldquoCategoryrdquo in Tablestdlogdata and the lithology be recorded in the column ofldquoLithologyrdquo according to the lithology dictionary

53 Discriminant Analysis After establishing lithofaciesdatabase namely the electrofacies of type well it is possibleto discriminate the lithofacies of other wells After discrim-ination the data and discriminant results will be written inTable anylogdata and the logging items bewritten in the Tableanylogitem

6 Comparison and Analysis ofResults of Algorithm

61 Experimental Data The Iris dataset [13] usually serves asthe testing dataset for benchmark function in which eachrecord contains 4 attributes of Iris totaling 150 samplesThe correct classification result is that each type of datahas 50 samples Eight attributes are included in each set ofdata of electrofacies totaling 177 samples As for the realdata in electrofacies there is no strictly accurate number ofcategories and standard classification Judging by experience8 classifications may be rational

62 Analysis of Experimental Results of Cluster

621 Iris Dataset It can be easily seen from Figures 6ndash9that the cluster obtained by standard K-means algorithm ispretty different from the standard results while the clusteringresults obtained by ISODATA and KMRIC come near to thestandard ones and are the same as the results obtained bybuilt-in K-means algorithm of Matlab AKM has only twocategories The second and the third categories are deemedas belonging to the same normal distribution that are neverapart for they are approximate to each other and have someparts overlapped (see Table 1 and 2)

Table 1 Clustering method comparison under Iris dataset

119870-means ISODATA KMRIC AKM MatlabType I 30 50 50 53 50Type II 24 39 39 97 38Type III 96 61 61 0 62Accuracy 693 926 926 667 92

Table 2 Clustering method comparisons under Iris dataset

119870-means ISODATA KMRIC AKM MatlabType I 56 35 47 46 47Type II 38 30 46 40 45Type III 36 26 26 26 23Type IV 18 23 14 23 23Type V 11 17 13 13 13Type VI 9 13 12 12 12Type VII 8 12 10 10 11Type VIII 1 10 9 7 3Type IX 0 8 0 0 0Type X 0 3 0 0 0

622 Electrofacies Dataset It can be seen from Figures 10ndash13that the clustering results obtained by K-means have largeerror while the cluster obtained by KMRIC and AKM isrelatively rational and can basically reflect the right classifi-cation and AKM can also identify the accurate number ofclustering automatically Compared with ISODATA AKMis more accurate in determining the number of clusteringand its clustering results are more rational as well Besidesit proves that the hypothesis testing way to judge the numberof clustering of AKM is more universal than that by judgingit based on the between-class distance of ISODATA

6 Journal of Applied Mathematics

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 6 Clustering results of dataset by Matlab figure

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 7 Clustering results obtained by standard K-means

Table 3 Number of misclassification and accuracy of variousdiscriminant methods under Iris dataset

Fisher Potential function LS-SVMType I 0 0 0Type II 1 0 0Type III 0 0 0Total 1 0 0Accuracy 967 100 100

63 Experimental Results and Analysis of Classification

631 Iris Dataset See Table 3

632 Electrofacies Dataset It can be seen fromTables 3 and 4that these three classification methods all work well whenprocessing the Iris data for the data structure of Iris is quitesimple and low in dimension As for electrofacies data Fisherdiscriminant analysis is not applicable due to the singular

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 8 Clustering results obtained by ISODATA and KMRIC

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 9 Clustering results obtained by AKM

Table 4 Number of misclassification of various discriminantmethods under electrofacies dataset

Fisher Potential function LS-SVMType I mdash 0 0Type II mdash 0 2Type III mdash 0 0Type IV mdash 1 2Type V mdash 0 0Type VI mdash 0 0Type VII mdash 0 3Type VIII mdash 0 2Total mdash 1 9Accuracy mdash 949 769

within-class scatter 119878119908

matrix while the potential func-tion and LS-SVM still have better accuracy to classificationThe multiclassification of LS-SVM application remains forfurther study

Journal of Applied Mathematics 7

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 10 Clustering results obtained by standard K-means

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 11 Clustering results obtained by ISODATA

7 Conclusion

On the basis of analyzing the strengths and weaknesses of theexisting main algorithms for clustering this paper proposedthe KMRIC algorithm for improving initial points and theAKM algorithm for determining the number of clusters Thesupport vector machine has also been used for classificationFinally the algorithms are applied to electrofacies analysisThrough the experimental analysis comparison was madeamong algorithms According to the experimental resultsthe KMRIC algorithm erases the sensibility of K-meansalgorithm to data input sequence and initial centers and itachieves an obvious improvement relative to K-means andISODATA AKMalgorithmmixes the process of determiningthe number of clusters and the clustering process togetherto avoid the subjectivity in selecting the number of clustersand the blindness in initial divisions Under general condi-tion the number of clusters and rational clusters can be foundcorrectly

There are some other problems that remain open Thevolatility of results which was caused by the randomness

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 12 Clustering results obtained by KMRIC

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 13 Clustering results obtained by AKM

of selecting initial points in KMRIC existed in KMRIC andAKM To address this problem we can lower the randomnessby selecting the optimal initial points repeatedly Hierarchicalclustering is a very stable method but its disadvantage is themassive calculation cost How to combine the hierarchicalclustering and the abovementionedmethods may be taken asthe improvement direction in future

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Research of the authors was supported by theNationalNatureScience Foundation of China (no 61103037) Nature ScienceFoundation of Guangdong Province (no S2013010011858)Project of Guangdong University of Outstanding Young Tal-ents Cultivation (no 2012LYM 0125) and Dongguan Scienceand Technology Project (no 2012108102007)

8 Journal of Applied Mathematics

References

[1] S Dong D D Zhou and W Ding ldquoFlow cluster algorithmbased on improvedK-meansmethodrdquo IETE Journal of Researchvol 59 no 4 pp 326ndash333 2013

[2] J Q He H Dai and X Song ldquoThe combination stretching fun-ction technique with simulated annealing algorithm for globaloptimizationrdquo Optimization Methods and Software vol 29 no3 pp 629ndash645 2014

[3] J Liu and T Z Liu ldquoDetecting community structure in complexnetworks using simulated annealing with k-means algorithmsrdquoPhysica A StatisticalMechanics and Its Applications vol 389 no11 pp 2300ndash2309 2010

[4] S H Kim and L Li ldquoStatistical identifiability and convergenceevaluation for nonlinear pharmacokinetic models with particleswarm optimizationrdquo Computer Methods and Programs inBiomedicine vol 113 no 2 pp 413ndash432 2014

[5] S Kalyani andK S Swarup ldquoParticle swarmoptimization basedK-means clustering approach for security assessment in powersystemsrdquo Expert Systems with Applications vol 38 no 9 pp10839ndash10846 2011

[6] D H Wang J F Wang and X Y Xu ldquoA relevance vectormachine and bare-bones particle swarm optimization hybridalgorithm for PD pattern recognition of XLPE cablerdquo Journal ofComputational Information Systems vol 8 no 2 pp 451ndash4582012

[7] D Pelleg and AWMoore ldquoX-means extending K-means withefficient estimation of the number of clustersrdquo in Proceedings ofthe 17th International Conference onMachine Learning pp 727ndash734 2000

[8] E Bertini A Tatu and D Keim ldquoQuality metrics in high-dimensional data visualization an overview and systemati-zationrdquo IEEE Transactions on Visualization and ComputerGraphics vol 17 no 12 pp 2203ndash2212 2011

[9] L M Li and Z S Wang ldquoMethod of redundant features elimi-nating based on k-means clusteringrdquo Applied Mechanics andMaterials vol 488 pp 1023ndash1026 2014

[10] C H Lin C C Chen H L Lee et al ldquoFast K-means algorithmbased on a level histogram for image retrievalrdquo Expert Systemswith Applications vol 41 no 7 pp 3276ndash3283 2014

[11] P AW Lewis ldquoDistribution of the Anderson-Darling statisticrdquoAnnals of Mathematical Statistics vol 32 pp 1118ndash1124 1961

[12] M Z Tang and C H Yang ldquoExcellent operational patternrecognition based on simultaneously optimizing cost-sensitivesupport vector machinerdquo CIESC Journal vol 64 no 12 pp4509ndash4514 2013

[13] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: Research Article The Application of Pattern Recognition in ...downloads.hindawi.com/journals/jam/2014/640406.pdf · Research Article The Application of Pattern Recognition in Electrofacies

4 Journal of Applied Mathematics

(a) The subclass follows Gaussian distribution (b) The subclass does not follow Gaussian distribution

Figure 3 Judge whether each subclass follows Gaussian distribution

(a) The sample cannot be identified when beingprojected on coordinate axis

(b) Theprojection samples can be identified by lookingfor a direction

Figure 4 Schematic diagram of Fisher linear discriminant analysis

charge and negative charge and the equipotential line whereits electric potential is zero is the decision boundary Thetraining course of potential function algorithm is a processof accumulating electric potential when the samples are inputone after another by exploiting the potential function

43 Least Squares Support Vector Machine (LS-SVM) Basedon the VC dimension theory of statistical learning theoryand the structural riskminimization principle support vectormachinesmethod [12] converts the practical problem to high-dimensional feature space through nonlinear transform-ation and realizes the nonlinear discriminant function in theoriginal space by constructing linear discriminant functionin higher space By means of introducing the least squareslinear system into support vector machine to replace thetraditional one quadratic programming method which isadopted to settle the problems of classification and estima-tion is a kind of extension of traditional support vectormachine

5 Procedures of Electrofacies Analysis

The procedure of electrofacies analysis is shown in Figure 5

51 Feature Extraction of Log Data The primary step toestablish electrofacies is to extract a set of log data featuresthat can reflect the lithologic character of sedimentary rockGenerally there are 9 types of well-logging items or moreand those logging items are interrelated There are two waysto eliminate gibberish simplify control methods and reducecalculated amount (1) principal component analysis (2)Select logging items manually The extracted logging itemswill be recorded in Table stdlogdata as the data source forclustering analysis

52 Clustering Analysis In order to find out the electrofaciesof the same type and establish a standard library in electro-facies analysis clustering analysis must be conducted to stra-tum Finally the classification results acquired by clustering

Journal of Applied Mathematics 5

Input number of typical

Set up dynamic data

Principal component

analysis

Select well-

logging item

Clustering analysis

Lithology dictionary

Set up electrofacies-lithofa

Input number of unknown

Set up dynamic data

Select well-logging item

Discriminate analysis

Determine

Figure 5 Flow diagram of electrofacies analysis

should be recorded in the column of ldquoCategoryrdquo in Tablestdlogdata and the lithology be recorded in the column ofldquoLithologyrdquo according to the lithology dictionary

53 Discriminant Analysis After establishing lithofaciesdatabase namely the electrofacies of type well it is possibleto discriminate the lithofacies of other wells After discrim-ination the data and discriminant results will be written inTable anylogdata and the logging items bewritten in the Tableanylogitem

6 Comparison and Analysis ofResults of Algorithm

61 Experimental Data The Iris dataset [13] usually serves asthe testing dataset for benchmark function in which eachrecord contains 4 attributes of Iris totaling 150 samplesThe correct classification result is that each type of datahas 50 samples Eight attributes are included in each set ofdata of electrofacies totaling 177 samples As for the realdata in electrofacies there is no strictly accurate number ofcategories and standard classification Judging by experience8 classifications may be rational

62 Analysis of Experimental Results of Cluster

621 Iris Dataset It can be easily seen from Figures 6ndash9that the cluster obtained by standard K-means algorithm ispretty different from the standard results while the clusteringresults obtained by ISODATA and KMRIC come near to thestandard ones and are the same as the results obtained bybuilt-in K-means algorithm of Matlab AKM has only twocategories The second and the third categories are deemedas belonging to the same normal distribution that are neverapart for they are approximate to each other and have someparts overlapped (see Table 1 and 2)

Table 1 Clustering method comparison under Iris dataset

119870-means ISODATA KMRIC AKM MatlabType I 30 50 50 53 50Type II 24 39 39 97 38Type III 96 61 61 0 62Accuracy 693 926 926 667 92

Table 2 Clustering method comparisons under Iris dataset

119870-means ISODATA KMRIC AKM MatlabType I 56 35 47 46 47Type II 38 30 46 40 45Type III 36 26 26 26 23Type IV 18 23 14 23 23Type V 11 17 13 13 13Type VI 9 13 12 12 12Type VII 8 12 10 10 11Type VIII 1 10 9 7 3Type IX 0 8 0 0 0Type X 0 3 0 0 0

622 Electrofacies Dataset It can be seen from Figures 10ndash13that the clustering results obtained by K-means have largeerror while the cluster obtained by KMRIC and AKM isrelatively rational and can basically reflect the right classifi-cation and AKM can also identify the accurate number ofclustering automatically Compared with ISODATA AKMis more accurate in determining the number of clusteringand its clustering results are more rational as well Besidesit proves that the hypothesis testing way to judge the numberof clustering of AKM is more universal than that by judgingit based on the between-class distance of ISODATA

6 Journal of Applied Mathematics

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 6 Clustering results of dataset by Matlab figure

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 7 Clustering results obtained by standard K-means

Table 3 Number of misclassification and accuracy of variousdiscriminant methods under Iris dataset

Fisher Potential function LS-SVMType I 0 0 0Type II 1 0 0Type III 0 0 0Total 1 0 0Accuracy 967 100 100

63 Experimental Results and Analysis of Classification

631 Iris Dataset See Table 3

632 Electrofacies Dataset It can be seen fromTables 3 and 4that these three classification methods all work well whenprocessing the Iris data for the data structure of Iris is quitesimple and low in dimension As for electrofacies data Fisherdiscriminant analysis is not applicable due to the singular

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 8 Clustering results obtained by ISODATA and KMRIC

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 9 Clustering results obtained by AKM

Table 4 Number of misclassification of various discriminantmethods under electrofacies dataset

Fisher Potential function LS-SVMType I mdash 0 0Type II mdash 0 2Type III mdash 0 0Type IV mdash 1 2Type V mdash 0 0Type VI mdash 0 0Type VII mdash 0 3Type VIII mdash 0 2Total mdash 1 9Accuracy mdash 949 769

within-class scatter 119878119908

matrix while the potential func-tion and LS-SVM still have better accuracy to classificationThe multiclassification of LS-SVM application remains forfurther study

Journal of Applied Mathematics 7

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 10 Clustering results obtained by standard K-means

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 11 Clustering results obtained by ISODATA

7 Conclusion

On the basis of analyzing the strengths and weaknesses of theexisting main algorithms for clustering this paper proposedthe KMRIC algorithm for improving initial points and theAKM algorithm for determining the number of clusters Thesupport vector machine has also been used for classificationFinally the algorithms are applied to electrofacies analysisThrough the experimental analysis comparison was madeamong algorithms According to the experimental resultsthe KMRIC algorithm erases the sensibility of K-meansalgorithm to data input sequence and initial centers and itachieves an obvious improvement relative to K-means andISODATA AKMalgorithmmixes the process of determiningthe number of clusters and the clustering process togetherto avoid the subjectivity in selecting the number of clustersand the blindness in initial divisions Under general condi-tion the number of clusters and rational clusters can be foundcorrectly

There are some other problems that remain open Thevolatility of results which was caused by the randomness

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 12 Clustering results obtained by KMRIC

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 13 Clustering results obtained by AKM

of selecting initial points in KMRIC existed in KMRIC andAKM To address this problem we can lower the randomnessby selecting the optimal initial points repeatedly Hierarchicalclustering is a very stable method but its disadvantage is themassive calculation cost How to combine the hierarchicalclustering and the abovementionedmethods may be taken asthe improvement direction in future

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Research of the authors was supported by theNationalNatureScience Foundation of China (no 61103037) Nature ScienceFoundation of Guangdong Province (no S2013010011858)Project of Guangdong University of Outstanding Young Tal-ents Cultivation (no 2012LYM 0125) and Dongguan Scienceand Technology Project (no 2012108102007)

8 Journal of Applied Mathematics

References

[1] S Dong D D Zhou and W Ding ldquoFlow cluster algorithmbased on improvedK-meansmethodrdquo IETE Journal of Researchvol 59 no 4 pp 326ndash333 2013

[2] J Q He H Dai and X Song ldquoThe combination stretching fun-ction technique with simulated annealing algorithm for globaloptimizationrdquo Optimization Methods and Software vol 29 no3 pp 629ndash645 2014

[3] J Liu and T Z Liu ldquoDetecting community structure in complexnetworks using simulated annealing with k-means algorithmsrdquoPhysica A StatisticalMechanics and Its Applications vol 389 no11 pp 2300ndash2309 2010

[4] S H Kim and L Li ldquoStatistical identifiability and convergenceevaluation for nonlinear pharmacokinetic models with particleswarm optimizationrdquo Computer Methods and Programs inBiomedicine vol 113 no 2 pp 413ndash432 2014

[5] S Kalyani andK S Swarup ldquoParticle swarmoptimization basedK-means clustering approach for security assessment in powersystemsrdquo Expert Systems with Applications vol 38 no 9 pp10839ndash10846 2011

[6] D H Wang J F Wang and X Y Xu ldquoA relevance vectormachine and bare-bones particle swarm optimization hybridalgorithm for PD pattern recognition of XLPE cablerdquo Journal ofComputational Information Systems vol 8 no 2 pp 451ndash4582012

[7] D Pelleg and AWMoore ldquoX-means extending K-means withefficient estimation of the number of clustersrdquo in Proceedings ofthe 17th International Conference onMachine Learning pp 727ndash734 2000

[8] E Bertini A Tatu and D Keim ldquoQuality metrics in high-dimensional data visualization an overview and systemati-zationrdquo IEEE Transactions on Visualization and ComputerGraphics vol 17 no 12 pp 2203ndash2212 2011

[9] L M Li and Z S Wang ldquoMethod of redundant features elimi-nating based on k-means clusteringrdquo Applied Mechanics andMaterials vol 488 pp 1023ndash1026 2014

[10] C H Lin C C Chen H L Lee et al ldquoFast K-means algorithmbased on a level histogram for image retrievalrdquo Expert Systemswith Applications vol 41 no 7 pp 3276ndash3283 2014

[11] P AW Lewis ldquoDistribution of the Anderson-Darling statisticrdquoAnnals of Mathematical Statistics vol 32 pp 1118ndash1124 1961

[12] M Z Tang and C H Yang ldquoExcellent operational patternrecognition based on simultaneously optimizing cost-sensitivesupport vector machinerdquo CIESC Journal vol 64 no 12 pp4509ndash4514 2013

[13] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: Research Article The Application of Pattern Recognition in ...downloads.hindawi.com/journals/jam/2014/640406.pdf · Research Article The Application of Pattern Recognition in Electrofacies

Journal of Applied Mathematics 5

Input number of typical

Set up dynamic data

Principal component

analysis

Select well-

logging item

Clustering analysis

Lithology dictionary

Set up electrofacies-lithofa

Input number of unknown

Set up dynamic data

Select well-logging item

Discriminate analysis

Determine

Figure 5 Flow diagram of electrofacies analysis

should be recorded in the column of ldquoCategoryrdquo in Tablestdlogdata and the lithology be recorded in the column ofldquoLithologyrdquo according to the lithology dictionary

53 Discriminant Analysis After establishing lithofaciesdatabase namely the electrofacies of type well it is possibleto discriminate the lithofacies of other wells After discrim-ination the data and discriminant results will be written inTable anylogdata and the logging items bewritten in the Tableanylogitem

6 Comparison and Analysis ofResults of Algorithm

61 Experimental Data The Iris dataset [13] usually serves asthe testing dataset for benchmark function in which eachrecord contains 4 attributes of Iris totaling 150 samplesThe correct classification result is that each type of datahas 50 samples Eight attributes are included in each set ofdata of electrofacies totaling 177 samples As for the realdata in electrofacies there is no strictly accurate number ofcategories and standard classification Judging by experience8 classifications may be rational

62 Analysis of Experimental Results of Cluster

621 Iris Dataset It can be easily seen from Figures 6ndash9that the cluster obtained by standard K-means algorithm ispretty different from the standard results while the clusteringresults obtained by ISODATA and KMRIC come near to thestandard ones and are the same as the results obtained bybuilt-in K-means algorithm of Matlab AKM has only twocategories The second and the third categories are deemedas belonging to the same normal distribution that are neverapart for they are approximate to each other and have someparts overlapped (see Table 1 and 2)

Table 1 Clustering method comparison under Iris dataset

119870-means ISODATA KMRIC AKM MatlabType I 30 50 50 53 50Type II 24 39 39 97 38Type III 96 61 61 0 62Accuracy 693 926 926 667 92

Table 2 Clustering method comparisons under Iris dataset

119870-means ISODATA KMRIC AKM MatlabType I 56 35 47 46 47Type II 38 30 46 40 45Type III 36 26 26 26 23Type IV 18 23 14 23 23Type V 11 17 13 13 13Type VI 9 13 12 12 12Type VII 8 12 10 10 11Type VIII 1 10 9 7 3Type IX 0 8 0 0 0Type X 0 3 0 0 0

622 Electrofacies Dataset It can be seen from Figures 10ndash13that the clustering results obtained by K-means have largeerror while the cluster obtained by KMRIC and AKM isrelatively rational and can basically reflect the right classifi-cation and AKM can also identify the accurate number ofclustering automatically Compared with ISODATA AKMis more accurate in determining the number of clusteringand its clustering results are more rational as well Besidesit proves that the hypothesis testing way to judge the numberof clustering of AKM is more universal than that by judgingit based on the between-class distance of ISODATA

6 Journal of Applied Mathematics

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 6 Clustering results of dataset by Matlab figure

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 7 Clustering results obtained by standard K-means

Table 3 Number of misclassification and accuracy of variousdiscriminant methods under Iris dataset

Fisher Potential function LS-SVMType I 0 0 0Type II 1 0 0Type III 0 0 0Total 1 0 0Accuracy 967 100 100

63 Experimental Results and Analysis of Classification

631 Iris Dataset See Table 3

632 Electrofacies Dataset It can be seen fromTables 3 and 4that these three classification methods all work well whenprocessing the Iris data for the data structure of Iris is quitesimple and low in dimension As for electrofacies data Fisherdiscriminant analysis is not applicable due to the singular

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 8 Clustering results obtained by ISODATA and KMRIC

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 9 Clustering results obtained by AKM

Table 4 Number of misclassification of various discriminantmethods under electrofacies dataset

Fisher Potential function LS-SVMType I mdash 0 0Type II mdash 0 2Type III mdash 0 0Type IV mdash 1 2Type V mdash 0 0Type VI mdash 0 0Type VII mdash 0 3Type VIII mdash 0 2Total mdash 1 9Accuracy mdash 949 769

within-class scatter 119878119908

matrix while the potential func-tion and LS-SVM still have better accuracy to classificationThe multiclassification of LS-SVM application remains forfurther study

Journal of Applied Mathematics 7

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 10 Clustering results obtained by standard K-means

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 11 Clustering results obtained by ISODATA

7 Conclusion

On the basis of analyzing the strengths and weaknesses of theexisting main algorithms for clustering this paper proposedthe KMRIC algorithm for improving initial points and theAKM algorithm for determining the number of clusters Thesupport vector machine has also been used for classificationFinally the algorithms are applied to electrofacies analysisThrough the experimental analysis comparison was madeamong algorithms According to the experimental resultsthe KMRIC algorithm erases the sensibility of K-meansalgorithm to data input sequence and initial centers and itachieves an obvious improvement relative to K-means andISODATA AKMalgorithmmixes the process of determiningthe number of clusters and the clustering process togetherto avoid the subjectivity in selecting the number of clustersand the blindness in initial divisions Under general condi-tion the number of clusters and rational clusters can be foundcorrectly

There are some other problems that remain open Thevolatility of results which was caused by the randomness

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 12 Clustering results obtained by KMRIC

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 13 Clustering results obtained by AKM

of selecting initial points in KMRIC existed in KMRIC andAKM To address this problem we can lower the randomnessby selecting the optimal initial points repeatedly Hierarchicalclustering is a very stable method but its disadvantage is themassive calculation cost How to combine the hierarchicalclustering and the abovementionedmethods may be taken asthe improvement direction in future

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Research of the authors was supported by theNationalNatureScience Foundation of China (no 61103037) Nature ScienceFoundation of Guangdong Province (no S2013010011858)Project of Guangdong University of Outstanding Young Tal-ents Cultivation (no 2012LYM 0125) and Dongguan Scienceand Technology Project (no 2012108102007)

8 Journal of Applied Mathematics

References

[1] S Dong D D Zhou and W Ding ldquoFlow cluster algorithmbased on improvedK-meansmethodrdquo IETE Journal of Researchvol 59 no 4 pp 326ndash333 2013

[2] J Q He H Dai and X Song ldquoThe combination stretching fun-ction technique with simulated annealing algorithm for globaloptimizationrdquo Optimization Methods and Software vol 29 no3 pp 629ndash645 2014

[3] J Liu and T Z Liu ldquoDetecting community structure in complexnetworks using simulated annealing with k-means algorithmsrdquoPhysica A StatisticalMechanics and Its Applications vol 389 no11 pp 2300ndash2309 2010

[4] S H Kim and L Li ldquoStatistical identifiability and convergenceevaluation for nonlinear pharmacokinetic models with particleswarm optimizationrdquo Computer Methods and Programs inBiomedicine vol 113 no 2 pp 413ndash432 2014

[5] S Kalyani andK S Swarup ldquoParticle swarmoptimization basedK-means clustering approach for security assessment in powersystemsrdquo Expert Systems with Applications vol 38 no 9 pp10839ndash10846 2011

[6] D H Wang J F Wang and X Y Xu ldquoA relevance vectormachine and bare-bones particle swarm optimization hybridalgorithm for PD pattern recognition of XLPE cablerdquo Journal ofComputational Information Systems vol 8 no 2 pp 451ndash4582012

[7] D Pelleg and AWMoore ldquoX-means extending K-means withefficient estimation of the number of clustersrdquo in Proceedings ofthe 17th International Conference onMachine Learning pp 727ndash734 2000

[8] E Bertini A Tatu and D Keim ldquoQuality metrics in high-dimensional data visualization an overview and systemati-zationrdquo IEEE Transactions on Visualization and ComputerGraphics vol 17 no 12 pp 2203ndash2212 2011

[9] L M Li and Z S Wang ldquoMethod of redundant features elimi-nating based on k-means clusteringrdquo Applied Mechanics andMaterials vol 488 pp 1023ndash1026 2014

[10] C H Lin C C Chen H L Lee et al ldquoFast K-means algorithmbased on a level histogram for image retrievalrdquo Expert Systemswith Applications vol 41 no 7 pp 3276ndash3283 2014

[11] P AW Lewis ldquoDistribution of the Anderson-Darling statisticrdquoAnnals of Mathematical Statistics vol 32 pp 1118ndash1124 1961

[12] M Z Tang and C H Yang ldquoExcellent operational patternrecognition based on simultaneously optimizing cost-sensitivesupport vector machinerdquo CIESC Journal vol 64 no 12 pp4509ndash4514 2013

[13] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: Research Article The Application of Pattern Recognition in ...downloads.hindawi.com/journals/jam/2014/640406.pdf · Research Article The Application of Pattern Recognition in Electrofacies

6 Journal of Applied Mathematics

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 6 Clustering results of dataset by Matlab figure

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 7 Clustering results obtained by standard K-means

Table 3 Number of misclassification and accuracy of variousdiscriminant methods under Iris dataset

Fisher Potential function LS-SVMType I 0 0 0Type II 1 0 0Type III 0 0 0Total 1 0 0Accuracy 967 100 100

63 Experimental Results and Analysis of Classification

631 Iris Dataset See Table 3

632 Electrofacies Dataset It can be seen fromTables 3 and 4that these three classification methods all work well whenprocessing the Iris data for the data structure of Iris is quitesimple and low in dimension As for electrofacies data Fisherdiscriminant analysis is not applicable due to the singular

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 8 Clustering results obtained by ISODATA and KMRIC

15

1

05

0

minus05

minus1

minus15

minus4 minus3 minus2 minus1 0 1 2 3 4

Figure 9 Clustering results obtained by AKM

Table 4 Number of misclassification of various discriminantmethods under electrofacies dataset

Fisher Potential function LS-SVMType I mdash 0 0Type II mdash 0 2Type III mdash 0 0Type IV mdash 1 2Type V mdash 0 0Type VI mdash 0 0Type VII mdash 0 3Type VIII mdash 0 2Total mdash 1 9Accuracy mdash 949 769

within-class scatter 119878119908

matrix while the potential func-tion and LS-SVM still have better accuracy to classificationThe multiclassification of LS-SVM application remains forfurther study

Journal of Applied Mathematics 7

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 10 Clustering results obtained by standard K-means

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 11 Clustering results obtained by ISODATA

7 Conclusion

On the basis of analyzing the strengths and weaknesses of theexisting main algorithms for clustering this paper proposedthe KMRIC algorithm for improving initial points and theAKM algorithm for determining the number of clusters Thesupport vector machine has also been used for classificationFinally the algorithms are applied to electrofacies analysisThrough the experimental analysis comparison was madeamong algorithms According to the experimental resultsthe KMRIC algorithm erases the sensibility of K-meansalgorithm to data input sequence and initial centers and itachieves an obvious improvement relative to K-means andISODATA AKMalgorithmmixes the process of determiningthe number of clusters and the clustering process togetherto avoid the subjectivity in selecting the number of clustersand the blindness in initial divisions Under general condi-tion the number of clusters and rational clusters can be foundcorrectly

There are some other problems that remain open Thevolatility of results which was caused by the randomness

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 12 Clustering results obtained by KMRIC

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 13 Clustering results obtained by AKM

of selecting initial points in KMRIC existed in KMRIC andAKM To address this problem we can lower the randomnessby selecting the optimal initial points repeatedly Hierarchicalclustering is a very stable method but its disadvantage is themassive calculation cost How to combine the hierarchicalclustering and the abovementionedmethods may be taken asthe improvement direction in future

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Research of the authors was supported by theNationalNatureScience Foundation of China (no 61103037) Nature ScienceFoundation of Guangdong Province (no S2013010011858)Project of Guangdong University of Outstanding Young Tal-ents Cultivation (no 2012LYM 0125) and Dongguan Scienceand Technology Project (no 2012108102007)

8 Journal of Applied Mathematics

References

[1] S Dong D D Zhou and W Ding ldquoFlow cluster algorithmbased on improvedK-meansmethodrdquo IETE Journal of Researchvol 59 no 4 pp 326ndash333 2013

[2] J Q He H Dai and X Song ldquoThe combination stretching fun-ction technique with simulated annealing algorithm for globaloptimizationrdquo Optimization Methods and Software vol 29 no3 pp 629ndash645 2014

[3] J Liu and T Z Liu ldquoDetecting community structure in complexnetworks using simulated annealing with k-means algorithmsrdquoPhysica A StatisticalMechanics and Its Applications vol 389 no11 pp 2300ndash2309 2010

[4] S H Kim and L Li ldquoStatistical identifiability and convergenceevaluation for nonlinear pharmacokinetic models with particleswarm optimizationrdquo Computer Methods and Programs inBiomedicine vol 113 no 2 pp 413ndash432 2014

[5] S Kalyani andK S Swarup ldquoParticle swarmoptimization basedK-means clustering approach for security assessment in powersystemsrdquo Expert Systems with Applications vol 38 no 9 pp10839ndash10846 2011

[6] D H Wang J F Wang and X Y Xu ldquoA relevance vectormachine and bare-bones particle swarm optimization hybridalgorithm for PD pattern recognition of XLPE cablerdquo Journal ofComputational Information Systems vol 8 no 2 pp 451ndash4582012

[7] D Pelleg and AWMoore ldquoX-means extending K-means withefficient estimation of the number of clustersrdquo in Proceedings ofthe 17th International Conference onMachine Learning pp 727ndash734 2000

[8] E Bertini A Tatu and D Keim ldquoQuality metrics in high-dimensional data visualization an overview and systemati-zationrdquo IEEE Transactions on Visualization and ComputerGraphics vol 17 no 12 pp 2203ndash2212 2011

[9] L M Li and Z S Wang ldquoMethod of redundant features elimi-nating based on k-means clusteringrdquo Applied Mechanics andMaterials vol 488 pp 1023ndash1026 2014

[10] C H Lin C C Chen H L Lee et al ldquoFast K-means algorithmbased on a level histogram for image retrievalrdquo Expert Systemswith Applications vol 41 no 7 pp 3276ndash3283 2014

[11] P AW Lewis ldquoDistribution of the Anderson-Darling statisticrdquoAnnals of Mathematical Statistics vol 32 pp 1118ndash1124 1961

[12] M Z Tang and C H Yang ldquoExcellent operational patternrecognition based on simultaneously optimizing cost-sensitivesupport vector machinerdquo CIESC Journal vol 64 no 12 pp4509ndash4514 2013

[13] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: Research Article The Application of Pattern Recognition in ...downloads.hindawi.com/journals/jam/2014/640406.pdf · Research Article The Application of Pattern Recognition in Electrofacies

Journal of Applied Mathematics 7

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 10 Clustering results obtained by standard K-means

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 11 Clustering results obtained by ISODATA

7 Conclusion

On the basis of analyzing the strengths and weaknesses of theexisting main algorithms for clustering this paper proposedthe KMRIC algorithm for improving initial points and theAKM algorithm for determining the number of clusters Thesupport vector machine has also been used for classificationFinally the algorithms are applied to electrofacies analysisThrough the experimental analysis comparison was madeamong algorithms According to the experimental resultsthe KMRIC algorithm erases the sensibility of K-meansalgorithm to data input sequence and initial centers and itachieves an obvious improvement relative to K-means andISODATA AKMalgorithmmixes the process of determiningthe number of clusters and the clustering process togetherto avoid the subjectivity in selecting the number of clustersand the blindness in initial divisions Under general condi-tion the number of clusters and rational clusters can be foundcorrectly

There are some other problems that remain open Thevolatility of results which was caused by the randomness

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 12 Clustering results obtained by KMRIC

15

10

5

0

minus5

minus10

minus15

minus30 minus20 minus10 0 10 20 30

Figure 13 Clustering results obtained by AKM

of selecting initial points in KMRIC existed in KMRIC andAKM To address this problem we can lower the randomnessby selecting the optimal initial points repeatedly Hierarchicalclustering is a very stable method but its disadvantage is themassive calculation cost How to combine the hierarchicalclustering and the abovementionedmethods may be taken asthe improvement direction in future

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Research of the authors was supported by theNationalNatureScience Foundation of China (no 61103037) Nature ScienceFoundation of Guangdong Province (no S2013010011858)Project of Guangdong University of Outstanding Young Tal-ents Cultivation (no 2012LYM 0125) and Dongguan Scienceand Technology Project (no 2012108102007)

8 Journal of Applied Mathematics

References

[1] S Dong D D Zhou and W Ding ldquoFlow cluster algorithmbased on improvedK-meansmethodrdquo IETE Journal of Researchvol 59 no 4 pp 326ndash333 2013

[2] J Q He H Dai and X Song ldquoThe combination stretching fun-ction technique with simulated annealing algorithm for globaloptimizationrdquo Optimization Methods and Software vol 29 no3 pp 629ndash645 2014

[3] J Liu and T Z Liu ldquoDetecting community structure in complexnetworks using simulated annealing with k-means algorithmsrdquoPhysica A StatisticalMechanics and Its Applications vol 389 no11 pp 2300ndash2309 2010

[4] S H Kim and L Li ldquoStatistical identifiability and convergenceevaluation for nonlinear pharmacokinetic models with particleswarm optimizationrdquo Computer Methods and Programs inBiomedicine vol 113 no 2 pp 413ndash432 2014

[5] S Kalyani andK S Swarup ldquoParticle swarmoptimization basedK-means clustering approach for security assessment in powersystemsrdquo Expert Systems with Applications vol 38 no 9 pp10839ndash10846 2011

[6] D H Wang J F Wang and X Y Xu ldquoA relevance vectormachine and bare-bones particle swarm optimization hybridalgorithm for PD pattern recognition of XLPE cablerdquo Journal ofComputational Information Systems vol 8 no 2 pp 451ndash4582012

[7] D Pelleg and AWMoore ldquoX-means extending K-means withefficient estimation of the number of clustersrdquo in Proceedings ofthe 17th International Conference onMachine Learning pp 727ndash734 2000

[8] E Bertini A Tatu and D Keim ldquoQuality metrics in high-dimensional data visualization an overview and systemati-zationrdquo IEEE Transactions on Visualization and ComputerGraphics vol 17 no 12 pp 2203ndash2212 2011

[9] L M Li and Z S Wang ldquoMethod of redundant features elimi-nating based on k-means clusteringrdquo Applied Mechanics andMaterials vol 488 pp 1023ndash1026 2014

[10] C H Lin C C Chen H L Lee et al ldquoFast K-means algorithmbased on a level histogram for image retrievalrdquo Expert Systemswith Applications vol 41 no 7 pp 3276ndash3283 2014

[11] P AW Lewis ldquoDistribution of the Anderson-Darling statisticrdquoAnnals of Mathematical Statistics vol 32 pp 1118ndash1124 1961

[12] M Z Tang and C H Yang ldquoExcellent operational patternrecognition based on simultaneously optimizing cost-sensitivesupport vector machinerdquo CIESC Journal vol 64 no 12 pp4509ndash4514 2013

[13] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: Research Article The Application of Pattern Recognition in ...downloads.hindawi.com/journals/jam/2014/640406.pdf · Research Article The Application of Pattern Recognition in Electrofacies

8 Journal of Applied Mathematics

References

[1] S Dong D D Zhou and W Ding ldquoFlow cluster algorithmbased on improvedK-meansmethodrdquo IETE Journal of Researchvol 59 no 4 pp 326ndash333 2013

[2] J Q He H Dai and X Song ldquoThe combination stretching fun-ction technique with simulated annealing algorithm for globaloptimizationrdquo Optimization Methods and Software vol 29 no3 pp 629ndash645 2014

[3] J Liu and T Z Liu ldquoDetecting community structure in complexnetworks using simulated annealing with k-means algorithmsrdquoPhysica A StatisticalMechanics and Its Applications vol 389 no11 pp 2300ndash2309 2010

[4] S H Kim and L Li ldquoStatistical identifiability and convergenceevaluation for nonlinear pharmacokinetic models with particleswarm optimizationrdquo Computer Methods and Programs inBiomedicine vol 113 no 2 pp 413ndash432 2014

[5] S Kalyani andK S Swarup ldquoParticle swarmoptimization basedK-means clustering approach for security assessment in powersystemsrdquo Expert Systems with Applications vol 38 no 9 pp10839ndash10846 2011

[6] D H Wang J F Wang and X Y Xu ldquoA relevance vectormachine and bare-bones particle swarm optimization hybridalgorithm for PD pattern recognition of XLPE cablerdquo Journal ofComputational Information Systems vol 8 no 2 pp 451ndash4582012

[7] D Pelleg and AWMoore ldquoX-means extending K-means withefficient estimation of the number of clustersrdquo in Proceedings ofthe 17th International Conference onMachine Learning pp 727ndash734 2000

[8] E Bertini A Tatu and D Keim ldquoQuality metrics in high-dimensional data visualization an overview and systemati-zationrdquo IEEE Transactions on Visualization and ComputerGraphics vol 17 no 12 pp 2203ndash2212 2011

[9] L M Li and Z S Wang ldquoMethod of redundant features elimi-nating based on k-means clusteringrdquo Applied Mechanics andMaterials vol 488 pp 1023ndash1026 2014

[10] C H Lin C C Chen H L Lee et al ldquoFast K-means algorithmbased on a level histogram for image retrievalrdquo Expert Systemswith Applications vol 41 no 7 pp 3276ndash3283 2014

[11] P AW Lewis ldquoDistribution of the Anderson-Darling statisticrdquoAnnals of Mathematical Statistics vol 32 pp 1118ndash1124 1961

[12] M Z Tang and C H Yang ldquoExcellent operational patternrecognition based on simultaneously optimizing cost-sensitivesupport vector machinerdquo CIESC Journal vol 64 no 12 pp4509ndash4514 2013

[13] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 9: Research Article The Application of Pattern Recognition in ...downloads.hindawi.com/journals/jam/2014/640406.pdf · Research Article The Application of Pattern Recognition in Electrofacies

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of