EC-Structure: Establishing Consumption Structure through ...downloads.hindawi.com/journals/complexity/2019/6543590.pdf · Complexity (a) Distributionofnodes (b) eheatmap F : Visualizationoftheresultsgeneratedbythealgorithm

Research ArticleEC-Structure Establishing Consumption Structure throughMining E-Commerce Data to Discover Consumption Upgrade

Lin Guo 1 and Dongliang Zhang2

1School of Economics and Management Changchun University of Science and Technology Jilin 130022 China2Institution of technical science Fudan University Shanghai 200000 China

Correspondence should be addressed to Lin Guo guolincusteducn

Received 24 December 2018 Accepted 26 February 2019 Published 12 March 2019

Guest Editor Thiago C Silva

Copyright copy 2019 Lin Guo and Dongliang Zhang This is an open access article distributed under the Creative CommonsAttribution License which permits unrestricted use distribution and reproduction in any medium provided the original work isproperly cited

The traditional methods of analyzing consumption structure have many limitations and data acquisition is difficult so it is hardto scientifically verify the accuracy of algorithms With the development of Internet economy many scientific researchers focuson mining knowledge of consumer behavior using big data analysis technology Because consumption decisions are influencedby not only personal characteristics but also social trends and environment it is one-sided to analyze the impact of one singlefactor on the phenomenon of consumption The authors of this paper combine the consumption structure analysis method anddata processing technology using data from an e-commerce platform to extract the consumption structure of cities compare thestructural differences between different periods and then discover consumption upgrading according to swarm intelligence Theexperiments prove the efficacy of the algorithm proposed in this paper compared to other similar algorithms using several differentdatasets which illustrates the algorithmrsquos efficacy and stable performance in consumption structure analysis

1 Introduction

With the continuous expansion of consumption scale con-sumersrsquo personalized demands are becoming increasinglyobvious Consumer behaviors such as purchasing decisionsare influenced by not only personal characteristics but alsointerpersonal relationships social environments networkculture and so onTherefore the analysis of consumption justfrom the individual perspective is one-sided and unscientific

At present most research on consumption upgradingis carried out from the macroperspective and is realizedbased on national statistical yearbook data The analysis ofconsumptionupgrading fromamacroperspective is relativelysimple However it is difficult to find individual consump-tion structures at the microlevel from the perspective ofconsumers From the perspective of management and eco-nomics consumption upgrading is difficult to measure andthere is no strict boundary with which to distinguish betweenconsumption upgrading and nonupgrading and relevantexperimental data is also difficult to obtain The authorsof this paper can fully quantify the judgment process of

consumption upgrading propose a set of evaluation criteriaand use big datawith comprehensive coverage of user featuresfor mining Therefore the algorithm proposed in this paperis scientific and accurate

This paper combines the consumption structure analysismethod and data processing technology to extract collec-tive wisdom to construct an economic map that describesurban economic hotspots The algorithm involves studyingconsumer consumption structures to realize consumptionupgrading mechanism research and building a consumptionupgradingmodel to analyzewhether consumption upgradingoccurred The results obtained from the multiangle andmultidimensional research will be comprehensive and rea-sonable

2 Related Work

The research on consumption function theory is mainlyfocused on Persistent Income Theory (PIH) and Life CycleTheory (LCH) The only difference between PIH and LCH is

HindawiComplexityVolume 2019 Article ID 6543590 8 pageshttpsdoiorg10115520196543590

2 Complexity

that the former usually uses an indefinite limitation while thelatter uses a definite limitation so they are generally calledLC-PIHwhen combinedMany researchers [1ndash4] use LC-PIHto study the consumption problems of Western residents buttheir conclusions are inconsistent

With the development of the Internet economy anincreasing number of researchers focus on the impact ofInternet technology Electronic commerce (e-commerce) isany type of business or commercial transaction that involvesinformation transfer across the Internet [5] A userrsquos behav-iors on a website can reflect their interests and purchaseintentions Therefore consumption structure can be analyzedby studying the data of e-commerce platforms which isdifficult to realize using to traditional research methods

When analyzing the characteristics of a network thedistribution of user activity is investigated and a network ofbidders that is connected by common interest in individualarticles is constructed [6 7] The networkrsquos cluster structurecorrespondswith themain user groups according to commoninterests exhibiting hierarchy and overlap

Regarding the characteristics of uses Curme [5] Chat-topadhyay [8] and Guo [9] analyze user behaviors from theperspective of complex systems and extract implicit semanticinformation from large-scale semistructured data Glass [10]Singh [11] and Aviano [12] find personal characteristicsthrough the feature extraction of original website datamine important features of users using to feature selectionmethods and finally extract a user profile model Kim [13]Ouaftouh [14] and Diao [15] classify customers into differentgroups according to their similarities that are calculatedthrough demographic features and psychological features orfeatures such as customer value customer consumption andlifetime value

3 Classification of Consumption Data

The rapid development of social networking resulted in theexplosive growth of data that contains large amounts ofhigh-quality information such as information about userinterests and interpersonal relationships Therefore socialdata processing and knowledge mining are intricate andindispensable Consumption data contains much redundantor nonrepresentative data We adopt the matrix analysismethod to quickly classify consumption data into negativedata positive data insufficient evidence data and disputeddata

Definition 1 (negative data) Thenormalized value of negativecomment times is greater than 0 and the normalized value ofpositive comment times is less than 0

Definition 2 (positive data) The normalized value of positivecomment times is greater than 0 and the normalized value ofnegative comment times is less than 0

Definition 3 (insufficient data) The number of positive andnegative comments is relatively small so there is no author-itative data based on which to measure the nature of thedata

Definition 4 (controversial data) The number of positive andnegative comments is relatively large so there is no way tofind absolute salient features based on which to measure thenature of the data

Because socialized data can reflect the characteristics andinterpersonal relationship information about real society theknowledge of regional and overall consumption structure(the coverage of analysis results are determined by datacapture granularity) can be acquired by analyzing the dataof ltuser commentgt that is gathered from websites Afterword segmentation and semantic analysis we can obtainthe ltconsumption object comment times positive timesnegative timesgt data Due to large differences in the eval-uation data about different consumption objects we usestandardized data to control data towithin a range tomeasurethe popularity of different consumption objects The formulafor data normalization is shown as follows

(119909 119910) = (119909119894 minus 119909119904119909 119910119894 minus 119910119904119910 ) (1)

119909 = 1119899119899sum119901=1

119909119901 (2)

119910 = 1119899119899sum119901=1

119910119901 (3)

119904119909 = radic 1119899 minus 1119899sum119901=1

(119909119901 minus 119909)2 (4)

119904119910 = radic 1119899 minus 1119899sum119901=1

(119910119901 minus 119910)2 (5)

(119909 119910) is the calculated result after standardization Itsmean value is 0 the variance is 1 and it is dimensionless(119909 119910) can be mapped to a two-dimensional coordinateinterval [-1 +1] The standardized variable value fluctuatesaround 0 A value greater than 0 indicates that (119909 119910) is higherthan the average level and a value less than 0 indicates that(119909 119910) is lower than the average level

The authors of this paper use nodes to describe con-sumption objects Therefore the locations of nodes in thematrix can describe status of consumption objects In theconsumption matrix X and Y coordinates respectivelyrepresent the standardized data of positive data and negativedata In the four interval matrixes the consumption data isdivided into four categories

The consumption matrix is divided into four regions asshown in Figure 1

It can be seen from Figure 1 that the X-coordinate valueof the node in the negative partition is less than 0 and theY-coordinate value is greater than 0 which indicates thatthe nodes in the negative partition have more negative datathan the average The nodes in the positive partition are justthe opposite The other two types of nodes are controversialand insufficient nodes Among them the controversial node

Complexity 3

negative

positive

controversial

insufficient

+1

+10-1

-1

Figure 1 Coordinate distribution of four types of nodes

has much positive and negative information The insufficientnode has little positive and negative information Neither ofthese two types of nodes can be classified into one specificpartition To reduce the impact of invalid or redundant dataon the accuracy of the analysis the authors of this paperonly focus on the nodes in positive and negative areasIn addition by manually analyzing the nodes in positiveand negative regions it is found that the data about thesenodes is authoritative and clean It is enough to describethe information about consumption structure of users andsufficient for the subsequent experiments

4 Analysis of Consumption Coefficient

By analyzing the data obtained in the above process thematrix ITEM(119909 y) is constructed from positive and negativedata Each node in the matrix describes the situation ofthe positive and negative comments about a consumptionobject By comparing the consumption matrixes from dif-ferent periods the consumption trends changes to thestructure and consumption upgrading can be judged andidentified

With the continuous expansion of consumption scaleconsumersrsquo personalized demands are becoming increasinglyobvious Consumer behavior such as purchase decisionsis affected by multiple factors so consumption hotspotsand structures often change This change may be strongor weak of course significant changes in structures arerelatively easy to detect but weak changes are difficultto capture Therefore to identify changes in consumptionstructures it is necessary to calculate the proportions ofdifferent consumption objects in the total consumption fieldand discover changes of consumption structure in time byanalyzing changes in proportion In this paper we calculatethe consumption coefficient of each consumption objectmeasure the proportion of different consumption objects inthe total consumption field and then identify changes andtrends in the consumption structures of users

The formula for calculating the consumption coefficientis as follows

coeff = ℎ119890119886119905 (119902)ℎ119890119886119905 (119886119897119897) times 100 (6)

coeff is the consumption coefficient of consumptionobject q heat(q) represents the heat of the consumption

object q heat(all) represents the heat of all consumptionobjects It can be seen from the formula that the consump-tion coefficient is the proportion of a certain consumptionobject in the total consumption objects The consumptioncoefficient is added to the matrix ITEM(119909 y) as an addi-tional parameter so the expression of the matrix becomesITEM(119909 y coeff ) Therefore the structural characteristics ofconsumption can be described from three dimensions Byanalyzing ITEM(119909 y coeff ) the implied information aboutconsumption structure consumption trend and consump-tion upgrading can be obtainedThe detailed analysis processis described in the next section

5 Discovery of IndividualConsumption Upgrading

To compare the consumption data fromdifferent periods thediffering degrees of the matrixes that describe the consump-tion structures in different periods need to be calculatedIf the difference degree exceeds a certain threshold valuethen the consumption upgrading phenomenon is consideredas happening Here the consumption matrix at momentn is denoted as ITEMn and the consumption matrix atmoment n+1 is denoted as ITEMn+1 By comparing ITEMnand ITEMn+1 the differences of different consumptionmatrixes can be calculated and structural changes can bedetectedThe formula for calculating degree of difference is asfollows

119862119874119877(ITEM119899 ITEM119899+1)= sum119899119894=1 (119868119879119864119872119894 minus 119868119879119864119872119899)sum119899+1119895=1 (119868119879119864119872119895 minus 119868119879119864119872119899+1)radicsum119899119894=1 (119868119879119864119872119894 minus 119868119879119864119872119899)2sum119899+1119895=1 (119868119879119864119872119895 minus 119868119879119864119872119899+1)2

(7)

According to the formula the coefficient COR is obtainedby dividing the covariance by the standard deviation of twovariables The covariance can reflect the correlation degreebetween two random variables When the covariance isgreater than 0 it means that the two variables are positivelycorrelated and when the covariance is less than 0 it meansthat the two variables are negatively correlated Note that thecoefficient is meaningful when both variables are not zeroand the range of the coefficient is [-1 1] When COR is 1ITEMn and ITEMn+1 are completely positively correlatedWhen COR is -1 ITEMn and ITEMn+1 are completely

4 Complexity

20

18

16

14

12

10

00 02 04 06 08 10

+10

X

Y

(a) COR is +10

00

02

04

06

08

10

00

00

02 04 06 08 10X

Y

(b) COR is 0-10

00

02

04

06

08

10

00 02 04 06 08 10X

Y

(c) COR is -10

Figure 2 The coefficient COR of different consumption structure

negatively correlated The greater the absolute value of CORis the stronger the correlation degree between ITEMn andITEMn+1 is The closer the coefficient COR is to 0 the weakerthe correlation degree between ITEMn and ITEMn+1 is

Through the above methods we can build consumptionmatrixes for different periods and judge the differences inconsumption structure in between periods by calculating thecoefficient CORWhenCOR approaches 1 or -1 it means thatthe consumption structure significantly changed so it can beconsidered that consumption upgrading occurred As shownin Figure 2 the closer the coefficient is 1 or -1 the greaterthe structural difference is while a coefficient that is near 0indicates that the consumption structure changed little andthere is no upgrading

6 Experiment

The datasets used throughout the experiments are ZacharyrsquosKarate Club(httpwww-personalumichedusimmejnnetda-ta) DolphinrsquosAssociations(httpwww-personalumichedu

simmejnnetdata) LesMiserables(httpwikigephiorgindexphpDatasets) MovieLens(httpwwwdatatangcomdatar-esdetailaspxid=44295) and EP dataset(httpwwwdian-pingcom)(1) The dataset of Zacharyrsquos Karate Club is a socialnetwork of friendships between 34 members so edges in thegraph describe the higher frequency of interactions betweenmembers(2)The dataset of Dolphinrsquos Associations is an undirectedsocial network of frequent associations between 62 dolphinswhich has 62 nodes and 159 edges(3) The dataset of LesMiserables is a coappearance net-work of characters in LesMiserables which contains 77 nodesand 254 edges(4)Thedataset of MovieLens is a synthesized recommen-dation system and virtual community which is commonlyused for social computing(5) The EP dataset was captured from an e-commerceplatform (dianpingcom) It contains 15890209 pieces of dataand was updated in August 2018The data collection fields are

Complexity 5

1

minus05 00 05

minus1

0

Figure 3 The positive and negative distribution of comments

shop id (uniqueness) province city city id area big cate(the primary classification) big cate id small cate (thesecondary classification) small cate id service rating allremarks very good remarks (5-star review) good remarks(4-star review) common remarks (3-star review) badremarks (2-star review) and very bad remarks (1-starreview)

Comparison Methods NMFOSC [16] presents an approachto community detection that utilizes a nonnegative matrixfactorization model to divide overlapping communities fromnetworks RNM [17] is a local expansion method based onrough neighborhood CPM [18] greedily expands naturalcommunities of seeds until the whole graph is covered byusing a local fitness function EdgeB-Cluster [19] bundlessimilar edges adjusts the locations of nodes to optimize thevisualized output of the graph and analyzes networks from acommunity level

Through the analysis of a consumption object in a certainregion in the e-commerce platform it was found that thenumber of positive comments is very large This is becausethere is a phenomenon of deliberately increasing the numberof good comments to improve the storersquos reputation whichresults in the presence of too many good comments Onthe contrary the numbers of neutral and negative commentsare relatively reasonable and few of these comments areintentionally added or deleted so they are convincing Basedon the above factors the authors of this paper did not analyzethe quantity of positive comments and only consideredthe quantity of neutral and negative comments Throughexperimental verification of the quality of the neutral andnegative comments it was found that the data is authenticand abundant and enough to describe the object to betested

Figure 3 shows the distribution of positive and negativecomments The nodes in the insufficient and controversialareas do not provide valuable information for subsequentanalysis so they were removed It can be seen that there aremore positive nodes than negative ones and the difference

between them is large It is important to note that althoughthe nodes in the negative area represent that the userrsquoscomments are negative they still provide useful knowledgeabout consumption trends that cannot be removed

Figure 3 shows the regional analysis results while Figure 4shows the overall analysis results Figure 4 is the visual outputof the results generated by the algorithm Figure 4(a) showsthe distribution of nodes and the colors of nodes indicatethe heat of different consumption objects The darker a nodecolor is the more attention the node received Figure 4(b)is a heat map of a center node and the black node isthe center node Figure 4(b) shows that there is a certaincorrelation between the central node and a large number ofother nodes indicating that there are many high correlationsbetween different consumer groups Thus the characteristicsof consumption objects can be further analyzed based on therelationships between consumers and commodities

Figure 5 depicts the distribution of different consumptionobjects In this case the node with a ratio of more than06 is regarded as a popular consumption node while anode with a ratio of less than or equal to 06 is regardedas an unpopular consumption node Of course if the ratiothreshold is lowered then additional nodes will be dividedinto popular consumption areas It can be seen from Figure 5that most nodes belong to the nonhot field which is in-linewith the actual situation

Figure 6(a) depicts the characteristics of nodes that weredivided into two categories to describe different consumptionheat (some representative nodes are extracted) It can beseen that the characteristics of nodes in different categoriesvary greatly Figure 6(b) describes the closeness centralitydistribution of the nodes belong to the same category Thisshows that the node locations have normal distribution sothe similarity between the nodes in the same category is veryhigh That is to say the classification is reasonable

For the purpose of analyzing the experimental resultsthe following measurement parameters are used [20] Mul-tiplicity Precision calculated by 119872119875 = min(|119862(119890) cap 119862(1198901015840)||119871(119890) cap 119871(1198901015840)|)|119862(119890) cap 119862(1198901015840)| Multiplicity Recall by 119872119877 =min(|119862(119890) cap 119862(1198901015840)| |119871(119890) cap 119871(1198901015840)|)|119871(119890) cap 119871(1198901015840)| Let L(e) and

6 Complexity

(a) Distribution of nodes (b) The heat map

Figure 4 Visualization of the results generated by the algorithm

1

09

08

06

07

05

04

03

01

02

0 1 2 3 4 5 6 7

times 104node

ratio

Figure 5 The distribution of different consumption objects

C(e) denote the category and the cluster of an item e e is acluster with n items belonging to the same category and 1198901015840is a cluster merging n items from unary categories FB is acomprehensive measure ofMP andMR and the algorithm isFB=MPtimesMRtimes2(MP+MR)

Table 1 proves the validity and feasibility of the algorithmThe numbers in italic indicate the highest value of thesame parameter in each row Table 1 displays a comparisonof the algorithm proposed in this paper to other similaralgorithms The datasets Karate Club Dolphin LesMiser-ables and MovieLens are used to prove the performanceof the algorithms in structural analysis The EP dataset isused to prove the performance of e-commerce data analysisIt is found that EC-Structure performs better than otheralgorithms and performs stably with different data sets Themain reasons for which EC-Structure is superior to otheralgorithms are that (1) it reduces the influence of erroneouse-business platform data on the algorithm (2) it increasesthe consumption coefficient as a parameter with which to

measure the proportions of different consumption objectsand (3) the coefficient COR can help researchers accuratelyjudge changes in consumption structures Therefore theoperation effect of this algorithm is effective

7 Conclusion

Research on consumer behavior can be made by extractingand analyzing useful information from a large amount ofincomplete vague and random consumer behavior dataThe algorithm proposed in this paper builds consumptionstructures and a consumption upgrading model based on thedata from e-commerce platforms to analyze whether con-sumption upgrading occurred The results of the experimentverified the implementation efficacy and analysis accuracy ofthe algorithm It was found that the algorithm is effectiveThe implementation efficacy of the proposed algorithm issuperior to those of other algorithms and it runs stably withdifferent datasets

Data Availability

The data used to support the findings of this study areincluded within the article

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

This work was supported by Youth Fund of Humanity andSocial Science of Ministry of Education of China (Grantno 18YJCZH041) Project of Education Department of JilinProvince of China (Grant no JJKH20190612SK)

Complexity 7

Table1Th

eperform

ance

comparis

ons

dataset

NMFO

SCRN

MCP

MEd

geB-Cluster

EC-Structure

MR

MP

FBMR

MP

FBMR

MP

FBMR

MP

FBMR

MP

FBKa

rateClub

100

092

096

084

100

091

058

094

071

100

100

100

100

100

100

Dolph

in064

090

075

046

097

062

040

094

056

073

098

083

080

098

088

LesM

iserables

072

087

079

080

088

084

048

089

062

081

088

084

088

083

085

MovieLens

083

085

084

056

086

068

081

086

083

081

088

084

082

088

085

EPdataset

085

080

082

053

056

054

053

065

058

079

083

081

085

082

083

8 Complexity

cold consumption area

hot c

onsu

mpt

ion

area

Degree Distribution1000

900800700600500400300200100

0

Value0 500 1000 1500 2000 2500 3000

Cou

nt

(a) Degree distribution

400375350325300275250225200175150125100

755025

0

Closeness Centrality Distribution

Value1 2 3 4

Cou

nt

(b) Closeness centrality distribution

Figure 6 The characteristic distribution of hot consumption and cold consumption nodes

References

[1] P N Ireland ldquoUsing the permanent income hypothesis for fore-castingrdquo Federal Reserve Bank of Richmond Economic Quarterlyvol 81 no 1 pp 49ndash63 1995

[2] L A Fisher andGKingston ldquoImproved forecasts of tax revenuevia the permanent income hypothesisrdquo Australian EconomicReview vol 50 no 1 pp 21ndash31 2017

[3] L Zhou C Wang and S O Finance ldquoHousehold debt andconsumption-evidence frommicro datardquo So Science vol 3 pp32ndash43 2018

[4] M Zagler ldquoEmpirical evidence on growth and business cyclesrdquoEmpirica vol 44 pp 1ndash20 2017

[5] C Curme T Preis and H E Stanley ldquoQuantifying the seman-tics of search behavior before stock market movesrdquo Proceedingsof the National Academy of Sciences of the United States ofAmerica vol 111 no 32 pp 11600ndash11605 2014

[6] J Reichardt and S Bornholdt Ebay users from stable groups ofcommon interest 2005

[7] H Halpin ldquoThe semantics of searchrdquo in Social Semantics pp149ndash186 Springer US 2013

[8] T Chattopadhyay S Maiti A Pal et al ldquoAutomatic discovery ofemerging trends using cluster name synthesis on user consump-tion data extended abstractrdquo in Proceedings of InternationalConference Companion on World Wide Web pp 981ndash983 2016

[9] L Guo W Zuo and T Peng ldquoInference network building andmovements prediction based on analysis of induced dependen-ciesrdquo IET Soware vol 11 no 1 pp 12ndash17 2017

[10] B Glass Z Benenson and R Landwirth ldquoLook before youleap improving the usersrsquo ability to detect fraud in electronicmarketplacesrdquo in Proceedings of the CHI Conference on HumanFactors in Computing Systems pp 3870ndash3882 ACM 2016

[11] P Singh and M Singh ldquoFraud detection by monitoring cus-tomer behavior and activitiesrdquo Annals of Regional Science vol49 no 1 pp 1ndash27 2012

[12] D Aviano B L Putro and E P Nugroho ldquoBehavioral trackinganalysis on learning management system with apriori associa-tion rules algorithmrdquo inProceedings of the 2017 3rd InternationalConference on Science in Information Technology (ICSITech)Bandung Indonesia 2017

[13] K Kim Y Choi and J Park ldquoPricing fraud detection inonline shopping malls using a finite mixture modelrdquo Electronic

Commerce Research and Applications vol 12 no 3 pp 195ndash2072013

[14] S Ouaftouh A Zellou and A Idri ldquoUser profile model a userdimension based classificationrdquo in Proceedings of the 2015 10thInternational Conference on Intelligent Systems eories andApplications (SITA) Rabat Morocco 2015

[15] Y Diao K Y Liu and L Hu ldquoClassification ofmassive user loadcharacteristics in distribution network based on agglomerativehierarchical algorithmrdquo in Proceedings of the 2016 Interna-tional Conference on Cyber-Enabled Distributed Computing andKnowledge Discovery (CyberC) Chengdu China 2017

[16] N Chen Y Liu and H-C Chao ldquoOverlapping communitydetection using non-negative matrix factorization with orthog-onal and sparseness constraintsrdquo IEEE Access vol 6 pp 21266ndash21274 2017

[17] Z H Zhang D Q Miao and J Qian ldquoDetecting overlappingcommunities with heuristic expansion method based on roughneighborhoodrdquo Chinese Journal of Computer vol 36 no 102013

[18] F Havemann M Heinz and A Struch ldquoIdentication of over-lapping communities and their hierarchy by locally calculatingcommunity-changing resolution levelsrdquo Journal of StatisticalMechanics eory and Experiment vol 1 2011

[19] L Guo W Zuo T Peng and B K Adhikari ldquoAttribute-based edge bundling for visualizing social networksrdquo PhysicaA Statistical Mechanics and Its Applications vol 438 pp 48ndash552015

[20] E Amigo J Gonzalo J Artiles and F Verdejo ldquoA comparisonof extrinsic clustering evaluation metrics based on formalconstraintsrdquo Information Retrieval vol 12 no 4 pp 461ndash4862009

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of


Mathematical Problems in Engineering

Applied MathematicsJournal of


Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of


Mathematical PhysicsAdvances in

Complex AnalysisJournal of


OptimizationJournal of



Engineering Mathematics

International Journal of


Operations ResearchAdvances in

Journal of


Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences


Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018


Decision SciencesAdvances in


AnalysisInternational Journal of


Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

2 Complexity

that the former usually uses an indefinite limitation while thelatter uses a definite limitation so they are generally calledLC-PIHwhen combinedMany researchers [1ndash4] use LC-PIHto study the consumption problems of Western residents buttheir conclusions are inconsistent

With the development of the Internet economy anincreasing number of researchers focus on the impact ofInternet technology Electronic commerce (e-commerce) isany type of business or commercial transaction that involvesinformation transfer across the Internet [5] A userrsquos behav-iors on a website can reflect their interests and purchaseintentions Therefore consumption structure can be analyzedby studying the data of e-commerce platforms which isdifficult to realize using to traditional research methods

When analyzing the characteristics of a network thedistribution of user activity is investigated and a network ofbidders that is connected by common interest in individualarticles is constructed [6 7] The networkrsquos cluster structurecorrespondswith themain user groups according to commoninterests exhibiting hierarchy and overlap

Regarding the characteristics of uses Curme [5] Chat-topadhyay [8] and Guo [9] analyze user behaviors from theperspective of complex systems and extract implicit semanticinformation from large-scale semistructured data Glass [10]Singh [11] and Aviano [12] find personal characteristicsthrough the feature extraction of original website datamine important features of users using to feature selectionmethods and finally extract a user profile model Kim [13]Ouaftouh [14] and Diao [15] classify customers into differentgroups according to their similarities that are calculatedthrough demographic features and psychological features orfeatures such as customer value customer consumption andlifetime value

3 Classification of Consumption Data

The rapid development of social networking resulted in theexplosive growth of data that contains large amounts ofhigh-quality information such as information about userinterests and interpersonal relationships Therefore socialdata processing and knowledge mining are intricate andindispensable Consumption data contains much redundantor nonrepresentative data We adopt the matrix analysismethod to quickly classify consumption data into negativedata positive data insufficient evidence data and disputeddata

Definition 1 (negative data) Thenormalized value of negativecomment times is greater than 0 and the normalized value ofpositive comment times is less than 0

Definition 2 (positive data) The normalized value of positivecomment times is greater than 0 and the normalized value ofnegative comment times is less than 0

Definition 3 (insufficient data) The number of positive andnegative comments is relatively small so there is no author-itative data based on which to measure the nature of thedata

Definition 4 (controversial data) The number of positive andnegative comments is relatively large so there is no way tofind absolute salient features based on which to measure thenature of the data

Because socialized data can reflect the characteristics andinterpersonal relationship information about real society theknowledge of regional and overall consumption structure(the coverage of analysis results are determined by datacapture granularity) can be acquired by analyzing the dataof ltuser commentgt that is gathered from websites Afterword segmentation and semantic analysis we can obtainthe ltconsumption object comment times positive timesnegative timesgt data Due to large differences in the eval-uation data about different consumption objects we usestandardized data to control data towithin a range tomeasurethe popularity of different consumption objects The formulafor data normalization is shown as follows

(119909 119910) = (119909119894 minus 119909119904119909 119910119894 minus 119910119904119910 ) (1)

119909 = 1119899119899sum119901=1

119909119901 (2)

119910 = 1119899119899sum119901=1

119910119901 (3)

119904119909 = radic 1119899 minus 1119899sum119901=1

(119909119901 minus 119909)2 (4)

119904119910 = radic 1119899 minus 1119899sum119901=1

(119910119901 minus 119910)2 (5)

(119909 119910) is the calculated result after standardization Itsmean value is 0 the variance is 1 and it is dimensionless(119909 119910) can be mapped to a two-dimensional coordinateinterval [-1 +1] The standardized variable value fluctuatesaround 0 A value greater than 0 indicates that (119909 119910) is higherthan the average level and a value less than 0 indicates that(119909 119910) is lower than the average level

The authors of this paper use nodes to describe con-sumption objects Therefore the locations of nodes in thematrix can describe status of consumption objects In theconsumption matrix X and Y coordinates respectivelyrepresent the standardized data of positive data and negativedata In the four interval matrixes the consumption data isdivided into four categories

The consumption matrix is divided into four regions asshown in Figure 1

It can be seen from Figure 1 that the X-coordinate valueof the node in the negative partition is less than 0 and theY-coordinate value is greater than 0 which indicates thatthe nodes in the negative partition have more negative datathan the average The nodes in the positive partition are justthe opposite The other two types of nodes are controversialand insufficient nodes Among them the controversial node

Complexity 3

negative

positive

controversial

insufficient

+1

+10-1

-1







coeff = ℎ119890119886119905 (119902)ℎ119890119886119905 (119886119897119897) times 100 (6)






(7)


4 Complexity

20

18

16

14

12

10

00 02 04 06 08 10

+10

X

Y

(a) COR is +10

00

02

04

06

08

10

00

00

02 04 06 08 10X

Y

(b) COR is 0-10

00

02

04

06

08

10

00 02 04 06 08 10X

Y

(c) COR is -10




6 Experiment



Complexity 5

1

minus05 00 05

minus1

0











6 Complexity



1

09

08

06

07

05

04

03

01

02

0 1 2 3 4 5 6 7

times 104node

ratio





7 Conclusion


Data Availability




Acknowledgments


Complexity 7

Table1Th

eperform

ance

comparis

ons

dataset

NMFO

SCRN

MCP

MEd

geB-Cluster

EC-Structure

MR

MP

FBMR

MP

FBMR

MP

FBMR

MP

FBMR

MP

FBKa

rateClub

100

092

096

084

100

091

058

094

071

100

100

100

100

100

100

Dolph

in064

090

075

046

097

062

040

094

056

073

098

083

080

098

088

LesM

iserables

072

087

079

080

088

084

048

089

062

081

088

084

088

083

085

MovieLens

083

085

084

056

086

068

081

086

083

081

088

084

082

088

085

EPdataset

085

080

082

053

056

054

053

065

058

079

083

081

085

082

083

8 Complexity


hot c

onsu

mpt

ion

area


900800700600500400300200100

0

Value0 500 1000 1500 2000 2500 3000

Cou

nt


400375350325300275250225200175150125100

755025

0


Value1 2 3 4

Cou

nt



References





























Journal of












Journal of







Volume 2018






Volume 2018








Complexity 3

negative

positive

controversial

insufficient

+1

+10-1

-1







coeff = ℎ119890119886119905 (119902)ℎ119890119886119905 (119886119897119897) times 100 (6)






(7)


4 Complexity

20

18

16

14

12

10

00 02 04 06 08 10

+10

X

Y

(a) COR is +10

00

02

04

06

08

10

00

00

02 04 06 08 10X

Y

(b) COR is 0-10

00

02

04

06

08

10

00 02 04 06 08 10X

Y

(c) COR is -10




6 Experiment



Complexity 5

1

minus05 00 05

minus1

0











6 Complexity



1

09

08

06

07

05

04

03

01

02

0 1 2 3 4 5 6 7

times 104node

ratio





7 Conclusion


Data Availability




Acknowledgments


Complexity 7

Table1Th

eperform

ance

comparis

ons

dataset

NMFO

SCRN

MCP

MEd

geB-Cluster

EC-Structure

MR

MP

FBMR

MP

FBMR

MP

FBMR

MP

FBMR

MP

FBKa

rateClub

100

092

096

084

100

091

058

094

071

100

100

100

100

100

100

Dolph

in064

090

075

046

097

062

040

094

056

073

098

083

080

098

088

LesM

iserables

072

087

079

080

088

084

048

089

062

081

088

084

088

083

085

MovieLens

083

085

084

056

086

068

081

086

083

081

088

084

082

088

085

EPdataset

085

080

082

053

056

054

053

065

058

079

083

081

085

082

083

8 Complexity


hot c

onsu

mpt

ion

area


900800700600500400300200100

0

Value0 500 1000 1500 2000 2500 3000

Cou

nt


400375350325300275250225200175150125100

755025

0


Value1 2 3 4

Cou

nt



References





























Journal of












Journal of







Volume 2018






Volume 2018








4 Complexity

20

18

16

14

12

10

00 02 04 06 08 10

+10

X

Y

(a) COR is +10

00

02

04

06

08

10

00

00

02 04 06 08 10X

Y

(b) COR is 0-10

00

02

04

06

08

10

00 02 04 06 08 10X

Y

(c) COR is -10




6 Experiment



Complexity 5

1

minus05 00 05

minus1

0











6 Complexity



1

09

08

06

07

05

04

03

01

02

0 1 2 3 4 5 6 7

times 104node

ratio





7 Conclusion


Data Availability




Acknowledgments


Complexity 7

Table1Th

eperform

ance

comparis

ons

dataset

NMFO

SCRN

MCP

MEd

geB-Cluster

EC-Structure

MR

MP

FBMR

MP

FBMR

MP

FBMR

MP

FBMR

MP

FBKa

rateClub

100

092

096

084

100

091

058

094

071

100

100

100

100

100

100

Dolph

in064

090

075

046

097

062

040

094

056

073

098

083

080

098

088

LesM

iserables

072

087

079

080

088

084

048

089

062

081

088

084

088

083

085

MovieLens

083

085

084

056

086

068

081

086

083

081

088

084

082

088

085

EPdataset

085

080

082

053

056

054

053

065

058

079

083

081

085

082

083

8 Complexity


hot c

onsu

mpt

ion

area


900800700600500400300200100

0

Value0 500 1000 1500 2000 2500 3000

Cou

nt


400375350325300275250225200175150125100

755025

0


Value1 2 3 4

Cou

nt



References





























Journal of












Journal of







Volume 2018






Volume 2018








Complexity 5

1

minus05 00 05

minus1

0











6 Complexity



1

09

08

06

07

05

04

03

01

02

0 1 2 3 4 5 6 7

times 104node

ratio





7 Conclusion


Data Availability




Acknowledgments


Complexity 7

Table1Th

eperform

ance

comparis

ons

dataset

NMFO

SCRN

MCP

MEd

geB-Cluster

EC-Structure

MR

MP

FBMR

MP

FBMR

MP

FBMR

MP

FBMR

MP

FBKa

rateClub

100

092

096

084

100

091

058

094

071

100

100

100

100

100

100

Dolph

in064

090

075

046

097

062

040

094

056

073

098

083

080

098

088

LesM

iserables

072

087

079

080

088

084

048

089

062

081

088

084

088

083

085

MovieLens

083

085

084

056

086

068

081

086

083

081

088

084

082

088

085

EPdataset

085

080

082

053

056

054

053

065

058

079

083

081

085

082

083

8 Complexity


hot c

onsu

mpt

ion

area


900800700600500400300200100

0

Value0 500 1000 1500 2000 2500 3000

Cou

nt


400375350325300275250225200175150125100

755025

0


Value1 2 3 4

Cou

nt



References





























Journal of












Journal of







Volume 2018






Volume 2018








6 Complexity



1

09

08

06

07

05

04

03

01

02

0 1 2 3 4 5 6 7

times 104node

ratio





7 Conclusion


Data Availability




Acknowledgments


Complexity 7

Table1Th

eperform

ance

comparis

ons

dataset

NMFO

SCRN

MCP

MEd

geB-Cluster

EC-Structure

MR

MP

FBMR

MP

FBMR

MP

FBMR

MP

FBMR

MP

FBKa

rateClub

100

092

096

084

100

091

058

094

071

100

100

100

100

100

100

Dolph

in064

090

075

046

097

062

040

094

056

073

098

083

080

098

088

LesM

iserables

072

087

079

080

088

084

048

089

062

081

088

084

088

083

085

MovieLens

083

085

084

056

086

068

081

086

083

081

088

084

082

088

085

EPdataset

085

080

082

053

056

054

053

065

058

079

083

081

085

082

083

8 Complexity


hot c

onsu

mpt

ion

area


900800700600500400300200100

0

Value0 500 1000 1500 2000 2500 3000

Cou

nt


400375350325300275250225200175150125100

755025

0


Value1 2 3 4

Cou

nt



References





























Journal of












Journal of







Volume 2018






Volume 2018








Complexity 7

Table1Th

eperform

ance

comparis

ons

dataset

NMFO

SCRN

MCP

MEd

geB-Cluster

EC-Structure

MR

MP

FBMR

MP

FBMR

MP

FBMR

MP

FBMR

MP

FBKa

rateClub

100

092

096

084

100

091

058

094

071

100

100

100

100

100

100

Dolph

in064

090

075

046

097

062

040

094

056

073

098

083

080

098

088

LesM

iserables

072

087

079

080

088

084

048

089

062

081

088

084

088

083

085

MovieLens

083

085

084

056

086

068

081

086

083

081

088

084

082

088

085

EPdataset

085

080

082

053

056

054

053

065

058

079

083

081

085

082

083

8 Complexity


hot c

onsu

mpt

ion

area


900800700600500400300200100

0

Value0 500 1000 1500 2000 2500 3000

Cou

nt


400375350325300275250225200175150125100

755025

0


Value1 2 3 4

Cou

nt



References





























Journal of












Journal of







Volume 2018






Volume 2018








8 Complexity


hot c

onsu

mpt

ion

area


900800700600500400300200100

0

Value0 500 1000 1500 2000 2500 3000

Cou

nt


400375350325300275250225200175150125100

755025

0


Value1 2 3 4

Cou

nt



References





























Journal of












Journal of







Volume 2018






Volume 2018















Journal of












Journal of







Volume 2018






Volume 2018








Documents

EC-Structure: Establishing Consumption Structure through ...downloads.hindawi.com/journals/complexity/2019/6543590.pdf · Complexity (a) Distributionofnodes (b) eheatmap F : Visualizationoftheresultsgeneratedbythealgorithm