Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Research ArticleEC-Structure Establishing Consumption Structure throughMining E-Commerce Data to Discover Consumption Upgrade
Lin Guo 1 and Dongliang Zhang2
1School of Economics and Management Changchun University of Science and Technology Jilin 130022 China2Institution of technical science Fudan University Shanghai 200000 China
Correspondence should be addressed to Lin Guo guolincusteducn
Received 24 December 2018 Accepted 26 February 2019 Published 12 March 2019
Guest Editor Thiago C Silva
Copyright copy 2019 Lin Guo and Dongliang Zhang This is an open access article distributed under the Creative CommonsAttribution License which permits unrestricted use distribution and reproduction in any medium provided the original work isproperly cited
The traditional methods of analyzing consumption structure have many limitations and data acquisition is difficult so it is hardto scientifically verify the accuracy of algorithms With the development of Internet economy many scientific researchers focuson mining knowledge of consumer behavior using big data analysis technology Because consumption decisions are influencedby not only personal characteristics but also social trends and environment it is one-sided to analyze the impact of one singlefactor on the phenomenon of consumption The authors of this paper combine the consumption structure analysis method anddata processing technology using data from an e-commerce platform to extract the consumption structure of cities compare thestructural differences between different periods and then discover consumption upgrading according to swarm intelligence Theexperiments prove the efficacy of the algorithm proposed in this paper compared to other similar algorithms using several differentdatasets which illustrates the algorithmrsquos efficacy and stable performance in consumption structure analysis
1 Introduction
With the continuous expansion of consumption scale con-sumersrsquo personalized demands are becoming increasinglyobvious Consumer behaviors such as purchasing decisionsare influenced by not only personal characteristics but alsointerpersonal relationships social environments networkculture and so onTherefore the analysis of consumption justfrom the individual perspective is one-sided and unscientific
At present most research on consumption upgradingis carried out from the macroperspective and is realizedbased on national statistical yearbook data The analysis ofconsumptionupgrading fromamacroperspective is relativelysimple However it is difficult to find individual consump-tion structures at the microlevel from the perspective ofconsumers From the perspective of management and eco-nomics consumption upgrading is difficult to measure andthere is no strict boundary with which to distinguish betweenconsumption upgrading and nonupgrading and relevantexperimental data is also difficult to obtain The authorsof this paper can fully quantify the judgment process of
consumption upgrading propose a set of evaluation criteriaand use big datawith comprehensive coverage of user featuresfor mining Therefore the algorithm proposed in this paperis scientific and accurate
This paper combines the consumption structure analysismethod and data processing technology to extract collec-tive wisdom to construct an economic map that describesurban economic hotspots The algorithm involves studyingconsumer consumption structures to realize consumptionupgrading mechanism research and building a consumptionupgradingmodel to analyzewhether consumption upgradingoccurred The results obtained from the multiangle andmultidimensional research will be comprehensive and rea-sonable
2 Related Work
The research on consumption function theory is mainlyfocused on Persistent Income Theory (PIH) and Life CycleTheory (LCH) The only difference between PIH and LCH is
HindawiComplexityVolume 2019 Article ID 6543590 8 pageshttpsdoiorg10115520196543590
2 Complexity
that the former usually uses an indefinite limitation while thelatter uses a definite limitation so they are generally calledLC-PIHwhen combinedMany researchers [1ndash4] use LC-PIHto study the consumption problems of Western residents buttheir conclusions are inconsistent
With the development of the Internet economy anincreasing number of researchers focus on the impact ofInternet technology Electronic commerce (e-commerce) isany type of business or commercial transaction that involvesinformation transfer across the Internet [5] A userrsquos behav-iors on a website can reflect their interests and purchaseintentions Therefore consumption structure can be analyzedby studying the data of e-commerce platforms which isdifficult to realize using to traditional research methods
When analyzing the characteristics of a network thedistribution of user activity is investigated and a network ofbidders that is connected by common interest in individualarticles is constructed [6 7] The networkrsquos cluster structurecorrespondswith themain user groups according to commoninterests exhibiting hierarchy and overlap
Regarding the characteristics of uses Curme [5] Chat-topadhyay [8] and Guo [9] analyze user behaviors from theperspective of complex systems and extract implicit semanticinformation from large-scale semistructured data Glass [10]Singh [11] and Aviano [12] find personal characteristicsthrough the feature extraction of original website datamine important features of users using to feature selectionmethods and finally extract a user profile model Kim [13]Ouaftouh [14] and Diao [15] classify customers into differentgroups according to their similarities that are calculatedthrough demographic features and psychological features orfeatures such as customer value customer consumption andlifetime value
3 Classification of Consumption Data
The rapid development of social networking resulted in theexplosive growth of data that contains large amounts ofhigh-quality information such as information about userinterests and interpersonal relationships Therefore socialdata processing and knowledge mining are intricate andindispensable Consumption data contains much redundantor nonrepresentative data We adopt the matrix analysismethod to quickly classify consumption data into negativedata positive data insufficient evidence data and disputeddata
Definition 1 (negative data) Thenormalized value of negativecomment times is greater than 0 and the normalized value ofpositive comment times is less than 0
Definition 2 (positive data) The normalized value of positivecomment times is greater than 0 and the normalized value ofnegative comment times is less than 0
Definition 3 (insufficient data) The number of positive andnegative comments is relatively small so there is no author-itative data based on which to measure the nature of thedata
Definition 4 (controversial data) The number of positive andnegative comments is relatively large so there is no way tofind absolute salient features based on which to measure thenature of the data
Because socialized data can reflect the characteristics andinterpersonal relationship information about real society theknowledge of regional and overall consumption structure(the coverage of analysis results are determined by datacapture granularity) can be acquired by analyzing the dataof ltuser commentgt that is gathered from websites Afterword segmentation and semantic analysis we can obtainthe ltconsumption object comment times positive timesnegative timesgt data Due to large differences in the eval-uation data about different consumption objects we usestandardized data to control data towithin a range tomeasurethe popularity of different consumption objects The formulafor data normalization is shown as follows
(119909 119910) = (119909119894 minus 119909119904119909 119910119894 minus 119910119904119910 ) (1)
119909 = 1119899119899sum119901=1
119909119901 (2)
119910 = 1119899119899sum119901=1
119910119901 (3)
119904119909 = radic 1119899 minus 1119899sum119901=1
(119909119901 minus 119909)2 (4)
119904119910 = radic 1119899 minus 1119899sum119901=1
(119910119901 minus 119910)2 (5)
(119909 119910) is the calculated result after standardization Itsmean value is 0 the variance is 1 and it is dimensionless(119909 119910) can be mapped to a two-dimensional coordinateinterval [-1 +1] The standardized variable value fluctuatesaround 0 A value greater than 0 indicates that (119909 119910) is higherthan the average level and a value less than 0 indicates that(119909 119910) is lower than the average level
The authors of this paper use nodes to describe con-sumption objects Therefore the locations of nodes in thematrix can describe status of consumption objects In theconsumption matrix X and Y coordinates respectivelyrepresent the standardized data of positive data and negativedata In the four interval matrixes the consumption data isdivided into four categories
The consumption matrix is divided into four regions asshown in Figure 1
It can be seen from Figure 1 that the X-coordinate valueof the node in the negative partition is less than 0 and theY-coordinate value is greater than 0 which indicates thatthe nodes in the negative partition have more negative datathan the average The nodes in the positive partition are justthe opposite The other two types of nodes are controversialand insufficient nodes Among them the controversial node
Complexity 3
negative
positive
controversial
insufficient
+1
+10-1
-1
Figure 1 Coordinate distribution of four types of nodes
has much positive and negative information The insufficientnode has little positive and negative information Neither ofthese two types of nodes can be classified into one specificpartition To reduce the impact of invalid or redundant dataon the accuracy of the analysis the authors of this paperonly focus on the nodes in positive and negative areasIn addition by manually analyzing the nodes in positiveand negative regions it is found that the data about thesenodes is authoritative and clean It is enough to describethe information about consumption structure of users andsufficient for the subsequent experiments
4 Analysis of Consumption Coefficient
By analyzing the data obtained in the above process thematrix ITEM(119909 y) is constructed from positive and negativedata Each node in the matrix describes the situation ofthe positive and negative comments about a consumptionobject By comparing the consumption matrixes from dif-ferent periods the consumption trends changes to thestructure and consumption upgrading can be judged andidentified
With the continuous expansion of consumption scaleconsumersrsquo personalized demands are becoming increasinglyobvious Consumer behavior such as purchase decisionsis affected by multiple factors so consumption hotspotsand structures often change This change may be strongor weak of course significant changes in structures arerelatively easy to detect but weak changes are difficultto capture Therefore to identify changes in consumptionstructures it is necessary to calculate the proportions ofdifferent consumption objects in the total consumption fieldand discover changes of consumption structure in time byanalyzing changes in proportion In this paper we calculatethe consumption coefficient of each consumption objectmeasure the proportion of different consumption objects inthe total consumption field and then identify changes andtrends in the consumption structures of users
The formula for calculating the consumption coefficientis as follows
coeff = ℎ119890119886119905 (119902)ℎ119890119886119905 (119886119897119897) times 100 (6)
coeff is the consumption coefficient of consumptionobject q heat(q) represents the heat of the consumption
object q heat(all) represents the heat of all consumptionobjects It can be seen from the formula that the consump-tion coefficient is the proportion of a certain consumptionobject in the total consumption objects The consumptioncoefficient is added to the matrix ITEM(119909 y) as an addi-tional parameter so the expression of the matrix becomesITEM(119909 y coeff ) Therefore the structural characteristics ofconsumption can be described from three dimensions Byanalyzing ITEM(119909 y coeff ) the implied information aboutconsumption structure consumption trend and consump-tion upgrading can be obtainedThe detailed analysis processis described in the next section
5 Discovery of IndividualConsumption Upgrading
To compare the consumption data fromdifferent periods thediffering degrees of the matrixes that describe the consump-tion structures in different periods need to be calculatedIf the difference degree exceeds a certain threshold valuethen the consumption upgrading phenomenon is consideredas happening Here the consumption matrix at momentn is denoted as ITEMn and the consumption matrix atmoment n+1 is denoted as ITEMn+1 By comparing ITEMnand ITEMn+1 the differences of different consumptionmatrixes can be calculated and structural changes can bedetectedThe formula for calculating degree of difference is asfollows
119862119874119877(ITEM119899 ITEM119899+1)= sum119899119894=1 (119868119879119864119872119894 minus 119868119879119864119872119899)sum119899+1119895=1 (119868119879119864119872119895 minus 119868119879119864119872119899+1)radicsum119899119894=1 (119868119879119864119872119894 minus 119868119879119864119872119899)2sum119899+1119895=1 (119868119879119864119872119895 minus 119868119879119864119872119899+1)2
(7)
According to the formula the coefficient COR is obtainedby dividing the covariance by the standard deviation of twovariables The covariance can reflect the correlation degreebetween two random variables When the covariance isgreater than 0 it means that the two variables are positivelycorrelated and when the covariance is less than 0 it meansthat the two variables are negatively correlated Note that thecoefficient is meaningful when both variables are not zeroand the range of the coefficient is [-1 1] When COR is 1ITEMn and ITEMn+1 are completely positively correlatedWhen COR is -1 ITEMn and ITEMn+1 are completely
4 Complexity
20
18
16
14
12
10
00 02 04 06 08 10
+10
X
Y
(a) COR is +10
00
02
04
06
08
10
00
00
02 04 06 08 10X
Y
(b) COR is 0-10
00
02
04
06
08
10
00 02 04 06 08 10X
Y
(c) COR is -10
Figure 2 The coefficient COR of different consumption structure
negatively correlated The greater the absolute value of CORis the stronger the correlation degree between ITEMn andITEMn+1 is The closer the coefficient COR is to 0 the weakerthe correlation degree between ITEMn and ITEMn+1 is
Through the above methods we can build consumptionmatrixes for different periods and judge the differences inconsumption structure in between periods by calculating thecoefficient CORWhenCOR approaches 1 or -1 it means thatthe consumption structure significantly changed so it can beconsidered that consumption upgrading occurred As shownin Figure 2 the closer the coefficient is 1 or -1 the greaterthe structural difference is while a coefficient that is near 0indicates that the consumption structure changed little andthere is no upgrading
6 Experiment
The datasets used throughout the experiments are ZacharyrsquosKarate Club(httpwww-personalumichedusimmejnnetda-ta) DolphinrsquosAssociations(httpwww-personalumichedu
simmejnnetdata) LesMiserables(httpwikigephiorgindexphpDatasets) MovieLens(httpwwwdatatangcomdatar-esdetailaspxid=44295) and EP dataset(httpwwwdian-pingcom)(1) The dataset of Zacharyrsquos Karate Club is a socialnetwork of friendships between 34 members so edges in thegraph describe the higher frequency of interactions betweenmembers(2)The dataset of Dolphinrsquos Associations is an undirectedsocial network of frequent associations between 62 dolphinswhich has 62 nodes and 159 edges(3) The dataset of LesMiserables is a coappearance net-work of characters in LesMiserables which contains 77 nodesand 254 edges(4)Thedataset of MovieLens is a synthesized recommen-dation system and virtual community which is commonlyused for social computing(5) The EP dataset was captured from an e-commerceplatform (dianpingcom) It contains 15890209 pieces of dataand was updated in August 2018The data collection fields are
Complexity 5
1
minus05 00 05
minus1
0
Figure 3 The positive and negative distribution of comments
shop id (uniqueness) province city city id area big cate(the primary classification) big cate id small cate (thesecondary classification) small cate id service rating allremarks very good remarks (5-star review) good remarks(4-star review) common remarks (3-star review) badremarks (2-star review) and very bad remarks (1-starreview)
Comparison Methods NMFOSC [16] presents an approachto community detection that utilizes a nonnegative matrixfactorization model to divide overlapping communities fromnetworks RNM [17] is a local expansion method based onrough neighborhood CPM [18] greedily expands naturalcommunities of seeds until the whole graph is covered byusing a local fitness function EdgeB-Cluster [19] bundlessimilar edges adjusts the locations of nodes to optimize thevisualized output of the graph and analyzes networks from acommunity level
Through the analysis of a consumption object in a certainregion in the e-commerce platform it was found that thenumber of positive comments is very large This is becausethere is a phenomenon of deliberately increasing the numberof good comments to improve the storersquos reputation whichresults in the presence of too many good comments Onthe contrary the numbers of neutral and negative commentsare relatively reasonable and few of these comments areintentionally added or deleted so they are convincing Basedon the above factors the authors of this paper did not analyzethe quantity of positive comments and only consideredthe quantity of neutral and negative comments Throughexperimental verification of the quality of the neutral andnegative comments it was found that the data is authenticand abundant and enough to describe the object to betested
Figure 3 shows the distribution of positive and negativecomments The nodes in the insufficient and controversialareas do not provide valuable information for subsequentanalysis so they were removed It can be seen that there aremore positive nodes than negative ones and the difference
between them is large It is important to note that althoughthe nodes in the negative area represent that the userrsquoscomments are negative they still provide useful knowledgeabout consumption trends that cannot be removed
Figure 3 shows the regional analysis results while Figure 4shows the overall analysis results Figure 4 is the visual outputof the results generated by the algorithm Figure 4(a) showsthe distribution of nodes and the colors of nodes indicatethe heat of different consumption objects The darker a nodecolor is the more attention the node received Figure 4(b)is a heat map of a center node and the black node isthe center node Figure 4(b) shows that there is a certaincorrelation between the central node and a large number ofother nodes indicating that there are many high correlationsbetween different consumer groups Thus the characteristicsof consumption objects can be further analyzed based on therelationships between consumers and commodities
Figure 5 depicts the distribution of different consumptionobjects In this case the node with a ratio of more than06 is regarded as a popular consumption node while anode with a ratio of less than or equal to 06 is regardedas an unpopular consumption node Of course if the ratiothreshold is lowered then additional nodes will be dividedinto popular consumption areas It can be seen from Figure 5that most nodes belong to the nonhot field which is in-linewith the actual situation
Figure 6(a) depicts the characteristics of nodes that weredivided into two categories to describe different consumptionheat (some representative nodes are extracted) It can beseen that the characteristics of nodes in different categoriesvary greatly Figure 6(b) describes the closeness centralitydistribution of the nodes belong to the same category Thisshows that the node locations have normal distribution sothe similarity between the nodes in the same category is veryhigh That is to say the classification is reasonable
For the purpose of analyzing the experimental resultsthe following measurement parameters are used [20] Mul-tiplicity Precision calculated by 119872119875 = min(|119862(119890) cap 119862(1198901015840)||119871(119890) cap 119871(1198901015840)|)|119862(119890) cap 119862(1198901015840)| Multiplicity Recall by 119872119877 =min(|119862(119890) cap 119862(1198901015840)| |119871(119890) cap 119871(1198901015840)|)|119871(119890) cap 119871(1198901015840)| Let L(e) and
6 Complexity
(a) Distribution of nodes (b) The heat map
Figure 4 Visualization of the results generated by the algorithm
1
09
08
06
07
05
04
03
01
02
0 1 2 3 4 5 6 7
times 104node
ratio
Figure 5 The distribution of different consumption objects
C(e) denote the category and the cluster of an item e e is acluster with n items belonging to the same category and 1198901015840is a cluster merging n items from unary categories FB is acomprehensive measure ofMP andMR and the algorithm isFB=MPtimesMRtimes2(MP+MR)
Table 1 proves the validity and feasibility of the algorithmThe numbers in italic indicate the highest value of thesame parameter in each row Table 1 displays a comparisonof the algorithm proposed in this paper to other similaralgorithms The datasets Karate Club Dolphin LesMiser-ables and MovieLens are used to prove the performanceof the algorithms in structural analysis The EP dataset isused to prove the performance of e-commerce data analysisIt is found that EC-Structure performs better than otheralgorithms and performs stably with different data sets Themain reasons for which EC-Structure is superior to otheralgorithms are that (1) it reduces the influence of erroneouse-business platform data on the algorithm (2) it increasesthe consumption coefficient as a parameter with which to
measure the proportions of different consumption objectsand (3) the coefficient COR can help researchers accuratelyjudge changes in consumption structures Therefore theoperation effect of this algorithm is effective
7 Conclusion
Research on consumer behavior can be made by extractingand analyzing useful information from a large amount ofincomplete vague and random consumer behavior dataThe algorithm proposed in this paper builds consumptionstructures and a consumption upgrading model based on thedata from e-commerce platforms to analyze whether con-sumption upgrading occurred The results of the experimentverified the implementation efficacy and analysis accuracy ofthe algorithm It was found that the algorithm is effectiveThe implementation efficacy of the proposed algorithm issuperior to those of other algorithms and it runs stably withdifferent datasets
Data Availability
The data used to support the findings of this study areincluded within the article
Conflicts of Interest
The authors declare that there are no conflicts of interestregarding the publication of this paper
Acknowledgments
This work was supported by Youth Fund of Humanity andSocial Science of Ministry of Education of China (Grantno 18YJCZH041) Project of Education Department of JilinProvince of China (Grant no JJKH20190612SK)
Complexity 7
Table1Th
eperform
ance
comparis
ons
dataset
NMFO
SCRN
MCP
MEd
geB-Cluster
EC-Structure
MR
MP
FBMR
MP
FBMR
MP
FBMR
MP
FBMR
MP
FBKa
rateClub
100
092
096
084
100
091
058
094
071
100
100
100
100
100
100
Dolph
in064
090
075
046
097
062
040
094
056
073
098
083
080
098
088
LesM
iserables
072
087
079
080
088
084
048
089
062
081
088
084
088
083
085
MovieLens
083
085
084
056
086
068
081
086
083
081
088
084
082
088
085
EPdataset
085
080
082
053
056
054
053
065
058
079
083
081
085
082
083
8 Complexity
cold consumption area
hot c
onsu
mpt
ion
area
Degree Distribution1000
900800700600500400300200100
0
Value0 500 1000 1500 2000 2500 3000
Cou
nt
(a) Degree distribution
400375350325300275250225200175150125100
755025
0
Closeness Centrality Distribution
Value1 2 3 4
Cou
nt
(b) Closeness centrality distribution
Figure 6 The characteristic distribution of hot consumption and cold consumption nodes
References
[1] P N Ireland ldquoUsing the permanent income hypothesis for fore-castingrdquo Federal Reserve Bank of Richmond Economic Quarterlyvol 81 no 1 pp 49ndash63 1995
[2] L A Fisher andGKingston ldquoImproved forecasts of tax revenuevia the permanent income hypothesisrdquo Australian EconomicReview vol 50 no 1 pp 21ndash31 2017
[3] L Zhou C Wang and S O Finance ldquoHousehold debt andconsumption-evidence frommicro datardquo So Science vol 3 pp32ndash43 2018
[4] M Zagler ldquoEmpirical evidence on growth and business cyclesrdquoEmpirica vol 44 pp 1ndash20 2017
[5] C Curme T Preis and H E Stanley ldquoQuantifying the seman-tics of search behavior before stock market movesrdquo Proceedingsof the National Academy of Sciences of the United States ofAmerica vol 111 no 32 pp 11600ndash11605 2014
[6] J Reichardt and S Bornholdt Ebay users from stable groups ofcommon interest 2005
[7] H Halpin ldquoThe semantics of searchrdquo in Social Semantics pp149ndash186 Springer US 2013
[8] T Chattopadhyay S Maiti A Pal et al ldquoAutomatic discovery ofemerging trends using cluster name synthesis on user consump-tion data extended abstractrdquo in Proceedings of InternationalConference Companion on World Wide Web pp 981ndash983 2016
[9] L Guo W Zuo and T Peng ldquoInference network building andmovements prediction based on analysis of induced dependen-ciesrdquo IET Soware vol 11 no 1 pp 12ndash17 2017
[10] B Glass Z Benenson and R Landwirth ldquoLook before youleap improving the usersrsquo ability to detect fraud in electronicmarketplacesrdquo in Proceedings of the CHI Conference on HumanFactors in Computing Systems pp 3870ndash3882 ACM 2016
[11] P Singh and M Singh ldquoFraud detection by monitoring cus-tomer behavior and activitiesrdquo Annals of Regional Science vol49 no 1 pp 1ndash27 2012
[12] D Aviano B L Putro and E P Nugroho ldquoBehavioral trackinganalysis on learning management system with apriori associa-tion rules algorithmrdquo inProceedings of the 2017 3rd InternationalConference on Science in Information Technology (ICSITech)Bandung Indonesia 2017
[13] K Kim Y Choi and J Park ldquoPricing fraud detection inonline shopping malls using a finite mixture modelrdquo Electronic
Commerce Research and Applications vol 12 no 3 pp 195ndash2072013
[14] S Ouaftouh A Zellou and A Idri ldquoUser profile model a userdimension based classificationrdquo in Proceedings of the 2015 10thInternational Conference on Intelligent Systems eories andApplications (SITA) Rabat Morocco 2015
[15] Y Diao K Y Liu and L Hu ldquoClassification ofmassive user loadcharacteristics in distribution network based on agglomerativehierarchical algorithmrdquo in Proceedings of the 2016 Interna-tional Conference on Cyber-Enabled Distributed Computing andKnowledge Discovery (CyberC) Chengdu China 2017
[16] N Chen Y Liu and H-C Chao ldquoOverlapping communitydetection using non-negative matrix factorization with orthog-onal and sparseness constraintsrdquo IEEE Access vol 6 pp 21266ndash21274 2017
[17] Z H Zhang D Q Miao and J Qian ldquoDetecting overlappingcommunities with heuristic expansion method based on roughneighborhoodrdquo Chinese Journal of Computer vol 36 no 102013
[18] F Havemann M Heinz and A Struch ldquoIdentication of over-lapping communities and their hierarchy by locally calculatingcommunity-changing resolution levelsrdquo Journal of StatisticalMechanics eory and Experiment vol 1 2011
[19] L Guo W Zuo T Peng and B K Adhikari ldquoAttribute-based edge bundling for visualizing social networksrdquo PhysicaA Statistical Mechanics and Its Applications vol 438 pp 48ndash552015
[20] E Amigo J Gonzalo J Artiles and F Verdejo ldquoA comparisonof extrinsic clustering evaluation metrics based on formalconstraintsrdquo Information Retrieval vol 12 no 4 pp 461ndash4862009
Hindawiwwwhindawicom Volume 2018
MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Applied MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Probability and StatisticsHindawiwwwhindawicom Volume 2018
Journal of
Hindawiwwwhindawicom Volume 2018
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawiwwwhindawicom Volume 2018
OptimizationJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
Hindawiwwwhindawicom Volume 2018
Operations ResearchAdvances in
Journal of
Hindawiwwwhindawicom Volume 2018
Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018
International Journal of Mathematics and Mathematical Sciences
Hindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018Volume 2018
Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in
Nature and SocietyHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Dierential EquationsInternational Journal of
Volume 2018
Hindawiwwwhindawicom Volume 2018
Decision SciencesAdvances in
Hindawiwwwhindawicom Volume 2018
AnalysisInternational Journal of
Hindawiwwwhindawicom Volume 2018
Stochastic AnalysisInternational Journal of
Submit your manuscripts atwwwhindawicom
2 Complexity
that the former usually uses an indefinite limitation while thelatter uses a definite limitation so they are generally calledLC-PIHwhen combinedMany researchers [1ndash4] use LC-PIHto study the consumption problems of Western residents buttheir conclusions are inconsistent
With the development of the Internet economy anincreasing number of researchers focus on the impact ofInternet technology Electronic commerce (e-commerce) isany type of business or commercial transaction that involvesinformation transfer across the Internet [5] A userrsquos behav-iors on a website can reflect their interests and purchaseintentions Therefore consumption structure can be analyzedby studying the data of e-commerce platforms which isdifficult to realize using to traditional research methods
When analyzing the characteristics of a network thedistribution of user activity is investigated and a network ofbidders that is connected by common interest in individualarticles is constructed [6 7] The networkrsquos cluster structurecorrespondswith themain user groups according to commoninterests exhibiting hierarchy and overlap
Regarding the characteristics of uses Curme [5] Chat-topadhyay [8] and Guo [9] analyze user behaviors from theperspective of complex systems and extract implicit semanticinformation from large-scale semistructured data Glass [10]Singh [11] and Aviano [12] find personal characteristicsthrough the feature extraction of original website datamine important features of users using to feature selectionmethods and finally extract a user profile model Kim [13]Ouaftouh [14] and Diao [15] classify customers into differentgroups according to their similarities that are calculatedthrough demographic features and psychological features orfeatures such as customer value customer consumption andlifetime value
3 Classification of Consumption Data
The rapid development of social networking resulted in theexplosive growth of data that contains large amounts ofhigh-quality information such as information about userinterests and interpersonal relationships Therefore socialdata processing and knowledge mining are intricate andindispensable Consumption data contains much redundantor nonrepresentative data We adopt the matrix analysismethod to quickly classify consumption data into negativedata positive data insufficient evidence data and disputeddata
Definition 1 (negative data) Thenormalized value of negativecomment times is greater than 0 and the normalized value ofpositive comment times is less than 0
Definition 2 (positive data) The normalized value of positivecomment times is greater than 0 and the normalized value ofnegative comment times is less than 0
Definition 3 (insufficient data) The number of positive andnegative comments is relatively small so there is no author-itative data based on which to measure the nature of thedata
Definition 4 (controversial data) The number of positive andnegative comments is relatively large so there is no way tofind absolute salient features based on which to measure thenature of the data
Because socialized data can reflect the characteristics andinterpersonal relationship information about real society theknowledge of regional and overall consumption structure(the coverage of analysis results are determined by datacapture granularity) can be acquired by analyzing the dataof ltuser commentgt that is gathered from websites Afterword segmentation and semantic analysis we can obtainthe ltconsumption object comment times positive timesnegative timesgt data Due to large differences in the eval-uation data about different consumption objects we usestandardized data to control data towithin a range tomeasurethe popularity of different consumption objects The formulafor data normalization is shown as follows
(119909 119910) = (119909119894 minus 119909119904119909 119910119894 minus 119910119904119910 ) (1)
119909 = 1119899119899sum119901=1
119909119901 (2)
119910 = 1119899119899sum119901=1
119910119901 (3)
119904119909 = radic 1119899 minus 1119899sum119901=1
(119909119901 minus 119909)2 (4)
119904119910 = radic 1119899 minus 1119899sum119901=1
(119910119901 minus 119910)2 (5)
(119909 119910) is the calculated result after standardization Itsmean value is 0 the variance is 1 and it is dimensionless(119909 119910) can be mapped to a two-dimensional coordinateinterval [-1 +1] The standardized variable value fluctuatesaround 0 A value greater than 0 indicates that (119909 119910) is higherthan the average level and a value less than 0 indicates that(119909 119910) is lower than the average level
The authors of this paper use nodes to describe con-sumption objects Therefore the locations of nodes in thematrix can describe status of consumption objects In theconsumption matrix X and Y coordinates respectivelyrepresent the standardized data of positive data and negativedata In the four interval matrixes the consumption data isdivided into four categories
The consumption matrix is divided into four regions asshown in Figure 1
It can be seen from Figure 1 that the X-coordinate valueof the node in the negative partition is less than 0 and theY-coordinate value is greater than 0 which indicates thatthe nodes in the negative partition have more negative datathan the average The nodes in the positive partition are justthe opposite The other two types of nodes are controversialand insufficient nodes Among them the controversial node
Complexity 3
negative
positive
controversial
insufficient
+1
+10-1
-1
Figure 1 Coordinate distribution of four types of nodes
has much positive and negative information The insufficientnode has little positive and negative information Neither ofthese two types of nodes can be classified into one specificpartition To reduce the impact of invalid or redundant dataon the accuracy of the analysis the authors of this paperonly focus on the nodes in positive and negative areasIn addition by manually analyzing the nodes in positiveand negative regions it is found that the data about thesenodes is authoritative and clean It is enough to describethe information about consumption structure of users andsufficient for the subsequent experiments
4 Analysis of Consumption Coefficient
By analyzing the data obtained in the above process thematrix ITEM(119909 y) is constructed from positive and negativedata Each node in the matrix describes the situation ofthe positive and negative comments about a consumptionobject By comparing the consumption matrixes from dif-ferent periods the consumption trends changes to thestructure and consumption upgrading can be judged andidentified
With the continuous expansion of consumption scaleconsumersrsquo personalized demands are becoming increasinglyobvious Consumer behavior such as purchase decisionsis affected by multiple factors so consumption hotspotsand structures often change This change may be strongor weak of course significant changes in structures arerelatively easy to detect but weak changes are difficultto capture Therefore to identify changes in consumptionstructures it is necessary to calculate the proportions ofdifferent consumption objects in the total consumption fieldand discover changes of consumption structure in time byanalyzing changes in proportion In this paper we calculatethe consumption coefficient of each consumption objectmeasure the proportion of different consumption objects inthe total consumption field and then identify changes andtrends in the consumption structures of users
The formula for calculating the consumption coefficientis as follows
coeff = ℎ119890119886119905 (119902)ℎ119890119886119905 (119886119897119897) times 100 (6)
coeff is the consumption coefficient of consumptionobject q heat(q) represents the heat of the consumption
object q heat(all) represents the heat of all consumptionobjects It can be seen from the formula that the consump-tion coefficient is the proportion of a certain consumptionobject in the total consumption objects The consumptioncoefficient is added to the matrix ITEM(119909 y) as an addi-tional parameter so the expression of the matrix becomesITEM(119909 y coeff ) Therefore the structural characteristics ofconsumption can be described from three dimensions Byanalyzing ITEM(119909 y coeff ) the implied information aboutconsumption structure consumption trend and consump-tion upgrading can be obtainedThe detailed analysis processis described in the next section
5 Discovery of IndividualConsumption Upgrading
To compare the consumption data fromdifferent periods thediffering degrees of the matrixes that describe the consump-tion structures in different periods need to be calculatedIf the difference degree exceeds a certain threshold valuethen the consumption upgrading phenomenon is consideredas happening Here the consumption matrix at momentn is denoted as ITEMn and the consumption matrix atmoment n+1 is denoted as ITEMn+1 By comparing ITEMnand ITEMn+1 the differences of different consumptionmatrixes can be calculated and structural changes can bedetectedThe formula for calculating degree of difference is asfollows
119862119874119877(ITEM119899 ITEM119899+1)= sum119899119894=1 (119868119879119864119872119894 minus 119868119879119864119872119899)sum119899+1119895=1 (119868119879119864119872119895 minus 119868119879119864119872119899+1)radicsum119899119894=1 (119868119879119864119872119894 minus 119868119879119864119872119899)2sum119899+1119895=1 (119868119879119864119872119895 minus 119868119879119864119872119899+1)2
(7)
According to the formula the coefficient COR is obtainedby dividing the covariance by the standard deviation of twovariables The covariance can reflect the correlation degreebetween two random variables When the covariance isgreater than 0 it means that the two variables are positivelycorrelated and when the covariance is less than 0 it meansthat the two variables are negatively correlated Note that thecoefficient is meaningful when both variables are not zeroand the range of the coefficient is [-1 1] When COR is 1ITEMn and ITEMn+1 are completely positively correlatedWhen COR is -1 ITEMn and ITEMn+1 are completely
4 Complexity
20
18
16
14
12
10
00 02 04 06 08 10
+10
X
Y
(a) COR is +10
00
02
04
06
08
10
00
00
02 04 06 08 10X
Y
(b) COR is 0-10
00
02
04
06
08
10
00 02 04 06 08 10X
Y
(c) COR is -10
Figure 2 The coefficient COR of different consumption structure
negatively correlated The greater the absolute value of CORis the stronger the correlation degree between ITEMn andITEMn+1 is The closer the coefficient COR is to 0 the weakerthe correlation degree between ITEMn and ITEMn+1 is
Through the above methods we can build consumptionmatrixes for different periods and judge the differences inconsumption structure in between periods by calculating thecoefficient CORWhenCOR approaches 1 or -1 it means thatthe consumption structure significantly changed so it can beconsidered that consumption upgrading occurred As shownin Figure 2 the closer the coefficient is 1 or -1 the greaterthe structural difference is while a coefficient that is near 0indicates that the consumption structure changed little andthere is no upgrading
6 Experiment
The datasets used throughout the experiments are ZacharyrsquosKarate Club(httpwww-personalumichedusimmejnnetda-ta) DolphinrsquosAssociations(httpwww-personalumichedu
simmejnnetdata) LesMiserables(httpwikigephiorgindexphpDatasets) MovieLens(httpwwwdatatangcomdatar-esdetailaspxid=44295) and EP dataset(httpwwwdian-pingcom)(1) The dataset of Zacharyrsquos Karate Club is a socialnetwork of friendships between 34 members so edges in thegraph describe the higher frequency of interactions betweenmembers(2)The dataset of Dolphinrsquos Associations is an undirectedsocial network of frequent associations between 62 dolphinswhich has 62 nodes and 159 edges(3) The dataset of LesMiserables is a coappearance net-work of characters in LesMiserables which contains 77 nodesand 254 edges(4)Thedataset of MovieLens is a synthesized recommen-dation system and virtual community which is commonlyused for social computing(5) The EP dataset was captured from an e-commerceplatform (dianpingcom) It contains 15890209 pieces of dataand was updated in August 2018The data collection fields are
Complexity 5
1
minus05 00 05
minus1
0
Figure 3 The positive and negative distribution of comments
shop id (uniqueness) province city city id area big cate(the primary classification) big cate id small cate (thesecondary classification) small cate id service rating allremarks very good remarks (5-star review) good remarks(4-star review) common remarks (3-star review) badremarks (2-star review) and very bad remarks (1-starreview)
Comparison Methods NMFOSC [16] presents an approachto community detection that utilizes a nonnegative matrixfactorization model to divide overlapping communities fromnetworks RNM [17] is a local expansion method based onrough neighborhood CPM [18] greedily expands naturalcommunities of seeds until the whole graph is covered byusing a local fitness function EdgeB-Cluster [19] bundlessimilar edges adjusts the locations of nodes to optimize thevisualized output of the graph and analyzes networks from acommunity level
Through the analysis of a consumption object in a certainregion in the e-commerce platform it was found that thenumber of positive comments is very large This is becausethere is a phenomenon of deliberately increasing the numberof good comments to improve the storersquos reputation whichresults in the presence of too many good comments Onthe contrary the numbers of neutral and negative commentsare relatively reasonable and few of these comments areintentionally added or deleted so they are convincing Basedon the above factors the authors of this paper did not analyzethe quantity of positive comments and only consideredthe quantity of neutral and negative comments Throughexperimental verification of the quality of the neutral andnegative comments it was found that the data is authenticand abundant and enough to describe the object to betested
Figure 3 shows the distribution of positive and negativecomments The nodes in the insufficient and controversialareas do not provide valuable information for subsequentanalysis so they were removed It can be seen that there aremore positive nodes than negative ones and the difference
between them is large It is important to note that althoughthe nodes in the negative area represent that the userrsquoscomments are negative they still provide useful knowledgeabout consumption trends that cannot be removed
Figure 3 shows the regional analysis results while Figure 4shows the overall analysis results Figure 4 is the visual outputof the results generated by the algorithm Figure 4(a) showsthe distribution of nodes and the colors of nodes indicatethe heat of different consumption objects The darker a nodecolor is the more attention the node received Figure 4(b)is a heat map of a center node and the black node isthe center node Figure 4(b) shows that there is a certaincorrelation between the central node and a large number ofother nodes indicating that there are many high correlationsbetween different consumer groups Thus the characteristicsof consumption objects can be further analyzed based on therelationships between consumers and commodities
Figure 5 depicts the distribution of different consumptionobjects In this case the node with a ratio of more than06 is regarded as a popular consumption node while anode with a ratio of less than or equal to 06 is regardedas an unpopular consumption node Of course if the ratiothreshold is lowered then additional nodes will be dividedinto popular consumption areas It can be seen from Figure 5that most nodes belong to the nonhot field which is in-linewith the actual situation
Figure 6(a) depicts the characteristics of nodes that weredivided into two categories to describe different consumptionheat (some representative nodes are extracted) It can beseen that the characteristics of nodes in different categoriesvary greatly Figure 6(b) describes the closeness centralitydistribution of the nodes belong to the same category Thisshows that the node locations have normal distribution sothe similarity between the nodes in the same category is veryhigh That is to say the classification is reasonable
For the purpose of analyzing the experimental resultsthe following measurement parameters are used [20] Mul-tiplicity Precision calculated by 119872119875 = min(|119862(119890) cap 119862(1198901015840)||119871(119890) cap 119871(1198901015840)|)|119862(119890) cap 119862(1198901015840)| Multiplicity Recall by 119872119877 =min(|119862(119890) cap 119862(1198901015840)| |119871(119890) cap 119871(1198901015840)|)|119871(119890) cap 119871(1198901015840)| Let L(e) and
6 Complexity
(a) Distribution of nodes (b) The heat map
Figure 4 Visualization of the results generated by the algorithm
1
09
08
06
07
05
04
03
01
02
0 1 2 3 4 5 6 7
times 104node
ratio
Figure 5 The distribution of different consumption objects
C(e) denote the category and the cluster of an item e e is acluster with n items belonging to the same category and 1198901015840is a cluster merging n items from unary categories FB is acomprehensive measure ofMP andMR and the algorithm isFB=MPtimesMRtimes2(MP+MR)
Table 1 proves the validity and feasibility of the algorithmThe numbers in italic indicate the highest value of thesame parameter in each row Table 1 displays a comparisonof the algorithm proposed in this paper to other similaralgorithms The datasets Karate Club Dolphin LesMiser-ables and MovieLens are used to prove the performanceof the algorithms in structural analysis The EP dataset isused to prove the performance of e-commerce data analysisIt is found that EC-Structure performs better than otheralgorithms and performs stably with different data sets Themain reasons for which EC-Structure is superior to otheralgorithms are that (1) it reduces the influence of erroneouse-business platform data on the algorithm (2) it increasesthe consumption coefficient as a parameter with which to
measure the proportions of different consumption objectsand (3) the coefficient COR can help researchers accuratelyjudge changes in consumption structures Therefore theoperation effect of this algorithm is effective
7 Conclusion
Research on consumer behavior can be made by extractingand analyzing useful information from a large amount ofincomplete vague and random consumer behavior dataThe algorithm proposed in this paper builds consumptionstructures and a consumption upgrading model based on thedata from e-commerce platforms to analyze whether con-sumption upgrading occurred The results of the experimentverified the implementation efficacy and analysis accuracy ofthe algorithm It was found that the algorithm is effectiveThe implementation efficacy of the proposed algorithm issuperior to those of other algorithms and it runs stably withdifferent datasets
Data Availability
The data used to support the findings of this study areincluded within the article
Conflicts of Interest
The authors declare that there are no conflicts of interestregarding the publication of this paper
Acknowledgments
This work was supported by Youth Fund of Humanity andSocial Science of Ministry of Education of China (Grantno 18YJCZH041) Project of Education Department of JilinProvince of China (Grant no JJKH20190612SK)
Complexity 7
Table1Th
eperform
ance
comparis
ons
dataset
NMFO
SCRN
MCP
MEd
geB-Cluster
EC-Structure
MR
MP
FBMR
MP
FBMR
MP
FBMR
MP
FBMR
MP
FBKa
rateClub
100
092
096
084
100
091
058
094
071
100
100
100
100
100
100
Dolph
in064
090
075
046
097
062
040
094
056
073
098
083
080
098
088
LesM
iserables
072
087
079
080
088
084
048
089
062
081
088
084
088
083
085
MovieLens
083
085
084
056
086
068
081
086
083
081
088
084
082
088
085
EPdataset
085
080
082
053
056
054
053
065
058
079
083
081
085
082
083
8 Complexity
cold consumption area
hot c
onsu
mpt
ion
area
Degree Distribution1000
900800700600500400300200100
0
Value0 500 1000 1500 2000 2500 3000
Cou
nt
(a) Degree distribution
400375350325300275250225200175150125100
755025
0
Closeness Centrality Distribution
Value1 2 3 4
Cou
nt
(b) Closeness centrality distribution
Figure 6 The characteristic distribution of hot consumption and cold consumption nodes
References
[1] P N Ireland ldquoUsing the permanent income hypothesis for fore-castingrdquo Federal Reserve Bank of Richmond Economic Quarterlyvol 81 no 1 pp 49ndash63 1995
[2] L A Fisher andGKingston ldquoImproved forecasts of tax revenuevia the permanent income hypothesisrdquo Australian EconomicReview vol 50 no 1 pp 21ndash31 2017
[3] L Zhou C Wang and S O Finance ldquoHousehold debt andconsumption-evidence frommicro datardquo So Science vol 3 pp32ndash43 2018
[4] M Zagler ldquoEmpirical evidence on growth and business cyclesrdquoEmpirica vol 44 pp 1ndash20 2017
[5] C Curme T Preis and H E Stanley ldquoQuantifying the seman-tics of search behavior before stock market movesrdquo Proceedingsof the National Academy of Sciences of the United States ofAmerica vol 111 no 32 pp 11600ndash11605 2014
[6] J Reichardt and S Bornholdt Ebay users from stable groups ofcommon interest 2005
[7] H Halpin ldquoThe semantics of searchrdquo in Social Semantics pp149ndash186 Springer US 2013
[8] T Chattopadhyay S Maiti A Pal et al ldquoAutomatic discovery ofemerging trends using cluster name synthesis on user consump-tion data extended abstractrdquo in Proceedings of InternationalConference Companion on World Wide Web pp 981ndash983 2016
[9] L Guo W Zuo and T Peng ldquoInference network building andmovements prediction based on analysis of induced dependen-ciesrdquo IET Soware vol 11 no 1 pp 12ndash17 2017
[10] B Glass Z Benenson and R Landwirth ldquoLook before youleap improving the usersrsquo ability to detect fraud in electronicmarketplacesrdquo in Proceedings of the CHI Conference on HumanFactors in Computing Systems pp 3870ndash3882 ACM 2016
[11] P Singh and M Singh ldquoFraud detection by monitoring cus-tomer behavior and activitiesrdquo Annals of Regional Science vol49 no 1 pp 1ndash27 2012
[12] D Aviano B L Putro and E P Nugroho ldquoBehavioral trackinganalysis on learning management system with apriori associa-tion rules algorithmrdquo inProceedings of the 2017 3rd InternationalConference on Science in Information Technology (ICSITech)Bandung Indonesia 2017
[13] K Kim Y Choi and J Park ldquoPricing fraud detection inonline shopping malls using a finite mixture modelrdquo Electronic
Commerce Research and Applications vol 12 no 3 pp 195ndash2072013
[14] S Ouaftouh A Zellou and A Idri ldquoUser profile model a userdimension based classificationrdquo in Proceedings of the 2015 10thInternational Conference on Intelligent Systems eories andApplications (SITA) Rabat Morocco 2015
[15] Y Diao K Y Liu and L Hu ldquoClassification ofmassive user loadcharacteristics in distribution network based on agglomerativehierarchical algorithmrdquo in Proceedings of the 2016 Interna-tional Conference on Cyber-Enabled Distributed Computing andKnowledge Discovery (CyberC) Chengdu China 2017
[16] N Chen Y Liu and H-C Chao ldquoOverlapping communitydetection using non-negative matrix factorization with orthog-onal and sparseness constraintsrdquo IEEE Access vol 6 pp 21266ndash21274 2017
[17] Z H Zhang D Q Miao and J Qian ldquoDetecting overlappingcommunities with heuristic expansion method based on roughneighborhoodrdquo Chinese Journal of Computer vol 36 no 102013
[18] F Havemann M Heinz and A Struch ldquoIdentication of over-lapping communities and their hierarchy by locally calculatingcommunity-changing resolution levelsrdquo Journal of StatisticalMechanics eory and Experiment vol 1 2011
[19] L Guo W Zuo T Peng and B K Adhikari ldquoAttribute-based edge bundling for visualizing social networksrdquo PhysicaA Statistical Mechanics and Its Applications vol 438 pp 48ndash552015
[20] E Amigo J Gonzalo J Artiles and F Verdejo ldquoA comparisonof extrinsic clustering evaluation metrics based on formalconstraintsrdquo Information Retrieval vol 12 no 4 pp 461ndash4862009
Hindawiwwwhindawicom Volume 2018
MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Applied MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Probability and StatisticsHindawiwwwhindawicom Volume 2018
Journal of
Hindawiwwwhindawicom Volume 2018
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawiwwwhindawicom Volume 2018
OptimizationJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
Hindawiwwwhindawicom Volume 2018
Operations ResearchAdvances in
Journal of
Hindawiwwwhindawicom Volume 2018
Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018
International Journal of Mathematics and Mathematical Sciences
Hindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018Volume 2018
Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in
Nature and SocietyHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Dierential EquationsInternational Journal of
Volume 2018
Hindawiwwwhindawicom Volume 2018
Decision SciencesAdvances in
Hindawiwwwhindawicom Volume 2018
AnalysisInternational Journal of
Hindawiwwwhindawicom Volume 2018
Stochastic AnalysisInternational Journal of
Submit your manuscripts atwwwhindawicom
Complexity 3
negative
positive
controversial
insufficient
+1
+10-1
-1
Figure 1 Coordinate distribution of four types of nodes
has much positive and negative information The insufficientnode has little positive and negative information Neither ofthese two types of nodes can be classified into one specificpartition To reduce the impact of invalid or redundant dataon the accuracy of the analysis the authors of this paperonly focus on the nodes in positive and negative areasIn addition by manually analyzing the nodes in positiveand negative regions it is found that the data about thesenodes is authoritative and clean It is enough to describethe information about consumption structure of users andsufficient for the subsequent experiments
4 Analysis of Consumption Coefficient
By analyzing the data obtained in the above process thematrix ITEM(119909 y) is constructed from positive and negativedata Each node in the matrix describes the situation ofthe positive and negative comments about a consumptionobject By comparing the consumption matrixes from dif-ferent periods the consumption trends changes to thestructure and consumption upgrading can be judged andidentified
With the continuous expansion of consumption scaleconsumersrsquo personalized demands are becoming increasinglyobvious Consumer behavior such as purchase decisionsis affected by multiple factors so consumption hotspotsand structures often change This change may be strongor weak of course significant changes in structures arerelatively easy to detect but weak changes are difficultto capture Therefore to identify changes in consumptionstructures it is necessary to calculate the proportions ofdifferent consumption objects in the total consumption fieldand discover changes of consumption structure in time byanalyzing changes in proportion In this paper we calculatethe consumption coefficient of each consumption objectmeasure the proportion of different consumption objects inthe total consumption field and then identify changes andtrends in the consumption structures of users
The formula for calculating the consumption coefficientis as follows
coeff = ℎ119890119886119905 (119902)ℎ119890119886119905 (119886119897119897) times 100 (6)
coeff is the consumption coefficient of consumptionobject q heat(q) represents the heat of the consumption
object q heat(all) represents the heat of all consumptionobjects It can be seen from the formula that the consump-tion coefficient is the proportion of a certain consumptionobject in the total consumption objects The consumptioncoefficient is added to the matrix ITEM(119909 y) as an addi-tional parameter so the expression of the matrix becomesITEM(119909 y coeff ) Therefore the structural characteristics ofconsumption can be described from three dimensions Byanalyzing ITEM(119909 y coeff ) the implied information aboutconsumption structure consumption trend and consump-tion upgrading can be obtainedThe detailed analysis processis described in the next section
5 Discovery of IndividualConsumption Upgrading
To compare the consumption data fromdifferent periods thediffering degrees of the matrixes that describe the consump-tion structures in different periods need to be calculatedIf the difference degree exceeds a certain threshold valuethen the consumption upgrading phenomenon is consideredas happening Here the consumption matrix at momentn is denoted as ITEMn and the consumption matrix atmoment n+1 is denoted as ITEMn+1 By comparing ITEMnand ITEMn+1 the differences of different consumptionmatrixes can be calculated and structural changes can bedetectedThe formula for calculating degree of difference is asfollows
119862119874119877(ITEM119899 ITEM119899+1)= sum119899119894=1 (119868119879119864119872119894 minus 119868119879119864119872119899)sum119899+1119895=1 (119868119879119864119872119895 minus 119868119879119864119872119899+1)radicsum119899119894=1 (119868119879119864119872119894 minus 119868119879119864119872119899)2sum119899+1119895=1 (119868119879119864119872119895 minus 119868119879119864119872119899+1)2
(7)
According to the formula the coefficient COR is obtainedby dividing the covariance by the standard deviation of twovariables The covariance can reflect the correlation degreebetween two random variables When the covariance isgreater than 0 it means that the two variables are positivelycorrelated and when the covariance is less than 0 it meansthat the two variables are negatively correlated Note that thecoefficient is meaningful when both variables are not zeroand the range of the coefficient is [-1 1] When COR is 1ITEMn and ITEMn+1 are completely positively correlatedWhen COR is -1 ITEMn and ITEMn+1 are completely
4 Complexity
20
18
16
14
12
10
00 02 04 06 08 10
+10
X
Y
(a) COR is +10
00
02
04
06
08
10
00
00
02 04 06 08 10X
Y
(b) COR is 0-10
00
02
04
06
08
10
00 02 04 06 08 10X
Y
(c) COR is -10
Figure 2 The coefficient COR of different consumption structure
negatively correlated The greater the absolute value of CORis the stronger the correlation degree between ITEMn andITEMn+1 is The closer the coefficient COR is to 0 the weakerthe correlation degree between ITEMn and ITEMn+1 is
Through the above methods we can build consumptionmatrixes for different periods and judge the differences inconsumption structure in between periods by calculating thecoefficient CORWhenCOR approaches 1 or -1 it means thatthe consumption structure significantly changed so it can beconsidered that consumption upgrading occurred As shownin Figure 2 the closer the coefficient is 1 or -1 the greaterthe structural difference is while a coefficient that is near 0indicates that the consumption structure changed little andthere is no upgrading
6 Experiment
The datasets used throughout the experiments are ZacharyrsquosKarate Club(httpwww-personalumichedusimmejnnetda-ta) DolphinrsquosAssociations(httpwww-personalumichedu
simmejnnetdata) LesMiserables(httpwikigephiorgindexphpDatasets) MovieLens(httpwwwdatatangcomdatar-esdetailaspxid=44295) and EP dataset(httpwwwdian-pingcom)(1) The dataset of Zacharyrsquos Karate Club is a socialnetwork of friendships between 34 members so edges in thegraph describe the higher frequency of interactions betweenmembers(2)The dataset of Dolphinrsquos Associations is an undirectedsocial network of frequent associations between 62 dolphinswhich has 62 nodes and 159 edges(3) The dataset of LesMiserables is a coappearance net-work of characters in LesMiserables which contains 77 nodesand 254 edges(4)Thedataset of MovieLens is a synthesized recommen-dation system and virtual community which is commonlyused for social computing(5) The EP dataset was captured from an e-commerceplatform (dianpingcom) It contains 15890209 pieces of dataand was updated in August 2018The data collection fields are
Complexity 5
1
minus05 00 05
minus1
0
Figure 3 The positive and negative distribution of comments
shop id (uniqueness) province city city id area big cate(the primary classification) big cate id small cate (thesecondary classification) small cate id service rating allremarks very good remarks (5-star review) good remarks(4-star review) common remarks (3-star review) badremarks (2-star review) and very bad remarks (1-starreview)
Comparison Methods NMFOSC [16] presents an approachto community detection that utilizes a nonnegative matrixfactorization model to divide overlapping communities fromnetworks RNM [17] is a local expansion method based onrough neighborhood CPM [18] greedily expands naturalcommunities of seeds until the whole graph is covered byusing a local fitness function EdgeB-Cluster [19] bundlessimilar edges adjusts the locations of nodes to optimize thevisualized output of the graph and analyzes networks from acommunity level
Through the analysis of a consumption object in a certainregion in the e-commerce platform it was found that thenumber of positive comments is very large This is becausethere is a phenomenon of deliberately increasing the numberof good comments to improve the storersquos reputation whichresults in the presence of too many good comments Onthe contrary the numbers of neutral and negative commentsare relatively reasonable and few of these comments areintentionally added or deleted so they are convincing Basedon the above factors the authors of this paper did not analyzethe quantity of positive comments and only consideredthe quantity of neutral and negative comments Throughexperimental verification of the quality of the neutral andnegative comments it was found that the data is authenticand abundant and enough to describe the object to betested
Figure 3 shows the distribution of positive and negativecomments The nodes in the insufficient and controversialareas do not provide valuable information for subsequentanalysis so they were removed It can be seen that there aremore positive nodes than negative ones and the difference
between them is large It is important to note that althoughthe nodes in the negative area represent that the userrsquoscomments are negative they still provide useful knowledgeabout consumption trends that cannot be removed
Figure 3 shows the regional analysis results while Figure 4shows the overall analysis results Figure 4 is the visual outputof the results generated by the algorithm Figure 4(a) showsthe distribution of nodes and the colors of nodes indicatethe heat of different consumption objects The darker a nodecolor is the more attention the node received Figure 4(b)is a heat map of a center node and the black node isthe center node Figure 4(b) shows that there is a certaincorrelation between the central node and a large number ofother nodes indicating that there are many high correlationsbetween different consumer groups Thus the characteristicsof consumption objects can be further analyzed based on therelationships between consumers and commodities
Figure 5 depicts the distribution of different consumptionobjects In this case the node with a ratio of more than06 is regarded as a popular consumption node while anode with a ratio of less than or equal to 06 is regardedas an unpopular consumption node Of course if the ratiothreshold is lowered then additional nodes will be dividedinto popular consumption areas It can be seen from Figure 5that most nodes belong to the nonhot field which is in-linewith the actual situation
Figure 6(a) depicts the characteristics of nodes that weredivided into two categories to describe different consumptionheat (some representative nodes are extracted) It can beseen that the characteristics of nodes in different categoriesvary greatly Figure 6(b) describes the closeness centralitydistribution of the nodes belong to the same category Thisshows that the node locations have normal distribution sothe similarity between the nodes in the same category is veryhigh That is to say the classification is reasonable
For the purpose of analyzing the experimental resultsthe following measurement parameters are used [20] Mul-tiplicity Precision calculated by 119872119875 = min(|119862(119890) cap 119862(1198901015840)||119871(119890) cap 119871(1198901015840)|)|119862(119890) cap 119862(1198901015840)| Multiplicity Recall by 119872119877 =min(|119862(119890) cap 119862(1198901015840)| |119871(119890) cap 119871(1198901015840)|)|119871(119890) cap 119871(1198901015840)| Let L(e) and
6 Complexity
(a) Distribution of nodes (b) The heat map
Figure 4 Visualization of the results generated by the algorithm
1
09
08
06
07
05
04
03
01
02
0 1 2 3 4 5 6 7
times 104node
ratio
Figure 5 The distribution of different consumption objects
C(e) denote the category and the cluster of an item e e is acluster with n items belonging to the same category and 1198901015840is a cluster merging n items from unary categories FB is acomprehensive measure ofMP andMR and the algorithm isFB=MPtimesMRtimes2(MP+MR)
Table 1 proves the validity and feasibility of the algorithmThe numbers in italic indicate the highest value of thesame parameter in each row Table 1 displays a comparisonof the algorithm proposed in this paper to other similaralgorithms The datasets Karate Club Dolphin LesMiser-ables and MovieLens are used to prove the performanceof the algorithms in structural analysis The EP dataset isused to prove the performance of e-commerce data analysisIt is found that EC-Structure performs better than otheralgorithms and performs stably with different data sets Themain reasons for which EC-Structure is superior to otheralgorithms are that (1) it reduces the influence of erroneouse-business platform data on the algorithm (2) it increasesthe consumption coefficient as a parameter with which to
measure the proportions of different consumption objectsand (3) the coefficient COR can help researchers accuratelyjudge changes in consumption structures Therefore theoperation effect of this algorithm is effective
7 Conclusion
Research on consumer behavior can be made by extractingand analyzing useful information from a large amount ofincomplete vague and random consumer behavior dataThe algorithm proposed in this paper builds consumptionstructures and a consumption upgrading model based on thedata from e-commerce platforms to analyze whether con-sumption upgrading occurred The results of the experimentverified the implementation efficacy and analysis accuracy ofthe algorithm It was found that the algorithm is effectiveThe implementation efficacy of the proposed algorithm issuperior to those of other algorithms and it runs stably withdifferent datasets
Data Availability
The data used to support the findings of this study areincluded within the article
Conflicts of Interest
The authors declare that there are no conflicts of interestregarding the publication of this paper
Acknowledgments
This work was supported by Youth Fund of Humanity andSocial Science of Ministry of Education of China (Grantno 18YJCZH041) Project of Education Department of JilinProvince of China (Grant no JJKH20190612SK)
Complexity 7
Table1Th
eperform
ance
comparis
ons
dataset
NMFO
SCRN
MCP
MEd
geB-Cluster
EC-Structure
MR
MP
FBMR
MP
FBMR
MP
FBMR
MP
FBMR
MP
FBKa
rateClub
100
092
096
084
100
091
058
094
071
100
100
100
100
100
100
Dolph
in064
090
075
046
097
062
040
094
056
073
098
083
080
098
088
LesM
iserables
072
087
079
080
088
084
048
089
062
081
088
084
088
083
085
MovieLens
083
085
084
056
086
068
081
086
083
081
088
084
082
088
085
EPdataset
085
080
082
053
056
054
053
065
058
079
083
081
085
082
083
8 Complexity
cold consumption area
hot c
onsu
mpt
ion
area
Degree Distribution1000
900800700600500400300200100
0
Value0 500 1000 1500 2000 2500 3000
Cou
nt
(a) Degree distribution
400375350325300275250225200175150125100
755025
0
Closeness Centrality Distribution
Value1 2 3 4
Cou
nt
(b) Closeness centrality distribution
Figure 6 The characteristic distribution of hot consumption and cold consumption nodes
References
[1] P N Ireland ldquoUsing the permanent income hypothesis for fore-castingrdquo Federal Reserve Bank of Richmond Economic Quarterlyvol 81 no 1 pp 49ndash63 1995
[2] L A Fisher andGKingston ldquoImproved forecasts of tax revenuevia the permanent income hypothesisrdquo Australian EconomicReview vol 50 no 1 pp 21ndash31 2017
[3] L Zhou C Wang and S O Finance ldquoHousehold debt andconsumption-evidence frommicro datardquo So Science vol 3 pp32ndash43 2018
[4] M Zagler ldquoEmpirical evidence on growth and business cyclesrdquoEmpirica vol 44 pp 1ndash20 2017
[5] C Curme T Preis and H E Stanley ldquoQuantifying the seman-tics of search behavior before stock market movesrdquo Proceedingsof the National Academy of Sciences of the United States ofAmerica vol 111 no 32 pp 11600ndash11605 2014
[6] J Reichardt and S Bornholdt Ebay users from stable groups ofcommon interest 2005
[7] H Halpin ldquoThe semantics of searchrdquo in Social Semantics pp149ndash186 Springer US 2013
[8] T Chattopadhyay S Maiti A Pal et al ldquoAutomatic discovery ofemerging trends using cluster name synthesis on user consump-tion data extended abstractrdquo in Proceedings of InternationalConference Companion on World Wide Web pp 981ndash983 2016
[9] L Guo W Zuo and T Peng ldquoInference network building andmovements prediction based on analysis of induced dependen-ciesrdquo IET Soware vol 11 no 1 pp 12ndash17 2017
[10] B Glass Z Benenson and R Landwirth ldquoLook before youleap improving the usersrsquo ability to detect fraud in electronicmarketplacesrdquo in Proceedings of the CHI Conference on HumanFactors in Computing Systems pp 3870ndash3882 ACM 2016
[11] P Singh and M Singh ldquoFraud detection by monitoring cus-tomer behavior and activitiesrdquo Annals of Regional Science vol49 no 1 pp 1ndash27 2012
[12] D Aviano B L Putro and E P Nugroho ldquoBehavioral trackinganalysis on learning management system with apriori associa-tion rules algorithmrdquo inProceedings of the 2017 3rd InternationalConference on Science in Information Technology (ICSITech)Bandung Indonesia 2017
[13] K Kim Y Choi and J Park ldquoPricing fraud detection inonline shopping malls using a finite mixture modelrdquo Electronic
Commerce Research and Applications vol 12 no 3 pp 195ndash2072013
[14] S Ouaftouh A Zellou and A Idri ldquoUser profile model a userdimension based classificationrdquo in Proceedings of the 2015 10thInternational Conference on Intelligent Systems eories andApplications (SITA) Rabat Morocco 2015
[15] Y Diao K Y Liu and L Hu ldquoClassification ofmassive user loadcharacteristics in distribution network based on agglomerativehierarchical algorithmrdquo in Proceedings of the 2016 Interna-tional Conference on Cyber-Enabled Distributed Computing andKnowledge Discovery (CyberC) Chengdu China 2017
[16] N Chen Y Liu and H-C Chao ldquoOverlapping communitydetection using non-negative matrix factorization with orthog-onal and sparseness constraintsrdquo IEEE Access vol 6 pp 21266ndash21274 2017
[17] Z H Zhang D Q Miao and J Qian ldquoDetecting overlappingcommunities with heuristic expansion method based on roughneighborhoodrdquo Chinese Journal of Computer vol 36 no 102013
[18] F Havemann M Heinz and A Struch ldquoIdentication of over-lapping communities and their hierarchy by locally calculatingcommunity-changing resolution levelsrdquo Journal of StatisticalMechanics eory and Experiment vol 1 2011
[19] L Guo W Zuo T Peng and B K Adhikari ldquoAttribute-based edge bundling for visualizing social networksrdquo PhysicaA Statistical Mechanics and Its Applications vol 438 pp 48ndash552015
[20] E Amigo J Gonzalo J Artiles and F Verdejo ldquoA comparisonof extrinsic clustering evaluation metrics based on formalconstraintsrdquo Information Retrieval vol 12 no 4 pp 461ndash4862009
Hindawiwwwhindawicom Volume 2018
MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Applied MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Probability and StatisticsHindawiwwwhindawicom Volume 2018
Journal of
Hindawiwwwhindawicom Volume 2018
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawiwwwhindawicom Volume 2018
OptimizationJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
Hindawiwwwhindawicom Volume 2018
Operations ResearchAdvances in
Journal of
Hindawiwwwhindawicom Volume 2018
Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018
International Journal of Mathematics and Mathematical Sciences
Hindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018Volume 2018
Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in
Nature and SocietyHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Dierential EquationsInternational Journal of
Volume 2018
Hindawiwwwhindawicom Volume 2018
Decision SciencesAdvances in
Hindawiwwwhindawicom Volume 2018
AnalysisInternational Journal of
Hindawiwwwhindawicom Volume 2018
Stochastic AnalysisInternational Journal of
Submit your manuscripts atwwwhindawicom
4 Complexity
20
18
16
14
12
10
00 02 04 06 08 10
+10
X
Y
(a) COR is +10
00
02
04
06
08
10
00
00
02 04 06 08 10X
Y
(b) COR is 0-10
00
02
04
06
08
10
00 02 04 06 08 10X
Y
(c) COR is -10
Figure 2 The coefficient COR of different consumption structure
negatively correlated The greater the absolute value of CORis the stronger the correlation degree between ITEMn andITEMn+1 is The closer the coefficient COR is to 0 the weakerthe correlation degree between ITEMn and ITEMn+1 is
Through the above methods we can build consumptionmatrixes for different periods and judge the differences inconsumption structure in between periods by calculating thecoefficient CORWhenCOR approaches 1 or -1 it means thatthe consumption structure significantly changed so it can beconsidered that consumption upgrading occurred As shownin Figure 2 the closer the coefficient is 1 or -1 the greaterthe structural difference is while a coefficient that is near 0indicates that the consumption structure changed little andthere is no upgrading
6 Experiment
The datasets used throughout the experiments are ZacharyrsquosKarate Club(httpwww-personalumichedusimmejnnetda-ta) DolphinrsquosAssociations(httpwww-personalumichedu
simmejnnetdata) LesMiserables(httpwikigephiorgindexphpDatasets) MovieLens(httpwwwdatatangcomdatar-esdetailaspxid=44295) and EP dataset(httpwwwdian-pingcom)(1) The dataset of Zacharyrsquos Karate Club is a socialnetwork of friendships between 34 members so edges in thegraph describe the higher frequency of interactions betweenmembers(2)The dataset of Dolphinrsquos Associations is an undirectedsocial network of frequent associations between 62 dolphinswhich has 62 nodes and 159 edges(3) The dataset of LesMiserables is a coappearance net-work of characters in LesMiserables which contains 77 nodesand 254 edges(4)Thedataset of MovieLens is a synthesized recommen-dation system and virtual community which is commonlyused for social computing(5) The EP dataset was captured from an e-commerceplatform (dianpingcom) It contains 15890209 pieces of dataand was updated in August 2018The data collection fields are
Complexity 5
1
minus05 00 05
minus1
0
Figure 3 The positive and negative distribution of comments
shop id (uniqueness) province city city id area big cate(the primary classification) big cate id small cate (thesecondary classification) small cate id service rating allremarks very good remarks (5-star review) good remarks(4-star review) common remarks (3-star review) badremarks (2-star review) and very bad remarks (1-starreview)
Comparison Methods NMFOSC [16] presents an approachto community detection that utilizes a nonnegative matrixfactorization model to divide overlapping communities fromnetworks RNM [17] is a local expansion method based onrough neighborhood CPM [18] greedily expands naturalcommunities of seeds until the whole graph is covered byusing a local fitness function EdgeB-Cluster [19] bundlessimilar edges adjusts the locations of nodes to optimize thevisualized output of the graph and analyzes networks from acommunity level
Through the analysis of a consumption object in a certainregion in the e-commerce platform it was found that thenumber of positive comments is very large This is becausethere is a phenomenon of deliberately increasing the numberof good comments to improve the storersquos reputation whichresults in the presence of too many good comments Onthe contrary the numbers of neutral and negative commentsare relatively reasonable and few of these comments areintentionally added or deleted so they are convincing Basedon the above factors the authors of this paper did not analyzethe quantity of positive comments and only consideredthe quantity of neutral and negative comments Throughexperimental verification of the quality of the neutral andnegative comments it was found that the data is authenticand abundant and enough to describe the object to betested
Figure 3 shows the distribution of positive and negativecomments The nodes in the insufficient and controversialareas do not provide valuable information for subsequentanalysis so they were removed It can be seen that there aremore positive nodes than negative ones and the difference
between them is large It is important to note that althoughthe nodes in the negative area represent that the userrsquoscomments are negative they still provide useful knowledgeabout consumption trends that cannot be removed
Figure 3 shows the regional analysis results while Figure 4shows the overall analysis results Figure 4 is the visual outputof the results generated by the algorithm Figure 4(a) showsthe distribution of nodes and the colors of nodes indicatethe heat of different consumption objects The darker a nodecolor is the more attention the node received Figure 4(b)is a heat map of a center node and the black node isthe center node Figure 4(b) shows that there is a certaincorrelation between the central node and a large number ofother nodes indicating that there are many high correlationsbetween different consumer groups Thus the characteristicsof consumption objects can be further analyzed based on therelationships between consumers and commodities
Figure 5 depicts the distribution of different consumptionobjects In this case the node with a ratio of more than06 is regarded as a popular consumption node while anode with a ratio of less than or equal to 06 is regardedas an unpopular consumption node Of course if the ratiothreshold is lowered then additional nodes will be dividedinto popular consumption areas It can be seen from Figure 5that most nodes belong to the nonhot field which is in-linewith the actual situation
Figure 6(a) depicts the characteristics of nodes that weredivided into two categories to describe different consumptionheat (some representative nodes are extracted) It can beseen that the characteristics of nodes in different categoriesvary greatly Figure 6(b) describes the closeness centralitydistribution of the nodes belong to the same category Thisshows that the node locations have normal distribution sothe similarity between the nodes in the same category is veryhigh That is to say the classification is reasonable
For the purpose of analyzing the experimental resultsthe following measurement parameters are used [20] Mul-tiplicity Precision calculated by 119872119875 = min(|119862(119890) cap 119862(1198901015840)||119871(119890) cap 119871(1198901015840)|)|119862(119890) cap 119862(1198901015840)| Multiplicity Recall by 119872119877 =min(|119862(119890) cap 119862(1198901015840)| |119871(119890) cap 119871(1198901015840)|)|119871(119890) cap 119871(1198901015840)| Let L(e) and
6 Complexity
(a) Distribution of nodes (b) The heat map
Figure 4 Visualization of the results generated by the algorithm
1
09
08
06
07
05
04
03
01
02
0 1 2 3 4 5 6 7
times 104node
ratio
Figure 5 The distribution of different consumption objects
C(e) denote the category and the cluster of an item e e is acluster with n items belonging to the same category and 1198901015840is a cluster merging n items from unary categories FB is acomprehensive measure ofMP andMR and the algorithm isFB=MPtimesMRtimes2(MP+MR)
Table 1 proves the validity and feasibility of the algorithmThe numbers in italic indicate the highest value of thesame parameter in each row Table 1 displays a comparisonof the algorithm proposed in this paper to other similaralgorithms The datasets Karate Club Dolphin LesMiser-ables and MovieLens are used to prove the performanceof the algorithms in structural analysis The EP dataset isused to prove the performance of e-commerce data analysisIt is found that EC-Structure performs better than otheralgorithms and performs stably with different data sets Themain reasons for which EC-Structure is superior to otheralgorithms are that (1) it reduces the influence of erroneouse-business platform data on the algorithm (2) it increasesthe consumption coefficient as a parameter with which to
measure the proportions of different consumption objectsand (3) the coefficient COR can help researchers accuratelyjudge changes in consumption structures Therefore theoperation effect of this algorithm is effective
7 Conclusion
Research on consumer behavior can be made by extractingand analyzing useful information from a large amount ofincomplete vague and random consumer behavior dataThe algorithm proposed in this paper builds consumptionstructures and a consumption upgrading model based on thedata from e-commerce platforms to analyze whether con-sumption upgrading occurred The results of the experimentverified the implementation efficacy and analysis accuracy ofthe algorithm It was found that the algorithm is effectiveThe implementation efficacy of the proposed algorithm issuperior to those of other algorithms and it runs stably withdifferent datasets
Data Availability
The data used to support the findings of this study areincluded within the article
Conflicts of Interest
The authors declare that there are no conflicts of interestregarding the publication of this paper
Acknowledgments
This work was supported by Youth Fund of Humanity andSocial Science of Ministry of Education of China (Grantno 18YJCZH041) Project of Education Department of JilinProvince of China (Grant no JJKH20190612SK)
Complexity 7
Table1Th
eperform
ance
comparis
ons
dataset
NMFO
SCRN
MCP
MEd
geB-Cluster
EC-Structure
MR
MP
FBMR
MP
FBMR
MP
FBMR
MP
FBMR
MP
FBKa
rateClub
100
092
096
084
100
091
058
094
071
100
100
100
100
100
100
Dolph
in064
090
075
046
097
062
040
094
056
073
098
083
080
098
088
LesM
iserables
072
087
079
080
088
084
048
089
062
081
088
084
088
083
085
MovieLens
083
085
084
056
086
068
081
086
083
081
088
084
082
088
085
EPdataset
085
080
082
053
056
054
053
065
058
079
083
081
085
082
083
8 Complexity
cold consumption area
hot c
onsu
mpt
ion
area
Degree Distribution1000
900800700600500400300200100
0
Value0 500 1000 1500 2000 2500 3000
Cou
nt
(a) Degree distribution
400375350325300275250225200175150125100
755025
0
Closeness Centrality Distribution
Value1 2 3 4
Cou
nt
(b) Closeness centrality distribution
Figure 6 The characteristic distribution of hot consumption and cold consumption nodes
References
[1] P N Ireland ldquoUsing the permanent income hypothesis for fore-castingrdquo Federal Reserve Bank of Richmond Economic Quarterlyvol 81 no 1 pp 49ndash63 1995
[2] L A Fisher andGKingston ldquoImproved forecasts of tax revenuevia the permanent income hypothesisrdquo Australian EconomicReview vol 50 no 1 pp 21ndash31 2017
[3] L Zhou C Wang and S O Finance ldquoHousehold debt andconsumption-evidence frommicro datardquo So Science vol 3 pp32ndash43 2018
[4] M Zagler ldquoEmpirical evidence on growth and business cyclesrdquoEmpirica vol 44 pp 1ndash20 2017
[5] C Curme T Preis and H E Stanley ldquoQuantifying the seman-tics of search behavior before stock market movesrdquo Proceedingsof the National Academy of Sciences of the United States ofAmerica vol 111 no 32 pp 11600ndash11605 2014
[6] J Reichardt and S Bornholdt Ebay users from stable groups ofcommon interest 2005
[7] H Halpin ldquoThe semantics of searchrdquo in Social Semantics pp149ndash186 Springer US 2013
[8] T Chattopadhyay S Maiti A Pal et al ldquoAutomatic discovery ofemerging trends using cluster name synthesis on user consump-tion data extended abstractrdquo in Proceedings of InternationalConference Companion on World Wide Web pp 981ndash983 2016
[9] L Guo W Zuo and T Peng ldquoInference network building andmovements prediction based on analysis of induced dependen-ciesrdquo IET Soware vol 11 no 1 pp 12ndash17 2017
[10] B Glass Z Benenson and R Landwirth ldquoLook before youleap improving the usersrsquo ability to detect fraud in electronicmarketplacesrdquo in Proceedings of the CHI Conference on HumanFactors in Computing Systems pp 3870ndash3882 ACM 2016
[11] P Singh and M Singh ldquoFraud detection by monitoring cus-tomer behavior and activitiesrdquo Annals of Regional Science vol49 no 1 pp 1ndash27 2012
[12] D Aviano B L Putro and E P Nugroho ldquoBehavioral trackinganalysis on learning management system with apriori associa-tion rules algorithmrdquo inProceedings of the 2017 3rd InternationalConference on Science in Information Technology (ICSITech)Bandung Indonesia 2017
[13] K Kim Y Choi and J Park ldquoPricing fraud detection inonline shopping malls using a finite mixture modelrdquo Electronic
Commerce Research and Applications vol 12 no 3 pp 195ndash2072013
[14] S Ouaftouh A Zellou and A Idri ldquoUser profile model a userdimension based classificationrdquo in Proceedings of the 2015 10thInternational Conference on Intelligent Systems eories andApplications (SITA) Rabat Morocco 2015
[15] Y Diao K Y Liu and L Hu ldquoClassification ofmassive user loadcharacteristics in distribution network based on agglomerativehierarchical algorithmrdquo in Proceedings of the 2016 Interna-tional Conference on Cyber-Enabled Distributed Computing andKnowledge Discovery (CyberC) Chengdu China 2017
[16] N Chen Y Liu and H-C Chao ldquoOverlapping communitydetection using non-negative matrix factorization with orthog-onal and sparseness constraintsrdquo IEEE Access vol 6 pp 21266ndash21274 2017
[17] Z H Zhang D Q Miao and J Qian ldquoDetecting overlappingcommunities with heuristic expansion method based on roughneighborhoodrdquo Chinese Journal of Computer vol 36 no 102013
[18] F Havemann M Heinz and A Struch ldquoIdentication of over-lapping communities and their hierarchy by locally calculatingcommunity-changing resolution levelsrdquo Journal of StatisticalMechanics eory and Experiment vol 1 2011
[19] L Guo W Zuo T Peng and B K Adhikari ldquoAttribute-based edge bundling for visualizing social networksrdquo PhysicaA Statistical Mechanics and Its Applications vol 438 pp 48ndash552015
[20] E Amigo J Gonzalo J Artiles and F Verdejo ldquoA comparisonof extrinsic clustering evaluation metrics based on formalconstraintsrdquo Information Retrieval vol 12 no 4 pp 461ndash4862009
Hindawiwwwhindawicom Volume 2018
MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Applied MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Probability and StatisticsHindawiwwwhindawicom Volume 2018
Journal of
Hindawiwwwhindawicom Volume 2018
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawiwwwhindawicom Volume 2018
OptimizationJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
Hindawiwwwhindawicom Volume 2018
Operations ResearchAdvances in
Journal of
Hindawiwwwhindawicom Volume 2018
Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018
International Journal of Mathematics and Mathematical Sciences
Hindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018Volume 2018
Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in
Nature and SocietyHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Dierential EquationsInternational Journal of
Volume 2018
Hindawiwwwhindawicom Volume 2018
Decision SciencesAdvances in
Hindawiwwwhindawicom Volume 2018
AnalysisInternational Journal of
Hindawiwwwhindawicom Volume 2018
Stochastic AnalysisInternational Journal of
Submit your manuscripts atwwwhindawicom
Complexity 5
1
minus05 00 05
minus1
0
Figure 3 The positive and negative distribution of comments
shop id (uniqueness) province city city id area big cate(the primary classification) big cate id small cate (thesecondary classification) small cate id service rating allremarks very good remarks (5-star review) good remarks(4-star review) common remarks (3-star review) badremarks (2-star review) and very bad remarks (1-starreview)
Comparison Methods NMFOSC [16] presents an approachto community detection that utilizes a nonnegative matrixfactorization model to divide overlapping communities fromnetworks RNM [17] is a local expansion method based onrough neighborhood CPM [18] greedily expands naturalcommunities of seeds until the whole graph is covered byusing a local fitness function EdgeB-Cluster [19] bundlessimilar edges adjusts the locations of nodes to optimize thevisualized output of the graph and analyzes networks from acommunity level
Through the analysis of a consumption object in a certainregion in the e-commerce platform it was found that thenumber of positive comments is very large This is becausethere is a phenomenon of deliberately increasing the numberof good comments to improve the storersquos reputation whichresults in the presence of too many good comments Onthe contrary the numbers of neutral and negative commentsare relatively reasonable and few of these comments areintentionally added or deleted so they are convincing Basedon the above factors the authors of this paper did not analyzethe quantity of positive comments and only consideredthe quantity of neutral and negative comments Throughexperimental verification of the quality of the neutral andnegative comments it was found that the data is authenticand abundant and enough to describe the object to betested
Figure 3 shows the distribution of positive and negativecomments The nodes in the insufficient and controversialareas do not provide valuable information for subsequentanalysis so they were removed It can be seen that there aremore positive nodes than negative ones and the difference
between them is large It is important to note that althoughthe nodes in the negative area represent that the userrsquoscomments are negative they still provide useful knowledgeabout consumption trends that cannot be removed
Figure 3 shows the regional analysis results while Figure 4shows the overall analysis results Figure 4 is the visual outputof the results generated by the algorithm Figure 4(a) showsthe distribution of nodes and the colors of nodes indicatethe heat of different consumption objects The darker a nodecolor is the more attention the node received Figure 4(b)is a heat map of a center node and the black node isthe center node Figure 4(b) shows that there is a certaincorrelation between the central node and a large number ofother nodes indicating that there are many high correlationsbetween different consumer groups Thus the characteristicsof consumption objects can be further analyzed based on therelationships between consumers and commodities
Figure 5 depicts the distribution of different consumptionobjects In this case the node with a ratio of more than06 is regarded as a popular consumption node while anode with a ratio of less than or equal to 06 is regardedas an unpopular consumption node Of course if the ratiothreshold is lowered then additional nodes will be dividedinto popular consumption areas It can be seen from Figure 5that most nodes belong to the nonhot field which is in-linewith the actual situation
Figure 6(a) depicts the characteristics of nodes that weredivided into two categories to describe different consumptionheat (some representative nodes are extracted) It can beseen that the characteristics of nodes in different categoriesvary greatly Figure 6(b) describes the closeness centralitydistribution of the nodes belong to the same category Thisshows that the node locations have normal distribution sothe similarity between the nodes in the same category is veryhigh That is to say the classification is reasonable
For the purpose of analyzing the experimental resultsthe following measurement parameters are used [20] Mul-tiplicity Precision calculated by 119872119875 = min(|119862(119890) cap 119862(1198901015840)||119871(119890) cap 119871(1198901015840)|)|119862(119890) cap 119862(1198901015840)| Multiplicity Recall by 119872119877 =min(|119862(119890) cap 119862(1198901015840)| |119871(119890) cap 119871(1198901015840)|)|119871(119890) cap 119871(1198901015840)| Let L(e) and
6 Complexity
(a) Distribution of nodes (b) The heat map
Figure 4 Visualization of the results generated by the algorithm
1
09
08
06
07
05
04
03
01
02
0 1 2 3 4 5 6 7
times 104node
ratio
Figure 5 The distribution of different consumption objects
C(e) denote the category and the cluster of an item e e is acluster with n items belonging to the same category and 1198901015840is a cluster merging n items from unary categories FB is acomprehensive measure ofMP andMR and the algorithm isFB=MPtimesMRtimes2(MP+MR)
Table 1 proves the validity and feasibility of the algorithmThe numbers in italic indicate the highest value of thesame parameter in each row Table 1 displays a comparisonof the algorithm proposed in this paper to other similaralgorithms The datasets Karate Club Dolphin LesMiser-ables and MovieLens are used to prove the performanceof the algorithms in structural analysis The EP dataset isused to prove the performance of e-commerce data analysisIt is found that EC-Structure performs better than otheralgorithms and performs stably with different data sets Themain reasons for which EC-Structure is superior to otheralgorithms are that (1) it reduces the influence of erroneouse-business platform data on the algorithm (2) it increasesthe consumption coefficient as a parameter with which to
measure the proportions of different consumption objectsand (3) the coefficient COR can help researchers accuratelyjudge changes in consumption structures Therefore theoperation effect of this algorithm is effective
7 Conclusion
Research on consumer behavior can be made by extractingand analyzing useful information from a large amount ofincomplete vague and random consumer behavior dataThe algorithm proposed in this paper builds consumptionstructures and a consumption upgrading model based on thedata from e-commerce platforms to analyze whether con-sumption upgrading occurred The results of the experimentverified the implementation efficacy and analysis accuracy ofthe algorithm It was found that the algorithm is effectiveThe implementation efficacy of the proposed algorithm issuperior to those of other algorithms and it runs stably withdifferent datasets
Data Availability
The data used to support the findings of this study areincluded within the article
Conflicts of Interest
The authors declare that there are no conflicts of interestregarding the publication of this paper
Acknowledgments
This work was supported by Youth Fund of Humanity andSocial Science of Ministry of Education of China (Grantno 18YJCZH041) Project of Education Department of JilinProvince of China (Grant no JJKH20190612SK)
Complexity 7
Table1Th
eperform
ance
comparis
ons
dataset
NMFO
SCRN
MCP
MEd
geB-Cluster
EC-Structure
MR
MP
FBMR
MP
FBMR
MP
FBMR
MP
FBMR
MP
FBKa
rateClub
100
092
096
084
100
091
058
094
071
100
100
100
100
100
100
Dolph
in064
090
075
046
097
062
040
094
056
073
098
083
080
098
088
LesM
iserables
072
087
079
080
088
084
048
089
062
081
088
084
088
083
085
MovieLens
083
085
084
056
086
068
081
086
083
081
088
084
082
088
085
EPdataset
085
080
082
053
056
054
053
065
058
079
083
081
085
082
083
8 Complexity
cold consumption area
hot c
onsu
mpt
ion
area
Degree Distribution1000
900800700600500400300200100
0
Value0 500 1000 1500 2000 2500 3000
Cou
nt
(a) Degree distribution
400375350325300275250225200175150125100
755025
0
Closeness Centrality Distribution
Value1 2 3 4
Cou
nt
(b) Closeness centrality distribution
Figure 6 The characteristic distribution of hot consumption and cold consumption nodes
References
[1] P N Ireland ldquoUsing the permanent income hypothesis for fore-castingrdquo Federal Reserve Bank of Richmond Economic Quarterlyvol 81 no 1 pp 49ndash63 1995
[2] L A Fisher andGKingston ldquoImproved forecasts of tax revenuevia the permanent income hypothesisrdquo Australian EconomicReview vol 50 no 1 pp 21ndash31 2017
[3] L Zhou C Wang and S O Finance ldquoHousehold debt andconsumption-evidence frommicro datardquo So Science vol 3 pp32ndash43 2018
[4] M Zagler ldquoEmpirical evidence on growth and business cyclesrdquoEmpirica vol 44 pp 1ndash20 2017
[5] C Curme T Preis and H E Stanley ldquoQuantifying the seman-tics of search behavior before stock market movesrdquo Proceedingsof the National Academy of Sciences of the United States ofAmerica vol 111 no 32 pp 11600ndash11605 2014
[6] J Reichardt and S Bornholdt Ebay users from stable groups ofcommon interest 2005
[7] H Halpin ldquoThe semantics of searchrdquo in Social Semantics pp149ndash186 Springer US 2013
[8] T Chattopadhyay S Maiti A Pal et al ldquoAutomatic discovery ofemerging trends using cluster name synthesis on user consump-tion data extended abstractrdquo in Proceedings of InternationalConference Companion on World Wide Web pp 981ndash983 2016
[9] L Guo W Zuo and T Peng ldquoInference network building andmovements prediction based on analysis of induced dependen-ciesrdquo IET Soware vol 11 no 1 pp 12ndash17 2017
[10] B Glass Z Benenson and R Landwirth ldquoLook before youleap improving the usersrsquo ability to detect fraud in electronicmarketplacesrdquo in Proceedings of the CHI Conference on HumanFactors in Computing Systems pp 3870ndash3882 ACM 2016
[11] P Singh and M Singh ldquoFraud detection by monitoring cus-tomer behavior and activitiesrdquo Annals of Regional Science vol49 no 1 pp 1ndash27 2012
[12] D Aviano B L Putro and E P Nugroho ldquoBehavioral trackinganalysis on learning management system with apriori associa-tion rules algorithmrdquo inProceedings of the 2017 3rd InternationalConference on Science in Information Technology (ICSITech)Bandung Indonesia 2017
[13] K Kim Y Choi and J Park ldquoPricing fraud detection inonline shopping malls using a finite mixture modelrdquo Electronic
Commerce Research and Applications vol 12 no 3 pp 195ndash2072013
[14] S Ouaftouh A Zellou and A Idri ldquoUser profile model a userdimension based classificationrdquo in Proceedings of the 2015 10thInternational Conference on Intelligent Systems eories andApplications (SITA) Rabat Morocco 2015
[15] Y Diao K Y Liu and L Hu ldquoClassification ofmassive user loadcharacteristics in distribution network based on agglomerativehierarchical algorithmrdquo in Proceedings of the 2016 Interna-tional Conference on Cyber-Enabled Distributed Computing andKnowledge Discovery (CyberC) Chengdu China 2017
[16] N Chen Y Liu and H-C Chao ldquoOverlapping communitydetection using non-negative matrix factorization with orthog-onal and sparseness constraintsrdquo IEEE Access vol 6 pp 21266ndash21274 2017
[17] Z H Zhang D Q Miao and J Qian ldquoDetecting overlappingcommunities with heuristic expansion method based on roughneighborhoodrdquo Chinese Journal of Computer vol 36 no 102013
[18] F Havemann M Heinz and A Struch ldquoIdentication of over-lapping communities and their hierarchy by locally calculatingcommunity-changing resolution levelsrdquo Journal of StatisticalMechanics eory and Experiment vol 1 2011
[19] L Guo W Zuo T Peng and B K Adhikari ldquoAttribute-based edge bundling for visualizing social networksrdquo PhysicaA Statistical Mechanics and Its Applications vol 438 pp 48ndash552015
[20] E Amigo J Gonzalo J Artiles and F Verdejo ldquoA comparisonof extrinsic clustering evaluation metrics based on formalconstraintsrdquo Information Retrieval vol 12 no 4 pp 461ndash4862009
Hindawiwwwhindawicom Volume 2018
MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Applied MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Probability and StatisticsHindawiwwwhindawicom Volume 2018
Journal of
Hindawiwwwhindawicom Volume 2018
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawiwwwhindawicom Volume 2018
OptimizationJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
Hindawiwwwhindawicom Volume 2018
Operations ResearchAdvances in
Journal of
Hindawiwwwhindawicom Volume 2018
Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018
International Journal of Mathematics and Mathematical Sciences
Hindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018Volume 2018
Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in
Nature and SocietyHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Dierential EquationsInternational Journal of
Volume 2018
Hindawiwwwhindawicom Volume 2018
Decision SciencesAdvances in
Hindawiwwwhindawicom Volume 2018
AnalysisInternational Journal of
Hindawiwwwhindawicom Volume 2018
Stochastic AnalysisInternational Journal of
Submit your manuscripts atwwwhindawicom
6 Complexity
(a) Distribution of nodes (b) The heat map
Figure 4 Visualization of the results generated by the algorithm
1
09
08
06
07
05
04
03
01
02
0 1 2 3 4 5 6 7
times 104node
ratio
Figure 5 The distribution of different consumption objects
C(e) denote the category and the cluster of an item e e is acluster with n items belonging to the same category and 1198901015840is a cluster merging n items from unary categories FB is acomprehensive measure ofMP andMR and the algorithm isFB=MPtimesMRtimes2(MP+MR)
Table 1 proves the validity and feasibility of the algorithmThe numbers in italic indicate the highest value of thesame parameter in each row Table 1 displays a comparisonof the algorithm proposed in this paper to other similaralgorithms The datasets Karate Club Dolphin LesMiser-ables and MovieLens are used to prove the performanceof the algorithms in structural analysis The EP dataset isused to prove the performance of e-commerce data analysisIt is found that EC-Structure performs better than otheralgorithms and performs stably with different data sets Themain reasons for which EC-Structure is superior to otheralgorithms are that (1) it reduces the influence of erroneouse-business platform data on the algorithm (2) it increasesthe consumption coefficient as a parameter with which to
measure the proportions of different consumption objectsand (3) the coefficient COR can help researchers accuratelyjudge changes in consumption structures Therefore theoperation effect of this algorithm is effective
7 Conclusion
Research on consumer behavior can be made by extractingand analyzing useful information from a large amount ofincomplete vague and random consumer behavior dataThe algorithm proposed in this paper builds consumptionstructures and a consumption upgrading model based on thedata from e-commerce platforms to analyze whether con-sumption upgrading occurred The results of the experimentverified the implementation efficacy and analysis accuracy ofthe algorithm It was found that the algorithm is effectiveThe implementation efficacy of the proposed algorithm issuperior to those of other algorithms and it runs stably withdifferent datasets
Data Availability
The data used to support the findings of this study areincluded within the article
Conflicts of Interest
The authors declare that there are no conflicts of interestregarding the publication of this paper
Acknowledgments
This work was supported by Youth Fund of Humanity andSocial Science of Ministry of Education of China (Grantno 18YJCZH041) Project of Education Department of JilinProvince of China (Grant no JJKH20190612SK)
Complexity 7
Table1Th
eperform
ance
comparis
ons
dataset
NMFO
SCRN
MCP
MEd
geB-Cluster
EC-Structure
MR
MP
FBMR
MP
FBMR
MP
FBMR
MP
FBMR
MP
FBKa
rateClub
100
092
096
084
100
091
058
094
071
100
100
100
100
100
100
Dolph
in064
090
075
046
097
062
040
094
056
073
098
083
080
098
088
LesM
iserables
072
087
079
080
088
084
048
089
062
081
088
084
088
083
085
MovieLens
083
085
084
056
086
068
081
086
083
081
088
084
082
088
085
EPdataset
085
080
082
053
056
054
053
065
058
079
083
081
085
082
083
8 Complexity
cold consumption area
hot c
onsu
mpt
ion
area
Degree Distribution1000
900800700600500400300200100
0
Value0 500 1000 1500 2000 2500 3000
Cou
nt
(a) Degree distribution
400375350325300275250225200175150125100
755025
0
Closeness Centrality Distribution
Value1 2 3 4
Cou
nt
(b) Closeness centrality distribution
Figure 6 The characteristic distribution of hot consumption and cold consumption nodes
References
[1] P N Ireland ldquoUsing the permanent income hypothesis for fore-castingrdquo Federal Reserve Bank of Richmond Economic Quarterlyvol 81 no 1 pp 49ndash63 1995
[2] L A Fisher andGKingston ldquoImproved forecasts of tax revenuevia the permanent income hypothesisrdquo Australian EconomicReview vol 50 no 1 pp 21ndash31 2017
[3] L Zhou C Wang and S O Finance ldquoHousehold debt andconsumption-evidence frommicro datardquo So Science vol 3 pp32ndash43 2018
[4] M Zagler ldquoEmpirical evidence on growth and business cyclesrdquoEmpirica vol 44 pp 1ndash20 2017
[5] C Curme T Preis and H E Stanley ldquoQuantifying the seman-tics of search behavior before stock market movesrdquo Proceedingsof the National Academy of Sciences of the United States ofAmerica vol 111 no 32 pp 11600ndash11605 2014
[6] J Reichardt and S Bornholdt Ebay users from stable groups ofcommon interest 2005
[7] H Halpin ldquoThe semantics of searchrdquo in Social Semantics pp149ndash186 Springer US 2013
[8] T Chattopadhyay S Maiti A Pal et al ldquoAutomatic discovery ofemerging trends using cluster name synthesis on user consump-tion data extended abstractrdquo in Proceedings of InternationalConference Companion on World Wide Web pp 981ndash983 2016
[9] L Guo W Zuo and T Peng ldquoInference network building andmovements prediction based on analysis of induced dependen-ciesrdquo IET Soware vol 11 no 1 pp 12ndash17 2017
[10] B Glass Z Benenson and R Landwirth ldquoLook before youleap improving the usersrsquo ability to detect fraud in electronicmarketplacesrdquo in Proceedings of the CHI Conference on HumanFactors in Computing Systems pp 3870ndash3882 ACM 2016
[11] P Singh and M Singh ldquoFraud detection by monitoring cus-tomer behavior and activitiesrdquo Annals of Regional Science vol49 no 1 pp 1ndash27 2012
[12] D Aviano B L Putro and E P Nugroho ldquoBehavioral trackinganalysis on learning management system with apriori associa-tion rules algorithmrdquo inProceedings of the 2017 3rd InternationalConference on Science in Information Technology (ICSITech)Bandung Indonesia 2017
[13] K Kim Y Choi and J Park ldquoPricing fraud detection inonline shopping malls using a finite mixture modelrdquo Electronic
Commerce Research and Applications vol 12 no 3 pp 195ndash2072013
[14] S Ouaftouh A Zellou and A Idri ldquoUser profile model a userdimension based classificationrdquo in Proceedings of the 2015 10thInternational Conference on Intelligent Systems eories andApplications (SITA) Rabat Morocco 2015
[15] Y Diao K Y Liu and L Hu ldquoClassification ofmassive user loadcharacteristics in distribution network based on agglomerativehierarchical algorithmrdquo in Proceedings of the 2016 Interna-tional Conference on Cyber-Enabled Distributed Computing andKnowledge Discovery (CyberC) Chengdu China 2017
[16] N Chen Y Liu and H-C Chao ldquoOverlapping communitydetection using non-negative matrix factorization with orthog-onal and sparseness constraintsrdquo IEEE Access vol 6 pp 21266ndash21274 2017
[17] Z H Zhang D Q Miao and J Qian ldquoDetecting overlappingcommunities with heuristic expansion method based on roughneighborhoodrdquo Chinese Journal of Computer vol 36 no 102013
[18] F Havemann M Heinz and A Struch ldquoIdentication of over-lapping communities and their hierarchy by locally calculatingcommunity-changing resolution levelsrdquo Journal of StatisticalMechanics eory and Experiment vol 1 2011
[19] L Guo W Zuo T Peng and B K Adhikari ldquoAttribute-based edge bundling for visualizing social networksrdquo PhysicaA Statistical Mechanics and Its Applications vol 438 pp 48ndash552015
[20] E Amigo J Gonzalo J Artiles and F Verdejo ldquoA comparisonof extrinsic clustering evaluation metrics based on formalconstraintsrdquo Information Retrieval vol 12 no 4 pp 461ndash4862009
Hindawiwwwhindawicom Volume 2018
MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Applied MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Probability and StatisticsHindawiwwwhindawicom Volume 2018
Journal of
Hindawiwwwhindawicom Volume 2018
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawiwwwhindawicom Volume 2018
OptimizationJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
Hindawiwwwhindawicom Volume 2018
Operations ResearchAdvances in
Journal of
Hindawiwwwhindawicom Volume 2018
Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018
International Journal of Mathematics and Mathematical Sciences
Hindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018Volume 2018
Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in
Nature and SocietyHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Dierential EquationsInternational Journal of
Volume 2018
Hindawiwwwhindawicom Volume 2018
Decision SciencesAdvances in
Hindawiwwwhindawicom Volume 2018
AnalysisInternational Journal of
Hindawiwwwhindawicom Volume 2018
Stochastic AnalysisInternational Journal of
Submit your manuscripts atwwwhindawicom
Complexity 7
Table1Th
eperform
ance
comparis
ons
dataset
NMFO
SCRN
MCP
MEd
geB-Cluster
EC-Structure
MR
MP
FBMR
MP
FBMR
MP
FBMR
MP
FBMR
MP
FBKa
rateClub
100
092
096
084
100
091
058
094
071
100
100
100
100
100
100
Dolph
in064
090
075
046
097
062
040
094
056
073
098
083
080
098
088
LesM
iserables
072
087
079
080
088
084
048
089
062
081
088
084
088
083
085
MovieLens
083
085
084
056
086
068
081
086
083
081
088
084
082
088
085
EPdataset
085
080
082
053
056
054
053
065
058
079
083
081
085
082
083
8 Complexity
cold consumption area
hot c
onsu
mpt
ion
area
Degree Distribution1000
900800700600500400300200100
0
Value0 500 1000 1500 2000 2500 3000
Cou
nt
(a) Degree distribution
400375350325300275250225200175150125100
755025
0
Closeness Centrality Distribution
Value1 2 3 4
Cou
nt
(b) Closeness centrality distribution
Figure 6 The characteristic distribution of hot consumption and cold consumption nodes
References
[1] P N Ireland ldquoUsing the permanent income hypothesis for fore-castingrdquo Federal Reserve Bank of Richmond Economic Quarterlyvol 81 no 1 pp 49ndash63 1995
[2] L A Fisher andGKingston ldquoImproved forecasts of tax revenuevia the permanent income hypothesisrdquo Australian EconomicReview vol 50 no 1 pp 21ndash31 2017
[3] L Zhou C Wang and S O Finance ldquoHousehold debt andconsumption-evidence frommicro datardquo So Science vol 3 pp32ndash43 2018
[4] M Zagler ldquoEmpirical evidence on growth and business cyclesrdquoEmpirica vol 44 pp 1ndash20 2017
[5] C Curme T Preis and H E Stanley ldquoQuantifying the seman-tics of search behavior before stock market movesrdquo Proceedingsof the National Academy of Sciences of the United States ofAmerica vol 111 no 32 pp 11600ndash11605 2014
[6] J Reichardt and S Bornholdt Ebay users from stable groups ofcommon interest 2005
[7] H Halpin ldquoThe semantics of searchrdquo in Social Semantics pp149ndash186 Springer US 2013
[8] T Chattopadhyay S Maiti A Pal et al ldquoAutomatic discovery ofemerging trends using cluster name synthesis on user consump-tion data extended abstractrdquo in Proceedings of InternationalConference Companion on World Wide Web pp 981ndash983 2016
[9] L Guo W Zuo and T Peng ldquoInference network building andmovements prediction based on analysis of induced dependen-ciesrdquo IET Soware vol 11 no 1 pp 12ndash17 2017
[10] B Glass Z Benenson and R Landwirth ldquoLook before youleap improving the usersrsquo ability to detect fraud in electronicmarketplacesrdquo in Proceedings of the CHI Conference on HumanFactors in Computing Systems pp 3870ndash3882 ACM 2016
[11] P Singh and M Singh ldquoFraud detection by monitoring cus-tomer behavior and activitiesrdquo Annals of Regional Science vol49 no 1 pp 1ndash27 2012
[12] D Aviano B L Putro and E P Nugroho ldquoBehavioral trackinganalysis on learning management system with apriori associa-tion rules algorithmrdquo inProceedings of the 2017 3rd InternationalConference on Science in Information Technology (ICSITech)Bandung Indonesia 2017
[13] K Kim Y Choi and J Park ldquoPricing fraud detection inonline shopping malls using a finite mixture modelrdquo Electronic
Commerce Research and Applications vol 12 no 3 pp 195ndash2072013
[14] S Ouaftouh A Zellou and A Idri ldquoUser profile model a userdimension based classificationrdquo in Proceedings of the 2015 10thInternational Conference on Intelligent Systems eories andApplications (SITA) Rabat Morocco 2015
[15] Y Diao K Y Liu and L Hu ldquoClassification ofmassive user loadcharacteristics in distribution network based on agglomerativehierarchical algorithmrdquo in Proceedings of the 2016 Interna-tional Conference on Cyber-Enabled Distributed Computing andKnowledge Discovery (CyberC) Chengdu China 2017
[16] N Chen Y Liu and H-C Chao ldquoOverlapping communitydetection using non-negative matrix factorization with orthog-onal and sparseness constraintsrdquo IEEE Access vol 6 pp 21266ndash21274 2017
[17] Z H Zhang D Q Miao and J Qian ldquoDetecting overlappingcommunities with heuristic expansion method based on roughneighborhoodrdquo Chinese Journal of Computer vol 36 no 102013
[18] F Havemann M Heinz and A Struch ldquoIdentication of over-lapping communities and their hierarchy by locally calculatingcommunity-changing resolution levelsrdquo Journal of StatisticalMechanics eory and Experiment vol 1 2011
[19] L Guo W Zuo T Peng and B K Adhikari ldquoAttribute-based edge bundling for visualizing social networksrdquo PhysicaA Statistical Mechanics and Its Applications vol 438 pp 48ndash552015
[20] E Amigo J Gonzalo J Artiles and F Verdejo ldquoA comparisonof extrinsic clustering evaluation metrics based on formalconstraintsrdquo Information Retrieval vol 12 no 4 pp 461ndash4862009
Hindawiwwwhindawicom Volume 2018
MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Applied MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Probability and StatisticsHindawiwwwhindawicom Volume 2018
Journal of
Hindawiwwwhindawicom Volume 2018
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawiwwwhindawicom Volume 2018
OptimizationJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
Hindawiwwwhindawicom Volume 2018
Operations ResearchAdvances in
Journal of
Hindawiwwwhindawicom Volume 2018
Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018
International Journal of Mathematics and Mathematical Sciences
Hindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018Volume 2018
Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in
Nature and SocietyHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Dierential EquationsInternational Journal of
Volume 2018
Hindawiwwwhindawicom Volume 2018
Decision SciencesAdvances in
Hindawiwwwhindawicom Volume 2018
AnalysisInternational Journal of
Hindawiwwwhindawicom Volume 2018
Stochastic AnalysisInternational Journal of
Submit your manuscripts atwwwhindawicom
8 Complexity
cold consumption area
hot c
onsu
mpt
ion
area
Degree Distribution1000
900800700600500400300200100
0
Value0 500 1000 1500 2000 2500 3000
Cou
nt
(a) Degree distribution
400375350325300275250225200175150125100
755025
0
Closeness Centrality Distribution
Value1 2 3 4
Cou
nt
(b) Closeness centrality distribution
Figure 6 The characteristic distribution of hot consumption and cold consumption nodes
References
[1] P N Ireland ldquoUsing the permanent income hypothesis for fore-castingrdquo Federal Reserve Bank of Richmond Economic Quarterlyvol 81 no 1 pp 49ndash63 1995
[2] L A Fisher andGKingston ldquoImproved forecasts of tax revenuevia the permanent income hypothesisrdquo Australian EconomicReview vol 50 no 1 pp 21ndash31 2017
[3] L Zhou C Wang and S O Finance ldquoHousehold debt andconsumption-evidence frommicro datardquo So Science vol 3 pp32ndash43 2018
[4] M Zagler ldquoEmpirical evidence on growth and business cyclesrdquoEmpirica vol 44 pp 1ndash20 2017
[5] C Curme T Preis and H E Stanley ldquoQuantifying the seman-tics of search behavior before stock market movesrdquo Proceedingsof the National Academy of Sciences of the United States ofAmerica vol 111 no 32 pp 11600ndash11605 2014
[6] J Reichardt and S Bornholdt Ebay users from stable groups ofcommon interest 2005
[7] H Halpin ldquoThe semantics of searchrdquo in Social Semantics pp149ndash186 Springer US 2013
[8] T Chattopadhyay S Maiti A Pal et al ldquoAutomatic discovery ofemerging trends using cluster name synthesis on user consump-tion data extended abstractrdquo in Proceedings of InternationalConference Companion on World Wide Web pp 981ndash983 2016
[9] L Guo W Zuo and T Peng ldquoInference network building andmovements prediction based on analysis of induced dependen-ciesrdquo IET Soware vol 11 no 1 pp 12ndash17 2017
[10] B Glass Z Benenson and R Landwirth ldquoLook before youleap improving the usersrsquo ability to detect fraud in electronicmarketplacesrdquo in Proceedings of the CHI Conference on HumanFactors in Computing Systems pp 3870ndash3882 ACM 2016
[11] P Singh and M Singh ldquoFraud detection by monitoring cus-tomer behavior and activitiesrdquo Annals of Regional Science vol49 no 1 pp 1ndash27 2012
[12] D Aviano B L Putro and E P Nugroho ldquoBehavioral trackinganalysis on learning management system with apriori associa-tion rules algorithmrdquo inProceedings of the 2017 3rd InternationalConference on Science in Information Technology (ICSITech)Bandung Indonesia 2017
[13] K Kim Y Choi and J Park ldquoPricing fraud detection inonline shopping malls using a finite mixture modelrdquo Electronic
Commerce Research and Applications vol 12 no 3 pp 195ndash2072013
[14] S Ouaftouh A Zellou and A Idri ldquoUser profile model a userdimension based classificationrdquo in Proceedings of the 2015 10thInternational Conference on Intelligent Systems eories andApplications (SITA) Rabat Morocco 2015
[15] Y Diao K Y Liu and L Hu ldquoClassification ofmassive user loadcharacteristics in distribution network based on agglomerativehierarchical algorithmrdquo in Proceedings of the 2016 Interna-tional Conference on Cyber-Enabled Distributed Computing andKnowledge Discovery (CyberC) Chengdu China 2017
[16] N Chen Y Liu and H-C Chao ldquoOverlapping communitydetection using non-negative matrix factorization with orthog-onal and sparseness constraintsrdquo IEEE Access vol 6 pp 21266ndash21274 2017
[17] Z H Zhang D Q Miao and J Qian ldquoDetecting overlappingcommunities with heuristic expansion method based on roughneighborhoodrdquo Chinese Journal of Computer vol 36 no 102013
[18] F Havemann M Heinz and A Struch ldquoIdentication of over-lapping communities and their hierarchy by locally calculatingcommunity-changing resolution levelsrdquo Journal of StatisticalMechanics eory and Experiment vol 1 2011
[19] L Guo W Zuo T Peng and B K Adhikari ldquoAttribute-based edge bundling for visualizing social networksrdquo PhysicaA Statistical Mechanics and Its Applications vol 438 pp 48ndash552015
[20] E Amigo J Gonzalo J Artiles and F Verdejo ldquoA comparisonof extrinsic clustering evaluation metrics based on formalconstraintsrdquo Information Retrieval vol 12 no 4 pp 461ndash4862009
Hindawiwwwhindawicom Volume 2018
MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Applied MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Probability and StatisticsHindawiwwwhindawicom Volume 2018
Journal of
Hindawiwwwhindawicom Volume 2018
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawiwwwhindawicom Volume 2018
OptimizationJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
Hindawiwwwhindawicom Volume 2018
Operations ResearchAdvances in
Journal of
Hindawiwwwhindawicom Volume 2018
Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018
International Journal of Mathematics and Mathematical Sciences
Hindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018Volume 2018
Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in
Nature and SocietyHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Dierential EquationsInternational Journal of
Volume 2018
Hindawiwwwhindawicom Volume 2018
Decision SciencesAdvances in
Hindawiwwwhindawicom Volume 2018
AnalysisInternational Journal of
Hindawiwwwhindawicom Volume 2018
Stochastic AnalysisInternational Journal of
Submit your manuscripts atwwwhindawicom
Hindawiwwwhindawicom Volume 2018
MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Applied MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Probability and StatisticsHindawiwwwhindawicom Volume 2018
Journal of
Hindawiwwwhindawicom Volume 2018
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawiwwwhindawicom Volume 2018
OptimizationJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
Hindawiwwwhindawicom Volume 2018
Operations ResearchAdvances in
Journal of
Hindawiwwwhindawicom Volume 2018
Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018
International Journal of Mathematics and Mathematical Sciences
Hindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018Volume 2018
Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in
Nature and SocietyHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Dierential EquationsInternational Journal of
Volume 2018
Hindawiwwwhindawicom Volume 2018
Decision SciencesAdvances in
Hindawiwwwhindawicom Volume 2018
AnalysisInternational Journal of
Hindawiwwwhindawicom Volume 2018
Stochastic AnalysisInternational Journal of
Submit your manuscripts atwwwhindawicom