9
Market Manipulation of Bitcoin: Evidence from Mining the Mt. Gox Transaction Network Weili Chen, Jun Wu, Zibin Zheng * , Chuan Chen * , and Yuren Zhou School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China National Engineering Research Center of Digital Life, Sun Yat-sen University, Guangzhou, 510006, China {chenwli9, wujun53}@mail2.sysu.edu.cn, {zhzibin, chenchuan, zhouyuren}@mail.sysu.edu.cn Abstract—The cryptocurrency market is a very huge market without effective supervision. It is of great importance for investors and regulators to recognize whether there are market manipulation and its manipulation patterns. This paper proposes an approach to mine the transaction networks of exchanges for answering this question. By taking the leaked transaction history of Mt. Gox Bitcoin exchange as a sample, we first divide the accounts into three categories according to its characteristic and then construct the transaction history into three graphs. Many observations and findings are obtained via analyzing the constructed graphs. To evaluate the influence of the accounts’ transaction behavior on the Bitcoin exchange price, the graphs are reconstructed into series and reshaped as matrices. By using singular value decomposition (SVD) on the matrices, we identify many base networks which have a great correlation with the price fluctuation. When further analyzing the most important accounts in the base networks, plenty of market manipulation patterns are found. According to these findings, we conclude that there was serious market manipulation in Mt. Gox exchange and the cryptocurrency market must strengthen the supervision. Index Terms—Bitcoin, Blockchian, Transaction network, Tem- poral network, Singular value decomposition I. I NTRODUCTION Bitcoin has become one of the hottest buzzwords among investors and researchers. It is the first and most famous decen- tralized digital currency [1], which is secured by cryptography (thus, we call it cryptocurrency). Unlike fiat currencies which usually issued by financial institutions, there is no centralized organization or country controlling the issue and operation of Bitcoin. Furthermore, because of decentralization, users in the Bitcoin system are anonymous. The two characteristics (i.e., decentralization and anonymity) make Bitcoin attract a lot of users since its creation in 2009. It is estimated that there are more than 10 million users in the Bitcoin system [2]. Since the famous “Bitcoin Pizza Day” when a programmer bought two pizzas with 10,000 BTC on May 22, 2010, Bitcoin began to exchange with fiat currencies. Soon afterward, a Bitcoin exchange, Mt. Gox launched. By 2013 and before filing for bankruptcy protection in February 2014, Mt. Gox was the largest bitcoin intermediary and the world’s leading Bitcoin exchange [3]. Nowadays, there are more than 1,700 cryptocurrencies inspired by Bitcoin and the daily transaction volume is over $ 150 billion dollar according to coinmarket- cap.com at the moment of writing this paper. * Zibin Zheng and Chuan Chen are both corresponding authors. The huge fluctuation of the exchange price of cryptocur- rency is an important reason to attract investors’ participation. Figure 1 shows the Bitcoin price (i.e., the exchange rate between Bitcoin and USD dollar in this paper) from 2012/12 to 2015/6. During this period, the Bitcoin price rose sharply from about $10/BTC to exceeding $1,000/BTC and then fell back to below $200/BTC. This extreme price fluctuation has also attracted a large number of researchers to find the determinant factors of the Bitcoin price. Four categories of factors are discussed, including 1) economic factors (e.g., the supply and demand of Bitcoin) [4]; 2) technical factors (e.g., hash rate and difficulty) [5]; 3) interest factors (through proxy variable such as Google trends) [6]; and 4) other financial assets (e.g., gold, stock). In addition, by using the principal component analysis method (analogous to SVD), the paper [7] indicates that the Bitcoin price has a strong correlation with the transactions on the blockchain ledger. Fig. 1. Bitcoin-USD exchange price at Bitstamp exchange, with the period being studied shaded. However, these factors are discussed based on data outside the exchanges. Because of the lack of supervision, a nature conjecture is that the extreme fluctuation may be related to the market manipulation of the exchanges. This conjecture is hard to verify as it is very difficult to obtain the detailed trading data from the trading platform. Surprisingly, many transaction histories from April 2011 to November 2013 of the once famous Bitcoin exchange Mt. Gox were leaked in the arXiv:1902.01941v1 [q-fin.GN] 19 Jan 2019

Cryptocurrency Market Cap, Live Prices, Charts & Ticker - Market … - Evidence... · 2020. 6. 14. · Abstract—The cryptocurrency market is a very huge market without effective

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Cryptocurrency Market Cap, Live Prices, Charts & Ticker - Market … - Evidence... · 2020. 6. 14. · Abstract—The cryptocurrency market is a very huge market without effective

Market Manipulation of Bitcoin Evidence fromMining the Mt Gox Transaction Network

Weili Chen Jun Wu Zibin Zhenglowast Chuan Chenlowast and Yuren ZhouSchool of Data and Computer Science Sun Yat-sen University Guangzhou China

National Engineering Research Center of Digital Life Sun Yat-sen University Guangzhou 510006 Chinachenwli9 wujun53mail2sysueducn zhzibin chenchuan zhouyurenmailsysueducn

AbstractmdashThe cryptocurrency market is a very huge marketwithout effective supervision It is of great importance forinvestors and regulators to recognize whether there are marketmanipulation and its manipulation patterns This paper proposesan approach to mine the transaction networks of exchangesfor answering this question By taking the leaked transactionhistory of Mt Gox Bitcoin exchange as a sample we first dividethe accounts into three categories according to its characteristicand then construct the transaction history into three graphsMany observations and findings are obtained via analyzing theconstructed graphs To evaluate the influence of the accountsrsquotransaction behavior on the Bitcoin exchange price the graphsare reconstructed into series and reshaped as matrices By usingsingular value decomposition (SVD) on the matrices we identifymany base networks which have a great correlation with theprice fluctuation When further analyzing the most importantaccounts in the base networks plenty of market manipulationpatterns are found According to these findings we conclude thatthere was serious market manipulation in Mt Gox exchange andthe cryptocurrency market must strengthen the supervision

Index TermsmdashBitcoin Blockchian Transaction network Tem-poral network Singular value decomposition

I INTRODUCTION

Bitcoin has become one of the hottest buzzwords amonginvestors and researchers It is the first and most famous decen-tralized digital currency [1] which is secured by cryptography(thus we call it cryptocurrency) Unlike fiat currencies whichusually issued by financial institutions there is no centralizedorganization or country controlling the issue and operation ofBitcoin Furthermore because of decentralization users in theBitcoin system are anonymous The two characteristics (iedecentralization and anonymity) make Bitcoin attract a lot ofusers since its creation in 2009 It is estimated that there aremore than 10 million users in the Bitcoin system [2]

Since the famous ldquoBitcoin Pizza Dayrdquo when a programmerbought two pizzas with 10000 BTC on May 22 2010 Bitcoinbegan to exchange with fiat currencies Soon afterward aBitcoin exchange Mt Gox launched By 2013 and beforefiling for bankruptcy protection in February 2014 Mt Goxwas the largest bitcoin intermediary and the worldrsquos leadingBitcoin exchange [3] Nowadays there are more than 1700cryptocurrencies inspired by Bitcoin and the daily transactionvolume is over $ 150 billion dollar according to coinmarket-capcom at the moment of writing this paper

lowast Zibin Zheng and Chuan Chen are both corresponding authors

The huge fluctuation of the exchange price of cryptocur-rency is an important reason to attract investorsrsquo participationFigure 1 shows the Bitcoin price (ie the exchange ratebetween Bitcoin and USD dollar in this paper) from 201212 to20156 During this period the Bitcoin price rose sharply fromabout $10BTC to exceeding $1000BTC and then fell backto below $200BTC This extreme price fluctuation has alsoattracted a large number of researchers to find the determinantfactors of the Bitcoin price Four categories of factors arediscussed including 1) economic factors (eg the supply anddemand of Bitcoin) [4] 2) technical factors (eg hash rate anddifficulty) [5] 3) interest factors (through proxy variable suchas Google trends) [6] and 4) other financial assets (eg goldstock) In addition by using the principal component analysismethod (analogous to SVD) the paper [7] indicates that theBitcoin price has a strong correlation with the transactions onthe blockchain ledger

2012-12 2013-05 2013-11 2014-5 2014-11 2015-05Date

0

200

400

600

800

1000

1200

USD

BTC

Exch

ange

Rat

e

exchange price

Fig 1 Bitcoin-USD exchange price at Bitstamp exchange with the periodbeing studied shaded

However these factors are discussed based on data outsidethe exchanges Because of the lack of supervision a natureconjecture is that the extreme fluctuation may be related tothe market manipulation of the exchanges This conjecture ishard to verify as it is very difficult to obtain the detailedtrading data from the trading platform Surprisingly manytransaction histories from April 2011 to November 2013 ofthe once famous Bitcoin exchange Mt Gox were leaked in the

arX

iv1

902

0194

1v1

[q-

fin

GN

] 1

9 Ja

n 20

19

form of CSV files These data provide a perfect opportunityfor answering the conjecture

To verify whether there is market manipulation and identifypossible manipulation patterns is urgent and of great impor-tance as plenty of investors who are dreaming of gettingrich overnight are attracted to the market The answer to thisquestion will help investors recognize the potential risks andhelp to regulate legislation Based on the leaked data a recentpaper [8] points out that the Mt Gox exchange manipulatedthe Bitcoin price by building a regression model to identifythe influence of the activities of some suspicious accounts onthe price We adopt a completely different method comparedwith it and obtain more results including fake volume pricemanipulation and manipulation patterns

Figure 2 shows an overview of our analysis We first verifythe leaked data and remove many unreasonable records Thenby comparing the transaction price with the disclosed MtGox price in quandlcom we find many abnormal transac-tions By using these transactions we divide the accountsinto three categories extreme high account (EHA) extremelow account (ELA) and normal account (NMA) Next weconstruct the extreme high graph (EHG) extreme low graph(ELG) and normal graph (NMG) by seeing the accounts asnodes and transactions as edges we conduct various graphstructure analysis on EHG ELG and NMG such as nodesand edges classification measuring graph clusters and degreedistribution Such investigation leads to new observations andfindings For example the abnormal accounts (ie EHA andELA) might be controlled by the exchange and used toprovide liquidity and fake volume for the exchange Finallyby dividing the graphs into daily snapshots and reconstructingit in a matrix we extract some base graphs through singularvalue decomposition (SVD) By doing this we find that theabnormal accountsrsquo transactions strongly related to the Bitcoinprice Furthermore we find many strange transaction patterns(such as self-loop bi-direction triangle etc) within abnormalaccounts These patterns are considered as evidence of marketmanipulation in the exchange

Fig 2 An overview of our analysis

In summary we make the following major contributionsbull To the best of our knowledge it is the first study on

market manipulation of cryptocurrency via graph analysis

and SVD Besides we prove the effectiveness of themethod by applying to the leaked Mt Gox transactiondata

bull We obtain many new observations and findings by char-acterizing the activities of different accounts (ie staticnetwork analysis) and adopting SVD on the daily snap-shots of the graphs (ie temporal network analysis)These findings convinced us that there are many marketmanipulation behaviors in the exchange

bull We detect many market manipulation patterns which havenever been reported in this area These patterns are strongevidence of market manipulation and can help investorsand regulators to recognize the dark side and its severityof the market

The rest of the paper is organized as follows After intro-ducing the data set in Section II we detail the static networkanalysis in Section III and the temporal network analysis inSection IV Finally we provide some related works in SectionV and conclude the paper in Section VI

II DATA SET

In early 2014 the transaction history from April 2011 toNovember 2013 of Mt Gox was leaked in the form of CSVfiles Table I reports a segment of the leaked data recordedon 20131001 Two rows with the same Trade Id indicatinga complete transaction from the seller (Type=sell) to thebuyer (Type=buy) The volume of the transaction is recordedin Bitcoins and the turnover in Money thus the real-timeprice of Bitcoin at the transaction moment is MoneyBitcoinsEach user has a unique identity (User Id) with the FIPSlocation codes recorded in the country (User Country) andstate (User State) fields There are some other attributes (egtransaction fees) not included in the table as they are not usedin this study

Data Cleaning As there are many duplicate entries in theleaked data we adopt a similar way for data cleaning as theprevious studies [3] [8] Specifically we use the combinationof the four key fields date user ID type and Bitcoins toremove duplicated entries (de-duplication strategy 2 in [3])After this step we remove all the single row transaction tomake sure that each transaction has the corresponding buyerand seller (ie a completed transaction) Then we removeall duplicated complete transactions By doing this the datanarrows from approximately 18 million rows to 135 millionrows (ie 67 million completed transactions) This method ismore strict than the method in [8] as complete transactionswith the same trade id are treated as duplicates We adopta more strict method in the hope of providing more reliableresults

Advantages The leaked Mt Gox data has many advantagesin understanding the transaction behaviors in cryptocurrencyand its influence on the price First of all Mt Gox was thedominant exchange and Bitcoin has been the main cryptocur-rency during the period thus analyzing the cryptocurrencymarket based on this data set is more reliable and represen-tative Second these data are much more finely grained than

TABLE IA SEGMENT OF THE LEAKED DATA

Trade Id Date User Id Type Currency Bitcoins Money User Country User State1380587338975940 2013101 02858 125439 buy USD 05 7169169 US NC1380587338975940 2013101 02858 295701 sell USD 05 7169169 CA QC1380739642844790 2013102 184722 609336 buy USD 026177217 3396631 US PA1380739642844790 2013102 184722 36865 sell USD 026177217 3396631 US CA

data extracted from the blockchain since most trading activityis recorded only in the exchange Furthermore users can beidentified by their accounts in the leaked data while it is hardin blockchain to identify a user because of its anonymousmechanism

III STATIC NETWORK ANALYSIS

A Account Classification

Before delving deeper into the Mt Gox leaked data wecheck the Bitcoin exchange price of each transaction (ieMoneyBitcoin) to inspect whether it falls between the highestand lowest exchange price of the disclosed price on the sameday To this end we first download all the Bitcoin exchangerate (BTC vs USD) on Mt Gox from quandlcom (we callthis reference price) Then we compare the exchange priceof each transaction with the reference price Surprisinglywe find that there are some abnormal transactions whichhave a very high or low exchange price For example on20130830 a transaction (trade ID=1377875127221631) hadan exchange price of $493384BTC and another transaction(trade ID=1377876535345547) had an exchange price of only$081BTC whereas on the same day the highest and lowestexchange price in the download data are $14276BTC and$12856BTC respectively

These transactions are abnormal as the exchange price isclearly out of the reasonable range In order to distinguish thetransaction behavior of different accounts and its influence onthe price we divide all the accounts into three categoriesextremely high account (EHA) extremely low account (ELA)and normal account (NMA) As a first step we apply a simpleapproach to identify an abnormal transaction For this supposethe highest and the lowest reference price on day t is Ht

and Lt we regard an transaction with real-time price largerthan 15 times Ht as an extremely high price transaction (EHT)and with real-time price lower than 05times Lt as an extremelylow price transaction (ELT) Both kinds of transactions arereferred to as abnormal transactions (ABTs) Please note thatwe use (05 times Lt 15 times Ht) instead of (Lt Ht) to identifyan abnormal transaction because there are many exchanges(thus many reference price) at the same time and we cannotmake sure the reference price is the real price of the exchangeHowever the parameter 05 and 15 is enough to exclude anynormal transaction Finally an account is an EHA if it has atleast one extremely high price transaction and an ELA if it hasat least one extremely low price transaction Both EHAs andELAs are referred to as abnormal accounts (ABA) Please notethat abnormal accounts could be both an EHA and an ELA if itinvolves both EHT and ELT NMA is an account involved in no

TABLE IISTATICS OF ACCOUNTS AND TRANSACTIONS

Category accounts Tx ABT EHT ELTEHA 10702 1406850 179701 138743 40958ELA 5835 2486807 85784 29737 56047ABA 14916 3025992 194790 138743 56047NMA 104427 812865 0 0 0

All 119343 6775117 194790 138743 56047

abnormal transactions that is to say all involved transactionsare normal transactions (NMT)

Table II shows the number of accounts and all kinds oftransactions for each category of accounts Four observationscan be made from the table 1) there are 14916 abnormalaccounts which account for 125 (14916119343) of all theaccounts (please note that the number of ABA is not thesum of the number of EHA and ELA due to the existenceof accounts contained in both categories) 2) the proportionof abnormal transactions (ABT) among ABAs accounts for28 (asymp1947906775117) 3) the number of normal transac-tions among ABAs (3025992-194790=2831202) account formore than 41 (28312026775117) of all transactions and 4)the sum of the number of transactions (Tx) among ABAsand NMAs is far less than the number of all transactions thusmany transactions occurred between ABA and NMA

Based on these observations one can confirm that the abnor-mal transactions do not occur by accident (observation 2) andthe abnormal accounts behave normally in most of their times(observation 3) Thus the existence of the abnormal accountsmust have a certain special purpose One of the most likelypurposes is for providing liquidity (observation 4 SectionIII-C) Considering the analysis on the recent cryptocurrencymarket of a trader and investor which report that in someexchanges most of their disclosed trading volume are fake [9]another possible purpose for these accounts is for fake volumeBesides price manipulation is also a likely purpose (SectionIV) In fact we find that the abnormal transactions are greatlycorrelated with the Bitcoin exchange price and there are manyabnormal patterns in the transactions

B Graph Construction

As each transaction contains a buyer and a seller wecan easily construct a directed graph from the records byconsidering each account as a node Specifically we presentthe definition of the constructed graph G as follows

Graph Definition G = (VEw) where V is a set ofnodes represent users (denoted by user ID) in the leaked dataE is a set of edges with each represents an ordered pair of

nodes and w is the function associating each edge to a weightEach pair indicates that there was at least one transactionbetween users u (seller) and v (buyer) in the whole datasetw E rarr R+ maps each edge with a weight which is thetotal amount of Bitcoins transferred along the edge by one ormore transactions

In the remainder of this paper we use the term accountuser and node interchangeably To better compare networkcharacteristics we construct three graphs according to thenodesrsquo categories as follows

bull EHG The graph that all nodes are EHAsbull ELG The graph that all nodes are ELAsbull NMG The graph that all nodes are NMAsTo construct the graph we adopt the following steps Since

each complete transaction has both a buy and sell record(has the same transaction ID) after data validation we firstconstruct a set of tuples (SB v t l) from every completetransaction where S and B represents the seller and buyer(denoted by user ID) v is the corresponding amount of thetransaction in Bitcoin t is the transaction time and l is a labelindicating the category of the transaction (ie EHT ELT orNMT) We call this set as transaction tuple as each tuplecorresponds to a unique transaction Based on the transactiontuple the aforementioned graphs are easy to construct Forexample to construct the EHG we select all the tuples inwhich both the seller and the buyer are EHAs and sum the ventry grouped by S and B Then the generated new tuples(SB v) is the EHG Other graphs are constructed as the sameexcept by selecting different tuples according to the nodesrsquocategory

C Graph Analysis

This subsection investigates the constructed graphs fromvarious metrics in graph analysis Figure 3 shows the threegraphs We can find that there are more nodes in NMGindicating the NMG is more sparse in connection (note that weselect 5000 edges for each graph) We investigate the statisticsand metrics in the following

(a) EHG (b) ELG (c) NMG

Fig 3 Visualization of EHG ELG and NMG For the ease of illustrationwe randomly select 5000 edges from each graph to draw the figure

Table III shows all the statistics and metrics for eachconstructed graph For comparison we also constructed theabnormal graph (ie the graph of all abnormal accountsABG) and the complete graph (ie the graph of all accounts

CG) In the following we first introduce the statistics ormetrics and then detail the observations

The number of nodes in each graph is the number ofaccounts in each category which is in accordance with thestatistics in Table II The only exception is that the numberof nodes in NMG is less than the number of NMA becausesome normal accounts interact with abnormal accounts thusit is not included the NMG

TABLE IIISTATICS OF GRAPHS

graph nodes edges cluster avg degree avg wgt degreeEHG 10702 212900 030 1989 50543ELG 5835 413881 042 7093 310768ABG 14916 612885 031 4109 143904NMG 86457 655882 003 759 7621CG 119343 2682719 028 2248 42654

An edge in the graph indicates a ldquochannelrdquo between twoaccounts for buying or selling Bitcoin As can be seen fromthe table the number of edges in each graph is far less thanthe number of transactions which means that many channelsare used more than one times Another notable result is thatthe summation of the number of edges in ABG and NMG isgreatly less than the number of edges in the CG This resultindicates that many edges are the channels between normal andabnormal accounts and is evidence that the abnormal accountsprovide liquidity in the exchange The number of edges inABG is slightly larger than the sum of the number of edges inEHG and ELG since there are some edges connecting EHAsand ELAs

We compute the clustering coefficient of all the graphsin column 4 of Table III As can be seen the clusteringcoefficients are extremely different among EHG ELG andNMG The large clustering coefficients (ie 03 in EHG and042 in ELG) revealing that if two abnormal accounts ABtrade with abnormal account C A and B are very likely totrade with each other In other words the abnormal accountsare likely to form triangles through transactions Converselythe clustering coefficient of NMG is very small (ie 003)which indicates a normal situation as the probability of threenormal accounts forming a triangle is very small This resultindicates that the abnormal accounts behave strangely andherald the existence of market manipulation in the exchange

The degree of a node is the number of edges connectingto the node In our case the degree of a node indicates thenumber of accounts trading with that node Figure 4 showsthe degree distribution of all the three graphs all of whichapproximately follows the power law distribution meaningthat there are few large-degree nodes and many small-degreenodes We estimate the parameters by using the free statisticalsoftware R [10] and the contributed package [11] and plotthe fitting line y sim xminusα for each distribution in red Thesmaller the α the more variable of nodesrsquo degree Thusthe abnormal accounts show less variable as compared withnormal accounts The result may be due to the abnormalaccounts are controlled by the same organizations

(a) EHG (b) ELG (c) NMG

Fig 4 Degree distribution of EHG ELG and NMG

Column 5 and 6 in Table III show the average degreeand the weighted average degree of the graphs The largeaverage degrees of EHG and ELG indicate that the abnormalaccounts are used more frequently than normal accounts Theweighted degree is computed by setting the transaction volume(ie Bitcoin) as the weight thus the average weighted degreerepresents the average transaction volume for each edge Ascan be seen the average weighted degree of ELG is far largerthan it of EHG one possible reason that the exchange priceof transactions in ELG is relatively low thus the transactionvolume is large Whatever the reason is an obvious factremains that the average weighted degree of EHG and ELGare larger than that of NMG which means the edges betweenabnormal accounts transfer more Bitcoin than edges betweennormal accounts

Based on the results and analysis discussed above wesummarize the findings as follows

bull Finding 1 There are some abnormal accounts (125)which trading with very high or low exchange price insome transactions We consider these accounts abnormaland under control by the exchange for two reasons 1)the proportion of the abnormal transactions account for28 thus it is not occurred by accident 2) the abnormalexchange price is impossible to appear on ordinary users

bull Finding 2 Many seemingly normal transactions occurredbetween abnormal accounts ( gt 41) There are two pos-sible purposes for these transactions 1) these transactionsare the fake volume that used to create an illusion ofactive trading 2) to provide liquidity for the exchange

bull Finding 3 The graphs of abnormal accounts have verylarge clustering coefficients One possible reason is thatthese accounts are controlled by one organization andthus the trade is not completely random

These findings indicate that the exchange was likely in-volved in trading manipulation As the exchange price is thekey factor of trading in the following section we will discussthe possibility of price manipulation of the exchange

IV TEMPORAL NETWORK ANALYSIS

As discussed above the transaction network of abnormalaccounts (ie EHG and ELG) show a great difference fromthe NMG We want to know whether these transactions have acorrelation with the Bitcoin price and what kind of users and

transactions (ie graph structure) influence the Bitcoin pricegreatly To this end we calculate the daily snapshots of thegraphs by adopting the method similar to III-B To detect im-portant changes in the graph structure we compare successivesnapshots of the graphs using singular value decomposition(SVD) The goal is to detect a set of base networks andrepresents each dayrsquos snapshot as a linear combination of thesebase networks Unlike in Section III we focused our studyon transaction data after 20121201 in this section Thereare many reasons supporting our choice Firstly the recentpaper which proves the price manipulation of Mt Gox usesthe same transaction history [8] Secondly the Bitcoin priceexperienced a skyrocketing during this period Thirdly MtGox was the main Bitcoin exchange during this period Finallymore abnormal users and transactions (more than 60) arefound after that day

A Extract Base Networks

To evaluate which networks influence the price greatly weneed to construct the daily snapshots of the three graphsEHGt ELGt and NMGt We adopt the same process toconstruct the graph series First of all we construct the aggre-gate networks (ie EHG) based on tuples after 20121201Assume there are n nodes and L edges in the aggregatenetwork then it can be represented by a n times n weightedadjacency matrix G in which there are L non-zero elementsWe rearrange G into an L long vector g containing all the non-zero elements We call this vector as edge-weight vector Thevector describes the graph structure of the aggregate networkas each element represents a possible edge and its weight Toconstruct the daily snapshots of EHGt on day t we recalculatethe edge-weight vector gt (ie the graph structure on day t)based on transaction tuples on day t Please note that we donot change the order of the vector thus the i-th element of allthe edge-weight vectors indicate the same edge and it maybe zero if the edge does not exist on a specific day For Tsnapshots we now build the TtimesL graph time series matrix Xsuch that the t-th row of X equals gt By doing this we builda special matrix with T samples and each sample representsa daily graph structure

To account for the variation of the daily graph structure wenormalize X such that the sum of each row equals 1 and thensubtract the column averages from each column As a resultboth the row and column sums in the matrix will be zero Wecompute the singular value decomposition of the matrix X

X = UΣV T (1)

where U is a TtimesT unitary matrix Σ is a TtimesL diagonal matrixwith non-negative values on the diagonal and V is a L times Lunitary matrix The non-negative values on the diagonal aresigular values and is usually sorted in descending order Theleft-singular vectors containing in the column of U are a setof orthonormal eigenvectors of XXT and the right-singularvectors containing in the column of V are a set of orthonormaleigenvectors of XTX Since in this case T lt L there areonly T nonzero sigular values We denote the sorted sigular

values as (σ1 middot middot middot σT ) the left-sigular vectors (u1 middot middot middot uT )and the right-sigular vectors (v1 middot middot middot vT ) where ui and vi

are column vectors and subject to the following equations

uiT lowast uj = vi

T lowast vj = δij (2)

Based on the special meaning of matrix X we can interpretthe singular vectors and the singular values as 1) the right-singular vectors can be seen as base networks and the elementvi(l) (ie the l-th element of the i-th right-singular vector)gives the weight of the l-th edge in the i-th base network2) the left-singular vectors account for the temporal variationof the base networks the t-th value of ui (denotes as ui(t))provides the contribution of the i-th base network on day t3) the singular value σi which are the square roots of thenon-zero eigenvalues of both XTX and XXT indicates theoverall importance of the i-th base network in approximatingthe whole matrix Please note that the singular values aresorted in decreasing order thus give decreasing contributionto the result

B Detecting Graph Structural Changes

As the (normalized) weight of the l-th edge in the dailygraph structure on day t can be written as

xtl =

Tsumi=1

σiui(t)vi(l) (3)

to detect graph structural changes we need to consider twoterms σi (ie the importance of the i-th base network) andui(t) (ie the contribution of the i-th base network on day t)

As a first glance we consider the daily influence of thefirst and also the most important base network (ie u1(t))We want to know the correlation between the variation ofu1(t) and the fluctuation of the Bitcoin exchange price Asthe range of the price is (12 1207) we adopt a simplemathematical transform to make sure most of the transformedprice falls in the interval (0 1) Specifically we adopt thelog transform B(t) = log1000P (t) where P (t) is the closeexchange price of Bitcoin on day t Table IV (left part) showsthree commonly used correlation coefficients (ie PearsonSpearman and Kendall correlation coefficient) between u1(t)and the log-transformed price B(t) The results show thatthe daily variation of the first base network in EHG andELG have a very strong correlation with the Bitcoin exchangeprice However in NMG there is no correlation between thetwo variables The result indicates that the transactions madebetween abnormal accounts have a great influence on theBitcoin exchange price

Motivated by this result we want to know to what extentthe log transfered price can be estimated with the combinationof the left-sigular vectors ie

B(t) sim c0 +

Nsumi=1

ciui(t) (4)

TABLE IVCORRELATION COEFFCIENTS BETWEEN THE LEFT-SINGULAR VECTORS OFTHE NETWORK TIME SERIES MATRIX AND THE BITCOIN EXCHANGE PRICE

Graph The 1st base network The Fitted 10 base networksρP ρS ρK ρP ρS ρK

EHG 056 060 044 0811 0807 0620ELG 058 082 064 0871 0834 0652NMG 005 015 012 0239 0398 0289

0 50 100 150 200 250 300 350Order of singular value

000

025

050

075

100

125

150

175

Sigu

lar v

alue

10

ELGEHGNMG

Fig 5 Sigular values in the order of its importance

where c0 is the mean of B(t) and ci can be computed as thedot product of B(t) and ui(t) As the left-singular vectors areorthonormal and span the T-dimensional linear space B(t)can be reconstructed by ui(t) when N = T However this isnot what we desire in this case The purpose of this study isto identify some important base networks and accounts thathave a great influence on the Bitcoin price To proceed wefirst try to select some important base networks in the detectedbase networks We draw the scree plot of the singular valuesas shown in Fig 5 As can be seen from the graph the curveof the singular values is clearly leveling off at the right side ofthe dotted line (ie the 10th singular value) Thus we selectthe first 10 base networks for the following analysis

Before analyzing accounts in the selected base networks weapproximate B(t) with the selected networks To evaluate thefitting effect we calculate the correlation coefficients betweenthe fitted price series and B(t) The right part of TableIV shows the correlation coefficients Surprisingly the threecorrelation coefficients are greatly enhanced as compared withthe first left-singular vector Especially the Pearson correlationcoefficient between ELG and B(t) is 087 while only 024between NMG and B(t) The great difference indicates astrong correlation between abnormal accountsrsquo transactionsand the Bitcoin exchange price which is a strong evidenceof the price manipulation in Mt Gox

Figure 6 shows the trends of B(t) and the fitted price Ascan be seen from the graph though the shape of the peak inApril of 2013 is missed the trends of B(t) has been graspedby the selected base networks of EHG and ELG whereas thebase networks in the NMG have no effect in grasping thetrend

2012-12 2013-03 2013-06 2013-09 2013-12Date

02

04

06

08

10

Exch

ange

pric

e

Fitted10 base networksExchange price

(a) EHG

2012-12 2013-03 2013-06 2013-09 2013-12Date

02

04

06

08

10

Exch

ange

pric

e

Fitted10 base networksExchange price

(b) ELG

2012-12 2013-03 2013-06 2013-09 2013-12Date

03

04

05

06

07

08

09

10

Exch

ange

pric

e

Fitted10 base networksExchange price

(c) NMG

Fig 6 Approximate the log-transformed Bitcoin price with the linear combination of the selected base networks of EHG ELG and NMG

2012-12 2013-03 2013-06 2013-09 2013-12Date

06

04

02

00

02

04

Left-

singu

lar v

ecto

r wei

ght

u1(t)u2(t)u3(t)u4(t)

(a) EHG

2012-12 2013-03 2013-06 2013-09 2013-12Date

04

02

00

02

04

Left-

singu

lar v

ecto

r wei

ght

u1(t)u2(t)u3(t)u4(t)

(b) ELG

2012-12 2013-03 2013-06 2013-09 2013-12Date

100

075

050

025

000

025

050

075

100

Left-

singu

lar v

ecto

r wei

ght

u1(t)u2(t)u3(t)u4(t)

(c) NMG

Fig 7 The time-varying contribution ui(t) of the first four base networks

To show the structure variation of the networks we draw thetime-varying contribution ui(t) of the first four base networksin Fig 7 In most cases ui(t) exhibit a few abrupt changespartitioning the history of the transaction into separate timeperiods The most notable abrupt changes are in December of2012 when the Bitcoin exchange price is very smooth and theNovember of 2013 when the price skyrocketing During thetwo periods the effects of the first four base networks of EHGand ELG are both significant however the base networks inNMG have no distinct effect during the smooth period andshow effect only a few days during the skyrocketing period

C Abnormal transaction patterns

As discussed above the transactions between abnormalusers have a great correlation with the Bitcoin exchange priceA natural question is which edges (ie transactions) and thusaccounts are the most influential and whether the transactionsshow certain patterns during the period To this end based onthe extracted 10 base networks we further extract the top-10ranking edges (by the absolute value of weights) in each basenetworks We find only 44 distinct edges instead of the 100maximally possible which including a total of 28 accounts inEHG In ELG 57 edges and 46 accounts were found We callthese core abnormal accounts

To identify special transaction patterns we draw the dailysubgraph of the core abnormal accounts We find that there aremany abnormal transaction patterns (ie market manipulationpatterns) in the networks In order to save space we show

only 6 typical patterns in Fig 8 These subgraphs are allextracted from ELG on different days In order to illustratemore clearly we fix the layout of the graph (ie the positionof the accounts in each graph is fixed) and denote the specialpatterns in red The size of the line denotes the number oftransactions between the two accounts The number at theright-hand side of the directed edge represents the numberof transactions between the two accounts We simply explainthe 6 patterns as follows

bull Self-Loop A pattern that an account made transactionswith itself Figure 8a shows subgraph on 20130207 theaccount 231 made 749 transactions with itself Self-Loopis restricted for normal accounts in any exchanges as itmakes no sense Thus a reasonable explanation for theself-loop pattern is that the account may belong to theexchange and may be used to increase daily transactionvolume or price manipulation

bull Unidirection The unidirectional pattern indicates morethan one transaction from account A to B Figure 8bshows a unidirectional pattern on 20130815 whereaccount 527332 made 322 sell transactions to account231 It is possible for an account to sell Bitcoin to anotheraccount for more than one times however it is almostimpossible for two normal accounts to interact with sucha large number of times on the same day

bull Bi-direction The bi-directional pattern is a typical mar-ket manipulation behavior especially when the two ac-counts are controlled by the same user that two accounts

(a) Self-Loop (b) Unidirection (c) Bidirection

(d) Triangle (e) Polygon (f) Star

Fig 8 Some typical abnormal transaction patterns

interact with each other many times Figure 8c shows thebi-direction pattern on 20130414 where account 144834interact with account 231 for more than 150 times

bull Triangle The triangle pattern indicates a triangle-likestructure between three accounts It may contain vari-ous forms when considering the direction of the edgeFigure 8d shows a special form of triangle pattern on20131025 It is special because the accounts form a loopthrough transactions (account 282004rarr71885rarr490089rarr282004)

bull Polygon Polygon pattern is a more complicated trans-action pattern where many accounts form a polygon-likegroup with each edge has more than one transactionsFigure 8e shows a quadrangle pattern on 20130919it seems that account 282004 sends Bitcoin to account527332 through the ldquobridge accountsrdquo 488195 and 231for more than two hundred transactions

bull Star A star pattern has a core account that buys orsells Bitcoin to many accounts Figure 8f shows a typicalstar where the account 282004 sell Bitcoin to accounts488195 490089 527332 and 231

Generally speaking it is not surprising for a transactionnetwork to form a special structure as transactions are randomHowever in our case it is impossible as each edge representsfar more than one transaction in a single day Thus it seemsquite possible that these accounts are controlled by a certain

group and these transactions have special purposesBased on the results we summarize the findings as followsbull Finding 4 The daily fluctuations of the selected base

networks of EHG and ELG have a strong correlationwith the Bitcoin exchange price On the contrary thedaily fluctuation of the base networks of NMG has nocorrelation with the Bitcoin exchange price This findingindicates that the behavior of the abnormal accountsrsquotransaction affects the fluctuation of Bitcoin exchangeprice

bull Finding 5 The trend of the Bitcoin exchange price canbe captured by the selected base networks of EHG andELG It means that the trend of the price can be predictedby transactions between abnormal accounts

bull Finding 6 There are many unusual transaction patterns(eg self-loop bi-direction star) between abnormal ac-counts These patterns imply that these accounts arecontrolled by the same group and are strong evidenceof price manipulation

V RELATED WORK

Blockchain technology is a new technology which hasmany research directions and attracts the interest of researchersfrom various fields [12] [13] Our research is related toprevious work in two areas The first related area is thestudy of understanding the big fluctuation of Bitcoin price As

aforementioned many driving factors of the price are foundDue to all the related data are time series the most usedmethod in the analysis is time series based model such asvector space model [14] vector error-correction model [15]ARDL bounds testing method [16] wavelet analysis [5] andvector autoregressive [15]

Another related area is the study of the blockchain data (iethe transaction ledger) for different topics Due to the publiclyaccessible of the blockchain data and users are anonymous inthe system a common topic is to mine the blockchain datato reveal usersrsquo privacy [17]ndash[19] Because of the relativelylawless blockchain has become an area full of various scamsThus mining the blockchain data to detect scams is also acritical topic Recently there are many studies on this topicsuch as Bitcoin-based scams [20] the smart contract basedPonzi schemes [21] [22] money laundry [23] attacks [24]See [25] for a full survey of this topic

VI CONCLUSION AND FUTURE WORK

We conduct a systematic study to analyze the leaked MtGox transaction data through graph analysis By comparingthe transaction price of the transaction data with the discloseddaily price many abnormal transactions were identified andwere used to divide the accounts into three categories Basedon this classification we construct three graphs (ie EHGELG and NMG) and obtain many findings by analyzing thesegraphs through various metrics These findings convinced usthat there are many market manipulation behaviors in theexchange In order to reveal the relationship between thesebehaviors and the Bitcoin price the graphs are reconstructedinto daily graph series and reshaped into matrices Throughadopting SVD to the matrices some very important base net-works are identified By inspecting the base networks we findthat the daily variation of the abnormal base networks closelyrelated to the Bitcoin price and many market manipulationpatterns Based on these findings and considering Bitcoin isdominant in the market we propose to strengthen supervisionin this market In the future we will conduct a more thoroughstudy of the data to reveal the extent to which the market isaffected and to discuss the changes in the behavior of investorsunder the extreme fluctuation price

ACKNOWLEDGMENT

The work described in this paper was supported bythe National Key Research and Development Program(2016YFB1000101)the National Natural Science Foundationof China (6172221411801595) the Pearl River SampT NovaProgram of Guangzhou (201710010046) and the Program forGuangdong Introducing Innovative and Entrepreneurial Teams(2016ZT06D211)

REFERENCES

[1] S Nakamoto ldquoBitcoin A peer-to-peer electronic cash systemrdquo 2008[Online] Available httpsbitcoinorgbitcoinpdf

[2] C Burniske and A White ldquoBitcoin Ringing the bellfor a new asset classrdquo 2017 [Online] Availablehttpsresearchark-investcomhubfs1 Download Files ARK-InvestWhite PapersBitcoin-Ringing-The-Bell-For-A-New-Asset-Classpdf

[3] A Feder N Gandal J Hamrick and T Moore ldquoThe impact of ddosand other security shocks on bitcoin currency exchanges Evidence frommt goxrdquo Journal of Cybersecurity vol 3 no 2 pp 137ndash144 2018

[4] M Buchholz J Delaney J Warren and J Parker ldquoBits and betsinformation price volatility and demand for bitcoinrdquo Economics vol312 2012

[5] L Kristoufek ldquoWhat are the main drivers of the bitcoin price evidencefrom wavelet coherence analysisrdquo PloS one vol 10 no 4 p e01239232015

[6] mdashmdash ldquoBitcoin meets google trends and wikipedia Quantifying therelationship between phenomena of the internet erardquo Scientific Reportsvol 3 p 3415 2013

[7] D Kondor I Csabai J Szule M Posfai and G Vattay ldquoInferring theinterplay between network structure and market effects in bitcoinrdquo NewJournal of Physics vol 16 no 12 p 125003 2014

[8] N Gandal J Hamrick T Moore and T Oberman ldquoPrice manipulationin the bitcoin ecosystemrdquo Journal of Monetary Economics vol 95 pp86ndash96 2018

[9] S Ribes ldquoChasing fake volume a crypto-plaguerdquo Mar2018 [Online] Available httpsmediumcomsylvainartplayribeschasing-fake-volume-a-crypto-plague-ea1a3c1e0b5e

[10] R Core Team R A Language and Environment for StatisticalComputing R Foundation for Statistical Computing Vienna Austria2016 [Online] Available httpswwwR-projectorg

[11] C S Gillespie ldquoFitting heavy tailed distributions The poweRlawpackagerdquo Journal of Statistical Software vol 64 no 2 pp 1ndash162015 [Online] Available httpwwwjstatsoftorgv64i02

[12] Z Zheng S Xie H Dai X Chen and H Wang ldquoBlockchain challengesand opportunities a surveyrdquo International Journal of Web and GridServices vol 14 pp 352ndash375 2018

[13] mdashmdash ldquoAn overview of blockchain technology Architecture consensusand future trendsrdquo in 2017 IEEE International Congress on Big Data(BigData Congress) IEEE 2017 pp 557ndash564

[14] I Georgoula D Pournarakis C Bilanakos D Sotiropoulosand M G Giaglis ldquoUsing time-series and sentiment analy-sis to detect the determinants of bitcoin pricesrdquo Available atSSRNhttpsssrncomabstract=2607167 2015

[15] P Ciaian M Rajcaniova and d Kancs ldquoThe economics of bitcoin priceformationrdquo Applied Economics vol 48 no 19 pp 1799ndash1815 2016

[16] J Bouoiyour R Selmi et al ldquoWhat does bitcoin look likerdquo Annals ofEconomics and Finance vol 16 no 2 pp 449ndash492 2015

[17] F Reid and M Harrigan ldquoAn analysis of anonymity in the bitcoinsystemrdquo in Security and Privacy in Social Networks Springer 2013pp 197ndash223

[18] E Androulaki G O Karame M Roeschlin T Scherer and S CapkunldquoEvaluating user privacy in bitcoinrdquo in International Conference onFinancial Cryptography and Data Security Springer 2013 pp 34ndash51

[19] S Athey I Parashkevov V Sarukkai and J Xia ldquoBitcoin pric-ing adoption and usage Theory and evidencerdquo Available atSSRNhttpsssrncomabstract=2826674 2016

[20] M Vasek and T Moore ldquoTherersquos no free lunch even using BitcoinTracking the popularity and profits of virtual currency scamsrdquo inInternational Conference on Financial Cryptography and Data SecuritySpringer 2015 pp 44ndash61

[21] M Bartoletti S Carta T Cimoli and R Saia ldquoDissecting ponzischemes on ethereum identification analysis and impactrdquo arXivpreprint arXiv170303779 2017

[22] W Chen Z Zheng J Cui E Ngai P Zheng and Y Zhou ldquoDetectingponzi schemes on ethereum Towards healthier blockchain technologyrdquoin Proceedings of the 2018 World Wide Web Conference on World WideWeb ACM 2018 pp 1409ndash1418

[23] M Moser R Bohme and D Breuker ldquoAn inquiry into money laun-dering tools in the bitcoin ecosystemrdquo in eCrime Researchers Summit(eCRS) 2013 IEEE 2013 pp 1ndash14

[24] T Chen Y Zhu Z Li J Chen X Li X Luo X Lin and X ZhangeldquoUnderstanding ethereum via graph analysisrdquo in IEEE INFOCOM 2018-IEEE Conference on Computer Communications IEEE 2018 pp1484ndash1492

[25] W Chen and Z Zheng ldquoBlockchain data analysis A review of statustrends and challengesrdquo Journal of Computer Research and Developmentvol 55 no 9 pp 1853ndash1870 2018

  • I Introduction
  • II Data Set
  • III Static Network Analysis
    • III-A Account Classification
    • III-B Graph Construction
    • III-C Graph Analysis
      • IV Temporal Network Analysis
        • IV-A Extract Base Networks
        • IV-B Detecting Graph Structural Changes
        • IV-C Abnormal transaction patterns
          • V Related Work
          • VI Conclusion and Future Work
          • References
Page 2: Cryptocurrency Market Cap, Live Prices, Charts & Ticker - Market … - Evidence... · 2020. 6. 14. · Abstract—The cryptocurrency market is a very huge market without effective

form of CSV files These data provide a perfect opportunityfor answering the conjecture

To verify whether there is market manipulation and identifypossible manipulation patterns is urgent and of great impor-tance as plenty of investors who are dreaming of gettingrich overnight are attracted to the market The answer to thisquestion will help investors recognize the potential risks andhelp to regulate legislation Based on the leaked data a recentpaper [8] points out that the Mt Gox exchange manipulatedthe Bitcoin price by building a regression model to identifythe influence of the activities of some suspicious accounts onthe price We adopt a completely different method comparedwith it and obtain more results including fake volume pricemanipulation and manipulation patterns

Figure 2 shows an overview of our analysis We first verifythe leaked data and remove many unreasonable records Thenby comparing the transaction price with the disclosed MtGox price in quandlcom we find many abnormal transac-tions By using these transactions we divide the accountsinto three categories extreme high account (EHA) extremelow account (ELA) and normal account (NMA) Next weconstruct the extreme high graph (EHG) extreme low graph(ELG) and normal graph (NMG) by seeing the accounts asnodes and transactions as edges we conduct various graphstructure analysis on EHG ELG and NMG such as nodesand edges classification measuring graph clusters and degreedistribution Such investigation leads to new observations andfindings For example the abnormal accounts (ie EHA andELA) might be controlled by the exchange and used toprovide liquidity and fake volume for the exchange Finallyby dividing the graphs into daily snapshots and reconstructingit in a matrix we extract some base graphs through singularvalue decomposition (SVD) By doing this we find that theabnormal accountsrsquo transactions strongly related to the Bitcoinprice Furthermore we find many strange transaction patterns(such as self-loop bi-direction triangle etc) within abnormalaccounts These patterns are considered as evidence of marketmanipulation in the exchange

Fig 2 An overview of our analysis

In summary we make the following major contributionsbull To the best of our knowledge it is the first study on

market manipulation of cryptocurrency via graph analysis

and SVD Besides we prove the effectiveness of themethod by applying to the leaked Mt Gox transactiondata

bull We obtain many new observations and findings by char-acterizing the activities of different accounts (ie staticnetwork analysis) and adopting SVD on the daily snap-shots of the graphs (ie temporal network analysis)These findings convinced us that there are many marketmanipulation behaviors in the exchange

bull We detect many market manipulation patterns which havenever been reported in this area These patterns are strongevidence of market manipulation and can help investorsand regulators to recognize the dark side and its severityof the market

The rest of the paper is organized as follows After intro-ducing the data set in Section II we detail the static networkanalysis in Section III and the temporal network analysis inSection IV Finally we provide some related works in SectionV and conclude the paper in Section VI

II DATA SET

In early 2014 the transaction history from April 2011 toNovember 2013 of Mt Gox was leaked in the form of CSVfiles Table I reports a segment of the leaked data recordedon 20131001 Two rows with the same Trade Id indicatinga complete transaction from the seller (Type=sell) to thebuyer (Type=buy) The volume of the transaction is recordedin Bitcoins and the turnover in Money thus the real-timeprice of Bitcoin at the transaction moment is MoneyBitcoinsEach user has a unique identity (User Id) with the FIPSlocation codes recorded in the country (User Country) andstate (User State) fields There are some other attributes (egtransaction fees) not included in the table as they are not usedin this study

Data Cleaning As there are many duplicate entries in theleaked data we adopt a similar way for data cleaning as theprevious studies [3] [8] Specifically we use the combinationof the four key fields date user ID type and Bitcoins toremove duplicated entries (de-duplication strategy 2 in [3])After this step we remove all the single row transaction tomake sure that each transaction has the corresponding buyerand seller (ie a completed transaction) Then we removeall duplicated complete transactions By doing this the datanarrows from approximately 18 million rows to 135 millionrows (ie 67 million completed transactions) This method ismore strict than the method in [8] as complete transactionswith the same trade id are treated as duplicates We adopta more strict method in the hope of providing more reliableresults

Advantages The leaked Mt Gox data has many advantagesin understanding the transaction behaviors in cryptocurrencyand its influence on the price First of all Mt Gox was thedominant exchange and Bitcoin has been the main cryptocur-rency during the period thus analyzing the cryptocurrencymarket based on this data set is more reliable and represen-tative Second these data are much more finely grained than

TABLE IA SEGMENT OF THE LEAKED DATA

Trade Id Date User Id Type Currency Bitcoins Money User Country User State1380587338975940 2013101 02858 125439 buy USD 05 7169169 US NC1380587338975940 2013101 02858 295701 sell USD 05 7169169 CA QC1380739642844790 2013102 184722 609336 buy USD 026177217 3396631 US PA1380739642844790 2013102 184722 36865 sell USD 026177217 3396631 US CA

data extracted from the blockchain since most trading activityis recorded only in the exchange Furthermore users can beidentified by their accounts in the leaked data while it is hardin blockchain to identify a user because of its anonymousmechanism

III STATIC NETWORK ANALYSIS

A Account Classification

Before delving deeper into the Mt Gox leaked data wecheck the Bitcoin exchange price of each transaction (ieMoneyBitcoin) to inspect whether it falls between the highestand lowest exchange price of the disclosed price on the sameday To this end we first download all the Bitcoin exchangerate (BTC vs USD) on Mt Gox from quandlcom (we callthis reference price) Then we compare the exchange priceof each transaction with the reference price Surprisinglywe find that there are some abnormal transactions whichhave a very high or low exchange price For example on20130830 a transaction (trade ID=1377875127221631) hadan exchange price of $493384BTC and another transaction(trade ID=1377876535345547) had an exchange price of only$081BTC whereas on the same day the highest and lowestexchange price in the download data are $14276BTC and$12856BTC respectively

These transactions are abnormal as the exchange price isclearly out of the reasonable range In order to distinguish thetransaction behavior of different accounts and its influence onthe price we divide all the accounts into three categoriesextremely high account (EHA) extremely low account (ELA)and normal account (NMA) As a first step we apply a simpleapproach to identify an abnormal transaction For this supposethe highest and the lowest reference price on day t is Ht

and Lt we regard an transaction with real-time price largerthan 15 times Ht as an extremely high price transaction (EHT)and with real-time price lower than 05times Lt as an extremelylow price transaction (ELT) Both kinds of transactions arereferred to as abnormal transactions (ABTs) Please note thatwe use (05 times Lt 15 times Ht) instead of (Lt Ht) to identifyan abnormal transaction because there are many exchanges(thus many reference price) at the same time and we cannotmake sure the reference price is the real price of the exchangeHowever the parameter 05 and 15 is enough to exclude anynormal transaction Finally an account is an EHA if it has atleast one extremely high price transaction and an ELA if it hasat least one extremely low price transaction Both EHAs andELAs are referred to as abnormal accounts (ABA) Please notethat abnormal accounts could be both an EHA and an ELA if itinvolves both EHT and ELT NMA is an account involved in no

TABLE IISTATICS OF ACCOUNTS AND TRANSACTIONS

Category accounts Tx ABT EHT ELTEHA 10702 1406850 179701 138743 40958ELA 5835 2486807 85784 29737 56047ABA 14916 3025992 194790 138743 56047NMA 104427 812865 0 0 0

All 119343 6775117 194790 138743 56047

abnormal transactions that is to say all involved transactionsare normal transactions (NMT)

Table II shows the number of accounts and all kinds oftransactions for each category of accounts Four observationscan be made from the table 1) there are 14916 abnormalaccounts which account for 125 (14916119343) of all theaccounts (please note that the number of ABA is not thesum of the number of EHA and ELA due to the existenceof accounts contained in both categories) 2) the proportionof abnormal transactions (ABT) among ABAs accounts for28 (asymp1947906775117) 3) the number of normal transac-tions among ABAs (3025992-194790=2831202) account formore than 41 (28312026775117) of all transactions and 4)the sum of the number of transactions (Tx) among ABAsand NMAs is far less than the number of all transactions thusmany transactions occurred between ABA and NMA

Based on these observations one can confirm that the abnor-mal transactions do not occur by accident (observation 2) andthe abnormal accounts behave normally in most of their times(observation 3) Thus the existence of the abnormal accountsmust have a certain special purpose One of the most likelypurposes is for providing liquidity (observation 4 SectionIII-C) Considering the analysis on the recent cryptocurrencymarket of a trader and investor which report that in someexchanges most of their disclosed trading volume are fake [9]another possible purpose for these accounts is for fake volumeBesides price manipulation is also a likely purpose (SectionIV) In fact we find that the abnormal transactions are greatlycorrelated with the Bitcoin exchange price and there are manyabnormal patterns in the transactions

B Graph Construction

As each transaction contains a buyer and a seller wecan easily construct a directed graph from the records byconsidering each account as a node Specifically we presentthe definition of the constructed graph G as follows

Graph Definition G = (VEw) where V is a set ofnodes represent users (denoted by user ID) in the leaked dataE is a set of edges with each represents an ordered pair of

nodes and w is the function associating each edge to a weightEach pair indicates that there was at least one transactionbetween users u (seller) and v (buyer) in the whole datasetw E rarr R+ maps each edge with a weight which is thetotal amount of Bitcoins transferred along the edge by one ormore transactions

In the remainder of this paper we use the term accountuser and node interchangeably To better compare networkcharacteristics we construct three graphs according to thenodesrsquo categories as follows

bull EHG The graph that all nodes are EHAsbull ELG The graph that all nodes are ELAsbull NMG The graph that all nodes are NMAsTo construct the graph we adopt the following steps Since

each complete transaction has both a buy and sell record(has the same transaction ID) after data validation we firstconstruct a set of tuples (SB v t l) from every completetransaction where S and B represents the seller and buyer(denoted by user ID) v is the corresponding amount of thetransaction in Bitcoin t is the transaction time and l is a labelindicating the category of the transaction (ie EHT ELT orNMT) We call this set as transaction tuple as each tuplecorresponds to a unique transaction Based on the transactiontuple the aforementioned graphs are easy to construct Forexample to construct the EHG we select all the tuples inwhich both the seller and the buyer are EHAs and sum the ventry grouped by S and B Then the generated new tuples(SB v) is the EHG Other graphs are constructed as the sameexcept by selecting different tuples according to the nodesrsquocategory

C Graph Analysis

This subsection investigates the constructed graphs fromvarious metrics in graph analysis Figure 3 shows the threegraphs We can find that there are more nodes in NMGindicating the NMG is more sparse in connection (note that weselect 5000 edges for each graph) We investigate the statisticsand metrics in the following

(a) EHG (b) ELG (c) NMG

Fig 3 Visualization of EHG ELG and NMG For the ease of illustrationwe randomly select 5000 edges from each graph to draw the figure

Table III shows all the statistics and metrics for eachconstructed graph For comparison we also constructed theabnormal graph (ie the graph of all abnormal accountsABG) and the complete graph (ie the graph of all accounts

CG) In the following we first introduce the statistics ormetrics and then detail the observations

The number of nodes in each graph is the number ofaccounts in each category which is in accordance with thestatistics in Table II The only exception is that the numberof nodes in NMG is less than the number of NMA becausesome normal accounts interact with abnormal accounts thusit is not included the NMG

TABLE IIISTATICS OF GRAPHS

graph nodes edges cluster avg degree avg wgt degreeEHG 10702 212900 030 1989 50543ELG 5835 413881 042 7093 310768ABG 14916 612885 031 4109 143904NMG 86457 655882 003 759 7621CG 119343 2682719 028 2248 42654

An edge in the graph indicates a ldquochannelrdquo between twoaccounts for buying or selling Bitcoin As can be seen fromthe table the number of edges in each graph is far less thanthe number of transactions which means that many channelsare used more than one times Another notable result is thatthe summation of the number of edges in ABG and NMG isgreatly less than the number of edges in the CG This resultindicates that many edges are the channels between normal andabnormal accounts and is evidence that the abnormal accountsprovide liquidity in the exchange The number of edges inABG is slightly larger than the sum of the number of edges inEHG and ELG since there are some edges connecting EHAsand ELAs

We compute the clustering coefficient of all the graphsin column 4 of Table III As can be seen the clusteringcoefficients are extremely different among EHG ELG andNMG The large clustering coefficients (ie 03 in EHG and042 in ELG) revealing that if two abnormal accounts ABtrade with abnormal account C A and B are very likely totrade with each other In other words the abnormal accountsare likely to form triangles through transactions Converselythe clustering coefficient of NMG is very small (ie 003)which indicates a normal situation as the probability of threenormal accounts forming a triangle is very small This resultindicates that the abnormal accounts behave strangely andherald the existence of market manipulation in the exchange

The degree of a node is the number of edges connectingto the node In our case the degree of a node indicates thenumber of accounts trading with that node Figure 4 showsthe degree distribution of all the three graphs all of whichapproximately follows the power law distribution meaningthat there are few large-degree nodes and many small-degreenodes We estimate the parameters by using the free statisticalsoftware R [10] and the contributed package [11] and plotthe fitting line y sim xminusα for each distribution in red Thesmaller the α the more variable of nodesrsquo degree Thusthe abnormal accounts show less variable as compared withnormal accounts The result may be due to the abnormalaccounts are controlled by the same organizations

(a) EHG (b) ELG (c) NMG

Fig 4 Degree distribution of EHG ELG and NMG

Column 5 and 6 in Table III show the average degreeand the weighted average degree of the graphs The largeaverage degrees of EHG and ELG indicate that the abnormalaccounts are used more frequently than normal accounts Theweighted degree is computed by setting the transaction volume(ie Bitcoin) as the weight thus the average weighted degreerepresents the average transaction volume for each edge Ascan be seen the average weighted degree of ELG is far largerthan it of EHG one possible reason that the exchange priceof transactions in ELG is relatively low thus the transactionvolume is large Whatever the reason is an obvious factremains that the average weighted degree of EHG and ELGare larger than that of NMG which means the edges betweenabnormal accounts transfer more Bitcoin than edges betweennormal accounts

Based on the results and analysis discussed above wesummarize the findings as follows

bull Finding 1 There are some abnormal accounts (125)which trading with very high or low exchange price insome transactions We consider these accounts abnormaland under control by the exchange for two reasons 1)the proportion of the abnormal transactions account for28 thus it is not occurred by accident 2) the abnormalexchange price is impossible to appear on ordinary users

bull Finding 2 Many seemingly normal transactions occurredbetween abnormal accounts ( gt 41) There are two pos-sible purposes for these transactions 1) these transactionsare the fake volume that used to create an illusion ofactive trading 2) to provide liquidity for the exchange

bull Finding 3 The graphs of abnormal accounts have verylarge clustering coefficients One possible reason is thatthese accounts are controlled by one organization andthus the trade is not completely random

These findings indicate that the exchange was likely in-volved in trading manipulation As the exchange price is thekey factor of trading in the following section we will discussthe possibility of price manipulation of the exchange

IV TEMPORAL NETWORK ANALYSIS

As discussed above the transaction network of abnormalaccounts (ie EHG and ELG) show a great difference fromthe NMG We want to know whether these transactions have acorrelation with the Bitcoin price and what kind of users and

transactions (ie graph structure) influence the Bitcoin pricegreatly To this end we calculate the daily snapshots of thegraphs by adopting the method similar to III-B To detect im-portant changes in the graph structure we compare successivesnapshots of the graphs using singular value decomposition(SVD) The goal is to detect a set of base networks andrepresents each dayrsquos snapshot as a linear combination of thesebase networks Unlike in Section III we focused our studyon transaction data after 20121201 in this section Thereare many reasons supporting our choice Firstly the recentpaper which proves the price manipulation of Mt Gox usesthe same transaction history [8] Secondly the Bitcoin priceexperienced a skyrocketing during this period Thirdly MtGox was the main Bitcoin exchange during this period Finallymore abnormal users and transactions (more than 60) arefound after that day

A Extract Base Networks

To evaluate which networks influence the price greatly weneed to construct the daily snapshots of the three graphsEHGt ELGt and NMGt We adopt the same process toconstruct the graph series First of all we construct the aggre-gate networks (ie EHG) based on tuples after 20121201Assume there are n nodes and L edges in the aggregatenetwork then it can be represented by a n times n weightedadjacency matrix G in which there are L non-zero elementsWe rearrange G into an L long vector g containing all the non-zero elements We call this vector as edge-weight vector Thevector describes the graph structure of the aggregate networkas each element represents a possible edge and its weight Toconstruct the daily snapshots of EHGt on day t we recalculatethe edge-weight vector gt (ie the graph structure on day t)based on transaction tuples on day t Please note that we donot change the order of the vector thus the i-th element of allthe edge-weight vectors indicate the same edge and it maybe zero if the edge does not exist on a specific day For Tsnapshots we now build the TtimesL graph time series matrix Xsuch that the t-th row of X equals gt By doing this we builda special matrix with T samples and each sample representsa daily graph structure

To account for the variation of the daily graph structure wenormalize X such that the sum of each row equals 1 and thensubtract the column averages from each column As a resultboth the row and column sums in the matrix will be zero Wecompute the singular value decomposition of the matrix X

X = UΣV T (1)

where U is a TtimesT unitary matrix Σ is a TtimesL diagonal matrixwith non-negative values on the diagonal and V is a L times Lunitary matrix The non-negative values on the diagonal aresigular values and is usually sorted in descending order Theleft-singular vectors containing in the column of U are a setof orthonormal eigenvectors of XXT and the right-singularvectors containing in the column of V are a set of orthonormaleigenvectors of XTX Since in this case T lt L there areonly T nonzero sigular values We denote the sorted sigular

values as (σ1 middot middot middot σT ) the left-sigular vectors (u1 middot middot middot uT )and the right-sigular vectors (v1 middot middot middot vT ) where ui and vi

are column vectors and subject to the following equations

uiT lowast uj = vi

T lowast vj = δij (2)

Based on the special meaning of matrix X we can interpretthe singular vectors and the singular values as 1) the right-singular vectors can be seen as base networks and the elementvi(l) (ie the l-th element of the i-th right-singular vector)gives the weight of the l-th edge in the i-th base network2) the left-singular vectors account for the temporal variationof the base networks the t-th value of ui (denotes as ui(t))provides the contribution of the i-th base network on day t3) the singular value σi which are the square roots of thenon-zero eigenvalues of both XTX and XXT indicates theoverall importance of the i-th base network in approximatingthe whole matrix Please note that the singular values aresorted in decreasing order thus give decreasing contributionto the result

B Detecting Graph Structural Changes

As the (normalized) weight of the l-th edge in the dailygraph structure on day t can be written as

xtl =

Tsumi=1

σiui(t)vi(l) (3)

to detect graph structural changes we need to consider twoterms σi (ie the importance of the i-th base network) andui(t) (ie the contribution of the i-th base network on day t)

As a first glance we consider the daily influence of thefirst and also the most important base network (ie u1(t))We want to know the correlation between the variation ofu1(t) and the fluctuation of the Bitcoin exchange price Asthe range of the price is (12 1207) we adopt a simplemathematical transform to make sure most of the transformedprice falls in the interval (0 1) Specifically we adopt thelog transform B(t) = log1000P (t) where P (t) is the closeexchange price of Bitcoin on day t Table IV (left part) showsthree commonly used correlation coefficients (ie PearsonSpearman and Kendall correlation coefficient) between u1(t)and the log-transformed price B(t) The results show thatthe daily variation of the first base network in EHG andELG have a very strong correlation with the Bitcoin exchangeprice However in NMG there is no correlation between thetwo variables The result indicates that the transactions madebetween abnormal accounts have a great influence on theBitcoin exchange price

Motivated by this result we want to know to what extentthe log transfered price can be estimated with the combinationof the left-sigular vectors ie

B(t) sim c0 +

Nsumi=1

ciui(t) (4)

TABLE IVCORRELATION COEFFCIENTS BETWEEN THE LEFT-SINGULAR VECTORS OFTHE NETWORK TIME SERIES MATRIX AND THE BITCOIN EXCHANGE PRICE

Graph The 1st base network The Fitted 10 base networksρP ρS ρK ρP ρS ρK

EHG 056 060 044 0811 0807 0620ELG 058 082 064 0871 0834 0652NMG 005 015 012 0239 0398 0289

0 50 100 150 200 250 300 350Order of singular value

000

025

050

075

100

125

150

175

Sigu

lar v

alue

10

ELGEHGNMG

Fig 5 Sigular values in the order of its importance

where c0 is the mean of B(t) and ci can be computed as thedot product of B(t) and ui(t) As the left-singular vectors areorthonormal and span the T-dimensional linear space B(t)can be reconstructed by ui(t) when N = T However this isnot what we desire in this case The purpose of this study isto identify some important base networks and accounts thathave a great influence on the Bitcoin price To proceed wefirst try to select some important base networks in the detectedbase networks We draw the scree plot of the singular valuesas shown in Fig 5 As can be seen from the graph the curveof the singular values is clearly leveling off at the right side ofthe dotted line (ie the 10th singular value) Thus we selectthe first 10 base networks for the following analysis

Before analyzing accounts in the selected base networks weapproximate B(t) with the selected networks To evaluate thefitting effect we calculate the correlation coefficients betweenthe fitted price series and B(t) The right part of TableIV shows the correlation coefficients Surprisingly the threecorrelation coefficients are greatly enhanced as compared withthe first left-singular vector Especially the Pearson correlationcoefficient between ELG and B(t) is 087 while only 024between NMG and B(t) The great difference indicates astrong correlation between abnormal accountsrsquo transactionsand the Bitcoin exchange price which is a strong evidenceof the price manipulation in Mt Gox

Figure 6 shows the trends of B(t) and the fitted price Ascan be seen from the graph though the shape of the peak inApril of 2013 is missed the trends of B(t) has been graspedby the selected base networks of EHG and ELG whereas thebase networks in the NMG have no effect in grasping thetrend

2012-12 2013-03 2013-06 2013-09 2013-12Date

02

04

06

08

10

Exch

ange

pric

e

Fitted10 base networksExchange price

(a) EHG

2012-12 2013-03 2013-06 2013-09 2013-12Date

02

04

06

08

10

Exch

ange

pric

e

Fitted10 base networksExchange price

(b) ELG

2012-12 2013-03 2013-06 2013-09 2013-12Date

03

04

05

06

07

08

09

10

Exch

ange

pric

e

Fitted10 base networksExchange price

(c) NMG

Fig 6 Approximate the log-transformed Bitcoin price with the linear combination of the selected base networks of EHG ELG and NMG

2012-12 2013-03 2013-06 2013-09 2013-12Date

06

04

02

00

02

04

Left-

singu

lar v

ecto

r wei

ght

u1(t)u2(t)u3(t)u4(t)

(a) EHG

2012-12 2013-03 2013-06 2013-09 2013-12Date

04

02

00

02

04

Left-

singu

lar v

ecto

r wei

ght

u1(t)u2(t)u3(t)u4(t)

(b) ELG

2012-12 2013-03 2013-06 2013-09 2013-12Date

100

075

050

025

000

025

050

075

100

Left-

singu

lar v

ecto

r wei

ght

u1(t)u2(t)u3(t)u4(t)

(c) NMG

Fig 7 The time-varying contribution ui(t) of the first four base networks

To show the structure variation of the networks we draw thetime-varying contribution ui(t) of the first four base networksin Fig 7 In most cases ui(t) exhibit a few abrupt changespartitioning the history of the transaction into separate timeperiods The most notable abrupt changes are in December of2012 when the Bitcoin exchange price is very smooth and theNovember of 2013 when the price skyrocketing During thetwo periods the effects of the first four base networks of EHGand ELG are both significant however the base networks inNMG have no distinct effect during the smooth period andshow effect only a few days during the skyrocketing period

C Abnormal transaction patterns

As discussed above the transactions between abnormalusers have a great correlation with the Bitcoin exchange priceA natural question is which edges (ie transactions) and thusaccounts are the most influential and whether the transactionsshow certain patterns during the period To this end based onthe extracted 10 base networks we further extract the top-10ranking edges (by the absolute value of weights) in each basenetworks We find only 44 distinct edges instead of the 100maximally possible which including a total of 28 accounts inEHG In ELG 57 edges and 46 accounts were found We callthese core abnormal accounts

To identify special transaction patterns we draw the dailysubgraph of the core abnormal accounts We find that there aremany abnormal transaction patterns (ie market manipulationpatterns) in the networks In order to save space we show

only 6 typical patterns in Fig 8 These subgraphs are allextracted from ELG on different days In order to illustratemore clearly we fix the layout of the graph (ie the positionof the accounts in each graph is fixed) and denote the specialpatterns in red The size of the line denotes the number oftransactions between the two accounts The number at theright-hand side of the directed edge represents the numberof transactions between the two accounts We simply explainthe 6 patterns as follows

bull Self-Loop A pattern that an account made transactionswith itself Figure 8a shows subgraph on 20130207 theaccount 231 made 749 transactions with itself Self-Loopis restricted for normal accounts in any exchanges as itmakes no sense Thus a reasonable explanation for theself-loop pattern is that the account may belong to theexchange and may be used to increase daily transactionvolume or price manipulation

bull Unidirection The unidirectional pattern indicates morethan one transaction from account A to B Figure 8bshows a unidirectional pattern on 20130815 whereaccount 527332 made 322 sell transactions to account231 It is possible for an account to sell Bitcoin to anotheraccount for more than one times however it is almostimpossible for two normal accounts to interact with sucha large number of times on the same day

bull Bi-direction The bi-directional pattern is a typical mar-ket manipulation behavior especially when the two ac-counts are controlled by the same user that two accounts

(a) Self-Loop (b) Unidirection (c) Bidirection

(d) Triangle (e) Polygon (f) Star

Fig 8 Some typical abnormal transaction patterns

interact with each other many times Figure 8c shows thebi-direction pattern on 20130414 where account 144834interact with account 231 for more than 150 times

bull Triangle The triangle pattern indicates a triangle-likestructure between three accounts It may contain vari-ous forms when considering the direction of the edgeFigure 8d shows a special form of triangle pattern on20131025 It is special because the accounts form a loopthrough transactions (account 282004rarr71885rarr490089rarr282004)

bull Polygon Polygon pattern is a more complicated trans-action pattern where many accounts form a polygon-likegroup with each edge has more than one transactionsFigure 8e shows a quadrangle pattern on 20130919it seems that account 282004 sends Bitcoin to account527332 through the ldquobridge accountsrdquo 488195 and 231for more than two hundred transactions

bull Star A star pattern has a core account that buys orsells Bitcoin to many accounts Figure 8f shows a typicalstar where the account 282004 sell Bitcoin to accounts488195 490089 527332 and 231

Generally speaking it is not surprising for a transactionnetwork to form a special structure as transactions are randomHowever in our case it is impossible as each edge representsfar more than one transaction in a single day Thus it seemsquite possible that these accounts are controlled by a certain

group and these transactions have special purposesBased on the results we summarize the findings as followsbull Finding 4 The daily fluctuations of the selected base

networks of EHG and ELG have a strong correlationwith the Bitcoin exchange price On the contrary thedaily fluctuation of the base networks of NMG has nocorrelation with the Bitcoin exchange price This findingindicates that the behavior of the abnormal accountsrsquotransaction affects the fluctuation of Bitcoin exchangeprice

bull Finding 5 The trend of the Bitcoin exchange price canbe captured by the selected base networks of EHG andELG It means that the trend of the price can be predictedby transactions between abnormal accounts

bull Finding 6 There are many unusual transaction patterns(eg self-loop bi-direction star) between abnormal ac-counts These patterns imply that these accounts arecontrolled by the same group and are strong evidenceof price manipulation

V RELATED WORK

Blockchain technology is a new technology which hasmany research directions and attracts the interest of researchersfrom various fields [12] [13] Our research is related toprevious work in two areas The first related area is thestudy of understanding the big fluctuation of Bitcoin price As

aforementioned many driving factors of the price are foundDue to all the related data are time series the most usedmethod in the analysis is time series based model such asvector space model [14] vector error-correction model [15]ARDL bounds testing method [16] wavelet analysis [5] andvector autoregressive [15]

Another related area is the study of the blockchain data (iethe transaction ledger) for different topics Due to the publiclyaccessible of the blockchain data and users are anonymous inthe system a common topic is to mine the blockchain datato reveal usersrsquo privacy [17]ndash[19] Because of the relativelylawless blockchain has become an area full of various scamsThus mining the blockchain data to detect scams is also acritical topic Recently there are many studies on this topicsuch as Bitcoin-based scams [20] the smart contract basedPonzi schemes [21] [22] money laundry [23] attacks [24]See [25] for a full survey of this topic

VI CONCLUSION AND FUTURE WORK

We conduct a systematic study to analyze the leaked MtGox transaction data through graph analysis By comparingthe transaction price of the transaction data with the discloseddaily price many abnormal transactions were identified andwere used to divide the accounts into three categories Basedon this classification we construct three graphs (ie EHGELG and NMG) and obtain many findings by analyzing thesegraphs through various metrics These findings convinced usthat there are many market manipulation behaviors in theexchange In order to reveal the relationship between thesebehaviors and the Bitcoin price the graphs are reconstructedinto daily graph series and reshaped into matrices Throughadopting SVD to the matrices some very important base net-works are identified By inspecting the base networks we findthat the daily variation of the abnormal base networks closelyrelated to the Bitcoin price and many market manipulationpatterns Based on these findings and considering Bitcoin isdominant in the market we propose to strengthen supervisionin this market In the future we will conduct a more thoroughstudy of the data to reveal the extent to which the market isaffected and to discuss the changes in the behavior of investorsunder the extreme fluctuation price

ACKNOWLEDGMENT

The work described in this paper was supported bythe National Key Research and Development Program(2016YFB1000101)the National Natural Science Foundationof China (6172221411801595) the Pearl River SampT NovaProgram of Guangzhou (201710010046) and the Program forGuangdong Introducing Innovative and Entrepreneurial Teams(2016ZT06D211)

REFERENCES

[1] S Nakamoto ldquoBitcoin A peer-to-peer electronic cash systemrdquo 2008[Online] Available httpsbitcoinorgbitcoinpdf

[2] C Burniske and A White ldquoBitcoin Ringing the bellfor a new asset classrdquo 2017 [Online] Availablehttpsresearchark-investcomhubfs1 Download Files ARK-InvestWhite PapersBitcoin-Ringing-The-Bell-For-A-New-Asset-Classpdf

[3] A Feder N Gandal J Hamrick and T Moore ldquoThe impact of ddosand other security shocks on bitcoin currency exchanges Evidence frommt goxrdquo Journal of Cybersecurity vol 3 no 2 pp 137ndash144 2018

[4] M Buchholz J Delaney J Warren and J Parker ldquoBits and betsinformation price volatility and demand for bitcoinrdquo Economics vol312 2012

[5] L Kristoufek ldquoWhat are the main drivers of the bitcoin price evidencefrom wavelet coherence analysisrdquo PloS one vol 10 no 4 p e01239232015

[6] mdashmdash ldquoBitcoin meets google trends and wikipedia Quantifying therelationship between phenomena of the internet erardquo Scientific Reportsvol 3 p 3415 2013

[7] D Kondor I Csabai J Szule M Posfai and G Vattay ldquoInferring theinterplay between network structure and market effects in bitcoinrdquo NewJournal of Physics vol 16 no 12 p 125003 2014

[8] N Gandal J Hamrick T Moore and T Oberman ldquoPrice manipulationin the bitcoin ecosystemrdquo Journal of Monetary Economics vol 95 pp86ndash96 2018

[9] S Ribes ldquoChasing fake volume a crypto-plaguerdquo Mar2018 [Online] Available httpsmediumcomsylvainartplayribeschasing-fake-volume-a-crypto-plague-ea1a3c1e0b5e

[10] R Core Team R A Language and Environment for StatisticalComputing R Foundation for Statistical Computing Vienna Austria2016 [Online] Available httpswwwR-projectorg

[11] C S Gillespie ldquoFitting heavy tailed distributions The poweRlawpackagerdquo Journal of Statistical Software vol 64 no 2 pp 1ndash162015 [Online] Available httpwwwjstatsoftorgv64i02

[12] Z Zheng S Xie H Dai X Chen and H Wang ldquoBlockchain challengesand opportunities a surveyrdquo International Journal of Web and GridServices vol 14 pp 352ndash375 2018

[13] mdashmdash ldquoAn overview of blockchain technology Architecture consensusand future trendsrdquo in 2017 IEEE International Congress on Big Data(BigData Congress) IEEE 2017 pp 557ndash564

[14] I Georgoula D Pournarakis C Bilanakos D Sotiropoulosand M G Giaglis ldquoUsing time-series and sentiment analy-sis to detect the determinants of bitcoin pricesrdquo Available atSSRNhttpsssrncomabstract=2607167 2015

[15] P Ciaian M Rajcaniova and d Kancs ldquoThe economics of bitcoin priceformationrdquo Applied Economics vol 48 no 19 pp 1799ndash1815 2016

[16] J Bouoiyour R Selmi et al ldquoWhat does bitcoin look likerdquo Annals ofEconomics and Finance vol 16 no 2 pp 449ndash492 2015

[17] F Reid and M Harrigan ldquoAn analysis of anonymity in the bitcoinsystemrdquo in Security and Privacy in Social Networks Springer 2013pp 197ndash223

[18] E Androulaki G O Karame M Roeschlin T Scherer and S CapkunldquoEvaluating user privacy in bitcoinrdquo in International Conference onFinancial Cryptography and Data Security Springer 2013 pp 34ndash51

[19] S Athey I Parashkevov V Sarukkai and J Xia ldquoBitcoin pric-ing adoption and usage Theory and evidencerdquo Available atSSRNhttpsssrncomabstract=2826674 2016

[20] M Vasek and T Moore ldquoTherersquos no free lunch even using BitcoinTracking the popularity and profits of virtual currency scamsrdquo inInternational Conference on Financial Cryptography and Data SecuritySpringer 2015 pp 44ndash61

[21] M Bartoletti S Carta T Cimoli and R Saia ldquoDissecting ponzischemes on ethereum identification analysis and impactrdquo arXivpreprint arXiv170303779 2017

[22] W Chen Z Zheng J Cui E Ngai P Zheng and Y Zhou ldquoDetectingponzi schemes on ethereum Towards healthier blockchain technologyrdquoin Proceedings of the 2018 World Wide Web Conference on World WideWeb ACM 2018 pp 1409ndash1418

[23] M Moser R Bohme and D Breuker ldquoAn inquiry into money laun-dering tools in the bitcoin ecosystemrdquo in eCrime Researchers Summit(eCRS) 2013 IEEE 2013 pp 1ndash14

[24] T Chen Y Zhu Z Li J Chen X Li X Luo X Lin and X ZhangeldquoUnderstanding ethereum via graph analysisrdquo in IEEE INFOCOM 2018-IEEE Conference on Computer Communications IEEE 2018 pp1484ndash1492

[25] W Chen and Z Zheng ldquoBlockchain data analysis A review of statustrends and challengesrdquo Journal of Computer Research and Developmentvol 55 no 9 pp 1853ndash1870 2018

  • I Introduction
  • II Data Set
  • III Static Network Analysis
    • III-A Account Classification
    • III-B Graph Construction
    • III-C Graph Analysis
      • IV Temporal Network Analysis
        • IV-A Extract Base Networks
        • IV-B Detecting Graph Structural Changes
        • IV-C Abnormal transaction patterns
          • V Related Work
          • VI Conclusion and Future Work
          • References
Page 3: Cryptocurrency Market Cap, Live Prices, Charts & Ticker - Market … - Evidence... · 2020. 6. 14. · Abstract—The cryptocurrency market is a very huge market without effective

TABLE IA SEGMENT OF THE LEAKED DATA

Trade Id Date User Id Type Currency Bitcoins Money User Country User State1380587338975940 2013101 02858 125439 buy USD 05 7169169 US NC1380587338975940 2013101 02858 295701 sell USD 05 7169169 CA QC1380739642844790 2013102 184722 609336 buy USD 026177217 3396631 US PA1380739642844790 2013102 184722 36865 sell USD 026177217 3396631 US CA

data extracted from the blockchain since most trading activityis recorded only in the exchange Furthermore users can beidentified by their accounts in the leaked data while it is hardin blockchain to identify a user because of its anonymousmechanism

III STATIC NETWORK ANALYSIS

A Account Classification

Before delving deeper into the Mt Gox leaked data wecheck the Bitcoin exchange price of each transaction (ieMoneyBitcoin) to inspect whether it falls between the highestand lowest exchange price of the disclosed price on the sameday To this end we first download all the Bitcoin exchangerate (BTC vs USD) on Mt Gox from quandlcom (we callthis reference price) Then we compare the exchange priceof each transaction with the reference price Surprisinglywe find that there are some abnormal transactions whichhave a very high or low exchange price For example on20130830 a transaction (trade ID=1377875127221631) hadan exchange price of $493384BTC and another transaction(trade ID=1377876535345547) had an exchange price of only$081BTC whereas on the same day the highest and lowestexchange price in the download data are $14276BTC and$12856BTC respectively

These transactions are abnormal as the exchange price isclearly out of the reasonable range In order to distinguish thetransaction behavior of different accounts and its influence onthe price we divide all the accounts into three categoriesextremely high account (EHA) extremely low account (ELA)and normal account (NMA) As a first step we apply a simpleapproach to identify an abnormal transaction For this supposethe highest and the lowest reference price on day t is Ht

and Lt we regard an transaction with real-time price largerthan 15 times Ht as an extremely high price transaction (EHT)and with real-time price lower than 05times Lt as an extremelylow price transaction (ELT) Both kinds of transactions arereferred to as abnormal transactions (ABTs) Please note thatwe use (05 times Lt 15 times Ht) instead of (Lt Ht) to identifyan abnormal transaction because there are many exchanges(thus many reference price) at the same time and we cannotmake sure the reference price is the real price of the exchangeHowever the parameter 05 and 15 is enough to exclude anynormal transaction Finally an account is an EHA if it has atleast one extremely high price transaction and an ELA if it hasat least one extremely low price transaction Both EHAs andELAs are referred to as abnormal accounts (ABA) Please notethat abnormal accounts could be both an EHA and an ELA if itinvolves both EHT and ELT NMA is an account involved in no

TABLE IISTATICS OF ACCOUNTS AND TRANSACTIONS

Category accounts Tx ABT EHT ELTEHA 10702 1406850 179701 138743 40958ELA 5835 2486807 85784 29737 56047ABA 14916 3025992 194790 138743 56047NMA 104427 812865 0 0 0

All 119343 6775117 194790 138743 56047

abnormal transactions that is to say all involved transactionsare normal transactions (NMT)

Table II shows the number of accounts and all kinds oftransactions for each category of accounts Four observationscan be made from the table 1) there are 14916 abnormalaccounts which account for 125 (14916119343) of all theaccounts (please note that the number of ABA is not thesum of the number of EHA and ELA due to the existenceof accounts contained in both categories) 2) the proportionof abnormal transactions (ABT) among ABAs accounts for28 (asymp1947906775117) 3) the number of normal transac-tions among ABAs (3025992-194790=2831202) account formore than 41 (28312026775117) of all transactions and 4)the sum of the number of transactions (Tx) among ABAsand NMAs is far less than the number of all transactions thusmany transactions occurred between ABA and NMA

Based on these observations one can confirm that the abnor-mal transactions do not occur by accident (observation 2) andthe abnormal accounts behave normally in most of their times(observation 3) Thus the existence of the abnormal accountsmust have a certain special purpose One of the most likelypurposes is for providing liquidity (observation 4 SectionIII-C) Considering the analysis on the recent cryptocurrencymarket of a trader and investor which report that in someexchanges most of their disclosed trading volume are fake [9]another possible purpose for these accounts is for fake volumeBesides price manipulation is also a likely purpose (SectionIV) In fact we find that the abnormal transactions are greatlycorrelated with the Bitcoin exchange price and there are manyabnormal patterns in the transactions

B Graph Construction

As each transaction contains a buyer and a seller wecan easily construct a directed graph from the records byconsidering each account as a node Specifically we presentthe definition of the constructed graph G as follows

Graph Definition G = (VEw) where V is a set ofnodes represent users (denoted by user ID) in the leaked dataE is a set of edges with each represents an ordered pair of

nodes and w is the function associating each edge to a weightEach pair indicates that there was at least one transactionbetween users u (seller) and v (buyer) in the whole datasetw E rarr R+ maps each edge with a weight which is thetotal amount of Bitcoins transferred along the edge by one ormore transactions

In the remainder of this paper we use the term accountuser and node interchangeably To better compare networkcharacteristics we construct three graphs according to thenodesrsquo categories as follows

bull EHG The graph that all nodes are EHAsbull ELG The graph that all nodes are ELAsbull NMG The graph that all nodes are NMAsTo construct the graph we adopt the following steps Since

each complete transaction has both a buy and sell record(has the same transaction ID) after data validation we firstconstruct a set of tuples (SB v t l) from every completetransaction where S and B represents the seller and buyer(denoted by user ID) v is the corresponding amount of thetransaction in Bitcoin t is the transaction time and l is a labelindicating the category of the transaction (ie EHT ELT orNMT) We call this set as transaction tuple as each tuplecorresponds to a unique transaction Based on the transactiontuple the aforementioned graphs are easy to construct Forexample to construct the EHG we select all the tuples inwhich both the seller and the buyer are EHAs and sum the ventry grouped by S and B Then the generated new tuples(SB v) is the EHG Other graphs are constructed as the sameexcept by selecting different tuples according to the nodesrsquocategory

C Graph Analysis

This subsection investigates the constructed graphs fromvarious metrics in graph analysis Figure 3 shows the threegraphs We can find that there are more nodes in NMGindicating the NMG is more sparse in connection (note that weselect 5000 edges for each graph) We investigate the statisticsand metrics in the following

(a) EHG (b) ELG (c) NMG

Fig 3 Visualization of EHG ELG and NMG For the ease of illustrationwe randomly select 5000 edges from each graph to draw the figure

Table III shows all the statistics and metrics for eachconstructed graph For comparison we also constructed theabnormal graph (ie the graph of all abnormal accountsABG) and the complete graph (ie the graph of all accounts

CG) In the following we first introduce the statistics ormetrics and then detail the observations

The number of nodes in each graph is the number ofaccounts in each category which is in accordance with thestatistics in Table II The only exception is that the numberof nodes in NMG is less than the number of NMA becausesome normal accounts interact with abnormal accounts thusit is not included the NMG

TABLE IIISTATICS OF GRAPHS

graph nodes edges cluster avg degree avg wgt degreeEHG 10702 212900 030 1989 50543ELG 5835 413881 042 7093 310768ABG 14916 612885 031 4109 143904NMG 86457 655882 003 759 7621CG 119343 2682719 028 2248 42654

An edge in the graph indicates a ldquochannelrdquo between twoaccounts for buying or selling Bitcoin As can be seen fromthe table the number of edges in each graph is far less thanthe number of transactions which means that many channelsare used more than one times Another notable result is thatthe summation of the number of edges in ABG and NMG isgreatly less than the number of edges in the CG This resultindicates that many edges are the channels between normal andabnormal accounts and is evidence that the abnormal accountsprovide liquidity in the exchange The number of edges inABG is slightly larger than the sum of the number of edges inEHG and ELG since there are some edges connecting EHAsand ELAs

We compute the clustering coefficient of all the graphsin column 4 of Table III As can be seen the clusteringcoefficients are extremely different among EHG ELG andNMG The large clustering coefficients (ie 03 in EHG and042 in ELG) revealing that if two abnormal accounts ABtrade with abnormal account C A and B are very likely totrade with each other In other words the abnormal accountsare likely to form triangles through transactions Converselythe clustering coefficient of NMG is very small (ie 003)which indicates a normal situation as the probability of threenormal accounts forming a triangle is very small This resultindicates that the abnormal accounts behave strangely andherald the existence of market manipulation in the exchange

The degree of a node is the number of edges connectingto the node In our case the degree of a node indicates thenumber of accounts trading with that node Figure 4 showsthe degree distribution of all the three graphs all of whichapproximately follows the power law distribution meaningthat there are few large-degree nodes and many small-degreenodes We estimate the parameters by using the free statisticalsoftware R [10] and the contributed package [11] and plotthe fitting line y sim xminusα for each distribution in red Thesmaller the α the more variable of nodesrsquo degree Thusthe abnormal accounts show less variable as compared withnormal accounts The result may be due to the abnormalaccounts are controlled by the same organizations

(a) EHG (b) ELG (c) NMG

Fig 4 Degree distribution of EHG ELG and NMG

Column 5 and 6 in Table III show the average degreeand the weighted average degree of the graphs The largeaverage degrees of EHG and ELG indicate that the abnormalaccounts are used more frequently than normal accounts Theweighted degree is computed by setting the transaction volume(ie Bitcoin) as the weight thus the average weighted degreerepresents the average transaction volume for each edge Ascan be seen the average weighted degree of ELG is far largerthan it of EHG one possible reason that the exchange priceof transactions in ELG is relatively low thus the transactionvolume is large Whatever the reason is an obvious factremains that the average weighted degree of EHG and ELGare larger than that of NMG which means the edges betweenabnormal accounts transfer more Bitcoin than edges betweennormal accounts

Based on the results and analysis discussed above wesummarize the findings as follows

bull Finding 1 There are some abnormal accounts (125)which trading with very high or low exchange price insome transactions We consider these accounts abnormaland under control by the exchange for two reasons 1)the proportion of the abnormal transactions account for28 thus it is not occurred by accident 2) the abnormalexchange price is impossible to appear on ordinary users

bull Finding 2 Many seemingly normal transactions occurredbetween abnormal accounts ( gt 41) There are two pos-sible purposes for these transactions 1) these transactionsare the fake volume that used to create an illusion ofactive trading 2) to provide liquidity for the exchange

bull Finding 3 The graphs of abnormal accounts have verylarge clustering coefficients One possible reason is thatthese accounts are controlled by one organization andthus the trade is not completely random

These findings indicate that the exchange was likely in-volved in trading manipulation As the exchange price is thekey factor of trading in the following section we will discussthe possibility of price manipulation of the exchange

IV TEMPORAL NETWORK ANALYSIS

As discussed above the transaction network of abnormalaccounts (ie EHG and ELG) show a great difference fromthe NMG We want to know whether these transactions have acorrelation with the Bitcoin price and what kind of users and

transactions (ie graph structure) influence the Bitcoin pricegreatly To this end we calculate the daily snapshots of thegraphs by adopting the method similar to III-B To detect im-portant changes in the graph structure we compare successivesnapshots of the graphs using singular value decomposition(SVD) The goal is to detect a set of base networks andrepresents each dayrsquos snapshot as a linear combination of thesebase networks Unlike in Section III we focused our studyon transaction data after 20121201 in this section Thereare many reasons supporting our choice Firstly the recentpaper which proves the price manipulation of Mt Gox usesthe same transaction history [8] Secondly the Bitcoin priceexperienced a skyrocketing during this period Thirdly MtGox was the main Bitcoin exchange during this period Finallymore abnormal users and transactions (more than 60) arefound after that day

A Extract Base Networks

To evaluate which networks influence the price greatly weneed to construct the daily snapshots of the three graphsEHGt ELGt and NMGt We adopt the same process toconstruct the graph series First of all we construct the aggre-gate networks (ie EHG) based on tuples after 20121201Assume there are n nodes and L edges in the aggregatenetwork then it can be represented by a n times n weightedadjacency matrix G in which there are L non-zero elementsWe rearrange G into an L long vector g containing all the non-zero elements We call this vector as edge-weight vector Thevector describes the graph structure of the aggregate networkas each element represents a possible edge and its weight Toconstruct the daily snapshots of EHGt on day t we recalculatethe edge-weight vector gt (ie the graph structure on day t)based on transaction tuples on day t Please note that we donot change the order of the vector thus the i-th element of allthe edge-weight vectors indicate the same edge and it maybe zero if the edge does not exist on a specific day For Tsnapshots we now build the TtimesL graph time series matrix Xsuch that the t-th row of X equals gt By doing this we builda special matrix with T samples and each sample representsa daily graph structure

To account for the variation of the daily graph structure wenormalize X such that the sum of each row equals 1 and thensubtract the column averages from each column As a resultboth the row and column sums in the matrix will be zero Wecompute the singular value decomposition of the matrix X

X = UΣV T (1)

where U is a TtimesT unitary matrix Σ is a TtimesL diagonal matrixwith non-negative values on the diagonal and V is a L times Lunitary matrix The non-negative values on the diagonal aresigular values and is usually sorted in descending order Theleft-singular vectors containing in the column of U are a setof orthonormal eigenvectors of XXT and the right-singularvectors containing in the column of V are a set of orthonormaleigenvectors of XTX Since in this case T lt L there areonly T nonzero sigular values We denote the sorted sigular

values as (σ1 middot middot middot σT ) the left-sigular vectors (u1 middot middot middot uT )and the right-sigular vectors (v1 middot middot middot vT ) where ui and vi

are column vectors and subject to the following equations

uiT lowast uj = vi

T lowast vj = δij (2)

Based on the special meaning of matrix X we can interpretthe singular vectors and the singular values as 1) the right-singular vectors can be seen as base networks and the elementvi(l) (ie the l-th element of the i-th right-singular vector)gives the weight of the l-th edge in the i-th base network2) the left-singular vectors account for the temporal variationof the base networks the t-th value of ui (denotes as ui(t))provides the contribution of the i-th base network on day t3) the singular value σi which are the square roots of thenon-zero eigenvalues of both XTX and XXT indicates theoverall importance of the i-th base network in approximatingthe whole matrix Please note that the singular values aresorted in decreasing order thus give decreasing contributionto the result

B Detecting Graph Structural Changes

As the (normalized) weight of the l-th edge in the dailygraph structure on day t can be written as

xtl =

Tsumi=1

σiui(t)vi(l) (3)

to detect graph structural changes we need to consider twoterms σi (ie the importance of the i-th base network) andui(t) (ie the contribution of the i-th base network on day t)

As a first glance we consider the daily influence of thefirst and also the most important base network (ie u1(t))We want to know the correlation between the variation ofu1(t) and the fluctuation of the Bitcoin exchange price Asthe range of the price is (12 1207) we adopt a simplemathematical transform to make sure most of the transformedprice falls in the interval (0 1) Specifically we adopt thelog transform B(t) = log1000P (t) where P (t) is the closeexchange price of Bitcoin on day t Table IV (left part) showsthree commonly used correlation coefficients (ie PearsonSpearman and Kendall correlation coefficient) between u1(t)and the log-transformed price B(t) The results show thatthe daily variation of the first base network in EHG andELG have a very strong correlation with the Bitcoin exchangeprice However in NMG there is no correlation between thetwo variables The result indicates that the transactions madebetween abnormal accounts have a great influence on theBitcoin exchange price

Motivated by this result we want to know to what extentthe log transfered price can be estimated with the combinationof the left-sigular vectors ie

B(t) sim c0 +

Nsumi=1

ciui(t) (4)

TABLE IVCORRELATION COEFFCIENTS BETWEEN THE LEFT-SINGULAR VECTORS OFTHE NETWORK TIME SERIES MATRIX AND THE BITCOIN EXCHANGE PRICE

Graph The 1st base network The Fitted 10 base networksρP ρS ρK ρP ρS ρK

EHG 056 060 044 0811 0807 0620ELG 058 082 064 0871 0834 0652NMG 005 015 012 0239 0398 0289

0 50 100 150 200 250 300 350Order of singular value

000

025

050

075

100

125

150

175

Sigu

lar v

alue

10

ELGEHGNMG

Fig 5 Sigular values in the order of its importance

where c0 is the mean of B(t) and ci can be computed as thedot product of B(t) and ui(t) As the left-singular vectors areorthonormal and span the T-dimensional linear space B(t)can be reconstructed by ui(t) when N = T However this isnot what we desire in this case The purpose of this study isto identify some important base networks and accounts thathave a great influence on the Bitcoin price To proceed wefirst try to select some important base networks in the detectedbase networks We draw the scree plot of the singular valuesas shown in Fig 5 As can be seen from the graph the curveof the singular values is clearly leveling off at the right side ofthe dotted line (ie the 10th singular value) Thus we selectthe first 10 base networks for the following analysis

Before analyzing accounts in the selected base networks weapproximate B(t) with the selected networks To evaluate thefitting effect we calculate the correlation coefficients betweenthe fitted price series and B(t) The right part of TableIV shows the correlation coefficients Surprisingly the threecorrelation coefficients are greatly enhanced as compared withthe first left-singular vector Especially the Pearson correlationcoefficient between ELG and B(t) is 087 while only 024between NMG and B(t) The great difference indicates astrong correlation between abnormal accountsrsquo transactionsand the Bitcoin exchange price which is a strong evidenceof the price manipulation in Mt Gox

Figure 6 shows the trends of B(t) and the fitted price Ascan be seen from the graph though the shape of the peak inApril of 2013 is missed the trends of B(t) has been graspedby the selected base networks of EHG and ELG whereas thebase networks in the NMG have no effect in grasping thetrend

2012-12 2013-03 2013-06 2013-09 2013-12Date

02

04

06

08

10

Exch

ange

pric

e

Fitted10 base networksExchange price

(a) EHG

2012-12 2013-03 2013-06 2013-09 2013-12Date

02

04

06

08

10

Exch

ange

pric

e

Fitted10 base networksExchange price

(b) ELG

2012-12 2013-03 2013-06 2013-09 2013-12Date

03

04

05

06

07

08

09

10

Exch

ange

pric

e

Fitted10 base networksExchange price

(c) NMG

Fig 6 Approximate the log-transformed Bitcoin price with the linear combination of the selected base networks of EHG ELG and NMG

2012-12 2013-03 2013-06 2013-09 2013-12Date

06

04

02

00

02

04

Left-

singu

lar v

ecto

r wei

ght

u1(t)u2(t)u3(t)u4(t)

(a) EHG

2012-12 2013-03 2013-06 2013-09 2013-12Date

04

02

00

02

04

Left-

singu

lar v

ecto

r wei

ght

u1(t)u2(t)u3(t)u4(t)

(b) ELG

2012-12 2013-03 2013-06 2013-09 2013-12Date

100

075

050

025

000

025

050

075

100

Left-

singu

lar v

ecto

r wei

ght

u1(t)u2(t)u3(t)u4(t)

(c) NMG

Fig 7 The time-varying contribution ui(t) of the first four base networks

To show the structure variation of the networks we draw thetime-varying contribution ui(t) of the first four base networksin Fig 7 In most cases ui(t) exhibit a few abrupt changespartitioning the history of the transaction into separate timeperiods The most notable abrupt changes are in December of2012 when the Bitcoin exchange price is very smooth and theNovember of 2013 when the price skyrocketing During thetwo periods the effects of the first four base networks of EHGand ELG are both significant however the base networks inNMG have no distinct effect during the smooth period andshow effect only a few days during the skyrocketing period

C Abnormal transaction patterns

As discussed above the transactions between abnormalusers have a great correlation with the Bitcoin exchange priceA natural question is which edges (ie transactions) and thusaccounts are the most influential and whether the transactionsshow certain patterns during the period To this end based onthe extracted 10 base networks we further extract the top-10ranking edges (by the absolute value of weights) in each basenetworks We find only 44 distinct edges instead of the 100maximally possible which including a total of 28 accounts inEHG In ELG 57 edges and 46 accounts were found We callthese core abnormal accounts

To identify special transaction patterns we draw the dailysubgraph of the core abnormal accounts We find that there aremany abnormal transaction patterns (ie market manipulationpatterns) in the networks In order to save space we show

only 6 typical patterns in Fig 8 These subgraphs are allextracted from ELG on different days In order to illustratemore clearly we fix the layout of the graph (ie the positionof the accounts in each graph is fixed) and denote the specialpatterns in red The size of the line denotes the number oftransactions between the two accounts The number at theright-hand side of the directed edge represents the numberof transactions between the two accounts We simply explainthe 6 patterns as follows

bull Self-Loop A pattern that an account made transactionswith itself Figure 8a shows subgraph on 20130207 theaccount 231 made 749 transactions with itself Self-Loopis restricted for normal accounts in any exchanges as itmakes no sense Thus a reasonable explanation for theself-loop pattern is that the account may belong to theexchange and may be used to increase daily transactionvolume or price manipulation

bull Unidirection The unidirectional pattern indicates morethan one transaction from account A to B Figure 8bshows a unidirectional pattern on 20130815 whereaccount 527332 made 322 sell transactions to account231 It is possible for an account to sell Bitcoin to anotheraccount for more than one times however it is almostimpossible for two normal accounts to interact with sucha large number of times on the same day

bull Bi-direction The bi-directional pattern is a typical mar-ket manipulation behavior especially when the two ac-counts are controlled by the same user that two accounts

(a) Self-Loop (b) Unidirection (c) Bidirection

(d) Triangle (e) Polygon (f) Star

Fig 8 Some typical abnormal transaction patterns

interact with each other many times Figure 8c shows thebi-direction pattern on 20130414 where account 144834interact with account 231 for more than 150 times

bull Triangle The triangle pattern indicates a triangle-likestructure between three accounts It may contain vari-ous forms when considering the direction of the edgeFigure 8d shows a special form of triangle pattern on20131025 It is special because the accounts form a loopthrough transactions (account 282004rarr71885rarr490089rarr282004)

bull Polygon Polygon pattern is a more complicated trans-action pattern where many accounts form a polygon-likegroup with each edge has more than one transactionsFigure 8e shows a quadrangle pattern on 20130919it seems that account 282004 sends Bitcoin to account527332 through the ldquobridge accountsrdquo 488195 and 231for more than two hundred transactions

bull Star A star pattern has a core account that buys orsells Bitcoin to many accounts Figure 8f shows a typicalstar where the account 282004 sell Bitcoin to accounts488195 490089 527332 and 231

Generally speaking it is not surprising for a transactionnetwork to form a special structure as transactions are randomHowever in our case it is impossible as each edge representsfar more than one transaction in a single day Thus it seemsquite possible that these accounts are controlled by a certain

group and these transactions have special purposesBased on the results we summarize the findings as followsbull Finding 4 The daily fluctuations of the selected base

networks of EHG and ELG have a strong correlationwith the Bitcoin exchange price On the contrary thedaily fluctuation of the base networks of NMG has nocorrelation with the Bitcoin exchange price This findingindicates that the behavior of the abnormal accountsrsquotransaction affects the fluctuation of Bitcoin exchangeprice

bull Finding 5 The trend of the Bitcoin exchange price canbe captured by the selected base networks of EHG andELG It means that the trend of the price can be predictedby transactions between abnormal accounts

bull Finding 6 There are many unusual transaction patterns(eg self-loop bi-direction star) between abnormal ac-counts These patterns imply that these accounts arecontrolled by the same group and are strong evidenceof price manipulation

V RELATED WORK

Blockchain technology is a new technology which hasmany research directions and attracts the interest of researchersfrom various fields [12] [13] Our research is related toprevious work in two areas The first related area is thestudy of understanding the big fluctuation of Bitcoin price As

aforementioned many driving factors of the price are foundDue to all the related data are time series the most usedmethod in the analysis is time series based model such asvector space model [14] vector error-correction model [15]ARDL bounds testing method [16] wavelet analysis [5] andvector autoregressive [15]

Another related area is the study of the blockchain data (iethe transaction ledger) for different topics Due to the publiclyaccessible of the blockchain data and users are anonymous inthe system a common topic is to mine the blockchain datato reveal usersrsquo privacy [17]ndash[19] Because of the relativelylawless blockchain has become an area full of various scamsThus mining the blockchain data to detect scams is also acritical topic Recently there are many studies on this topicsuch as Bitcoin-based scams [20] the smart contract basedPonzi schemes [21] [22] money laundry [23] attacks [24]See [25] for a full survey of this topic

VI CONCLUSION AND FUTURE WORK

We conduct a systematic study to analyze the leaked MtGox transaction data through graph analysis By comparingthe transaction price of the transaction data with the discloseddaily price many abnormal transactions were identified andwere used to divide the accounts into three categories Basedon this classification we construct three graphs (ie EHGELG and NMG) and obtain many findings by analyzing thesegraphs through various metrics These findings convinced usthat there are many market manipulation behaviors in theexchange In order to reveal the relationship between thesebehaviors and the Bitcoin price the graphs are reconstructedinto daily graph series and reshaped into matrices Throughadopting SVD to the matrices some very important base net-works are identified By inspecting the base networks we findthat the daily variation of the abnormal base networks closelyrelated to the Bitcoin price and many market manipulationpatterns Based on these findings and considering Bitcoin isdominant in the market we propose to strengthen supervisionin this market In the future we will conduct a more thoroughstudy of the data to reveal the extent to which the market isaffected and to discuss the changes in the behavior of investorsunder the extreme fluctuation price

ACKNOWLEDGMENT

The work described in this paper was supported bythe National Key Research and Development Program(2016YFB1000101)the National Natural Science Foundationof China (6172221411801595) the Pearl River SampT NovaProgram of Guangzhou (201710010046) and the Program forGuangdong Introducing Innovative and Entrepreneurial Teams(2016ZT06D211)

REFERENCES

[1] S Nakamoto ldquoBitcoin A peer-to-peer electronic cash systemrdquo 2008[Online] Available httpsbitcoinorgbitcoinpdf

[2] C Burniske and A White ldquoBitcoin Ringing the bellfor a new asset classrdquo 2017 [Online] Availablehttpsresearchark-investcomhubfs1 Download Files ARK-InvestWhite PapersBitcoin-Ringing-The-Bell-For-A-New-Asset-Classpdf

[3] A Feder N Gandal J Hamrick and T Moore ldquoThe impact of ddosand other security shocks on bitcoin currency exchanges Evidence frommt goxrdquo Journal of Cybersecurity vol 3 no 2 pp 137ndash144 2018

[4] M Buchholz J Delaney J Warren and J Parker ldquoBits and betsinformation price volatility and demand for bitcoinrdquo Economics vol312 2012

[5] L Kristoufek ldquoWhat are the main drivers of the bitcoin price evidencefrom wavelet coherence analysisrdquo PloS one vol 10 no 4 p e01239232015

[6] mdashmdash ldquoBitcoin meets google trends and wikipedia Quantifying therelationship between phenomena of the internet erardquo Scientific Reportsvol 3 p 3415 2013

[7] D Kondor I Csabai J Szule M Posfai and G Vattay ldquoInferring theinterplay between network structure and market effects in bitcoinrdquo NewJournal of Physics vol 16 no 12 p 125003 2014

[8] N Gandal J Hamrick T Moore and T Oberman ldquoPrice manipulationin the bitcoin ecosystemrdquo Journal of Monetary Economics vol 95 pp86ndash96 2018

[9] S Ribes ldquoChasing fake volume a crypto-plaguerdquo Mar2018 [Online] Available httpsmediumcomsylvainartplayribeschasing-fake-volume-a-crypto-plague-ea1a3c1e0b5e

[10] R Core Team R A Language and Environment for StatisticalComputing R Foundation for Statistical Computing Vienna Austria2016 [Online] Available httpswwwR-projectorg

[11] C S Gillespie ldquoFitting heavy tailed distributions The poweRlawpackagerdquo Journal of Statistical Software vol 64 no 2 pp 1ndash162015 [Online] Available httpwwwjstatsoftorgv64i02

[12] Z Zheng S Xie H Dai X Chen and H Wang ldquoBlockchain challengesand opportunities a surveyrdquo International Journal of Web and GridServices vol 14 pp 352ndash375 2018

[13] mdashmdash ldquoAn overview of blockchain technology Architecture consensusand future trendsrdquo in 2017 IEEE International Congress on Big Data(BigData Congress) IEEE 2017 pp 557ndash564

[14] I Georgoula D Pournarakis C Bilanakos D Sotiropoulosand M G Giaglis ldquoUsing time-series and sentiment analy-sis to detect the determinants of bitcoin pricesrdquo Available atSSRNhttpsssrncomabstract=2607167 2015

[15] P Ciaian M Rajcaniova and d Kancs ldquoThe economics of bitcoin priceformationrdquo Applied Economics vol 48 no 19 pp 1799ndash1815 2016

[16] J Bouoiyour R Selmi et al ldquoWhat does bitcoin look likerdquo Annals ofEconomics and Finance vol 16 no 2 pp 449ndash492 2015

[17] F Reid and M Harrigan ldquoAn analysis of anonymity in the bitcoinsystemrdquo in Security and Privacy in Social Networks Springer 2013pp 197ndash223

[18] E Androulaki G O Karame M Roeschlin T Scherer and S CapkunldquoEvaluating user privacy in bitcoinrdquo in International Conference onFinancial Cryptography and Data Security Springer 2013 pp 34ndash51

[19] S Athey I Parashkevov V Sarukkai and J Xia ldquoBitcoin pric-ing adoption and usage Theory and evidencerdquo Available atSSRNhttpsssrncomabstract=2826674 2016

[20] M Vasek and T Moore ldquoTherersquos no free lunch even using BitcoinTracking the popularity and profits of virtual currency scamsrdquo inInternational Conference on Financial Cryptography and Data SecuritySpringer 2015 pp 44ndash61

[21] M Bartoletti S Carta T Cimoli and R Saia ldquoDissecting ponzischemes on ethereum identification analysis and impactrdquo arXivpreprint arXiv170303779 2017

[22] W Chen Z Zheng J Cui E Ngai P Zheng and Y Zhou ldquoDetectingponzi schemes on ethereum Towards healthier blockchain technologyrdquoin Proceedings of the 2018 World Wide Web Conference on World WideWeb ACM 2018 pp 1409ndash1418

[23] M Moser R Bohme and D Breuker ldquoAn inquiry into money laun-dering tools in the bitcoin ecosystemrdquo in eCrime Researchers Summit(eCRS) 2013 IEEE 2013 pp 1ndash14

[24] T Chen Y Zhu Z Li J Chen X Li X Luo X Lin and X ZhangeldquoUnderstanding ethereum via graph analysisrdquo in IEEE INFOCOM 2018-IEEE Conference on Computer Communications IEEE 2018 pp1484ndash1492

[25] W Chen and Z Zheng ldquoBlockchain data analysis A review of statustrends and challengesrdquo Journal of Computer Research and Developmentvol 55 no 9 pp 1853ndash1870 2018

  • I Introduction
  • II Data Set
  • III Static Network Analysis
    • III-A Account Classification
    • III-B Graph Construction
    • III-C Graph Analysis
      • IV Temporal Network Analysis
        • IV-A Extract Base Networks
        • IV-B Detecting Graph Structural Changes
        • IV-C Abnormal transaction patterns
          • V Related Work
          • VI Conclusion and Future Work
          • References
Page 4: Cryptocurrency Market Cap, Live Prices, Charts & Ticker - Market … - Evidence... · 2020. 6. 14. · Abstract—The cryptocurrency market is a very huge market without effective

nodes and w is the function associating each edge to a weightEach pair indicates that there was at least one transactionbetween users u (seller) and v (buyer) in the whole datasetw E rarr R+ maps each edge with a weight which is thetotal amount of Bitcoins transferred along the edge by one ormore transactions

In the remainder of this paper we use the term accountuser and node interchangeably To better compare networkcharacteristics we construct three graphs according to thenodesrsquo categories as follows

bull EHG The graph that all nodes are EHAsbull ELG The graph that all nodes are ELAsbull NMG The graph that all nodes are NMAsTo construct the graph we adopt the following steps Since

each complete transaction has both a buy and sell record(has the same transaction ID) after data validation we firstconstruct a set of tuples (SB v t l) from every completetransaction where S and B represents the seller and buyer(denoted by user ID) v is the corresponding amount of thetransaction in Bitcoin t is the transaction time and l is a labelindicating the category of the transaction (ie EHT ELT orNMT) We call this set as transaction tuple as each tuplecorresponds to a unique transaction Based on the transactiontuple the aforementioned graphs are easy to construct Forexample to construct the EHG we select all the tuples inwhich both the seller and the buyer are EHAs and sum the ventry grouped by S and B Then the generated new tuples(SB v) is the EHG Other graphs are constructed as the sameexcept by selecting different tuples according to the nodesrsquocategory

C Graph Analysis

This subsection investigates the constructed graphs fromvarious metrics in graph analysis Figure 3 shows the threegraphs We can find that there are more nodes in NMGindicating the NMG is more sparse in connection (note that weselect 5000 edges for each graph) We investigate the statisticsand metrics in the following

(a) EHG (b) ELG (c) NMG

Fig 3 Visualization of EHG ELG and NMG For the ease of illustrationwe randomly select 5000 edges from each graph to draw the figure

Table III shows all the statistics and metrics for eachconstructed graph For comparison we also constructed theabnormal graph (ie the graph of all abnormal accountsABG) and the complete graph (ie the graph of all accounts

CG) In the following we first introduce the statistics ormetrics and then detail the observations

The number of nodes in each graph is the number ofaccounts in each category which is in accordance with thestatistics in Table II The only exception is that the numberof nodes in NMG is less than the number of NMA becausesome normal accounts interact with abnormal accounts thusit is not included the NMG

TABLE IIISTATICS OF GRAPHS

graph nodes edges cluster avg degree avg wgt degreeEHG 10702 212900 030 1989 50543ELG 5835 413881 042 7093 310768ABG 14916 612885 031 4109 143904NMG 86457 655882 003 759 7621CG 119343 2682719 028 2248 42654

An edge in the graph indicates a ldquochannelrdquo between twoaccounts for buying or selling Bitcoin As can be seen fromthe table the number of edges in each graph is far less thanthe number of transactions which means that many channelsare used more than one times Another notable result is thatthe summation of the number of edges in ABG and NMG isgreatly less than the number of edges in the CG This resultindicates that many edges are the channels between normal andabnormal accounts and is evidence that the abnormal accountsprovide liquidity in the exchange The number of edges inABG is slightly larger than the sum of the number of edges inEHG and ELG since there are some edges connecting EHAsand ELAs

We compute the clustering coefficient of all the graphsin column 4 of Table III As can be seen the clusteringcoefficients are extremely different among EHG ELG andNMG The large clustering coefficients (ie 03 in EHG and042 in ELG) revealing that if two abnormal accounts ABtrade with abnormal account C A and B are very likely totrade with each other In other words the abnormal accountsare likely to form triangles through transactions Converselythe clustering coefficient of NMG is very small (ie 003)which indicates a normal situation as the probability of threenormal accounts forming a triangle is very small This resultindicates that the abnormal accounts behave strangely andherald the existence of market manipulation in the exchange

The degree of a node is the number of edges connectingto the node In our case the degree of a node indicates thenumber of accounts trading with that node Figure 4 showsthe degree distribution of all the three graphs all of whichapproximately follows the power law distribution meaningthat there are few large-degree nodes and many small-degreenodes We estimate the parameters by using the free statisticalsoftware R [10] and the contributed package [11] and plotthe fitting line y sim xminusα for each distribution in red Thesmaller the α the more variable of nodesrsquo degree Thusthe abnormal accounts show less variable as compared withnormal accounts The result may be due to the abnormalaccounts are controlled by the same organizations

(a) EHG (b) ELG (c) NMG

Fig 4 Degree distribution of EHG ELG and NMG

Column 5 and 6 in Table III show the average degreeand the weighted average degree of the graphs The largeaverage degrees of EHG and ELG indicate that the abnormalaccounts are used more frequently than normal accounts Theweighted degree is computed by setting the transaction volume(ie Bitcoin) as the weight thus the average weighted degreerepresents the average transaction volume for each edge Ascan be seen the average weighted degree of ELG is far largerthan it of EHG one possible reason that the exchange priceof transactions in ELG is relatively low thus the transactionvolume is large Whatever the reason is an obvious factremains that the average weighted degree of EHG and ELGare larger than that of NMG which means the edges betweenabnormal accounts transfer more Bitcoin than edges betweennormal accounts

Based on the results and analysis discussed above wesummarize the findings as follows

bull Finding 1 There are some abnormal accounts (125)which trading with very high or low exchange price insome transactions We consider these accounts abnormaland under control by the exchange for two reasons 1)the proportion of the abnormal transactions account for28 thus it is not occurred by accident 2) the abnormalexchange price is impossible to appear on ordinary users

bull Finding 2 Many seemingly normal transactions occurredbetween abnormal accounts ( gt 41) There are two pos-sible purposes for these transactions 1) these transactionsare the fake volume that used to create an illusion ofactive trading 2) to provide liquidity for the exchange

bull Finding 3 The graphs of abnormal accounts have verylarge clustering coefficients One possible reason is thatthese accounts are controlled by one organization andthus the trade is not completely random

These findings indicate that the exchange was likely in-volved in trading manipulation As the exchange price is thekey factor of trading in the following section we will discussthe possibility of price manipulation of the exchange

IV TEMPORAL NETWORK ANALYSIS

As discussed above the transaction network of abnormalaccounts (ie EHG and ELG) show a great difference fromthe NMG We want to know whether these transactions have acorrelation with the Bitcoin price and what kind of users and

transactions (ie graph structure) influence the Bitcoin pricegreatly To this end we calculate the daily snapshots of thegraphs by adopting the method similar to III-B To detect im-portant changes in the graph structure we compare successivesnapshots of the graphs using singular value decomposition(SVD) The goal is to detect a set of base networks andrepresents each dayrsquos snapshot as a linear combination of thesebase networks Unlike in Section III we focused our studyon transaction data after 20121201 in this section Thereare many reasons supporting our choice Firstly the recentpaper which proves the price manipulation of Mt Gox usesthe same transaction history [8] Secondly the Bitcoin priceexperienced a skyrocketing during this period Thirdly MtGox was the main Bitcoin exchange during this period Finallymore abnormal users and transactions (more than 60) arefound after that day

A Extract Base Networks

To evaluate which networks influence the price greatly weneed to construct the daily snapshots of the three graphsEHGt ELGt and NMGt We adopt the same process toconstruct the graph series First of all we construct the aggre-gate networks (ie EHG) based on tuples after 20121201Assume there are n nodes and L edges in the aggregatenetwork then it can be represented by a n times n weightedadjacency matrix G in which there are L non-zero elementsWe rearrange G into an L long vector g containing all the non-zero elements We call this vector as edge-weight vector Thevector describes the graph structure of the aggregate networkas each element represents a possible edge and its weight Toconstruct the daily snapshots of EHGt on day t we recalculatethe edge-weight vector gt (ie the graph structure on day t)based on transaction tuples on day t Please note that we donot change the order of the vector thus the i-th element of allthe edge-weight vectors indicate the same edge and it maybe zero if the edge does not exist on a specific day For Tsnapshots we now build the TtimesL graph time series matrix Xsuch that the t-th row of X equals gt By doing this we builda special matrix with T samples and each sample representsa daily graph structure

To account for the variation of the daily graph structure wenormalize X such that the sum of each row equals 1 and thensubtract the column averages from each column As a resultboth the row and column sums in the matrix will be zero Wecompute the singular value decomposition of the matrix X

X = UΣV T (1)

where U is a TtimesT unitary matrix Σ is a TtimesL diagonal matrixwith non-negative values on the diagonal and V is a L times Lunitary matrix The non-negative values on the diagonal aresigular values and is usually sorted in descending order Theleft-singular vectors containing in the column of U are a setof orthonormal eigenvectors of XXT and the right-singularvectors containing in the column of V are a set of orthonormaleigenvectors of XTX Since in this case T lt L there areonly T nonzero sigular values We denote the sorted sigular

values as (σ1 middot middot middot σT ) the left-sigular vectors (u1 middot middot middot uT )and the right-sigular vectors (v1 middot middot middot vT ) where ui and vi

are column vectors and subject to the following equations

uiT lowast uj = vi

T lowast vj = δij (2)

Based on the special meaning of matrix X we can interpretthe singular vectors and the singular values as 1) the right-singular vectors can be seen as base networks and the elementvi(l) (ie the l-th element of the i-th right-singular vector)gives the weight of the l-th edge in the i-th base network2) the left-singular vectors account for the temporal variationof the base networks the t-th value of ui (denotes as ui(t))provides the contribution of the i-th base network on day t3) the singular value σi which are the square roots of thenon-zero eigenvalues of both XTX and XXT indicates theoverall importance of the i-th base network in approximatingthe whole matrix Please note that the singular values aresorted in decreasing order thus give decreasing contributionto the result

B Detecting Graph Structural Changes

As the (normalized) weight of the l-th edge in the dailygraph structure on day t can be written as

xtl =

Tsumi=1

σiui(t)vi(l) (3)

to detect graph structural changes we need to consider twoterms σi (ie the importance of the i-th base network) andui(t) (ie the contribution of the i-th base network on day t)

As a first glance we consider the daily influence of thefirst and also the most important base network (ie u1(t))We want to know the correlation between the variation ofu1(t) and the fluctuation of the Bitcoin exchange price Asthe range of the price is (12 1207) we adopt a simplemathematical transform to make sure most of the transformedprice falls in the interval (0 1) Specifically we adopt thelog transform B(t) = log1000P (t) where P (t) is the closeexchange price of Bitcoin on day t Table IV (left part) showsthree commonly used correlation coefficients (ie PearsonSpearman and Kendall correlation coefficient) between u1(t)and the log-transformed price B(t) The results show thatthe daily variation of the first base network in EHG andELG have a very strong correlation with the Bitcoin exchangeprice However in NMG there is no correlation between thetwo variables The result indicates that the transactions madebetween abnormal accounts have a great influence on theBitcoin exchange price

Motivated by this result we want to know to what extentthe log transfered price can be estimated with the combinationof the left-sigular vectors ie

B(t) sim c0 +

Nsumi=1

ciui(t) (4)

TABLE IVCORRELATION COEFFCIENTS BETWEEN THE LEFT-SINGULAR VECTORS OFTHE NETWORK TIME SERIES MATRIX AND THE BITCOIN EXCHANGE PRICE

Graph The 1st base network The Fitted 10 base networksρP ρS ρK ρP ρS ρK

EHG 056 060 044 0811 0807 0620ELG 058 082 064 0871 0834 0652NMG 005 015 012 0239 0398 0289

0 50 100 150 200 250 300 350Order of singular value

000

025

050

075

100

125

150

175

Sigu

lar v

alue

10

ELGEHGNMG

Fig 5 Sigular values in the order of its importance

where c0 is the mean of B(t) and ci can be computed as thedot product of B(t) and ui(t) As the left-singular vectors areorthonormal and span the T-dimensional linear space B(t)can be reconstructed by ui(t) when N = T However this isnot what we desire in this case The purpose of this study isto identify some important base networks and accounts thathave a great influence on the Bitcoin price To proceed wefirst try to select some important base networks in the detectedbase networks We draw the scree plot of the singular valuesas shown in Fig 5 As can be seen from the graph the curveof the singular values is clearly leveling off at the right side ofthe dotted line (ie the 10th singular value) Thus we selectthe first 10 base networks for the following analysis

Before analyzing accounts in the selected base networks weapproximate B(t) with the selected networks To evaluate thefitting effect we calculate the correlation coefficients betweenthe fitted price series and B(t) The right part of TableIV shows the correlation coefficients Surprisingly the threecorrelation coefficients are greatly enhanced as compared withthe first left-singular vector Especially the Pearson correlationcoefficient between ELG and B(t) is 087 while only 024between NMG and B(t) The great difference indicates astrong correlation between abnormal accountsrsquo transactionsand the Bitcoin exchange price which is a strong evidenceof the price manipulation in Mt Gox

Figure 6 shows the trends of B(t) and the fitted price Ascan be seen from the graph though the shape of the peak inApril of 2013 is missed the trends of B(t) has been graspedby the selected base networks of EHG and ELG whereas thebase networks in the NMG have no effect in grasping thetrend

2012-12 2013-03 2013-06 2013-09 2013-12Date

02

04

06

08

10

Exch

ange

pric

e

Fitted10 base networksExchange price

(a) EHG

2012-12 2013-03 2013-06 2013-09 2013-12Date

02

04

06

08

10

Exch

ange

pric

e

Fitted10 base networksExchange price

(b) ELG

2012-12 2013-03 2013-06 2013-09 2013-12Date

03

04

05

06

07

08

09

10

Exch

ange

pric

e

Fitted10 base networksExchange price

(c) NMG

Fig 6 Approximate the log-transformed Bitcoin price with the linear combination of the selected base networks of EHG ELG and NMG

2012-12 2013-03 2013-06 2013-09 2013-12Date

06

04

02

00

02

04

Left-

singu

lar v

ecto

r wei

ght

u1(t)u2(t)u3(t)u4(t)

(a) EHG

2012-12 2013-03 2013-06 2013-09 2013-12Date

04

02

00

02

04

Left-

singu

lar v

ecto

r wei

ght

u1(t)u2(t)u3(t)u4(t)

(b) ELG

2012-12 2013-03 2013-06 2013-09 2013-12Date

100

075

050

025

000

025

050

075

100

Left-

singu

lar v

ecto

r wei

ght

u1(t)u2(t)u3(t)u4(t)

(c) NMG

Fig 7 The time-varying contribution ui(t) of the first four base networks

To show the structure variation of the networks we draw thetime-varying contribution ui(t) of the first four base networksin Fig 7 In most cases ui(t) exhibit a few abrupt changespartitioning the history of the transaction into separate timeperiods The most notable abrupt changes are in December of2012 when the Bitcoin exchange price is very smooth and theNovember of 2013 when the price skyrocketing During thetwo periods the effects of the first four base networks of EHGand ELG are both significant however the base networks inNMG have no distinct effect during the smooth period andshow effect only a few days during the skyrocketing period

C Abnormal transaction patterns

As discussed above the transactions between abnormalusers have a great correlation with the Bitcoin exchange priceA natural question is which edges (ie transactions) and thusaccounts are the most influential and whether the transactionsshow certain patterns during the period To this end based onthe extracted 10 base networks we further extract the top-10ranking edges (by the absolute value of weights) in each basenetworks We find only 44 distinct edges instead of the 100maximally possible which including a total of 28 accounts inEHG In ELG 57 edges and 46 accounts were found We callthese core abnormal accounts

To identify special transaction patterns we draw the dailysubgraph of the core abnormal accounts We find that there aremany abnormal transaction patterns (ie market manipulationpatterns) in the networks In order to save space we show

only 6 typical patterns in Fig 8 These subgraphs are allextracted from ELG on different days In order to illustratemore clearly we fix the layout of the graph (ie the positionof the accounts in each graph is fixed) and denote the specialpatterns in red The size of the line denotes the number oftransactions between the two accounts The number at theright-hand side of the directed edge represents the numberof transactions between the two accounts We simply explainthe 6 patterns as follows

bull Self-Loop A pattern that an account made transactionswith itself Figure 8a shows subgraph on 20130207 theaccount 231 made 749 transactions with itself Self-Loopis restricted for normal accounts in any exchanges as itmakes no sense Thus a reasonable explanation for theself-loop pattern is that the account may belong to theexchange and may be used to increase daily transactionvolume or price manipulation

bull Unidirection The unidirectional pattern indicates morethan one transaction from account A to B Figure 8bshows a unidirectional pattern on 20130815 whereaccount 527332 made 322 sell transactions to account231 It is possible for an account to sell Bitcoin to anotheraccount for more than one times however it is almostimpossible for two normal accounts to interact with sucha large number of times on the same day

bull Bi-direction The bi-directional pattern is a typical mar-ket manipulation behavior especially when the two ac-counts are controlled by the same user that two accounts

(a) Self-Loop (b) Unidirection (c) Bidirection

(d) Triangle (e) Polygon (f) Star

Fig 8 Some typical abnormal transaction patterns

interact with each other many times Figure 8c shows thebi-direction pattern on 20130414 where account 144834interact with account 231 for more than 150 times

bull Triangle The triangle pattern indicates a triangle-likestructure between three accounts It may contain vari-ous forms when considering the direction of the edgeFigure 8d shows a special form of triangle pattern on20131025 It is special because the accounts form a loopthrough transactions (account 282004rarr71885rarr490089rarr282004)

bull Polygon Polygon pattern is a more complicated trans-action pattern where many accounts form a polygon-likegroup with each edge has more than one transactionsFigure 8e shows a quadrangle pattern on 20130919it seems that account 282004 sends Bitcoin to account527332 through the ldquobridge accountsrdquo 488195 and 231for more than two hundred transactions

bull Star A star pattern has a core account that buys orsells Bitcoin to many accounts Figure 8f shows a typicalstar where the account 282004 sell Bitcoin to accounts488195 490089 527332 and 231

Generally speaking it is not surprising for a transactionnetwork to form a special structure as transactions are randomHowever in our case it is impossible as each edge representsfar more than one transaction in a single day Thus it seemsquite possible that these accounts are controlled by a certain

group and these transactions have special purposesBased on the results we summarize the findings as followsbull Finding 4 The daily fluctuations of the selected base

networks of EHG and ELG have a strong correlationwith the Bitcoin exchange price On the contrary thedaily fluctuation of the base networks of NMG has nocorrelation with the Bitcoin exchange price This findingindicates that the behavior of the abnormal accountsrsquotransaction affects the fluctuation of Bitcoin exchangeprice

bull Finding 5 The trend of the Bitcoin exchange price canbe captured by the selected base networks of EHG andELG It means that the trend of the price can be predictedby transactions between abnormal accounts

bull Finding 6 There are many unusual transaction patterns(eg self-loop bi-direction star) between abnormal ac-counts These patterns imply that these accounts arecontrolled by the same group and are strong evidenceof price manipulation

V RELATED WORK

Blockchain technology is a new technology which hasmany research directions and attracts the interest of researchersfrom various fields [12] [13] Our research is related toprevious work in two areas The first related area is thestudy of understanding the big fluctuation of Bitcoin price As

aforementioned many driving factors of the price are foundDue to all the related data are time series the most usedmethod in the analysis is time series based model such asvector space model [14] vector error-correction model [15]ARDL bounds testing method [16] wavelet analysis [5] andvector autoregressive [15]

Another related area is the study of the blockchain data (iethe transaction ledger) for different topics Due to the publiclyaccessible of the blockchain data and users are anonymous inthe system a common topic is to mine the blockchain datato reveal usersrsquo privacy [17]ndash[19] Because of the relativelylawless blockchain has become an area full of various scamsThus mining the blockchain data to detect scams is also acritical topic Recently there are many studies on this topicsuch as Bitcoin-based scams [20] the smart contract basedPonzi schemes [21] [22] money laundry [23] attacks [24]See [25] for a full survey of this topic

VI CONCLUSION AND FUTURE WORK

We conduct a systematic study to analyze the leaked MtGox transaction data through graph analysis By comparingthe transaction price of the transaction data with the discloseddaily price many abnormal transactions were identified andwere used to divide the accounts into three categories Basedon this classification we construct three graphs (ie EHGELG and NMG) and obtain many findings by analyzing thesegraphs through various metrics These findings convinced usthat there are many market manipulation behaviors in theexchange In order to reveal the relationship between thesebehaviors and the Bitcoin price the graphs are reconstructedinto daily graph series and reshaped into matrices Throughadopting SVD to the matrices some very important base net-works are identified By inspecting the base networks we findthat the daily variation of the abnormal base networks closelyrelated to the Bitcoin price and many market manipulationpatterns Based on these findings and considering Bitcoin isdominant in the market we propose to strengthen supervisionin this market In the future we will conduct a more thoroughstudy of the data to reveal the extent to which the market isaffected and to discuss the changes in the behavior of investorsunder the extreme fluctuation price

ACKNOWLEDGMENT

The work described in this paper was supported bythe National Key Research and Development Program(2016YFB1000101)the National Natural Science Foundationof China (6172221411801595) the Pearl River SampT NovaProgram of Guangzhou (201710010046) and the Program forGuangdong Introducing Innovative and Entrepreneurial Teams(2016ZT06D211)

REFERENCES

[1] S Nakamoto ldquoBitcoin A peer-to-peer electronic cash systemrdquo 2008[Online] Available httpsbitcoinorgbitcoinpdf

[2] C Burniske and A White ldquoBitcoin Ringing the bellfor a new asset classrdquo 2017 [Online] Availablehttpsresearchark-investcomhubfs1 Download Files ARK-InvestWhite PapersBitcoin-Ringing-The-Bell-For-A-New-Asset-Classpdf

[3] A Feder N Gandal J Hamrick and T Moore ldquoThe impact of ddosand other security shocks on bitcoin currency exchanges Evidence frommt goxrdquo Journal of Cybersecurity vol 3 no 2 pp 137ndash144 2018

[4] M Buchholz J Delaney J Warren and J Parker ldquoBits and betsinformation price volatility and demand for bitcoinrdquo Economics vol312 2012

[5] L Kristoufek ldquoWhat are the main drivers of the bitcoin price evidencefrom wavelet coherence analysisrdquo PloS one vol 10 no 4 p e01239232015

[6] mdashmdash ldquoBitcoin meets google trends and wikipedia Quantifying therelationship between phenomena of the internet erardquo Scientific Reportsvol 3 p 3415 2013

[7] D Kondor I Csabai J Szule M Posfai and G Vattay ldquoInferring theinterplay between network structure and market effects in bitcoinrdquo NewJournal of Physics vol 16 no 12 p 125003 2014

[8] N Gandal J Hamrick T Moore and T Oberman ldquoPrice manipulationin the bitcoin ecosystemrdquo Journal of Monetary Economics vol 95 pp86ndash96 2018

[9] S Ribes ldquoChasing fake volume a crypto-plaguerdquo Mar2018 [Online] Available httpsmediumcomsylvainartplayribeschasing-fake-volume-a-crypto-plague-ea1a3c1e0b5e

[10] R Core Team R A Language and Environment for StatisticalComputing R Foundation for Statistical Computing Vienna Austria2016 [Online] Available httpswwwR-projectorg

[11] C S Gillespie ldquoFitting heavy tailed distributions The poweRlawpackagerdquo Journal of Statistical Software vol 64 no 2 pp 1ndash162015 [Online] Available httpwwwjstatsoftorgv64i02

[12] Z Zheng S Xie H Dai X Chen and H Wang ldquoBlockchain challengesand opportunities a surveyrdquo International Journal of Web and GridServices vol 14 pp 352ndash375 2018

[13] mdashmdash ldquoAn overview of blockchain technology Architecture consensusand future trendsrdquo in 2017 IEEE International Congress on Big Data(BigData Congress) IEEE 2017 pp 557ndash564

[14] I Georgoula D Pournarakis C Bilanakos D Sotiropoulosand M G Giaglis ldquoUsing time-series and sentiment analy-sis to detect the determinants of bitcoin pricesrdquo Available atSSRNhttpsssrncomabstract=2607167 2015

[15] P Ciaian M Rajcaniova and d Kancs ldquoThe economics of bitcoin priceformationrdquo Applied Economics vol 48 no 19 pp 1799ndash1815 2016

[16] J Bouoiyour R Selmi et al ldquoWhat does bitcoin look likerdquo Annals ofEconomics and Finance vol 16 no 2 pp 449ndash492 2015

[17] F Reid and M Harrigan ldquoAn analysis of anonymity in the bitcoinsystemrdquo in Security and Privacy in Social Networks Springer 2013pp 197ndash223

[18] E Androulaki G O Karame M Roeschlin T Scherer and S CapkunldquoEvaluating user privacy in bitcoinrdquo in International Conference onFinancial Cryptography and Data Security Springer 2013 pp 34ndash51

[19] S Athey I Parashkevov V Sarukkai and J Xia ldquoBitcoin pric-ing adoption and usage Theory and evidencerdquo Available atSSRNhttpsssrncomabstract=2826674 2016

[20] M Vasek and T Moore ldquoTherersquos no free lunch even using BitcoinTracking the popularity and profits of virtual currency scamsrdquo inInternational Conference on Financial Cryptography and Data SecuritySpringer 2015 pp 44ndash61

[21] M Bartoletti S Carta T Cimoli and R Saia ldquoDissecting ponzischemes on ethereum identification analysis and impactrdquo arXivpreprint arXiv170303779 2017

[22] W Chen Z Zheng J Cui E Ngai P Zheng and Y Zhou ldquoDetectingponzi schemes on ethereum Towards healthier blockchain technologyrdquoin Proceedings of the 2018 World Wide Web Conference on World WideWeb ACM 2018 pp 1409ndash1418

[23] M Moser R Bohme and D Breuker ldquoAn inquiry into money laun-dering tools in the bitcoin ecosystemrdquo in eCrime Researchers Summit(eCRS) 2013 IEEE 2013 pp 1ndash14

[24] T Chen Y Zhu Z Li J Chen X Li X Luo X Lin and X ZhangeldquoUnderstanding ethereum via graph analysisrdquo in IEEE INFOCOM 2018-IEEE Conference on Computer Communications IEEE 2018 pp1484ndash1492

[25] W Chen and Z Zheng ldquoBlockchain data analysis A review of statustrends and challengesrdquo Journal of Computer Research and Developmentvol 55 no 9 pp 1853ndash1870 2018

  • I Introduction
  • II Data Set
  • III Static Network Analysis
    • III-A Account Classification
    • III-B Graph Construction
    • III-C Graph Analysis
      • IV Temporal Network Analysis
        • IV-A Extract Base Networks
        • IV-B Detecting Graph Structural Changes
        • IV-C Abnormal transaction patterns
          • V Related Work
          • VI Conclusion and Future Work
          • References
Page 5: Cryptocurrency Market Cap, Live Prices, Charts & Ticker - Market … - Evidence... · 2020. 6. 14. · Abstract—The cryptocurrency market is a very huge market without effective

(a) EHG (b) ELG (c) NMG

Fig 4 Degree distribution of EHG ELG and NMG

Column 5 and 6 in Table III show the average degreeand the weighted average degree of the graphs The largeaverage degrees of EHG and ELG indicate that the abnormalaccounts are used more frequently than normal accounts Theweighted degree is computed by setting the transaction volume(ie Bitcoin) as the weight thus the average weighted degreerepresents the average transaction volume for each edge Ascan be seen the average weighted degree of ELG is far largerthan it of EHG one possible reason that the exchange priceof transactions in ELG is relatively low thus the transactionvolume is large Whatever the reason is an obvious factremains that the average weighted degree of EHG and ELGare larger than that of NMG which means the edges betweenabnormal accounts transfer more Bitcoin than edges betweennormal accounts

Based on the results and analysis discussed above wesummarize the findings as follows

bull Finding 1 There are some abnormal accounts (125)which trading with very high or low exchange price insome transactions We consider these accounts abnormaland under control by the exchange for two reasons 1)the proportion of the abnormal transactions account for28 thus it is not occurred by accident 2) the abnormalexchange price is impossible to appear on ordinary users

bull Finding 2 Many seemingly normal transactions occurredbetween abnormal accounts ( gt 41) There are two pos-sible purposes for these transactions 1) these transactionsare the fake volume that used to create an illusion ofactive trading 2) to provide liquidity for the exchange

bull Finding 3 The graphs of abnormal accounts have verylarge clustering coefficients One possible reason is thatthese accounts are controlled by one organization andthus the trade is not completely random

These findings indicate that the exchange was likely in-volved in trading manipulation As the exchange price is thekey factor of trading in the following section we will discussthe possibility of price manipulation of the exchange

IV TEMPORAL NETWORK ANALYSIS

As discussed above the transaction network of abnormalaccounts (ie EHG and ELG) show a great difference fromthe NMG We want to know whether these transactions have acorrelation with the Bitcoin price and what kind of users and

transactions (ie graph structure) influence the Bitcoin pricegreatly To this end we calculate the daily snapshots of thegraphs by adopting the method similar to III-B To detect im-portant changes in the graph structure we compare successivesnapshots of the graphs using singular value decomposition(SVD) The goal is to detect a set of base networks andrepresents each dayrsquos snapshot as a linear combination of thesebase networks Unlike in Section III we focused our studyon transaction data after 20121201 in this section Thereare many reasons supporting our choice Firstly the recentpaper which proves the price manipulation of Mt Gox usesthe same transaction history [8] Secondly the Bitcoin priceexperienced a skyrocketing during this period Thirdly MtGox was the main Bitcoin exchange during this period Finallymore abnormal users and transactions (more than 60) arefound after that day

A Extract Base Networks

To evaluate which networks influence the price greatly weneed to construct the daily snapshots of the three graphsEHGt ELGt and NMGt We adopt the same process toconstruct the graph series First of all we construct the aggre-gate networks (ie EHG) based on tuples after 20121201Assume there are n nodes and L edges in the aggregatenetwork then it can be represented by a n times n weightedadjacency matrix G in which there are L non-zero elementsWe rearrange G into an L long vector g containing all the non-zero elements We call this vector as edge-weight vector Thevector describes the graph structure of the aggregate networkas each element represents a possible edge and its weight Toconstruct the daily snapshots of EHGt on day t we recalculatethe edge-weight vector gt (ie the graph structure on day t)based on transaction tuples on day t Please note that we donot change the order of the vector thus the i-th element of allthe edge-weight vectors indicate the same edge and it maybe zero if the edge does not exist on a specific day For Tsnapshots we now build the TtimesL graph time series matrix Xsuch that the t-th row of X equals gt By doing this we builda special matrix with T samples and each sample representsa daily graph structure

To account for the variation of the daily graph structure wenormalize X such that the sum of each row equals 1 and thensubtract the column averages from each column As a resultboth the row and column sums in the matrix will be zero Wecompute the singular value decomposition of the matrix X

X = UΣV T (1)

where U is a TtimesT unitary matrix Σ is a TtimesL diagonal matrixwith non-negative values on the diagonal and V is a L times Lunitary matrix The non-negative values on the diagonal aresigular values and is usually sorted in descending order Theleft-singular vectors containing in the column of U are a setof orthonormal eigenvectors of XXT and the right-singularvectors containing in the column of V are a set of orthonormaleigenvectors of XTX Since in this case T lt L there areonly T nonzero sigular values We denote the sorted sigular

values as (σ1 middot middot middot σT ) the left-sigular vectors (u1 middot middot middot uT )and the right-sigular vectors (v1 middot middot middot vT ) where ui and vi

are column vectors and subject to the following equations

uiT lowast uj = vi

T lowast vj = δij (2)

Based on the special meaning of matrix X we can interpretthe singular vectors and the singular values as 1) the right-singular vectors can be seen as base networks and the elementvi(l) (ie the l-th element of the i-th right-singular vector)gives the weight of the l-th edge in the i-th base network2) the left-singular vectors account for the temporal variationof the base networks the t-th value of ui (denotes as ui(t))provides the contribution of the i-th base network on day t3) the singular value σi which are the square roots of thenon-zero eigenvalues of both XTX and XXT indicates theoverall importance of the i-th base network in approximatingthe whole matrix Please note that the singular values aresorted in decreasing order thus give decreasing contributionto the result

B Detecting Graph Structural Changes

As the (normalized) weight of the l-th edge in the dailygraph structure on day t can be written as

xtl =

Tsumi=1

σiui(t)vi(l) (3)

to detect graph structural changes we need to consider twoterms σi (ie the importance of the i-th base network) andui(t) (ie the contribution of the i-th base network on day t)

As a first glance we consider the daily influence of thefirst and also the most important base network (ie u1(t))We want to know the correlation between the variation ofu1(t) and the fluctuation of the Bitcoin exchange price Asthe range of the price is (12 1207) we adopt a simplemathematical transform to make sure most of the transformedprice falls in the interval (0 1) Specifically we adopt thelog transform B(t) = log1000P (t) where P (t) is the closeexchange price of Bitcoin on day t Table IV (left part) showsthree commonly used correlation coefficients (ie PearsonSpearman and Kendall correlation coefficient) between u1(t)and the log-transformed price B(t) The results show thatthe daily variation of the first base network in EHG andELG have a very strong correlation with the Bitcoin exchangeprice However in NMG there is no correlation between thetwo variables The result indicates that the transactions madebetween abnormal accounts have a great influence on theBitcoin exchange price

Motivated by this result we want to know to what extentthe log transfered price can be estimated with the combinationof the left-sigular vectors ie

B(t) sim c0 +

Nsumi=1

ciui(t) (4)

TABLE IVCORRELATION COEFFCIENTS BETWEEN THE LEFT-SINGULAR VECTORS OFTHE NETWORK TIME SERIES MATRIX AND THE BITCOIN EXCHANGE PRICE

Graph The 1st base network The Fitted 10 base networksρP ρS ρK ρP ρS ρK

EHG 056 060 044 0811 0807 0620ELG 058 082 064 0871 0834 0652NMG 005 015 012 0239 0398 0289

0 50 100 150 200 250 300 350Order of singular value

000

025

050

075

100

125

150

175

Sigu

lar v

alue

10

ELGEHGNMG

Fig 5 Sigular values in the order of its importance

where c0 is the mean of B(t) and ci can be computed as thedot product of B(t) and ui(t) As the left-singular vectors areorthonormal and span the T-dimensional linear space B(t)can be reconstructed by ui(t) when N = T However this isnot what we desire in this case The purpose of this study isto identify some important base networks and accounts thathave a great influence on the Bitcoin price To proceed wefirst try to select some important base networks in the detectedbase networks We draw the scree plot of the singular valuesas shown in Fig 5 As can be seen from the graph the curveof the singular values is clearly leveling off at the right side ofthe dotted line (ie the 10th singular value) Thus we selectthe first 10 base networks for the following analysis

Before analyzing accounts in the selected base networks weapproximate B(t) with the selected networks To evaluate thefitting effect we calculate the correlation coefficients betweenthe fitted price series and B(t) The right part of TableIV shows the correlation coefficients Surprisingly the threecorrelation coefficients are greatly enhanced as compared withthe first left-singular vector Especially the Pearson correlationcoefficient between ELG and B(t) is 087 while only 024between NMG and B(t) The great difference indicates astrong correlation between abnormal accountsrsquo transactionsand the Bitcoin exchange price which is a strong evidenceof the price manipulation in Mt Gox

Figure 6 shows the trends of B(t) and the fitted price Ascan be seen from the graph though the shape of the peak inApril of 2013 is missed the trends of B(t) has been graspedby the selected base networks of EHG and ELG whereas thebase networks in the NMG have no effect in grasping thetrend

2012-12 2013-03 2013-06 2013-09 2013-12Date

02

04

06

08

10

Exch

ange

pric

e

Fitted10 base networksExchange price

(a) EHG

2012-12 2013-03 2013-06 2013-09 2013-12Date

02

04

06

08

10

Exch

ange

pric

e

Fitted10 base networksExchange price

(b) ELG

2012-12 2013-03 2013-06 2013-09 2013-12Date

03

04

05

06

07

08

09

10

Exch

ange

pric

e

Fitted10 base networksExchange price

(c) NMG

Fig 6 Approximate the log-transformed Bitcoin price with the linear combination of the selected base networks of EHG ELG and NMG

2012-12 2013-03 2013-06 2013-09 2013-12Date

06

04

02

00

02

04

Left-

singu

lar v

ecto

r wei

ght

u1(t)u2(t)u3(t)u4(t)

(a) EHG

2012-12 2013-03 2013-06 2013-09 2013-12Date

04

02

00

02

04

Left-

singu

lar v

ecto

r wei

ght

u1(t)u2(t)u3(t)u4(t)

(b) ELG

2012-12 2013-03 2013-06 2013-09 2013-12Date

100

075

050

025

000

025

050

075

100

Left-

singu

lar v

ecto

r wei

ght

u1(t)u2(t)u3(t)u4(t)

(c) NMG

Fig 7 The time-varying contribution ui(t) of the first four base networks

To show the structure variation of the networks we draw thetime-varying contribution ui(t) of the first four base networksin Fig 7 In most cases ui(t) exhibit a few abrupt changespartitioning the history of the transaction into separate timeperiods The most notable abrupt changes are in December of2012 when the Bitcoin exchange price is very smooth and theNovember of 2013 when the price skyrocketing During thetwo periods the effects of the first four base networks of EHGand ELG are both significant however the base networks inNMG have no distinct effect during the smooth period andshow effect only a few days during the skyrocketing period

C Abnormal transaction patterns

As discussed above the transactions between abnormalusers have a great correlation with the Bitcoin exchange priceA natural question is which edges (ie transactions) and thusaccounts are the most influential and whether the transactionsshow certain patterns during the period To this end based onthe extracted 10 base networks we further extract the top-10ranking edges (by the absolute value of weights) in each basenetworks We find only 44 distinct edges instead of the 100maximally possible which including a total of 28 accounts inEHG In ELG 57 edges and 46 accounts were found We callthese core abnormal accounts

To identify special transaction patterns we draw the dailysubgraph of the core abnormal accounts We find that there aremany abnormal transaction patterns (ie market manipulationpatterns) in the networks In order to save space we show

only 6 typical patterns in Fig 8 These subgraphs are allextracted from ELG on different days In order to illustratemore clearly we fix the layout of the graph (ie the positionof the accounts in each graph is fixed) and denote the specialpatterns in red The size of the line denotes the number oftransactions between the two accounts The number at theright-hand side of the directed edge represents the numberof transactions between the two accounts We simply explainthe 6 patterns as follows

bull Self-Loop A pattern that an account made transactionswith itself Figure 8a shows subgraph on 20130207 theaccount 231 made 749 transactions with itself Self-Loopis restricted for normal accounts in any exchanges as itmakes no sense Thus a reasonable explanation for theself-loop pattern is that the account may belong to theexchange and may be used to increase daily transactionvolume or price manipulation

bull Unidirection The unidirectional pattern indicates morethan one transaction from account A to B Figure 8bshows a unidirectional pattern on 20130815 whereaccount 527332 made 322 sell transactions to account231 It is possible for an account to sell Bitcoin to anotheraccount for more than one times however it is almostimpossible for two normal accounts to interact with sucha large number of times on the same day

bull Bi-direction The bi-directional pattern is a typical mar-ket manipulation behavior especially when the two ac-counts are controlled by the same user that two accounts

(a) Self-Loop (b) Unidirection (c) Bidirection

(d) Triangle (e) Polygon (f) Star

Fig 8 Some typical abnormal transaction patterns

interact with each other many times Figure 8c shows thebi-direction pattern on 20130414 where account 144834interact with account 231 for more than 150 times

bull Triangle The triangle pattern indicates a triangle-likestructure between three accounts It may contain vari-ous forms when considering the direction of the edgeFigure 8d shows a special form of triangle pattern on20131025 It is special because the accounts form a loopthrough transactions (account 282004rarr71885rarr490089rarr282004)

bull Polygon Polygon pattern is a more complicated trans-action pattern where many accounts form a polygon-likegroup with each edge has more than one transactionsFigure 8e shows a quadrangle pattern on 20130919it seems that account 282004 sends Bitcoin to account527332 through the ldquobridge accountsrdquo 488195 and 231for more than two hundred transactions

bull Star A star pattern has a core account that buys orsells Bitcoin to many accounts Figure 8f shows a typicalstar where the account 282004 sell Bitcoin to accounts488195 490089 527332 and 231

Generally speaking it is not surprising for a transactionnetwork to form a special structure as transactions are randomHowever in our case it is impossible as each edge representsfar more than one transaction in a single day Thus it seemsquite possible that these accounts are controlled by a certain

group and these transactions have special purposesBased on the results we summarize the findings as followsbull Finding 4 The daily fluctuations of the selected base

networks of EHG and ELG have a strong correlationwith the Bitcoin exchange price On the contrary thedaily fluctuation of the base networks of NMG has nocorrelation with the Bitcoin exchange price This findingindicates that the behavior of the abnormal accountsrsquotransaction affects the fluctuation of Bitcoin exchangeprice

bull Finding 5 The trend of the Bitcoin exchange price canbe captured by the selected base networks of EHG andELG It means that the trend of the price can be predictedby transactions between abnormal accounts

bull Finding 6 There are many unusual transaction patterns(eg self-loop bi-direction star) between abnormal ac-counts These patterns imply that these accounts arecontrolled by the same group and are strong evidenceof price manipulation

V RELATED WORK

Blockchain technology is a new technology which hasmany research directions and attracts the interest of researchersfrom various fields [12] [13] Our research is related toprevious work in two areas The first related area is thestudy of understanding the big fluctuation of Bitcoin price As

aforementioned many driving factors of the price are foundDue to all the related data are time series the most usedmethod in the analysis is time series based model such asvector space model [14] vector error-correction model [15]ARDL bounds testing method [16] wavelet analysis [5] andvector autoregressive [15]

Another related area is the study of the blockchain data (iethe transaction ledger) for different topics Due to the publiclyaccessible of the blockchain data and users are anonymous inthe system a common topic is to mine the blockchain datato reveal usersrsquo privacy [17]ndash[19] Because of the relativelylawless blockchain has become an area full of various scamsThus mining the blockchain data to detect scams is also acritical topic Recently there are many studies on this topicsuch as Bitcoin-based scams [20] the smart contract basedPonzi schemes [21] [22] money laundry [23] attacks [24]See [25] for a full survey of this topic

VI CONCLUSION AND FUTURE WORK

We conduct a systematic study to analyze the leaked MtGox transaction data through graph analysis By comparingthe transaction price of the transaction data with the discloseddaily price many abnormal transactions were identified andwere used to divide the accounts into three categories Basedon this classification we construct three graphs (ie EHGELG and NMG) and obtain many findings by analyzing thesegraphs through various metrics These findings convinced usthat there are many market manipulation behaviors in theexchange In order to reveal the relationship between thesebehaviors and the Bitcoin price the graphs are reconstructedinto daily graph series and reshaped into matrices Throughadopting SVD to the matrices some very important base net-works are identified By inspecting the base networks we findthat the daily variation of the abnormal base networks closelyrelated to the Bitcoin price and many market manipulationpatterns Based on these findings and considering Bitcoin isdominant in the market we propose to strengthen supervisionin this market In the future we will conduct a more thoroughstudy of the data to reveal the extent to which the market isaffected and to discuss the changes in the behavior of investorsunder the extreme fluctuation price

ACKNOWLEDGMENT

The work described in this paper was supported bythe National Key Research and Development Program(2016YFB1000101)the National Natural Science Foundationof China (6172221411801595) the Pearl River SampT NovaProgram of Guangzhou (201710010046) and the Program forGuangdong Introducing Innovative and Entrepreneurial Teams(2016ZT06D211)

REFERENCES

[1] S Nakamoto ldquoBitcoin A peer-to-peer electronic cash systemrdquo 2008[Online] Available httpsbitcoinorgbitcoinpdf

[2] C Burniske and A White ldquoBitcoin Ringing the bellfor a new asset classrdquo 2017 [Online] Availablehttpsresearchark-investcomhubfs1 Download Files ARK-InvestWhite PapersBitcoin-Ringing-The-Bell-For-A-New-Asset-Classpdf

[3] A Feder N Gandal J Hamrick and T Moore ldquoThe impact of ddosand other security shocks on bitcoin currency exchanges Evidence frommt goxrdquo Journal of Cybersecurity vol 3 no 2 pp 137ndash144 2018

[4] M Buchholz J Delaney J Warren and J Parker ldquoBits and betsinformation price volatility and demand for bitcoinrdquo Economics vol312 2012

[5] L Kristoufek ldquoWhat are the main drivers of the bitcoin price evidencefrom wavelet coherence analysisrdquo PloS one vol 10 no 4 p e01239232015

[6] mdashmdash ldquoBitcoin meets google trends and wikipedia Quantifying therelationship between phenomena of the internet erardquo Scientific Reportsvol 3 p 3415 2013

[7] D Kondor I Csabai J Szule M Posfai and G Vattay ldquoInferring theinterplay between network structure and market effects in bitcoinrdquo NewJournal of Physics vol 16 no 12 p 125003 2014

[8] N Gandal J Hamrick T Moore and T Oberman ldquoPrice manipulationin the bitcoin ecosystemrdquo Journal of Monetary Economics vol 95 pp86ndash96 2018

[9] S Ribes ldquoChasing fake volume a crypto-plaguerdquo Mar2018 [Online] Available httpsmediumcomsylvainartplayribeschasing-fake-volume-a-crypto-plague-ea1a3c1e0b5e

[10] R Core Team R A Language and Environment for StatisticalComputing R Foundation for Statistical Computing Vienna Austria2016 [Online] Available httpswwwR-projectorg

[11] C S Gillespie ldquoFitting heavy tailed distributions The poweRlawpackagerdquo Journal of Statistical Software vol 64 no 2 pp 1ndash162015 [Online] Available httpwwwjstatsoftorgv64i02

[12] Z Zheng S Xie H Dai X Chen and H Wang ldquoBlockchain challengesand opportunities a surveyrdquo International Journal of Web and GridServices vol 14 pp 352ndash375 2018

[13] mdashmdash ldquoAn overview of blockchain technology Architecture consensusand future trendsrdquo in 2017 IEEE International Congress on Big Data(BigData Congress) IEEE 2017 pp 557ndash564

[14] I Georgoula D Pournarakis C Bilanakos D Sotiropoulosand M G Giaglis ldquoUsing time-series and sentiment analy-sis to detect the determinants of bitcoin pricesrdquo Available atSSRNhttpsssrncomabstract=2607167 2015

[15] P Ciaian M Rajcaniova and d Kancs ldquoThe economics of bitcoin priceformationrdquo Applied Economics vol 48 no 19 pp 1799ndash1815 2016

[16] J Bouoiyour R Selmi et al ldquoWhat does bitcoin look likerdquo Annals ofEconomics and Finance vol 16 no 2 pp 449ndash492 2015

[17] F Reid and M Harrigan ldquoAn analysis of anonymity in the bitcoinsystemrdquo in Security and Privacy in Social Networks Springer 2013pp 197ndash223

[18] E Androulaki G O Karame M Roeschlin T Scherer and S CapkunldquoEvaluating user privacy in bitcoinrdquo in International Conference onFinancial Cryptography and Data Security Springer 2013 pp 34ndash51

[19] S Athey I Parashkevov V Sarukkai and J Xia ldquoBitcoin pric-ing adoption and usage Theory and evidencerdquo Available atSSRNhttpsssrncomabstract=2826674 2016

[20] M Vasek and T Moore ldquoTherersquos no free lunch even using BitcoinTracking the popularity and profits of virtual currency scamsrdquo inInternational Conference on Financial Cryptography and Data SecuritySpringer 2015 pp 44ndash61

[21] M Bartoletti S Carta T Cimoli and R Saia ldquoDissecting ponzischemes on ethereum identification analysis and impactrdquo arXivpreprint arXiv170303779 2017

[22] W Chen Z Zheng J Cui E Ngai P Zheng and Y Zhou ldquoDetectingponzi schemes on ethereum Towards healthier blockchain technologyrdquoin Proceedings of the 2018 World Wide Web Conference on World WideWeb ACM 2018 pp 1409ndash1418

[23] M Moser R Bohme and D Breuker ldquoAn inquiry into money laun-dering tools in the bitcoin ecosystemrdquo in eCrime Researchers Summit(eCRS) 2013 IEEE 2013 pp 1ndash14

[24] T Chen Y Zhu Z Li J Chen X Li X Luo X Lin and X ZhangeldquoUnderstanding ethereum via graph analysisrdquo in IEEE INFOCOM 2018-IEEE Conference on Computer Communications IEEE 2018 pp1484ndash1492

[25] W Chen and Z Zheng ldquoBlockchain data analysis A review of statustrends and challengesrdquo Journal of Computer Research and Developmentvol 55 no 9 pp 1853ndash1870 2018

  • I Introduction
  • II Data Set
  • III Static Network Analysis
    • III-A Account Classification
    • III-B Graph Construction
    • III-C Graph Analysis
      • IV Temporal Network Analysis
        • IV-A Extract Base Networks
        • IV-B Detecting Graph Structural Changes
        • IV-C Abnormal transaction patterns
          • V Related Work
          • VI Conclusion and Future Work
          • References
Page 6: Cryptocurrency Market Cap, Live Prices, Charts & Ticker - Market … - Evidence... · 2020. 6. 14. · Abstract—The cryptocurrency market is a very huge market without effective

values as (σ1 middot middot middot σT ) the left-sigular vectors (u1 middot middot middot uT )and the right-sigular vectors (v1 middot middot middot vT ) where ui and vi

are column vectors and subject to the following equations

uiT lowast uj = vi

T lowast vj = δij (2)

Based on the special meaning of matrix X we can interpretthe singular vectors and the singular values as 1) the right-singular vectors can be seen as base networks and the elementvi(l) (ie the l-th element of the i-th right-singular vector)gives the weight of the l-th edge in the i-th base network2) the left-singular vectors account for the temporal variationof the base networks the t-th value of ui (denotes as ui(t))provides the contribution of the i-th base network on day t3) the singular value σi which are the square roots of thenon-zero eigenvalues of both XTX and XXT indicates theoverall importance of the i-th base network in approximatingthe whole matrix Please note that the singular values aresorted in decreasing order thus give decreasing contributionto the result

B Detecting Graph Structural Changes

As the (normalized) weight of the l-th edge in the dailygraph structure on day t can be written as

xtl =

Tsumi=1

σiui(t)vi(l) (3)

to detect graph structural changes we need to consider twoterms σi (ie the importance of the i-th base network) andui(t) (ie the contribution of the i-th base network on day t)

As a first glance we consider the daily influence of thefirst and also the most important base network (ie u1(t))We want to know the correlation between the variation ofu1(t) and the fluctuation of the Bitcoin exchange price Asthe range of the price is (12 1207) we adopt a simplemathematical transform to make sure most of the transformedprice falls in the interval (0 1) Specifically we adopt thelog transform B(t) = log1000P (t) where P (t) is the closeexchange price of Bitcoin on day t Table IV (left part) showsthree commonly used correlation coefficients (ie PearsonSpearman and Kendall correlation coefficient) between u1(t)and the log-transformed price B(t) The results show thatthe daily variation of the first base network in EHG andELG have a very strong correlation with the Bitcoin exchangeprice However in NMG there is no correlation between thetwo variables The result indicates that the transactions madebetween abnormal accounts have a great influence on theBitcoin exchange price

Motivated by this result we want to know to what extentthe log transfered price can be estimated with the combinationof the left-sigular vectors ie

B(t) sim c0 +

Nsumi=1

ciui(t) (4)

TABLE IVCORRELATION COEFFCIENTS BETWEEN THE LEFT-SINGULAR VECTORS OFTHE NETWORK TIME SERIES MATRIX AND THE BITCOIN EXCHANGE PRICE

Graph The 1st base network The Fitted 10 base networksρP ρS ρK ρP ρS ρK

EHG 056 060 044 0811 0807 0620ELG 058 082 064 0871 0834 0652NMG 005 015 012 0239 0398 0289

0 50 100 150 200 250 300 350Order of singular value

000

025

050

075

100

125

150

175

Sigu

lar v

alue

10

ELGEHGNMG

Fig 5 Sigular values in the order of its importance

where c0 is the mean of B(t) and ci can be computed as thedot product of B(t) and ui(t) As the left-singular vectors areorthonormal and span the T-dimensional linear space B(t)can be reconstructed by ui(t) when N = T However this isnot what we desire in this case The purpose of this study isto identify some important base networks and accounts thathave a great influence on the Bitcoin price To proceed wefirst try to select some important base networks in the detectedbase networks We draw the scree plot of the singular valuesas shown in Fig 5 As can be seen from the graph the curveof the singular values is clearly leveling off at the right side ofthe dotted line (ie the 10th singular value) Thus we selectthe first 10 base networks for the following analysis

Before analyzing accounts in the selected base networks weapproximate B(t) with the selected networks To evaluate thefitting effect we calculate the correlation coefficients betweenthe fitted price series and B(t) The right part of TableIV shows the correlation coefficients Surprisingly the threecorrelation coefficients are greatly enhanced as compared withthe first left-singular vector Especially the Pearson correlationcoefficient between ELG and B(t) is 087 while only 024between NMG and B(t) The great difference indicates astrong correlation between abnormal accountsrsquo transactionsand the Bitcoin exchange price which is a strong evidenceof the price manipulation in Mt Gox

Figure 6 shows the trends of B(t) and the fitted price Ascan be seen from the graph though the shape of the peak inApril of 2013 is missed the trends of B(t) has been graspedby the selected base networks of EHG and ELG whereas thebase networks in the NMG have no effect in grasping thetrend

2012-12 2013-03 2013-06 2013-09 2013-12Date

02

04

06

08

10

Exch

ange

pric

e

Fitted10 base networksExchange price

(a) EHG

2012-12 2013-03 2013-06 2013-09 2013-12Date

02

04

06

08

10

Exch

ange

pric

e

Fitted10 base networksExchange price

(b) ELG

2012-12 2013-03 2013-06 2013-09 2013-12Date

03

04

05

06

07

08

09

10

Exch

ange

pric

e

Fitted10 base networksExchange price

(c) NMG

Fig 6 Approximate the log-transformed Bitcoin price with the linear combination of the selected base networks of EHG ELG and NMG

2012-12 2013-03 2013-06 2013-09 2013-12Date

06

04

02

00

02

04

Left-

singu

lar v

ecto

r wei

ght

u1(t)u2(t)u3(t)u4(t)

(a) EHG

2012-12 2013-03 2013-06 2013-09 2013-12Date

04

02

00

02

04

Left-

singu

lar v

ecto

r wei

ght

u1(t)u2(t)u3(t)u4(t)

(b) ELG

2012-12 2013-03 2013-06 2013-09 2013-12Date

100

075

050

025

000

025

050

075

100

Left-

singu

lar v

ecto

r wei

ght

u1(t)u2(t)u3(t)u4(t)

(c) NMG

Fig 7 The time-varying contribution ui(t) of the first four base networks

To show the structure variation of the networks we draw thetime-varying contribution ui(t) of the first four base networksin Fig 7 In most cases ui(t) exhibit a few abrupt changespartitioning the history of the transaction into separate timeperiods The most notable abrupt changes are in December of2012 when the Bitcoin exchange price is very smooth and theNovember of 2013 when the price skyrocketing During thetwo periods the effects of the first four base networks of EHGand ELG are both significant however the base networks inNMG have no distinct effect during the smooth period andshow effect only a few days during the skyrocketing period

C Abnormal transaction patterns

As discussed above the transactions between abnormalusers have a great correlation with the Bitcoin exchange priceA natural question is which edges (ie transactions) and thusaccounts are the most influential and whether the transactionsshow certain patterns during the period To this end based onthe extracted 10 base networks we further extract the top-10ranking edges (by the absolute value of weights) in each basenetworks We find only 44 distinct edges instead of the 100maximally possible which including a total of 28 accounts inEHG In ELG 57 edges and 46 accounts were found We callthese core abnormal accounts

To identify special transaction patterns we draw the dailysubgraph of the core abnormal accounts We find that there aremany abnormal transaction patterns (ie market manipulationpatterns) in the networks In order to save space we show

only 6 typical patterns in Fig 8 These subgraphs are allextracted from ELG on different days In order to illustratemore clearly we fix the layout of the graph (ie the positionof the accounts in each graph is fixed) and denote the specialpatterns in red The size of the line denotes the number oftransactions between the two accounts The number at theright-hand side of the directed edge represents the numberof transactions between the two accounts We simply explainthe 6 patterns as follows

bull Self-Loop A pattern that an account made transactionswith itself Figure 8a shows subgraph on 20130207 theaccount 231 made 749 transactions with itself Self-Loopis restricted for normal accounts in any exchanges as itmakes no sense Thus a reasonable explanation for theself-loop pattern is that the account may belong to theexchange and may be used to increase daily transactionvolume or price manipulation

bull Unidirection The unidirectional pattern indicates morethan one transaction from account A to B Figure 8bshows a unidirectional pattern on 20130815 whereaccount 527332 made 322 sell transactions to account231 It is possible for an account to sell Bitcoin to anotheraccount for more than one times however it is almostimpossible for two normal accounts to interact with sucha large number of times on the same day

bull Bi-direction The bi-directional pattern is a typical mar-ket manipulation behavior especially when the two ac-counts are controlled by the same user that two accounts

(a) Self-Loop (b) Unidirection (c) Bidirection

(d) Triangle (e) Polygon (f) Star

Fig 8 Some typical abnormal transaction patterns

interact with each other many times Figure 8c shows thebi-direction pattern on 20130414 where account 144834interact with account 231 for more than 150 times

bull Triangle The triangle pattern indicates a triangle-likestructure between three accounts It may contain vari-ous forms when considering the direction of the edgeFigure 8d shows a special form of triangle pattern on20131025 It is special because the accounts form a loopthrough transactions (account 282004rarr71885rarr490089rarr282004)

bull Polygon Polygon pattern is a more complicated trans-action pattern where many accounts form a polygon-likegroup with each edge has more than one transactionsFigure 8e shows a quadrangle pattern on 20130919it seems that account 282004 sends Bitcoin to account527332 through the ldquobridge accountsrdquo 488195 and 231for more than two hundred transactions

bull Star A star pattern has a core account that buys orsells Bitcoin to many accounts Figure 8f shows a typicalstar where the account 282004 sell Bitcoin to accounts488195 490089 527332 and 231

Generally speaking it is not surprising for a transactionnetwork to form a special structure as transactions are randomHowever in our case it is impossible as each edge representsfar more than one transaction in a single day Thus it seemsquite possible that these accounts are controlled by a certain

group and these transactions have special purposesBased on the results we summarize the findings as followsbull Finding 4 The daily fluctuations of the selected base

networks of EHG and ELG have a strong correlationwith the Bitcoin exchange price On the contrary thedaily fluctuation of the base networks of NMG has nocorrelation with the Bitcoin exchange price This findingindicates that the behavior of the abnormal accountsrsquotransaction affects the fluctuation of Bitcoin exchangeprice

bull Finding 5 The trend of the Bitcoin exchange price canbe captured by the selected base networks of EHG andELG It means that the trend of the price can be predictedby transactions between abnormal accounts

bull Finding 6 There are many unusual transaction patterns(eg self-loop bi-direction star) between abnormal ac-counts These patterns imply that these accounts arecontrolled by the same group and are strong evidenceof price manipulation

V RELATED WORK

Blockchain technology is a new technology which hasmany research directions and attracts the interest of researchersfrom various fields [12] [13] Our research is related toprevious work in two areas The first related area is thestudy of understanding the big fluctuation of Bitcoin price As

aforementioned many driving factors of the price are foundDue to all the related data are time series the most usedmethod in the analysis is time series based model such asvector space model [14] vector error-correction model [15]ARDL bounds testing method [16] wavelet analysis [5] andvector autoregressive [15]

Another related area is the study of the blockchain data (iethe transaction ledger) for different topics Due to the publiclyaccessible of the blockchain data and users are anonymous inthe system a common topic is to mine the blockchain datato reveal usersrsquo privacy [17]ndash[19] Because of the relativelylawless blockchain has become an area full of various scamsThus mining the blockchain data to detect scams is also acritical topic Recently there are many studies on this topicsuch as Bitcoin-based scams [20] the smart contract basedPonzi schemes [21] [22] money laundry [23] attacks [24]See [25] for a full survey of this topic

VI CONCLUSION AND FUTURE WORK

We conduct a systematic study to analyze the leaked MtGox transaction data through graph analysis By comparingthe transaction price of the transaction data with the discloseddaily price many abnormal transactions were identified andwere used to divide the accounts into three categories Basedon this classification we construct three graphs (ie EHGELG and NMG) and obtain many findings by analyzing thesegraphs through various metrics These findings convinced usthat there are many market manipulation behaviors in theexchange In order to reveal the relationship between thesebehaviors and the Bitcoin price the graphs are reconstructedinto daily graph series and reshaped into matrices Throughadopting SVD to the matrices some very important base net-works are identified By inspecting the base networks we findthat the daily variation of the abnormal base networks closelyrelated to the Bitcoin price and many market manipulationpatterns Based on these findings and considering Bitcoin isdominant in the market we propose to strengthen supervisionin this market In the future we will conduct a more thoroughstudy of the data to reveal the extent to which the market isaffected and to discuss the changes in the behavior of investorsunder the extreme fluctuation price

ACKNOWLEDGMENT

The work described in this paper was supported bythe National Key Research and Development Program(2016YFB1000101)the National Natural Science Foundationof China (6172221411801595) the Pearl River SampT NovaProgram of Guangzhou (201710010046) and the Program forGuangdong Introducing Innovative and Entrepreneurial Teams(2016ZT06D211)

REFERENCES

[1] S Nakamoto ldquoBitcoin A peer-to-peer electronic cash systemrdquo 2008[Online] Available httpsbitcoinorgbitcoinpdf

[2] C Burniske and A White ldquoBitcoin Ringing the bellfor a new asset classrdquo 2017 [Online] Availablehttpsresearchark-investcomhubfs1 Download Files ARK-InvestWhite PapersBitcoin-Ringing-The-Bell-For-A-New-Asset-Classpdf

[3] A Feder N Gandal J Hamrick and T Moore ldquoThe impact of ddosand other security shocks on bitcoin currency exchanges Evidence frommt goxrdquo Journal of Cybersecurity vol 3 no 2 pp 137ndash144 2018

[4] M Buchholz J Delaney J Warren and J Parker ldquoBits and betsinformation price volatility and demand for bitcoinrdquo Economics vol312 2012

[5] L Kristoufek ldquoWhat are the main drivers of the bitcoin price evidencefrom wavelet coherence analysisrdquo PloS one vol 10 no 4 p e01239232015

[6] mdashmdash ldquoBitcoin meets google trends and wikipedia Quantifying therelationship between phenomena of the internet erardquo Scientific Reportsvol 3 p 3415 2013

[7] D Kondor I Csabai J Szule M Posfai and G Vattay ldquoInferring theinterplay between network structure and market effects in bitcoinrdquo NewJournal of Physics vol 16 no 12 p 125003 2014

[8] N Gandal J Hamrick T Moore and T Oberman ldquoPrice manipulationin the bitcoin ecosystemrdquo Journal of Monetary Economics vol 95 pp86ndash96 2018

[9] S Ribes ldquoChasing fake volume a crypto-plaguerdquo Mar2018 [Online] Available httpsmediumcomsylvainartplayribeschasing-fake-volume-a-crypto-plague-ea1a3c1e0b5e

[10] R Core Team R A Language and Environment for StatisticalComputing R Foundation for Statistical Computing Vienna Austria2016 [Online] Available httpswwwR-projectorg

[11] C S Gillespie ldquoFitting heavy tailed distributions The poweRlawpackagerdquo Journal of Statistical Software vol 64 no 2 pp 1ndash162015 [Online] Available httpwwwjstatsoftorgv64i02

[12] Z Zheng S Xie H Dai X Chen and H Wang ldquoBlockchain challengesand opportunities a surveyrdquo International Journal of Web and GridServices vol 14 pp 352ndash375 2018

[13] mdashmdash ldquoAn overview of blockchain technology Architecture consensusand future trendsrdquo in 2017 IEEE International Congress on Big Data(BigData Congress) IEEE 2017 pp 557ndash564

[14] I Georgoula D Pournarakis C Bilanakos D Sotiropoulosand M G Giaglis ldquoUsing time-series and sentiment analy-sis to detect the determinants of bitcoin pricesrdquo Available atSSRNhttpsssrncomabstract=2607167 2015

[15] P Ciaian M Rajcaniova and d Kancs ldquoThe economics of bitcoin priceformationrdquo Applied Economics vol 48 no 19 pp 1799ndash1815 2016

[16] J Bouoiyour R Selmi et al ldquoWhat does bitcoin look likerdquo Annals ofEconomics and Finance vol 16 no 2 pp 449ndash492 2015

[17] F Reid and M Harrigan ldquoAn analysis of anonymity in the bitcoinsystemrdquo in Security and Privacy in Social Networks Springer 2013pp 197ndash223

[18] E Androulaki G O Karame M Roeschlin T Scherer and S CapkunldquoEvaluating user privacy in bitcoinrdquo in International Conference onFinancial Cryptography and Data Security Springer 2013 pp 34ndash51

[19] S Athey I Parashkevov V Sarukkai and J Xia ldquoBitcoin pric-ing adoption and usage Theory and evidencerdquo Available atSSRNhttpsssrncomabstract=2826674 2016

[20] M Vasek and T Moore ldquoTherersquos no free lunch even using BitcoinTracking the popularity and profits of virtual currency scamsrdquo inInternational Conference on Financial Cryptography and Data SecuritySpringer 2015 pp 44ndash61

[21] M Bartoletti S Carta T Cimoli and R Saia ldquoDissecting ponzischemes on ethereum identification analysis and impactrdquo arXivpreprint arXiv170303779 2017

[22] W Chen Z Zheng J Cui E Ngai P Zheng and Y Zhou ldquoDetectingponzi schemes on ethereum Towards healthier blockchain technologyrdquoin Proceedings of the 2018 World Wide Web Conference on World WideWeb ACM 2018 pp 1409ndash1418

[23] M Moser R Bohme and D Breuker ldquoAn inquiry into money laun-dering tools in the bitcoin ecosystemrdquo in eCrime Researchers Summit(eCRS) 2013 IEEE 2013 pp 1ndash14

[24] T Chen Y Zhu Z Li J Chen X Li X Luo X Lin and X ZhangeldquoUnderstanding ethereum via graph analysisrdquo in IEEE INFOCOM 2018-IEEE Conference on Computer Communications IEEE 2018 pp1484ndash1492

[25] W Chen and Z Zheng ldquoBlockchain data analysis A review of statustrends and challengesrdquo Journal of Computer Research and Developmentvol 55 no 9 pp 1853ndash1870 2018

  • I Introduction
  • II Data Set
  • III Static Network Analysis
    • III-A Account Classification
    • III-B Graph Construction
    • III-C Graph Analysis
      • IV Temporal Network Analysis
        • IV-A Extract Base Networks
        • IV-B Detecting Graph Structural Changes
        • IV-C Abnormal transaction patterns
          • V Related Work
          • VI Conclusion and Future Work
          • References
Page 7: Cryptocurrency Market Cap, Live Prices, Charts & Ticker - Market … - Evidence... · 2020. 6. 14. · Abstract—The cryptocurrency market is a very huge market without effective

2012-12 2013-03 2013-06 2013-09 2013-12Date

02

04

06

08

10

Exch

ange

pric

e

Fitted10 base networksExchange price

(a) EHG

2012-12 2013-03 2013-06 2013-09 2013-12Date

02

04

06

08

10

Exch

ange

pric

e

Fitted10 base networksExchange price

(b) ELG

2012-12 2013-03 2013-06 2013-09 2013-12Date

03

04

05

06

07

08

09

10

Exch

ange

pric

e

Fitted10 base networksExchange price

(c) NMG

Fig 6 Approximate the log-transformed Bitcoin price with the linear combination of the selected base networks of EHG ELG and NMG

2012-12 2013-03 2013-06 2013-09 2013-12Date

06

04

02

00

02

04

Left-

singu

lar v

ecto

r wei

ght

u1(t)u2(t)u3(t)u4(t)

(a) EHG

2012-12 2013-03 2013-06 2013-09 2013-12Date

04

02

00

02

04

Left-

singu

lar v

ecto

r wei

ght

u1(t)u2(t)u3(t)u4(t)

(b) ELG

2012-12 2013-03 2013-06 2013-09 2013-12Date

100

075

050

025

000

025

050

075

100

Left-

singu

lar v

ecto

r wei

ght

u1(t)u2(t)u3(t)u4(t)

(c) NMG

Fig 7 The time-varying contribution ui(t) of the first four base networks

To show the structure variation of the networks we draw thetime-varying contribution ui(t) of the first four base networksin Fig 7 In most cases ui(t) exhibit a few abrupt changespartitioning the history of the transaction into separate timeperiods The most notable abrupt changes are in December of2012 when the Bitcoin exchange price is very smooth and theNovember of 2013 when the price skyrocketing During thetwo periods the effects of the first four base networks of EHGand ELG are both significant however the base networks inNMG have no distinct effect during the smooth period andshow effect only a few days during the skyrocketing period

C Abnormal transaction patterns

As discussed above the transactions between abnormalusers have a great correlation with the Bitcoin exchange priceA natural question is which edges (ie transactions) and thusaccounts are the most influential and whether the transactionsshow certain patterns during the period To this end based onthe extracted 10 base networks we further extract the top-10ranking edges (by the absolute value of weights) in each basenetworks We find only 44 distinct edges instead of the 100maximally possible which including a total of 28 accounts inEHG In ELG 57 edges and 46 accounts were found We callthese core abnormal accounts

To identify special transaction patterns we draw the dailysubgraph of the core abnormal accounts We find that there aremany abnormal transaction patterns (ie market manipulationpatterns) in the networks In order to save space we show

only 6 typical patterns in Fig 8 These subgraphs are allextracted from ELG on different days In order to illustratemore clearly we fix the layout of the graph (ie the positionof the accounts in each graph is fixed) and denote the specialpatterns in red The size of the line denotes the number oftransactions between the two accounts The number at theright-hand side of the directed edge represents the numberof transactions between the two accounts We simply explainthe 6 patterns as follows

bull Self-Loop A pattern that an account made transactionswith itself Figure 8a shows subgraph on 20130207 theaccount 231 made 749 transactions with itself Self-Loopis restricted for normal accounts in any exchanges as itmakes no sense Thus a reasonable explanation for theself-loop pattern is that the account may belong to theexchange and may be used to increase daily transactionvolume or price manipulation

bull Unidirection The unidirectional pattern indicates morethan one transaction from account A to B Figure 8bshows a unidirectional pattern on 20130815 whereaccount 527332 made 322 sell transactions to account231 It is possible for an account to sell Bitcoin to anotheraccount for more than one times however it is almostimpossible for two normal accounts to interact with sucha large number of times on the same day

bull Bi-direction The bi-directional pattern is a typical mar-ket manipulation behavior especially when the two ac-counts are controlled by the same user that two accounts

(a) Self-Loop (b) Unidirection (c) Bidirection

(d) Triangle (e) Polygon (f) Star

Fig 8 Some typical abnormal transaction patterns

interact with each other many times Figure 8c shows thebi-direction pattern on 20130414 where account 144834interact with account 231 for more than 150 times

bull Triangle The triangle pattern indicates a triangle-likestructure between three accounts It may contain vari-ous forms when considering the direction of the edgeFigure 8d shows a special form of triangle pattern on20131025 It is special because the accounts form a loopthrough transactions (account 282004rarr71885rarr490089rarr282004)

bull Polygon Polygon pattern is a more complicated trans-action pattern where many accounts form a polygon-likegroup with each edge has more than one transactionsFigure 8e shows a quadrangle pattern on 20130919it seems that account 282004 sends Bitcoin to account527332 through the ldquobridge accountsrdquo 488195 and 231for more than two hundred transactions

bull Star A star pattern has a core account that buys orsells Bitcoin to many accounts Figure 8f shows a typicalstar where the account 282004 sell Bitcoin to accounts488195 490089 527332 and 231

Generally speaking it is not surprising for a transactionnetwork to form a special structure as transactions are randomHowever in our case it is impossible as each edge representsfar more than one transaction in a single day Thus it seemsquite possible that these accounts are controlled by a certain

group and these transactions have special purposesBased on the results we summarize the findings as followsbull Finding 4 The daily fluctuations of the selected base

networks of EHG and ELG have a strong correlationwith the Bitcoin exchange price On the contrary thedaily fluctuation of the base networks of NMG has nocorrelation with the Bitcoin exchange price This findingindicates that the behavior of the abnormal accountsrsquotransaction affects the fluctuation of Bitcoin exchangeprice

bull Finding 5 The trend of the Bitcoin exchange price canbe captured by the selected base networks of EHG andELG It means that the trend of the price can be predictedby transactions between abnormal accounts

bull Finding 6 There are many unusual transaction patterns(eg self-loop bi-direction star) between abnormal ac-counts These patterns imply that these accounts arecontrolled by the same group and are strong evidenceof price manipulation

V RELATED WORK

Blockchain technology is a new technology which hasmany research directions and attracts the interest of researchersfrom various fields [12] [13] Our research is related toprevious work in two areas The first related area is thestudy of understanding the big fluctuation of Bitcoin price As

aforementioned many driving factors of the price are foundDue to all the related data are time series the most usedmethod in the analysis is time series based model such asvector space model [14] vector error-correction model [15]ARDL bounds testing method [16] wavelet analysis [5] andvector autoregressive [15]

Another related area is the study of the blockchain data (iethe transaction ledger) for different topics Due to the publiclyaccessible of the blockchain data and users are anonymous inthe system a common topic is to mine the blockchain datato reveal usersrsquo privacy [17]ndash[19] Because of the relativelylawless blockchain has become an area full of various scamsThus mining the blockchain data to detect scams is also acritical topic Recently there are many studies on this topicsuch as Bitcoin-based scams [20] the smart contract basedPonzi schemes [21] [22] money laundry [23] attacks [24]See [25] for a full survey of this topic

VI CONCLUSION AND FUTURE WORK

We conduct a systematic study to analyze the leaked MtGox transaction data through graph analysis By comparingthe transaction price of the transaction data with the discloseddaily price many abnormal transactions were identified andwere used to divide the accounts into three categories Basedon this classification we construct three graphs (ie EHGELG and NMG) and obtain many findings by analyzing thesegraphs through various metrics These findings convinced usthat there are many market manipulation behaviors in theexchange In order to reveal the relationship between thesebehaviors and the Bitcoin price the graphs are reconstructedinto daily graph series and reshaped into matrices Throughadopting SVD to the matrices some very important base net-works are identified By inspecting the base networks we findthat the daily variation of the abnormal base networks closelyrelated to the Bitcoin price and many market manipulationpatterns Based on these findings and considering Bitcoin isdominant in the market we propose to strengthen supervisionin this market In the future we will conduct a more thoroughstudy of the data to reveal the extent to which the market isaffected and to discuss the changes in the behavior of investorsunder the extreme fluctuation price

ACKNOWLEDGMENT

The work described in this paper was supported bythe National Key Research and Development Program(2016YFB1000101)the National Natural Science Foundationof China (6172221411801595) the Pearl River SampT NovaProgram of Guangzhou (201710010046) and the Program forGuangdong Introducing Innovative and Entrepreneurial Teams(2016ZT06D211)

REFERENCES

[1] S Nakamoto ldquoBitcoin A peer-to-peer electronic cash systemrdquo 2008[Online] Available httpsbitcoinorgbitcoinpdf

[2] C Burniske and A White ldquoBitcoin Ringing the bellfor a new asset classrdquo 2017 [Online] Availablehttpsresearchark-investcomhubfs1 Download Files ARK-InvestWhite PapersBitcoin-Ringing-The-Bell-For-A-New-Asset-Classpdf

[3] A Feder N Gandal J Hamrick and T Moore ldquoThe impact of ddosand other security shocks on bitcoin currency exchanges Evidence frommt goxrdquo Journal of Cybersecurity vol 3 no 2 pp 137ndash144 2018

[4] M Buchholz J Delaney J Warren and J Parker ldquoBits and betsinformation price volatility and demand for bitcoinrdquo Economics vol312 2012

[5] L Kristoufek ldquoWhat are the main drivers of the bitcoin price evidencefrom wavelet coherence analysisrdquo PloS one vol 10 no 4 p e01239232015

[6] mdashmdash ldquoBitcoin meets google trends and wikipedia Quantifying therelationship between phenomena of the internet erardquo Scientific Reportsvol 3 p 3415 2013

[7] D Kondor I Csabai J Szule M Posfai and G Vattay ldquoInferring theinterplay between network structure and market effects in bitcoinrdquo NewJournal of Physics vol 16 no 12 p 125003 2014

[8] N Gandal J Hamrick T Moore and T Oberman ldquoPrice manipulationin the bitcoin ecosystemrdquo Journal of Monetary Economics vol 95 pp86ndash96 2018

[9] S Ribes ldquoChasing fake volume a crypto-plaguerdquo Mar2018 [Online] Available httpsmediumcomsylvainartplayribeschasing-fake-volume-a-crypto-plague-ea1a3c1e0b5e

[10] R Core Team R A Language and Environment for StatisticalComputing R Foundation for Statistical Computing Vienna Austria2016 [Online] Available httpswwwR-projectorg

[11] C S Gillespie ldquoFitting heavy tailed distributions The poweRlawpackagerdquo Journal of Statistical Software vol 64 no 2 pp 1ndash162015 [Online] Available httpwwwjstatsoftorgv64i02

[12] Z Zheng S Xie H Dai X Chen and H Wang ldquoBlockchain challengesand opportunities a surveyrdquo International Journal of Web and GridServices vol 14 pp 352ndash375 2018

[13] mdashmdash ldquoAn overview of blockchain technology Architecture consensusand future trendsrdquo in 2017 IEEE International Congress on Big Data(BigData Congress) IEEE 2017 pp 557ndash564

[14] I Georgoula D Pournarakis C Bilanakos D Sotiropoulosand M G Giaglis ldquoUsing time-series and sentiment analy-sis to detect the determinants of bitcoin pricesrdquo Available atSSRNhttpsssrncomabstract=2607167 2015

[15] P Ciaian M Rajcaniova and d Kancs ldquoThe economics of bitcoin priceformationrdquo Applied Economics vol 48 no 19 pp 1799ndash1815 2016

[16] J Bouoiyour R Selmi et al ldquoWhat does bitcoin look likerdquo Annals ofEconomics and Finance vol 16 no 2 pp 449ndash492 2015

[17] F Reid and M Harrigan ldquoAn analysis of anonymity in the bitcoinsystemrdquo in Security and Privacy in Social Networks Springer 2013pp 197ndash223

[18] E Androulaki G O Karame M Roeschlin T Scherer and S CapkunldquoEvaluating user privacy in bitcoinrdquo in International Conference onFinancial Cryptography and Data Security Springer 2013 pp 34ndash51

[19] S Athey I Parashkevov V Sarukkai and J Xia ldquoBitcoin pric-ing adoption and usage Theory and evidencerdquo Available atSSRNhttpsssrncomabstract=2826674 2016

[20] M Vasek and T Moore ldquoTherersquos no free lunch even using BitcoinTracking the popularity and profits of virtual currency scamsrdquo inInternational Conference on Financial Cryptography and Data SecuritySpringer 2015 pp 44ndash61

[21] M Bartoletti S Carta T Cimoli and R Saia ldquoDissecting ponzischemes on ethereum identification analysis and impactrdquo arXivpreprint arXiv170303779 2017

[22] W Chen Z Zheng J Cui E Ngai P Zheng and Y Zhou ldquoDetectingponzi schemes on ethereum Towards healthier blockchain technologyrdquoin Proceedings of the 2018 World Wide Web Conference on World WideWeb ACM 2018 pp 1409ndash1418

[23] M Moser R Bohme and D Breuker ldquoAn inquiry into money laun-dering tools in the bitcoin ecosystemrdquo in eCrime Researchers Summit(eCRS) 2013 IEEE 2013 pp 1ndash14

[24] T Chen Y Zhu Z Li J Chen X Li X Luo X Lin and X ZhangeldquoUnderstanding ethereum via graph analysisrdquo in IEEE INFOCOM 2018-IEEE Conference on Computer Communications IEEE 2018 pp1484ndash1492

[25] W Chen and Z Zheng ldquoBlockchain data analysis A review of statustrends and challengesrdquo Journal of Computer Research and Developmentvol 55 no 9 pp 1853ndash1870 2018

  • I Introduction
  • II Data Set
  • III Static Network Analysis
    • III-A Account Classification
    • III-B Graph Construction
    • III-C Graph Analysis
      • IV Temporal Network Analysis
        • IV-A Extract Base Networks
        • IV-B Detecting Graph Structural Changes
        • IV-C Abnormal transaction patterns
          • V Related Work
          • VI Conclusion and Future Work
          • References
Page 8: Cryptocurrency Market Cap, Live Prices, Charts & Ticker - Market … - Evidence... · 2020. 6. 14. · Abstract—The cryptocurrency market is a very huge market without effective

(a) Self-Loop (b) Unidirection (c) Bidirection

(d) Triangle (e) Polygon (f) Star

Fig 8 Some typical abnormal transaction patterns

interact with each other many times Figure 8c shows thebi-direction pattern on 20130414 where account 144834interact with account 231 for more than 150 times

bull Triangle The triangle pattern indicates a triangle-likestructure between three accounts It may contain vari-ous forms when considering the direction of the edgeFigure 8d shows a special form of triangle pattern on20131025 It is special because the accounts form a loopthrough transactions (account 282004rarr71885rarr490089rarr282004)

bull Polygon Polygon pattern is a more complicated trans-action pattern where many accounts form a polygon-likegroup with each edge has more than one transactionsFigure 8e shows a quadrangle pattern on 20130919it seems that account 282004 sends Bitcoin to account527332 through the ldquobridge accountsrdquo 488195 and 231for more than two hundred transactions

bull Star A star pattern has a core account that buys orsells Bitcoin to many accounts Figure 8f shows a typicalstar where the account 282004 sell Bitcoin to accounts488195 490089 527332 and 231

Generally speaking it is not surprising for a transactionnetwork to form a special structure as transactions are randomHowever in our case it is impossible as each edge representsfar more than one transaction in a single day Thus it seemsquite possible that these accounts are controlled by a certain

group and these transactions have special purposesBased on the results we summarize the findings as followsbull Finding 4 The daily fluctuations of the selected base

networks of EHG and ELG have a strong correlationwith the Bitcoin exchange price On the contrary thedaily fluctuation of the base networks of NMG has nocorrelation with the Bitcoin exchange price This findingindicates that the behavior of the abnormal accountsrsquotransaction affects the fluctuation of Bitcoin exchangeprice

bull Finding 5 The trend of the Bitcoin exchange price canbe captured by the selected base networks of EHG andELG It means that the trend of the price can be predictedby transactions between abnormal accounts

bull Finding 6 There are many unusual transaction patterns(eg self-loop bi-direction star) between abnormal ac-counts These patterns imply that these accounts arecontrolled by the same group and are strong evidenceof price manipulation

V RELATED WORK

Blockchain technology is a new technology which hasmany research directions and attracts the interest of researchersfrom various fields [12] [13] Our research is related toprevious work in two areas The first related area is thestudy of understanding the big fluctuation of Bitcoin price As

aforementioned many driving factors of the price are foundDue to all the related data are time series the most usedmethod in the analysis is time series based model such asvector space model [14] vector error-correction model [15]ARDL bounds testing method [16] wavelet analysis [5] andvector autoregressive [15]

Another related area is the study of the blockchain data (iethe transaction ledger) for different topics Due to the publiclyaccessible of the blockchain data and users are anonymous inthe system a common topic is to mine the blockchain datato reveal usersrsquo privacy [17]ndash[19] Because of the relativelylawless blockchain has become an area full of various scamsThus mining the blockchain data to detect scams is also acritical topic Recently there are many studies on this topicsuch as Bitcoin-based scams [20] the smart contract basedPonzi schemes [21] [22] money laundry [23] attacks [24]See [25] for a full survey of this topic

VI CONCLUSION AND FUTURE WORK

We conduct a systematic study to analyze the leaked MtGox transaction data through graph analysis By comparingthe transaction price of the transaction data with the discloseddaily price many abnormal transactions were identified andwere used to divide the accounts into three categories Basedon this classification we construct three graphs (ie EHGELG and NMG) and obtain many findings by analyzing thesegraphs through various metrics These findings convinced usthat there are many market manipulation behaviors in theexchange In order to reveal the relationship between thesebehaviors and the Bitcoin price the graphs are reconstructedinto daily graph series and reshaped into matrices Throughadopting SVD to the matrices some very important base net-works are identified By inspecting the base networks we findthat the daily variation of the abnormal base networks closelyrelated to the Bitcoin price and many market manipulationpatterns Based on these findings and considering Bitcoin isdominant in the market we propose to strengthen supervisionin this market In the future we will conduct a more thoroughstudy of the data to reveal the extent to which the market isaffected and to discuss the changes in the behavior of investorsunder the extreme fluctuation price

ACKNOWLEDGMENT

The work described in this paper was supported bythe National Key Research and Development Program(2016YFB1000101)the National Natural Science Foundationof China (6172221411801595) the Pearl River SampT NovaProgram of Guangzhou (201710010046) and the Program forGuangdong Introducing Innovative and Entrepreneurial Teams(2016ZT06D211)

REFERENCES

[1] S Nakamoto ldquoBitcoin A peer-to-peer electronic cash systemrdquo 2008[Online] Available httpsbitcoinorgbitcoinpdf

[2] C Burniske and A White ldquoBitcoin Ringing the bellfor a new asset classrdquo 2017 [Online] Availablehttpsresearchark-investcomhubfs1 Download Files ARK-InvestWhite PapersBitcoin-Ringing-The-Bell-For-A-New-Asset-Classpdf

[3] A Feder N Gandal J Hamrick and T Moore ldquoThe impact of ddosand other security shocks on bitcoin currency exchanges Evidence frommt goxrdquo Journal of Cybersecurity vol 3 no 2 pp 137ndash144 2018

[4] M Buchholz J Delaney J Warren and J Parker ldquoBits and betsinformation price volatility and demand for bitcoinrdquo Economics vol312 2012

[5] L Kristoufek ldquoWhat are the main drivers of the bitcoin price evidencefrom wavelet coherence analysisrdquo PloS one vol 10 no 4 p e01239232015

[6] mdashmdash ldquoBitcoin meets google trends and wikipedia Quantifying therelationship between phenomena of the internet erardquo Scientific Reportsvol 3 p 3415 2013

[7] D Kondor I Csabai J Szule M Posfai and G Vattay ldquoInferring theinterplay between network structure and market effects in bitcoinrdquo NewJournal of Physics vol 16 no 12 p 125003 2014

[8] N Gandal J Hamrick T Moore and T Oberman ldquoPrice manipulationin the bitcoin ecosystemrdquo Journal of Monetary Economics vol 95 pp86ndash96 2018

[9] S Ribes ldquoChasing fake volume a crypto-plaguerdquo Mar2018 [Online] Available httpsmediumcomsylvainartplayribeschasing-fake-volume-a-crypto-plague-ea1a3c1e0b5e

[10] R Core Team R A Language and Environment for StatisticalComputing R Foundation for Statistical Computing Vienna Austria2016 [Online] Available httpswwwR-projectorg

[11] C S Gillespie ldquoFitting heavy tailed distributions The poweRlawpackagerdquo Journal of Statistical Software vol 64 no 2 pp 1ndash162015 [Online] Available httpwwwjstatsoftorgv64i02

[12] Z Zheng S Xie H Dai X Chen and H Wang ldquoBlockchain challengesand opportunities a surveyrdquo International Journal of Web and GridServices vol 14 pp 352ndash375 2018

[13] mdashmdash ldquoAn overview of blockchain technology Architecture consensusand future trendsrdquo in 2017 IEEE International Congress on Big Data(BigData Congress) IEEE 2017 pp 557ndash564

[14] I Georgoula D Pournarakis C Bilanakos D Sotiropoulosand M G Giaglis ldquoUsing time-series and sentiment analy-sis to detect the determinants of bitcoin pricesrdquo Available atSSRNhttpsssrncomabstract=2607167 2015

[15] P Ciaian M Rajcaniova and d Kancs ldquoThe economics of bitcoin priceformationrdquo Applied Economics vol 48 no 19 pp 1799ndash1815 2016

[16] J Bouoiyour R Selmi et al ldquoWhat does bitcoin look likerdquo Annals ofEconomics and Finance vol 16 no 2 pp 449ndash492 2015

[17] F Reid and M Harrigan ldquoAn analysis of anonymity in the bitcoinsystemrdquo in Security and Privacy in Social Networks Springer 2013pp 197ndash223

[18] E Androulaki G O Karame M Roeschlin T Scherer and S CapkunldquoEvaluating user privacy in bitcoinrdquo in International Conference onFinancial Cryptography and Data Security Springer 2013 pp 34ndash51

[19] S Athey I Parashkevov V Sarukkai and J Xia ldquoBitcoin pric-ing adoption and usage Theory and evidencerdquo Available atSSRNhttpsssrncomabstract=2826674 2016

[20] M Vasek and T Moore ldquoTherersquos no free lunch even using BitcoinTracking the popularity and profits of virtual currency scamsrdquo inInternational Conference on Financial Cryptography and Data SecuritySpringer 2015 pp 44ndash61

[21] M Bartoletti S Carta T Cimoli and R Saia ldquoDissecting ponzischemes on ethereum identification analysis and impactrdquo arXivpreprint arXiv170303779 2017

[22] W Chen Z Zheng J Cui E Ngai P Zheng and Y Zhou ldquoDetectingponzi schemes on ethereum Towards healthier blockchain technologyrdquoin Proceedings of the 2018 World Wide Web Conference on World WideWeb ACM 2018 pp 1409ndash1418

[23] M Moser R Bohme and D Breuker ldquoAn inquiry into money laun-dering tools in the bitcoin ecosystemrdquo in eCrime Researchers Summit(eCRS) 2013 IEEE 2013 pp 1ndash14

[24] T Chen Y Zhu Z Li J Chen X Li X Luo X Lin and X ZhangeldquoUnderstanding ethereum via graph analysisrdquo in IEEE INFOCOM 2018-IEEE Conference on Computer Communications IEEE 2018 pp1484ndash1492

[25] W Chen and Z Zheng ldquoBlockchain data analysis A review of statustrends and challengesrdquo Journal of Computer Research and Developmentvol 55 no 9 pp 1853ndash1870 2018

  • I Introduction
  • II Data Set
  • III Static Network Analysis
    • III-A Account Classification
    • III-B Graph Construction
    • III-C Graph Analysis
      • IV Temporal Network Analysis
        • IV-A Extract Base Networks
        • IV-B Detecting Graph Structural Changes
        • IV-C Abnormal transaction patterns
          • V Related Work
          • VI Conclusion and Future Work
          • References
Page 9: Cryptocurrency Market Cap, Live Prices, Charts & Ticker - Market … - Evidence... · 2020. 6. 14. · Abstract—The cryptocurrency market is a very huge market without effective

aforementioned many driving factors of the price are foundDue to all the related data are time series the most usedmethod in the analysis is time series based model such asvector space model [14] vector error-correction model [15]ARDL bounds testing method [16] wavelet analysis [5] andvector autoregressive [15]

Another related area is the study of the blockchain data (iethe transaction ledger) for different topics Due to the publiclyaccessible of the blockchain data and users are anonymous inthe system a common topic is to mine the blockchain datato reveal usersrsquo privacy [17]ndash[19] Because of the relativelylawless blockchain has become an area full of various scamsThus mining the blockchain data to detect scams is also acritical topic Recently there are many studies on this topicsuch as Bitcoin-based scams [20] the smart contract basedPonzi schemes [21] [22] money laundry [23] attacks [24]See [25] for a full survey of this topic

VI CONCLUSION AND FUTURE WORK

We conduct a systematic study to analyze the leaked MtGox transaction data through graph analysis By comparingthe transaction price of the transaction data with the discloseddaily price many abnormal transactions were identified andwere used to divide the accounts into three categories Basedon this classification we construct three graphs (ie EHGELG and NMG) and obtain many findings by analyzing thesegraphs through various metrics These findings convinced usthat there are many market manipulation behaviors in theexchange In order to reveal the relationship between thesebehaviors and the Bitcoin price the graphs are reconstructedinto daily graph series and reshaped into matrices Throughadopting SVD to the matrices some very important base net-works are identified By inspecting the base networks we findthat the daily variation of the abnormal base networks closelyrelated to the Bitcoin price and many market manipulationpatterns Based on these findings and considering Bitcoin isdominant in the market we propose to strengthen supervisionin this market In the future we will conduct a more thoroughstudy of the data to reveal the extent to which the market isaffected and to discuss the changes in the behavior of investorsunder the extreme fluctuation price

ACKNOWLEDGMENT

The work described in this paper was supported bythe National Key Research and Development Program(2016YFB1000101)the National Natural Science Foundationof China (6172221411801595) the Pearl River SampT NovaProgram of Guangzhou (201710010046) and the Program forGuangdong Introducing Innovative and Entrepreneurial Teams(2016ZT06D211)

REFERENCES

[1] S Nakamoto ldquoBitcoin A peer-to-peer electronic cash systemrdquo 2008[Online] Available httpsbitcoinorgbitcoinpdf

[2] C Burniske and A White ldquoBitcoin Ringing the bellfor a new asset classrdquo 2017 [Online] Availablehttpsresearchark-investcomhubfs1 Download Files ARK-InvestWhite PapersBitcoin-Ringing-The-Bell-For-A-New-Asset-Classpdf

[3] A Feder N Gandal J Hamrick and T Moore ldquoThe impact of ddosand other security shocks on bitcoin currency exchanges Evidence frommt goxrdquo Journal of Cybersecurity vol 3 no 2 pp 137ndash144 2018

[4] M Buchholz J Delaney J Warren and J Parker ldquoBits and betsinformation price volatility and demand for bitcoinrdquo Economics vol312 2012

[5] L Kristoufek ldquoWhat are the main drivers of the bitcoin price evidencefrom wavelet coherence analysisrdquo PloS one vol 10 no 4 p e01239232015

[6] mdashmdash ldquoBitcoin meets google trends and wikipedia Quantifying therelationship between phenomena of the internet erardquo Scientific Reportsvol 3 p 3415 2013

[7] D Kondor I Csabai J Szule M Posfai and G Vattay ldquoInferring theinterplay between network structure and market effects in bitcoinrdquo NewJournal of Physics vol 16 no 12 p 125003 2014

[8] N Gandal J Hamrick T Moore and T Oberman ldquoPrice manipulationin the bitcoin ecosystemrdquo Journal of Monetary Economics vol 95 pp86ndash96 2018

[9] S Ribes ldquoChasing fake volume a crypto-plaguerdquo Mar2018 [Online] Available httpsmediumcomsylvainartplayribeschasing-fake-volume-a-crypto-plague-ea1a3c1e0b5e

[10] R Core Team R A Language and Environment for StatisticalComputing R Foundation for Statistical Computing Vienna Austria2016 [Online] Available httpswwwR-projectorg

[11] C S Gillespie ldquoFitting heavy tailed distributions The poweRlawpackagerdquo Journal of Statistical Software vol 64 no 2 pp 1ndash162015 [Online] Available httpwwwjstatsoftorgv64i02

[12] Z Zheng S Xie H Dai X Chen and H Wang ldquoBlockchain challengesand opportunities a surveyrdquo International Journal of Web and GridServices vol 14 pp 352ndash375 2018

[13] mdashmdash ldquoAn overview of blockchain technology Architecture consensusand future trendsrdquo in 2017 IEEE International Congress on Big Data(BigData Congress) IEEE 2017 pp 557ndash564

[14] I Georgoula D Pournarakis C Bilanakos D Sotiropoulosand M G Giaglis ldquoUsing time-series and sentiment analy-sis to detect the determinants of bitcoin pricesrdquo Available atSSRNhttpsssrncomabstract=2607167 2015

[15] P Ciaian M Rajcaniova and d Kancs ldquoThe economics of bitcoin priceformationrdquo Applied Economics vol 48 no 19 pp 1799ndash1815 2016

[16] J Bouoiyour R Selmi et al ldquoWhat does bitcoin look likerdquo Annals ofEconomics and Finance vol 16 no 2 pp 449ndash492 2015

[17] F Reid and M Harrigan ldquoAn analysis of anonymity in the bitcoinsystemrdquo in Security and Privacy in Social Networks Springer 2013pp 197ndash223

[18] E Androulaki G O Karame M Roeschlin T Scherer and S CapkunldquoEvaluating user privacy in bitcoinrdquo in International Conference onFinancial Cryptography and Data Security Springer 2013 pp 34ndash51

[19] S Athey I Parashkevov V Sarukkai and J Xia ldquoBitcoin pric-ing adoption and usage Theory and evidencerdquo Available atSSRNhttpsssrncomabstract=2826674 2016

[20] M Vasek and T Moore ldquoTherersquos no free lunch even using BitcoinTracking the popularity and profits of virtual currency scamsrdquo inInternational Conference on Financial Cryptography and Data SecuritySpringer 2015 pp 44ndash61

[21] M Bartoletti S Carta T Cimoli and R Saia ldquoDissecting ponzischemes on ethereum identification analysis and impactrdquo arXivpreprint arXiv170303779 2017

[22] W Chen Z Zheng J Cui E Ngai P Zheng and Y Zhou ldquoDetectingponzi schemes on ethereum Towards healthier blockchain technologyrdquoin Proceedings of the 2018 World Wide Web Conference on World WideWeb ACM 2018 pp 1409ndash1418

[23] M Moser R Bohme and D Breuker ldquoAn inquiry into money laun-dering tools in the bitcoin ecosystemrdquo in eCrime Researchers Summit(eCRS) 2013 IEEE 2013 pp 1ndash14

[24] T Chen Y Zhu Z Li J Chen X Li X Luo X Lin and X ZhangeldquoUnderstanding ethereum via graph analysisrdquo in IEEE INFOCOM 2018-IEEE Conference on Computer Communications IEEE 2018 pp1484ndash1492

[25] W Chen and Z Zheng ldquoBlockchain data analysis A review of statustrends and challengesrdquo Journal of Computer Research and Developmentvol 55 no 9 pp 1853ndash1870 2018

  • I Introduction
  • II Data Set
  • III Static Network Analysis
    • III-A Account Classification
    • III-B Graph Construction
    • III-C Graph Analysis
      • IV Temporal Network Analysis
        • IV-A Extract Base Networks
        • IV-B Detecting Graph Structural Changes
        • IV-C Abnormal transaction patterns
          • V Related Work
          • VI Conclusion and Future Work
          • References