37
A Systematic Study of the Mobile App Ecosystem Thanasis Petsas , Antonis Papadogiannakis, Evangelos P. Markatos Michalis Polychronakis Thomas Karagiannis

A Systematic Study of the Mobile App Ecosystem Thanasis Petsas, Antonis Papadogiannakis, Evangelos P. Markatos Michalis PolychronakisThomas Karagiannis

Embed Size (px)

Citation preview

A Systematic Study of the Mobile App Ecosystem

Thanasis Petsas, Antonis Papadogiannakis, Evangelos P. Markatos

Michalis Polychronakis Thomas Karagiannis

Smartphone Adoption Explodes

• Smartphone adoption:– 10x faster than 80s PC revolution– 2x faster than 90s Internet Boom– 3x faster than social networks

• 1.4 B smartphones will be in use by 2013!

Source:

2

Mobile Apps are Getting Popular

50B+downloads

1M+apps

50B+downloads

900K+apps

Windows Store

2B+downloads

100K+apps 3

A Plethora of Marketplaces

• In addition to the officialmarketplaces...

• Many alternative markets

4

Motivation

• App popularity– How does app popularity distribution look like?– Is it similar with other domains?

• WWW, P2P, UGC

– Can we model app popularity?

• App pricing– How does price affect app popularity?– What is the developers’ income?– Which are the common pricing strategies?

5

Crawler Hosts

Data Collection

MarketplacesPlanetLab Proxies

App stats APKs

App stats APKs

App stats APKs

Database

App

stat

s

APK

s

6

Datasets

Appstore Crawling period

Total apps* New apps / day

Total downloads*

Daily downloads

SlideMe (free) 5 months 16,578 28.0 96 M 215.7 K

SlideMe (paid) 5 months 5,606 6.5 914 K 5.2 K

1Mobile 4.5 months 156,221 210.4 453 M 651.5 K

AppChina 2 months 55,357 336.0 2,623 M 24.1 M

Anzhi 2 months 60,196 29.6 2,816 M 23.7 M

* Last Day~ 300K apps

Paid apps: • less downloads• fewer uploads

7

App PopularityIs There a Pareto Effect?

Dow

nloa

ds (%

) CD

F

Normalized App Ranking (%)

10% of the apps account for90% of the downloads

8

App PopularityIs There a Power-law Behavior?

Let’s focus on one appstore

9

App PopularityDeviations from ZIPF

WWWINFOCOM‘99

P2PSOSP’03 UGC

IMC’0710

Truncation for small x values: Fetch-at-most-once

• Also observed in P2P workloads• Users appear to download an application at most once

P2PSOSP’03

simulations

11

Truncation for large x values:clustering effect

• Other studies attribute this truncation to information filtering• Our suggestion: the clustering effect

UGCIMC’07

12

App Clustering

GamesReader

SocialTool

• Apps are grouped into clusters

• App clusters can be formed by– App categories– Recommendation systems– User communities– Other grouping forces

13

Clustering Hypothesis

• Users tend to download apps from the same clusters

I like Games!

I like Social apps!

14

Validating Clustering Effect in User Downloads

Dataset: 361,282 user comment streams, 60,196 apps in 34 categories

53% of users commentedon apps from a single category

94% of users commentedon apps from up to 5 categories

15

User Temporal Affinity

a1 a2 a3 a4 a5User downloads

sequencea1, a2, a3, a4, a5

x

Aff1 = 0

Aff2 = 1

Aff3 = 1

x

Aff4 = 0

Pair 1 Pair 2 Pair 3 Pair 4

16

Users Exhibit a Strong Temporal Affinity to Categories

0.55

0.14

3.9 x

17

Modeling Appstore Workloads

. . .

Top

bottom

App

popu

latir

y

ReaderGames Social ProductivityAPP-CLUSTERING model

1. Download the 1st app – overall app ranking2. Download another app 2.1 with prob. p from a previous app cluster c – cluster app ranking 2.2 with prob. 1-p – overall app ranking3. If user’s downloads < d go to 2.

1

22.1p

2.2If downloaded apps < user downloads go to 2.

3

1-p

18

Model ParametersSymbol Parameter Description

A Number of apps

D Total downloads

d Downloads per user (average)

C Number of clusters

U Number of users

zr Zipf exponent for overall app ranking

ZG Overall Zipf distribution of all apps

P Percentage of downloads based on clustering effect

zc Zipf exponent for cluster’s app ranking

Zc Zipf distribution of apps in cluster c

D(I,j) Predicted downloads for app with total rank i and rank j in its cluster

Number of downloads of the most popular app

19

Results

AppChina

Model Distance from measured dataZIPF 0.77ZIPF-at-most-once 0.71APP-CLUSTERING 0.15

20

App Pricing

• Main Questions:– Which are the differences between paid & free apps?– What is the developers’ income range?– Which are the common developer strategies

• How do they affect revenue?

21

The influence of cost

Clearpower-law

Free Paid

Users are more selective when downloading paid apps

22

Developers’ Income

Median: < 10 $

80% < 100 $

95% < 1500 $

Quality is more important than quantity

23(USD)

Developers Create a Few Apps

A large portion of developerscreate only 1 app

95% of developerscreate < 10 apps

10% of developers offerfree & paid apps

24

Can Free Apps Generate Higher Income Than Paid Apps?

Nec

essa

ry a

d in

com

e (U

SD)

Day

Average: 0.21 $

An average free app needs about0.21 $/download to match the income of a paid app

25

Conclusions

• App popularity: Zipf with truncated ends– Fetch-at-most-once– Clustering effect

• Practical implications– New replacement policies for app caching– Effective prefetching– Better recommendation systems– Increase income

26

Thank you!

27

Backup Slides

28

Modeling Appstore Workloads

• Each user downloads d apps randomly– Fetch-at-most-once: a user downloads an app

only once– Clustering effect: user downloads a percentage of

the apps based on previous selections

• Each app has two rankings– an overall ranking – a ranking in its cluster

29

30

Developers Focus on Few Categories

80% of developersfocus on 1 category

~99% of developersfocus on 1-5 categories

31

Choosing the Right Number of Users

Minimum distance

32

Distance From Actual Data

APP-CLUSTERING:• up to 7.2 times closer than ZIPF• up to 6.4 times closer than

ZIPF-at-most-once

33

Apps Are Not Updated OftenAp

ps (C

DF)

Number of Updated

34

Temporal Affinity for Different Depth LevelsAp

ps (C

DF)

Number of Updated

35

Clustering-based User Behavior Affects LRU Cache Performance Negatively

Cach

e H

it Ra

tio (%

)

Cache Size (% of total apps)

User affinity to app categories - EquationsDepth 1:

Depth d:

User Temporal Affinity

a1 a2 a1 a2 a1

x

Aff1 = 1

37

1+¿

x

Aff2 = 1

1+¿

x

Aff3 = 1

1

Depth = 2