17
Discovering Playing Patterns: Time Series Clustering of Free-To-Play Game Data Alain Saas, Anna Guitart and ´ Africa Peri´ nez (Silicon Studio) IEEE CIG 2016 Santorini 21 September, 2016

Discovering Playing Patterns: Time Series Clustering of Free-To-Play Game Data

Embed Size (px)

Citation preview

Discovering Playing Patterns:

Time Series Clustering of Free-To-Play Game Data

Alain Saas, Anna Guitart and Africa Perianez (Silicon Studio)

IEEE CIG 2016 Santorini

21 September, 2016

About us

• Who are we?◦ Game studio and graphics

middleware company based in Tokyo◦ Research project to provide Game

Data Science as a service◦ Goals: predict player behavior, scale

to big data and intuitive resultvisualization

• Which data?◦ RPG free-to-play games◦ TS of two games◦ TS of in-app purchases and activity

behavioral data

2 of 17

Challenge

Unsupervised clustering of Time Series of player activity

• Why?◦ discover temporary player patterns◦ evaluation of game events and business diagnosis◦ assess common characteristics of players belonging to the same cluster

• How?

1. representation techniques: reducing the high dimensionality of TS2. similarity measures for free-to-play game data3. hierarquical clustering4. visual validation of the results

3 of 17

Representation methods

Symbolic Aggregate Approximation

Trend Extraction

Discrete Wavelet Transfrom

4 of 17

Similarity measures

Dynamic Time Warping

DTW (X ,Y ) = minr∈M

(M∑

m=1

|xim − yjm|)

Correlation-based measure

COR(X ,Y ) =

∑Nn=1(xn − X )(yn − Y )√∑N

n=1(xn − X )2√∑N

n=1(yn − Y )2

Temporal Correlation and Raw ValuesBehaviors measure

CORT (X ,Y ) =

∑N−1n=1 (xn+1 − xn)(yn+1 − yn)√∑N−1

n=1 (xn+1 − xn)2√∑N−1

n=1 (yn+1 − yn)2

Complexity-Invariant Distancemeasure

CID(X ,Y ) = dist(X ,Y ) · CF (X ,Y ),

CF complexity correction factor

CF (X ,Y ) =max(CE(X ),CE(Y ))

min(CE(X ),CE(Y ))

CE is the complexity estimation

CE(X ) =

√√√√N−1∑n=1

(xn − xn+1)2

5 of 17

Similarity measure comparison

Euclidean vs. Correlation Correlation vs. Complexity-Invariant Distance

Dynamic Time Warping vs.Correlation Correlation vs. Discrete Wavelet Transform

6 of 17

Comparison clustering methods

• DTW Dynamic Time Warping

◦ similar player profiles with ashift on the time axis

◦ different patterns but atdifferent scale

• DWT Discrete Wavelet Transform

◦ dimensionality reduction◦ frequency of the series

• SAX Symbolic Aggregate

Approximation

◦ parameters w,a

• COR Correlation

◦ similar geometric andsynchronous profiles

◦ sensitive to noise data andoutliers

• CORT Temporal Correlation

◦ similar to COR but with timeconsideration?

• CID Complexity-Invariant distance

◦ similar complexity patterns◦ good for sparse time series

• COR+trend Correlation and trend extraction

◦ addresses COR’s sensitivity to noise◦ does not work well with sparse time series

7 of 17

Hierarchical clustering

Agglomerative Ward method:Lead to a minimum increase of total within-cluster variance

Single LinkageComplete LinkageAverage LinkageCentroid MethodWard Method

8 of 17

Our data

Time series measured per user per day.

Game ActivityBehavioral data

Time: The amount of time spent in the gameSessions: The total number of playing sessionsActions: The total number of actions performed

In-app Sales Purchase: The total amount of in-app purchases

9 of 17

Data selection, constraints

Time Series: Multi-dimensional data⇒ selection of period P

• in our data weekly game events

• period P of length 21 days

• played time → active usersmin connections 6/7 days a week

• purchases → paying usersat least one purchase in period P

• players alive during period P

10 of 17

Datasets and tests

Game Data Technique Clusters Date rangeAge of Ishtaria Daily played time COR-trend 8 Oct2014 - Jan2016Age of Ishtaria Daily purchase CID 5 Oct2014 - Jan2016Grand Sphere Daily played time COR-trend 8 Jun2015 - Mar2016

11 of 17

Clustering time series of time played

1. representation method: trend extraction

2. similarity measure: correlation

3. hierarchical clustering: Ward method

4. validation of results: visualization withheatmap (raw data)

12 of 17

Extraction of players characteristics

13 of 17

Clustering time series of time played

Also able to extract differentiate patterns as in Age of Ishtaria

14 of 17

Clustering time series of purchases

1. similarity measure:complexity-invariant distance

2. hierarchical clustering: Ward method

3. validation of results: visualization withheatmap (raw data)

15 of 17

Summary and Next Steps

• Unsupervised clustering time series data from two free-to-playgames

• Evaluate several similarity measures and representation methods

• Extract meaningful behavioral patterns of players

• Assess impact of weekly game events

• Discover hidden playing dynamics regarding purchases and timeplayed

• Feature for churn prediction

• Event recommender

• Cluster level behaviour

16 of 17

http://www.siliconstudio.co.jp/rd/4front/

Thank you!

17 of 17