25
Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation Machines Fabrikatyr Analytics Uncover tangible truths amidst the noise of modern media Recommendation service using Factorisation Machines PyCon Dublin - 2016 @Conr @fabrikatyr

Content Recommendation using factorisation machines ; Pycon Ireland 2016

Embed Size (px)

Citation preview

Page 1: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation Machines

Fabrikatyr AnalyticsUncover tangible truths amidst the noise of modern media

Recommendation service using Factorisation MachinesPyCon Dublin - 2016@Conr

@fabrikatyr

Page 2: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation Machines

Agenda The Problem Factorisation machines as a method Getting the data right Modelling and Deployment

Further Research

Page 3: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation MachinesOct 2015 - PyCon Dublin 2015 Fabrikatyr – Increasing Customer Response Rate

Business ProblemIncrease User Engagement by displaying content which is both personalised and interesting

Page 4: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation Machines

Content can be User Generated or Taken from 3rd party content provider;

Users Communities

A user can be in MANY communities

Behaviours are consistent across communities,

Content consumption is not

Interesting Content will generate● Likes● Comments● Share

Content can be ‘EverGreen’

Page 5: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation MachinesOct 2015 - PyCon Dublin 2015 Fabrikatyr – Increasing Customer Response Rate

Optimisation Problem

5

General recommender

Speed & Scalability

Sparse data set

How to select an method to deploy which can answer the challenges?

Accuracy is important, but the goal is to generate recommendations which are consumed

The system needs to respond quickly to trends and topics across communities

Lot of ‘hidden’ behavioursMeasurably engagement is Low so the data set is very sparse

Page 6: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation Machines

Factorisation Machines appeared to be the method which answered the challenge

Factorisation Machines General accuracy Quick Designed for it

Accuracy Speed Sparsity

Collaborative Filter Too Accurate Suitable Suitable

Support Vector Machines Too Accurate Suitable Unsuitable

Random Forest / CART General Accuracy Unsuitable Unsuitable

Page 7: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation MachinesOct 2015 - PyCon Dublin 2015 Fabrikatyr – Increasing Customer Response Rate

Factorisation machines as a method

Page 8: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation Machines

Factorisation Machine - The Equation

Page 9: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation Machines

Factorisation Machine - The Equation

Page 10: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation Machines

Factorisation Machine - The Equation

Page 11: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation Machines

Limits of Factorisation Machines Need to understand your features as the model Not good with ‘dense’ data with binary outcomes Relatively newer method, but supported by most languages

General model, so predictions are also general

Page 12: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation MachinesOct 2015 - PyCon Dublin 2015 Fabrikatyr – Increasing Customer Response Rate

Getting the data right

Page 13: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation Machines 13

The data from the systems needs to be examined and structured before executing the model

3 groups of information

Users

Content

Context

Time was NOT a feature

Page 14: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation Machines

Important

Unimportant

● Is the user and Admin / Moderator● Has the User ‘logged-in’

Not all USER behaviours are important when using a generalised model

● User behaviour● Engagement● Count of Community membership

Page 15: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation Machines

Engagement

Keywords

● Did the user ‘Like’ the content● Did the user ‘comment’ on the content● Did the user ‘share’ the content

Content needs to be given ‘Context’ to be worked with effectively

● Which keywords does the content have?

Page 16: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation Machines

The general behaviour is that a set of users and content generate most of the activity

Page 17: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation Machines

Final result was a ‘wide’ dataset per user event with many columns● Each time a user either saw content or it engaged

with it a row must be added to the data set

● Keywords, likes, etc. all receive a 1 or a 0 for ALL the events

Page 18: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation MachinesOct 2015 - PyCon Dublin 2015 Fabrikatyr – Increasing Customer Response Rate

Modelling and Deployment

Page 19: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation Machines

We used an application which could deploy our python model at scale

Turi Predictive Services supports model predictions, hosting and managing machine learning models as low-latency RESTful services. Turi was acquired by Apple Inc. for $200 millDomino Data labs is an alternative

Page 20: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation Machines

sFrame versus Pandas - works for Factorisation MachinesSFrame is an scalable, out-of-core dataframe, whichAllows you to work with datasets that are larger than the amount of RAM on your system.

Similar to Spark RDD

Page 21: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation Machines

Content store

Solution Architecture

Guest data

Factorisation Machinemodel

Scoring Engine&

Recommendations

Page 22: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation Machines

Machine Learning Service

Content store

URL’s

Guest behaviour

Balance between online / offline calcuations

Guest data

Factorisation Machinemodel

Content consumption

Guest classification

Scoring Engine&

Recommendations Content URL

C# content server

(Offline/batch)

(Online)

Model weights

Page 23: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation Machines

Further analysis Injecting content into the model Consumption tracking using A/B testing Presentation Bias - does rank affect consumption

Page 24: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation Machines

Fabrikatyr AnalyticsUncover tangible truths amidst the noise of modern media

Thank you - Any Questions?PyCon Dublin - 2016@Conr

@fabrikatyr

Page 25: Content Recommendation using  factorisation machines ; Pycon Ireland 2016

Nov 2016- PyCon Dublin 2016 Fabrikatyr – Factorisation Machines

References

● http://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf

● https://github.com/ibayer/fastFM ● www.libfm.org● https://github.com/zhengruifeng/spark-libFM ● https://github.com/scikit-learn-contrib/polylearn