33
Recommendation Systems 2 Tufts COMP 135: Introduction to Machine Learning https://www.cs.tufts.edu/comp/135/2019s/ Many ideas/slides attributable to: Liping Liu (Tufts), Emily Fox (UW) Matt Gormley (CMU) Prof. Mike Hughes

20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Recommendation Systems

2

Tufts COMP 135: Introduction to Machine Learninghttps://www.cs.tufts.edu/comp/135/2019s/

Many ideas/slides attributable to:Liping Liu (Tufts), Emily Fox (UW)Matt Gormley (CMU)

Prof. Mike Hughes

Page 2: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Recommendation Task:Which users will like which items?

3Mike Hughes - Tufts COMP 135 - Spring 2019

Page 3: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

4Mike Hughes - Tufts COMP 135 - Spring 2019

• Need recommendation everywhere

Page 4: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Utility matrix

5Mike Hughes - Tufts COMP 135 - Spring 2019

• The “value” or “utility” of items to users • Only known when ratings happen• In practice, very sparse, many entries unknown

42

Page 5: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Rec Sys Unit Objectives

• Explain Recommendation Task• Predict which users will like which items

• Explain two major types of recommendation• Content-based (have features for items/users)• Collaborative filtering (only have scores for item+user pairs)

• Detailed Approach: Matrix Factorization + SGD

• Evaluation: • Precision/recall for binary recs

6Mike Hughes - Tufts COMP 135 - Spring 2019

Page 6: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

7Mike Hughes - Tufts COMP 135 - Spring 2019

Task: RecommendationSupervisedLearning

Unsupervised Learning

Reinforcement Learning

Collaborative filtering

Content filtering

Page 7: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Content-based recommendation

8Mike Hughes - Tufts COMP 135 - Spring 2019

Page 8: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Content-based

9Mike Hughes - Tufts COMP 135 - Spring 2019

FEATURE VALUEis_round 1is_juicy 1average_price $1.99/lb

Key aspect:Have common features for each item

Page 9: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Content-Based Recommendation• Reduce per-user prediction to supervised

prediction problem

10Mike Hughes - Tufts COMP 135 - Spring 2019

What features are necessary? What are pitfalls?

Fig. Credit: Emily Fox (UW)

Page 10: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Possible Per-Item Features

• Movie• Set of actors, director, genre, year

• Document• Bag of words, author, genre, citations

• Product• Tags, reviews

11Mike Hughes - Tufts COMP 135 - Spring 2019

Page 11: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Content-Based Recommender

12Mike Hughes - Tufts COMP 135 - Spring 2019

Fig. Credit: Emily Fox (UW)

Page 12: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Collaborative filtering

13Mike Hughes - Tufts COMP 135 - Spring 2019

Page 13: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

External Slides

• Matt Gormley at CMU

• https://www.cs.cmu.edu/~mgormley/courses/10601-s17/slides/lecture25-mf.pdf

• We’ll use page 4 – 34• Start: ”Recommender Systems” slide• Stop at: comparison of optimization algorithms

14Mike Hughes - Tufts COMP 135 - Spring 2019

Page 14: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Matrix Factorization (MF)

15Mike Hughes - Tufts COMP 135 - Spring 2019

• User ! represented by vector "# ∈ %&• Item ' represented by vector () ∈ %&• Inner product "#*() approximates the utility +#)• Intuition: • Two items with similar vectors get similar utility scores

from the same user;• Two users with similar vectors give similar utility

scores to the same item

Page 15: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Training an MF model

16Mike Hughes - Tufts COMP 135 - Spring 2019

• Variables to optimize• ! = #$: & = 1,… ,* , + = ,$: - = 1,… ,.

• Training objective

• How to optimize?• Stochastic gradient descent, visit each user-item entry at

random!

• Key practical aspects• Regularization terms to prevent overfitting

min!,+ 2$3

.$3 − #$5,36 + 82

$#$ 66 + 82

3,3 6

6

Page 16: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Include intercept/bias terms!

17Mike Hughes - Tufts COMP 135 - Spring 2019

• Per-userscalar+,• Per-item scalar -.

Why include these?

min2,4 5,.

6,. − 8,9:. − +, − -.; + =5

,8, ;; + =5

.:. ;

;

Page 17: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Include intercept/bias terms!

18Mike Hughes - Tufts COMP 135 - Spring 2019

• Per-userscalar+,• Per-item scalar -.

Why include these? Improve accuracySome items just more popularSome users just more positive

min2,4 5,.

6,. − 8,9:. − +, − -.; + =5

,8, ;; + =5

.:. ;

;

Page 18: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Summary of Methods

19Mike Hughes - Tufts COMP 135 - Spring 2019

Page 19: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

20Mike Hughes - Tufts COMP 135 - Spring 2019

Task: RecommendationSupervisedLearning

Unsupervised Learning

Reinforcement Learning

Collaborative filtering

Content-based filtering

Page 20: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Recall: Supervised Method

21Mike Hughes - Tufts COMP 135 - Spring 2019

SupervisedLearning

Unsupervised Learning

Reinforcement Learning

Data, Label PairsPerformance

measureTask

data x

labely

{xn, yn}Nn=1

Training

Prediction

Evaluation

Page 21: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Example: Per-User Predictor

22Mike Hughes - Tufts COMP 135 - Spring 2019

SupervisedLearning

Unsupervised Learning

Reinforcement Learning

For each item n:x: User-Item Featurey: Rating Score

PerformancemeasureTask

User-itemFeature vectorx

Predicted rating

y

{xn, yn}Nn=1

Regressor/

Classifier

Content-based filtering

Page 22: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

23Mike Hughes - Tufts COMP 135 - Spring 2019 23Mike Hughes - Tufts COMP 135 - Spring 2019

Data Examples

data x

SupervisedLearning

Unsupervised Learning

Reinforcement Learning

{xn}Nn=1Task

summaryor model

of x

Performancemeasure

Recall: Unsupervised Method

Page 23: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

24Mike Hughes - Tufts COMP 135 - Spring 2019 24Mike Hughes - Tufts COMP 135 - Spring 2019

Data Examples Matrix M

Specific entry

indicies(i,j)

SupervisedLearning

Unsupervised Learning

Reinforcement Learning

Task

Low-rankfactors

that reconstruct

M

Performancemeasure

Example: Matrix Factorization

Value ofM_ij

Collaborative filtering

Page 24: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Evaluation

25Mike Hughes - Tufts COMP 135 - Spring 2019

Page 25: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Evaluation

26Mike Hughes - Tufts COMP 135 - Spring 2019

Item ranking 1 2 3 4 5 6 7 8Actual usage 1 0 1 0 0 0 1 1

Assumptions• For given user, we can rate each item with score• We care most about our top-score predictions• Setup:

• Algorithm rates each item with score• Sort items from high to low score• Have “true” relevant/not usage labels (unused by algo.)

Page 26: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

27Mike Hughes - Tufts COMP 135 - Spring 2019

Page 27: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

28Mike Hughes - Tufts COMP 135 - Spring 2019

Page 28: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

External Slides

• Emily Fox’s slides

• https://courses.cs.washington.edu/courses/cse416/18sp/slides/L13_matrix-factorization.pdf#page=19

• Start: Slide 19 (world of all baby products)• Stop: End of that section

29Mike Hughes - Tufts COMP 135 - Spring 2019

Page 29: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Precision-Recall Curve

30Mike Hughes - Tufts COMP 135 - Spring 2019

recall (= TPR)

prec

isio

n

Page 30: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Precision @ k

• Assume only top k results are predicted positive• E.g. Netflix can only show you ~8 results on screen at a

time, we want most of these to be relevant• Example:

• Prec @ 1 : 100%• Prec @ 2: 50%• Prec @ 8: 50%

31Mike Hughes - Tufts COMP 135 - Spring 2019

Item ranking 1 2 3 4 5 6 7 8Actual usage 1 0 1 0 0 0 1 1

Page 31: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Cold Start Issue

32Mike Hughes - Tufts COMP 135 - Spring 2019

• New user entering the system• Hard for both content-based and matrix factors• Matching similar users• Trial-and-error

• New item entering the system• Easy with per-user content-based recommendation

• IF easy to get the item’s feature vector• Hard with matrix factorization

• Trial-and-error

Page 32: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Practical Issues in Real Systems

33Mike Hughes - Tufts COMP 135 - Spring 2019

• Recommendation system and users form a loopy system• RS changes user’s behavior• User generate data for RS

• User groups becoming more homogeneous• Youtube recommendation of politic videos:

recommend videos from the same camp

Page 33: 20 recommendation systems• Recommendation system and users form a loopy system • RS changes user’s behavior • User generate data for RS • User groups becoming more homogeneous

Rec Sys Unit Objectives

• Explain Recommendation Task• Predict which users will like which items

• Explain two major types of recommendation• Content-based (have features for items/users)• Collaborative filtering (only have scores for item+user pairs)

• Detailed Approach: Matrix Factorization + SGD

• Evaluation: • Precision/recall for binary recs

34Mike Hughes - Tufts COMP 135 - Spring 2019