19
Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Bayesian Personalized Ranking for Non-Uniformly Sampled Items Zeno Gantner, Lucas Drumond, Christoph Freudenthaler, Lars Schmidt-Thieme University of Hildesheim 21 August 2011 Zeno Gantner et al., University of Hildesheim 1 / 15

Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Embed Size (px)

DESCRIPTION

The slide set describing our approach to the KDD Cup 2011, presented at the KDD Cup workshop in San Diego, California.

Citation preview

Page 1: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items

Zeno Gantner, Lucas Drumond, Christoph Freudenthaler,Lars Schmidt-Thieme

University of Hildesheim

21 August 2011

Zeno Gantner et al., University of Hildesheim 1 / 15

Page 2: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Questions (and Answers)

Who? Which?

How?

Why?Where?

What?

Zeno Gantner et al., University of Hildesheim 2 / 15

Page 3: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Which problem to solve?

Which problem to solve?

Rating Prediction (Track 1)

vs.

Item Prediction (Track 2)

Zeno Gantner et al., University of Hildesheim 3 / 15

Page 4: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items How did we tackle the problem?

How did we tackle the problem?Bayesian Personalized Ranking:

BPR(DS) = argmaxΘ

∑(u,i ,j)∈DS

ln σ(su,i (Θ)− su,j (Θ) )−λ‖Θ‖2

I DS contains all pairs of positive and negative items for each user,

I σ(x) = 11+e−x is the logistic function,

I Θ represents the model parameters,

I su,i (Θ) is the predicted score for user u and item i , and

I λ‖Θ‖2 is a regularization term to prevent overfitting.

interpretation 1: reduce ranking to pairwise classif. [Balcan et al. 2008]

interpretation 2: optimize for smoothed area under the ROC curve (AUC)

Model: matrix factorizationLearning: stochastic gradient ascent

[Rendle et al., UAI 2009]Zeno Gantner et al., University of Hildesheim 4 / 15

Page 5: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items How did we tackle the problem?

How did we tackle the problem?

BPR(DS) = argmaxΘ

∑(u,i ,j)∈DS

ln σ(su,i − su,j)− λ‖Θ‖2

problem: all negative items j are given the same weight

solution: adapt weights in the optimization criterion (and samplingprobabilities in the learning algorithm)

WBPR(DS) = argmaxΘ

∑(u,i ,j)∈DS

wuwiwj ln σ(su,i − su,j)− λ‖Θ‖2,

wherewj =

∑u∈U

δ(j ∈ I+u ). (1)

Zeno Gantner et al., University of Hildesheim 5 / 15

Page 6: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items How did we tackle the problem?

How did we tackle the problem?

BPR(DS) = argmaxΘ

∑(u,i ,j)∈DS

ln σ(su,i − su,j)− λ‖Θ‖2

problem: all negative items j are given the same weight

solution: adapt weights in the optimization criterion (and samplingprobabilities in the learning algorithm)

WBPR(DS) = argmaxΘ

∑(u,i ,j)∈DS

wuwiwj ln σ(su,i − su,j)− λ‖Θ‖2,

wherewj =

∑u∈U

δ(j ∈ I+u ). (1)

Zeno Gantner et al., University of Hildesheim 5 / 15

Page 7: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Why did we not win?

Why did we not win?But also: Why did we perform better than others?

Why did we perform better than others?

I straightforward model that matches the prediction task pretty well

I scalability (e.g. k = 480 factors per user/item)

I integration of rating information (see paper)

I ensembles (see paper)

Why did we not win?

I . . . two possible answers . . .

Zeno Gantner et al., University of Hildesheim 6 / 15

Page 8: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Why did we not win?

Taxonomy

Zeno Gantner et al., University of Hildesheim 7 / 15

Page 9: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Why did we not win?

Learn the right contrast

rating >= 80

rating < 80

no rating

liked?

rating >= 80

rating < 80

no ratingrated?

rating >= 80 no rating?

Zeno Gantner et al., University of Hildesheim 8 / 15

Page 10: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Why did we not win?

Learn the right contrast

rating >= 80

rating < 80

no rating

liked?

rating >= 80

rating < 80

no ratingrated?

rating >= 80 no rating?

Zeno Gantner et al., University of Hildesheim 9 / 15

Page 11: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Why did we not win?

Learn the right contrast

rating >= 80

rating < 80

no rating

liked?

rating >= 80

rating < 80

no ratingrated?

rating >= 80 no rating?

Zeno Gantner et al., University of Hildesheim 10 / 15

Page 12: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Why did we not win?

Learn the right contrast

rating >= 80

rating < 80

no rating

liked?

rating >= 80

rating < 80

no ratingrated?

rating >= 80 no rating?

Zeno Gantner et al., University of Hildesheim 11 / 15

Page 13: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Where?

Where next?

I classification → ranking → pairwise classification

I pairwise classification: try other losses, e.g. soft margin (hinge) loss

I Bayesian2 Personalized Ranking

I beyond KDD Cup: consider different sampling schemes . . .

Zeno Gantner et al., University of Hildesheim 12 / 15

Page 14: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Summary

Summary

I Use matrix factorization optimized for BayesianPersonalized Ranking (BPR) to solve the itemranking problem.

I BPR reduces ranking (in this case: binaryvariables) to pairwise classification.

I Extend BPR to use different sampling scheme:Weighted BPR (WBPR).

I Open question: Learn a different contrast?

I Details can be found in the paper.

I Code: http://ismll.de/mymedialite/

examples/kddcup2011.html

advertisement: Contribute to http://recsyswiki.com!

Zeno Gantner et al., University of Hildesheim 13 / 15

Page 15: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Questions

Zeno Gantner et al., University of Hildesheim 14 / 15

Page 16: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items

AcknowledgementsThank you

I The organizers, for hosting a great competition.

I The participants, for sharing their insights.

Funding

I German Research Council (Deutsche Forschungsgemeinschaft, DFG) projectMultirelational Factorization Models.

I Development of the MyMediaLite software was co-funded by the EuropeanCommission FP7 project MyMedia under the grant agreement no. 215006.

Picture credits

I by Michael Sauers, under Creative Commons by-nc-sa 2.0http://www.flickr.com/photos/travelinlibrarian/223839049/

I by Rob Starling, under Creative Commons by-sa 2.0http://en.wikipedia.org/wiki/File:Air_New_Zealand_B747-400_ZK-SUI_at_LHR.jpg

Zeno Gantner et al., University of Hildesheim 15 / 15

Page 17: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items

Numbers?

k error in %“liked” contrast

320 5.52480 5.08

“rated” contrast

320 5.15480 4.87

Estimated error on validation split (not leaderboard).

Zeno Gantner et al., University of Hildesheim 16 / 15

Page 18: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Advertisement

MyMediaLite: Recommender System Algorithm Libraryfunctionality

I rating prediction

I item recommendation from implicit feedback

I group recommendation

target groups

I researchers, educators and students

I application developers

development

I written in C#, runs on Mono

I GNU General Public License (GPL)

I regular releases (ca. 1 per month)

I simple

I free

I scalable

I well-documented

I well-tested

I choice

http://ismll.de/mymedialite

Zeno Gantner et al., University of Hildesheim 17 / 15

Page 19: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Advertisement

RecSys Wiki is looking for contributions

Alan

Zeno

http://recsyswiki.com

Zeno Gantner et al., University of Hildesheim 18 / 15