Client-side hybrid rating prediction for recommendation · Client-side hybrid rating prediction for recommendation Andr es Moreno12 Harold Castro 1 Michel Riveill 2 1School of Engineering

Client-side hybrid rating prediction forrecommendation

Andres Moreno12 Harold Castro 1 Michel Riveill 2

1School of EngineeringUniversidad de los Andes, Bogota, Colombia

2I3SUniversite de Nice Sophia Antipolis, France

UMAP, 2014

Outline

Motivation: Privacy in recommender systemsRecommender SystemsPrivacy considerations

A client-side agent for recommendationAppliying client-side predictive modelsContent-based modelCollaborative Filtering model (CF)Hybrid prediction under expert advice

Final considerations

Outline




Recommender systems

I Recommender systems are personalization systems thatautomatically calculate the relevance of a large collection ofdata items for a user. The relevance mapping between usersand items is used to select, screen out or rank items based onher preferences and situation.

U1

f1 fk

Uu

log files

item profiles

I1

f1 fk

Ii

Recommendation component

Interaction log component

feedback

item suggestionsTraining component

Prediction component

user profiles

Recommendation server

Recommender systems

I Recommender systems are personalization systems thatautomatically calculate the relevance of a large collection ofdata items for a user. The relevance mapping between usersand items is used to select, screen out or rank items based onher preferences and situation.

U1

f1 fk

Uu

log files

item profiles

I1

f1 fk

Ii



feedback



user profiles


Outline




Privacy considerations

I Recommender systems gather information about users andstore it in a centralized entity, then they apply heuristics ordata mining techniques to learn the users’ interests with thepurpose of detecting which elements are relevant for the user

I Users trust that the information submitted or registeredabout them will be used for filtering purposes, however theirinformation can be used for purposes different than filteringconfiguring an exposure risk. [LFR06]

log files



feedback





I Recommender systems gather information about users andstore it in a centralized entity, then they apply heuristics ordata mining techniques to learn the users’ interests with thepurpose of detecting which elements are relevant for the user

I Users trust that the information submitted or registeredabout them will be used for filtering purposes, however theirinformation can be used for purposes different than filteringconfiguring an exposure risk. [LFR06]

log files



feedback





According to [Fon99], keeping user profile information on acentralized entity can lead to exposure risks configured in fiveways:

I Deception by the recipient: The system can lie about itsprivacy policies.

I Mission creep: The system expands its goals in a previouslyunforeseen manner, changing the use of personal informationfor other purposes related to the new goals of theorganization.

I Accidental disclosure: Information about users can be madeavailable accidentally.

I Disclosure by malicious intent: Storage security breachedstealing personal information.

I Forced disclosure: Systems must disclose the information forlegal reasons.

Proposed architecture

Client-side recommender systems is a privacy-per-architecturesolution to avoid exposure scenarios:

I Keep user profile information in user’s device

I Don’t reveal user ratings

Proposed architecture

Client-side recommender systems is a privacy-per-architecturesolution to avoid exposure scenarios:

I Keep user profile information in user’s device

I Don’t reveal user ratings

Outline




Online Client-based predictive models

I Rating prediction task on client, tested with Movielens10Mdataset, possible ratings restricted to O = {1, 2, 3, 4, 5}.

I Hybrid modelI Content-Based model (CB)I Collaborative Filtering model(CF)

Outline




Content-based model profiles and prediction

I Content-based filtering (CB): Items are described byfeatures or characteristics of the items to find out therelevance for the user.

Star wars

actor:harrison_ford actor:james_earl_jones actor:mark_hamill actor:alec_guinnessactor:denis_lawsonactor:carrie_fisherdirector:george_lucas genre:Adventuregenre:Actiongenre:Sci-Fi

actor:kathy_griffinactor:uma_thurmanactor:bruce_willisactor:christopher_walkenactor:samel_l_jacksonactor:john_travoltagenre:Crimegenre:Comedy

I User has a list of frequent concepts (keywords) Cu , items aredescribed as well by keywords Ci .

I Each user has |O| vectors wou ∈ R|Cu | (o ∈ O)

I mui (Ci × Cu) → R|Cu | binary vector (mui [f ] = 1Cu [f ]∈Ci)

I Rating prediction is: rui =∑

o∈O σ(〈wo ,mui 〉)×o∑o∈O σ(〈wo ,mui 〉)



Star wars










Star wars










Star wars








Content-based model training

I How Ci and Cu are calculated?

I Ci expert knowledge (IMDB.com, rottentomatoes.com)[CBK11]

I Cu concepts the user has interacted at least N times based ona min-count sketch structure [DSHK08][MHS+13]

I How wou is updated?

I Online logistic regression on each vectorI Decreasing learning rate: γt = γ0(1 + αγ0t)−c

I update: wou ← wo

u − γ(tu)(σ(〈wo ,mui 〉)− 1rui=o)mui


I How Ci and Cu are calculated?I Ci expert knowledge (IMDB.com, rottentomatoes.com)

[CBK11]I Cu concepts the user has interacted at least N times based on

a min-count sketch structure [DSHK08][MHS+13]





















Content-based results

I RMSE on dataset, results with N=5, α = 10E−6.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90.95

1

1.05

1.1

1.15

1.2

1.25

1.3

γ0

RM

SE

Metadata predictor RMSE

RMSE trainRMSE cv

Outline




CF model profiles and predictionI Collaborative Filtering model (CF): Users and items are

described by latent-features, trained from user-iteminteractions

Star wars

q1u

f1 fk

q2u

q3u

q4u

q5u

f1 fk

pi

I Each user has |O| vectors qou ∈ RF (o ∈ O)I An item is described by a vector pi ∈ RF

I Model predicts probability that user u will give rating o toitem i

I Restriction on user profile: qu,f ≥ 0 and∑

o∈O qou,f = 1.I Restriction on item profile:pi,f ≥ 0 and

∑f∈F pi,f = 1

I Probability is πoui = 〈qou , pi 〉I Rating prediction is: rui =

∑o∈O π

oui × o



Star wars

q1u

f1 fk

q2u

q3u

q4u

q5u

f1 fk

pi





∑f∈F pi,f = 1


∑o∈O π

oui × o



Star wars

q1u

f1 fk

q2u

q3u

q4u

q5u

f1 fk

pi




o∈O qou,f = 1.

I Restriction on item profile:pi,f ≥ 0 and∑

f∈F pi,f = 1


∑o∈O π

oui × o



Star wars

q1u

f1 fk

q2u

q3u

q4u

q5u

f1 fk

pi





∑f∈F pi,f = 1


∑o∈O π

oui × o



Star wars

q1u

f1 fk

q2u

q3u

q4u

q5u

f1 fk

pi





∑f∈F pi,f = 1

I Probability is πoui = 〈qou , pi 〉


o∈O πoui × o



Star wars

q1u

f1 fk

q2u

q3u

q4u

q5u

f1 fk

pi





∑f∈F pi,f = 1


∑o∈O π

oui × o

CF model training

I How qou is updated?

I Stochastic projected regression on each vector [IICM11]I Decreasing learning rate: γt = γ0(1 + αγ0t)−c

I update: qou ← qou + γ(tu)(1rui=o − (〈pi , qou 〉))piqu ←

∏Duser

(qu)

I How pi is updated?I update: pi ← pi + γ(ti )(1− (〈pi , qou 〉))qou

pi ←∏

Ditem(pi )

I Server doesn’t need rui value in order to update pi

CF model training

I How qou is updated?I Stochastic projected regression on each vector [IICM11]I Decreasing learning rate: γt = γ0(1 + αγ0t)−c


∏Duser

(qu)

I How pi is updated?

I update: pi ← pi + γ(ti )(1− (〈pi , qou 〉))qoupi ←

∏Ditem

(pi )I Server doesn’t need rui value in order to update pi

CF model training

I How qou is updated?I Stochastic projected regression on each vector [IICM11]I Decreasing learning rate: γt = γ0(1 + αγ0t)−c


∏Duser

(qu)

I How pi is updated?I update: pi ← pi + γ(ti )(1− (〈pi , qou 〉))qou

pi ←∏

Ditem(pi )

I Server doesn’t need rui value in order to update pi

Collaborative Filtering results

I RMSE on dataset, α = 10E−6 for increasing γ0.

0 5 10 20 30 40 50 60 70 80 90 1001.1

1.15

1.2

1.25

1.3

1.35

1.4

1.45

1.5

1.55

1.6

RMSE evolution across γ0 and F for CF model

F dimension

RM

SE

γ0=0.05 cv

γ0=0.1 cv

γ0=0.25 cv

γ0=0.5 cv

Outline




Hybrid prediction

I How to use both models for rating prediction ? [BL06]

Predictionqcomponent

Recommendationqcomponent

Client-sideqagent

Starqwars

actor:harrison_fordqactor:james_earl_jonesqactor:mark_hamillqactor:alec_guinnessactor:denis_lawsonactor:carrie_fisherdirector:george_lucasqgenre:Adventuregenre:Actiongenre:Sci-Fi


Starqwars

q1uq

f1 fkq

q2uq

q3uq

q4uq

q5uq

piq

CB CF

pit

rui^ rui

^1 2

I `(R×O)→ R loss function that scores a prediction

I Cumulative regret: RE ,n =n∑

t=1

(`(pi ,t , ri ,t)− `(rEi ,t , ri ,t)

)I Expert weight: WE ,t−1 =

exp(ηtRE ,t−1)∑e∈E exp(ηtRe,t−1)

I Final prediction: Weighted average of experts

pi ,t =∑

E∈EWE ,t−1 rEi,t∑

E∈EWE ,t−1

Hybrid prediction




Client-sideqagent

Starqwars



Starqwars

q1uq

f1 fkq

q2uq

q3uq

q4uq

q5uq

piq

CB CF

pit

rui^ rui

^1 2



t=1

(`(pi ,t , ri ,t)− `(rEi ,t , ri ,t)




pi ,t =∑


E∈EWE ,t−1

Hybrid prediction




Client-sideqagent

Starqwars



Starqwars

q1uq

f1 fkq

q2uq

q3uq

q4uq

q5uq

piq

CB CF

pit

rui^ rui

^1 2



t=1

(`(pi ,t , ri ,t)− `(rEi ,t , ri ,t)

)

I Expert weight: WE ,t−1 =exp(ηtRE ,t−1)∑e∈E exp(ηtRe,t−1)


pi ,t =∑


E∈EWE ,t−1

Hybrid prediction




Client-sideqagent

Starqwars



Starqwars

q1uq

f1 fkq

q2uq

q3uq

q4uq

q5uq

piq

CB CF

pit

rui^ rui

^1 2



t=1

(`(pi ,t , ri ,t)− `(rEi ,t , ri ,t)




pi ,t =∑


E∈EWE ,t−1

Hybrid prediction




Client-sideqagent

Starqwars



Starqwars

q1uq

f1 fkq

q2uq

q3uq

q4uq

q5uq

piq

CB CF

pit

rui^ rui

^1 2



t=1

(`(pi ,t , ri ,t)− `(rEi ,t , ri ,t)




pi ,t =∑


E∈EWE ,t−1

Exponential weighted regret results

I RMSE on dataset, α = 10E−6 for increasing γ0. γ0 of CBhybrid model set to 0.75.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.91.05

1.1

1.15

1.2

1.25

1.3

1.35

1.4

1.45

1.5RMSE on test set

γ0

RM

SE

RMSE CB modelRMSE CF modelRMSE Hybrid model

Summary

I Client-side agents help the avoidance of user exposure risks .

I Placed in an online learning setting, hybridization of clientside predictive models helps to increase the predictiveperformance of the single models.

I OutlookI Actual model still reveals implicit interaction to

recommendation server.

Client-side hybrid rating prediction forrecommendation

Andres Moreno12 Harold Castro 1 Michel Riveill 2

1School of EngineeringUniversidad de los Andes, Bogota, Colombia

2I3SUniversite de Nice Sophia Antipolis, France

UMAP, 2014

References I

[BL06] Nicolo C. Bianchi and Gabor Lugosi, Prediction, learning, andgames, Cambridge University Press, New York, NY, USA,2006.

[CBK11] Ivan Cantador, Peter Brusilovsky, and Tsvi Kuflik, 2ndworkshop on information heterogeneity and fusion inrecommender systems (hetrec 2011), Proceedings of the 5thACM conference on Recommender systems (New York, NY,USA), RecSys 2011, ACM, 2011.

[DSHK08] Xenofontas Dimitropoulos, Marc Stoecklin, Paul Hurley, andAndreas Kind, The eternal sunshine of the sketch datastructure, Comput. Netw. 52 (2008), no. 17, 3248–3257.

[Fon99] Leonard N. Foner, Political artifacts and personal privacy:The yenta Multi-Agent distributed matchmaking system,Ph.D. thesis, Program in Media Arts and Sciences, School ofArchitecture and Planning, Massachusetts Institute ofTechnology, June 1999.

References II[IICM11] Sibren Isaacman, Stratis Ioannidis, Augustin Chaintreau, and

Margaret Martonosi, Distributed rating prediction in usergenerated content streams, Proceedings of the fifth ACMconference on Recommender systems (New York, NY, USA),RecSys ’11, ACM, 2011, pp. 69–76.

[LFR06] Shyong Lam, Dan Frankowski, and John Riedl, Do you trustyour recommendations? an exploration of security and privacyissues in recommender systems, Emerging Trends inInformation and Communication Security (Gunter Muller,ed.), Lecture Notes in Computer Science, vol. 3995, SpringerBerlin / Heidelberg, Berlin, Heidelberg, 2006, pp. 14–29.

[MHS+13] H. Brendan McMahan, Gary Holt, D. Sculley, Michael Young,Dietmar Ebner, Julian Grady, Lan Nie, Todd Phillips, EugeneDavydov, Daniel Golovin, Sharat Chikkerur, Dan Liu, MartinWattenberg, Arnar M. Hrafnkelsson, Tom Boulos, and JeremyKubica, Ad click prediction: A view from the trenches,Proceedings of the 19th ACM SIGKDD InternationalConference on Knowledge Discovery and Data Mining (NewYork, NY, USA), KDD ’13, ACM, 2013, pp. 1222–1230.

Documents

Client-side hybrid rating prediction for recommendation · Client-side hybrid rating prediction for recommendation Andr es Moreno12 Harold Castro 1 Michel Riveill 2 1School of Engineering