View
847
Download
0
Category
Preview:
Citation preview
France is the 2nd manga consumer in the world
1. Japan - 500M volumes2. France - 13M volumes3. US - 9M volumes
Japan Expo 2015, summer festival about Japanese culture250k tickets over 4 dayse130 per personMore than e8M every year
2 / 21
Mangaki
Let’s use algorithms to discoverpearls of Japanese culture!
23 members:3 mad developers
1 graphic designer1 intern
Feb 2014 - PhD about Adaptive TestingOct 2014 - MangakiOct 2015 - Student Demo Cup winner, Microsoft prizeFeb 2016 - Japanese Culture Embassy Prize, Paris
We’re flying to Tokyo!
3 / 21
Mangaki
2000 users150 → 14000 works anime / manga / OST
290000 ratings fav / like / dislike / neutral / willsee / wontsee
People rate a few works Preference Elicitation
And receive recommendations Collaborative Filtering
4 / 21
Collaborative Filtering
ProblemUsers u = 1, . . . , n and items i = 1, . . . ,mEvery user u rates some items Ru(rui : rating of user u on item i)How to guess unknown ratings?
k -nearest neighbor
Similarity score between users:
score(u, v) =Ru · Rv
||Ru|| · ||Rv ||.
Let us find the k nearest neighbors of the userAnd recommend what they liked that he did not rate
5 / 21
Preference Elicitation
ProblemWhat questions to ask adaptively to a new user?
4 decksPopularityControversy (Reddit)
controversy(L,D) = (L + D)min(L/D,D/L)
Most likedPrecious pearls: few rates but almost no dislike
Problem: most people cannot rate the controversial items
6 / 21
Yahoo’s Bootstrapping Decision Trees
Good balance between like, dislike and unknown outcomes.
7 / 21
Matrix Completion
If the matrix of ratings is assumed low rank:
M =
R1R2...
Rn
= = CP
Every line Ru is a linear combination of few profiles P.
1. Explicable profiles
If P P1 : adventure P2 : romance P3 : plot twistAnd Cu 0,2 −0,5 0,6⇒ u likes a bit adventure, dislikes romance, loves plot twists.
2. We get user2vec & item2vec in the same space!⇒ An user likes items that are close to him.
8 / 21
Mangaki’s Explained Profiles
P1 loves stories for adults (Ghibli/SF), hates stories for teensPaprika : In the near future, a machine that allows to enteranother person’s dreams for psychotherapy treatment has beenrobbed. The doctors investigate.
P2 likes the most popular works, hates really weird worksSame-family homosexual romances:- Mira is a high school student in love with his father, Kyosuke.- They are involved both in a romantic and sexual relationship.- Trouble arises when Mira finds adoption papers.- Mira thinks [his dad] is cheating on him with a girl, which turns out tobe his mother.- Meanwhile, Mira is chased by his best friend, who is also in lovewith him.- And the end, it turns out that Kyosuke is his uncle and everything isright again.
9 / 21
Modelling Diversity: Determinantal Point Processes
Let’s sample a few items that are far from each other and askthem to the user.
12 / 21
Determinantal Point Processes
We want to sample over n itemsK : n × n similarity matrix over items (positive semidefinite)
P is a determinantal point process if Y is drawn such that:
∀A ⊂ Y , P(A ⊆ Y ) ∝ det(KA)
Example
K =
1 2 3 42 5 6 73 6 8 94 7 9 0
A = {1,2,4} will be includedwith probability prop. to
KA = det
1 2 42 5 74 7 0
13 / 21
Link with diversity
The determinant is the volume of the vectorsNon-correlated (diverse) vectors will increase the volumeWe need to sample k diverse elements efficiently
14 / 21
Results
We assume the eigenvalues of K are known.
Kulesza & Taskar, ICML 2011Sampling k diverse elements from a DPP of size n hascomplexity O(nk2).
Kang (Samsung), NIPS 2013
We found an algorithm ϵ-close with complexity O(k3 log(k/ϵ)).
Rebeschini & Karbasi, COLT 2015No, you did not.Your proof of complexity is false, but at least your algorithmsamples correctly.
Vie, RecSysFR 2016Please calm down guys, we’re all friends here.
15 / 21
Work in progress: Preference Elicitation in Mangaki
Keep the 100 most popular worksPick 5 diverse works using k -DPPAsk the user to rate themEstimate the user’s vectorFilter less informative works accordinglyRepeat.
16 / 21
Recommended