59
Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models for Personalization

Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

Embed Size (px)

Citation preview

Page 1: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

Khalid El-AriniCarnegie Mellon University

Joint work with:Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera

y Arcas

Transparent User Models for

Personalization

Page 2: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

Personalization is ubiquitous.

Page 3: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

3

• YouTube: 72+ hours/minute of new video• Facebook: 950 million+ users• Twitter: 400+ million tweets/day• Shopping:

[1994]: 500K unique consumer goods sold in U.S.[2010]: Amazon alone offered 24 million.

Personalization is invaluable.

Keyword search is not enough.

Page 4: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

Personalization is often wrong.

Page 5: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

- J. Zaslow, November 26, 2002

“Basil…is not a neo-Nazi. Lukas…is not a shadowy stalker.David…is not Korean.

intent on giving them such labels.”

Page 6: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

“there's just one way to change its mind: outfox it.” - J. Zaslow, November 26, 2002

What recourse do we have?

Can we do better?

Page 7: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

You behave like a

vegan hipster

Vegan? Really? Why?

You: • tweeted with #meatlessmonday• follow @WholeFoods• …

We propose an alternative.

Why am I getting this?

Page 8: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

We propose an alternative.

Why am I getting this?

You behave like a

Brooklyn hipster

Goal: Achieve transparency via interpretable user features, learned from user activity

Page 9: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

You behave like a

Brooklyn hipster

Goal: Achieve transparency via interpretable user features, learned from user activity

Badges

Page 10: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

10

Approach Model Experiments Summary

Page 11: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

11

1. Define a vocabulary of badges

Apple fanboy

vegan runner photographer

Rich, interpretable and explainable

Page 12: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

12

1. Define a vocabulary of badges

2. Identify exemplars

How do I find vegans?

Page 13: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

observed label

Take advantage of how users describe themselves

Page 14: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

14

Most vegans don’t label themselves as “vegan” on Twitter…

we want to infer the attributes of these users

Page 15: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

15

1. Define a vocabulary of badges

2. Identify exemplars3. Model characteristic

behavior• Hashtags #meatlessmonday• Retweets RT @WholeFoods

Page 16: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

16

Approach Model Experiments Summary

Page 17: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

• We have no negative training examples.Use a generative model.

• Actions can be explained by multiple badges, even for the same user.

Noisy-or to combine badges.• How do we deal with user corrections?

Observing a latent variable.

Model sketch

Page 18: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

18

i=1…B

B badges

Page 19: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

19

u=1…N

i=1…B

N users

Page 20: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

20

u=1…N

i=1…B

F actions j=1…F

j=1…F

Page 21: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

21

bi(u)

u=1…N

i=1…BDoes user u have badge i?

j=1…F

j=1…F

Page 22: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

22

bi(u) λi(u)

u=1…N

i=1…B

j=1…F

j=1…FDoes user u have label for

badge i in his profile?

Page 23: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

23

aj(u)

bi(u) λi(u)

j=1…F u=1…N

i=1…B

Has user u performed action j?

j=1…F

Page 24: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

24

sij

aj(u)

bi(u) λi(u)

j=1…F

j=1…F

u=1…N

i=1…B

Does badge i explain action j?

Page 25: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

25

sijφij

aj(u)

bi(u) wi(u)

αφβφj=1…F

j=1…F

u=1…N

i=1…B

What’s the probability that a user with badge i performs action j?

Page 26: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

26

sijφijφbg aj(u)

bi(u) wi(u)

αφβφj=1…F

j=1…F

u=1…N

i=1…B

What is the background probability for each action?

Page 27: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

27

sijφijφbg aj(u)

bi(u) wi(u)

αφβφj=1…F

j=1…F

u=1…N

i=1…B

noisy or:Can at least one of my badges (or the background) explain it?

Page 28: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

28

sijφijφbg aj(u)

bi(u) λi(u)

αφβφj=1…F

j=1…F

u=1…N

i=1…B

Page 29: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

29

sijφijφbg aj(u)

bi(u) λi(u)

αφβφj=1…F

j=1…F

u=1…N

i=1…B

Beta priors to control sparsity

Page 30: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

30

sijφijφbg aj(u)

bi(u) λi(u)

γiT γiF

αφβφ

αT βT αF βF

j=1…F

j=1…F

u=1…N

i=1…B

Beta prior to encode low recall (e.g., 10%)

Beta prior to encode high precision

(e.g., 99.9%)

Page 31: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

31

ηisijφijφbg aj(u)

bi(u) λi(u)

γiT γiFωi

αφβφ

αη βη αω βω αT βT αF βF

j=1…F

j=1…F

u=1…N

i=1…B

Page 32: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

32

• Collapsed Gibbs sampler (with MH steps)

Inference

sijφijφbg

bi(u)

Page 33: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

33

ηisijφijφbg aj(u)

bi(u) λi(u)

γiT γiFωi

αφβφ

αη βη αω βω αT βT αF βF

j=1…F

j=1…F

u=1…N

i=1…BYou behave like a

vegan hipster.

Page 34: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

34

ηisijφijφbg aj(u)

bi(u) λi(u)

γiT γiFωi

αφβφ

αη βη αω βω αT βT αF βF

j=1…F

j=1…F

u=1…N

i=1…BYou behave like a

vegan hipster.

Page 35: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

35

Approach Model Experiments Summary

Page 36: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

36

• Start with 7 million Twitter users• Manually define 31 sample badges

by specifying labels

Data description

Page 37: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

• Start with 7 million Twitter users• Manually define 31 sample badges by

specifying labels• Gather 2 million tweets from August

2011• Recall: actions are hashtags and

retweets

Remove infrequent actions and inactive users, leaving us with:

75,880 users32,030 actions

Data description

Page 38: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 310

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

Chart Title

Badges

artist

photographer

country music fan

book worm

Badge statistics

Page 39: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

39

Can we learn badges?

Page 40: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

40

Vegetarian badge

Page 41: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

41

Runner badge

Page 42: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

42

Hacker badge

Page 43: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

43

Manchester United badge

Page 44: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

44

Do all badges look this good?

No, but most do.

Page 45: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

45wine lover

Over-generalized

Page 46: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

46

Overwhelmed

Ruby on Rails

Page 47: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

47

Can we just use the labels directly?

Page 48: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

48

Inferred Apple fanboy badge

Self-described Apple fanboys

Page 49: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

49

• Compare to labeled LDA [Ramage+ 2009]– LDA extension where each document is

labeled with multiple tags– One-to-one mapping between topics and tags– Document explained only by topics

associated with its tags

• Hold out random 10% of labels, treat as ground truth, and try to predict them

Comparative Analysis

Page 50: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

50

Rank of held-out labels be

tter

Better predictiveperformance

Page 51: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

51

bett

erBetter predictions for active

users

Page 52: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

52

Sparse badges

Apple fanboy (badges) Apple fanboy (l-lda)

Page 53: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

53

Approach Model Experiments Summary

Page 54: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

54

Leveraged how users describe themselves

Page 55: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

55

Leveraged how users describe themselves to build interpretable user features You behave like a

vegan hipster

Page 56: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

56

Empirically showed we can infer a user’s attributes from his behavior

Page 57: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

57

谢谢

Page 58: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

What recourse do we have?

Collaborative filtering

Content-based filtering

Can we do better?

Page 59: Khalid El-Arini Carnegie Mellon University Joint work with: Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, Blaise Agüera y Arcas Transparent User Models

59

Most vegans don’t label themselves as “vegan” on Twitter……but what about non-vegans?

“I drink too much and hate vegans.”