45
Recommender Systems: An Introduction to the Art Tony White

Recommender Systems: An Introduction to the Art Tony White

Embed Size (px)

Citation preview

Page 1: Recommender Systems: An Introduction to the Art Tony White

Recommender Systems: An Introduction to the Art

Tony White

Page 2: Recommender Systems: An Introduction to the Art Tony White

- 2 -

ACKNOWLEDGEMENT

Several of these slides are associated with material from “Recommender Systems: An Introduction”.

Page 3: Recommender Systems: An Introduction to the Art Tony White

- 3 -

Agenda

Definition

Collaborative Filtering

Content-based Filtering

Page 4: Recommender Systems: An Introduction to the Art Tony White

- 4 -

What is a recommender system?

It is a utility to help users find items in which they are interested among catalogues containing a huge number of services and products.

It helps users deal with information overload in search engines’ results.

It has roots in:– information retrieval, – cognitive science, – customer marketing modeling and management.

Page 5: Recommender Systems: An Introduction to the Art Tony White

- 5 -

Recommender systems

RS seen as a function

Given:– User model (e.g. ratings, preferences, demographics, situational context)– Items (with or without description of item characteristics)

Find:– Relevance score. Used for ranking.

Relation to Information Retrieval: – IR is finding material [..] of an unstructured nature [..] that satisfies an

information need from within large collections [..].» (Manning et al. 20081)

(1) Manning, Raghavan, and Schütze, Introduction to information retrieval, Cambridge University Press, 2008

Page 6: Recommender Systems: An Introduction to the Art Tony White

- 6 -

Recommender System Paradigms

Recommender systems reduce information overload by estimating relevance

Page 7: Recommender Systems: An Introduction to the Art Tony White

- 7 -

Recommender System Paradigms

Personalized recommendations

Page 8: Recommender Systems: An Introduction to the Art Tony White

- 8 -

Recommender System Paradigms

Collaborative: "Tell me what's popular among my peers"

Page 9: Recommender Systems: An Introduction to the Art Tony White

- 9 -

Recommender System Paradigms

Content-based: "Show me more of the same what I've liked"

Page 10: Recommender Systems: An Introduction to the Art Tony White

- 10 -

Recommender System Paradigms

Knowledge-based: "Tell me what fits based on my needs"

Page 11: Recommender Systems: An Introduction to the Art Tony White

- 11 -

Recommender System Paradigms

Hybrid: combinations of various inputs and/or composition of different mechanism

Page 12: Recommender Systems: An Introduction to the Art Tony White

Collaborative Filtering

Page 13: Recommender Systems: An Introduction to the Art Tony White

- 13 -

Collaborative Filtering (CF)

The most prominent approach to generate recommendations– used by large, commercial e-commerce sites– well-understood, various algorithms and variations exist– applicable in many domains (book, movies, DVDs, ..)

Approach– use the "wisdom of the crowd" to recommend items

Basic assumption and idea– Users give ratings to catalog items (implicitly or explicitly)– Customers who had similar tastes in the past, will have similar tastes in the

future

Page 14: Recommender Systems: An Introduction to the Art Tony White

- 14 -

Pure CF Approaches

Input– Only a matrix of given user–item ratings

Output types– A (numerical) prediction indicating to what degree the current user will like or

dislike a certain item– A top-N list of recommended items

Page 15: Recommender Systems: An Introduction to the Art Tony White

- 15 -

User-based nearest-neighbor collaborative filtering (1)

Page 16: Recommender Systems: An Introduction to the Art Tony White

- 16 -

User-based nearest-neighbor collaborative filtering (2)

Example– A database of ratings of the current user, Alice, and some other users is given:

– Determine whether Alice will like or dislike Item5, which Alice has not yet rated or seen

Item1 Item2 Item3 Item4 Item5

Alice 5 3 4 4 ?

User1 3 1 2 3 3

User2 4 3 4 3 5

User3 3 3 1 5 4

User4 1 5 5 2 1

Page 17: Recommender Systems: An Introduction to the Art Tony White

- 17 -

User-based nearest-neighbor collaborative filtering (3)

Some first questions– How do we measure similarity?– How many neighbors should we consider?– How do we generate a prediction from the neighbors' ratings?

Item1 Item2 Item3 Item4 Item5

Alice 5 3 4 4 ?

User1 3 1 2 3 3

User2 4 3 4 3 5

User3 3 3 1 5 4

User4 1 5 5 2 1

Page 18: Recommender Systems: An Introduction to the Art Tony White

- 18 -

Measuring user similarity (1)

 

Page 19: Recommender Systems: An Introduction to the Art Tony White

- 19 -

Measuring user similarity (2)

Item1 Item2 Item3 Item4 Item5

Alice 5 3 4 4 ?

User1 3 1 2 3 3

User2 4 3 4 3 5

User3 3 3 1 5 4

User4 1 5 5 2 1

sim = 0.85

sim = 0.71

sim = 0.00

sim = -0.79

Assumes that you don’t include item5 in average

Zeynep
Punkt statt Komma als Trennzeichen.
Page 20: Recommender Systems: An Introduction to the Art Tony White

- 20 -

Pearson correlation

Takes differences in rating behavior into account

Works well in usual domains, compared with alternative measures– such as cosine similarity

Item1 Item2 Item3 Item4

0

1

2

3

4

5

6 Alice

User1

User4

Ratings

Page 21: Recommender Systems: An Introduction to the Art Tony White

- 21 -

Making predictions

 

Only include reasonably similar users (>0). Also, average here includes ALL reasonably similarusers ratings (not the same average as before).

Page 22: Recommender Systems: An Introduction to the Art Tony White

- 22 -

Improving the metrics / prediction function

Not all neighbor ratings might be equally "valuable"– Agreement on commonly liked items is not so informative as agreement on

controversial items– Possible solution: Give more weight to items that have a higher variance

Value of number of co-rated items– Use "significance weighting", by e.g., linearly reducing the weight when the

number of co-rated items is low

Case amplification– Intuition: Give more weight to "very similar" neighbors, i.e., where the

similarity value is close to 1.

Neighborhood selection– Use similarity threshold or fixed number of neighbors

Page 23: Recommender Systems: An Introduction to the Art Tony White

- 23 -

Memory-based and model-based approaches

User-based CF is said to be "memory-based"– the rating matrix is directly used to find neighbors / make predictions– does not scale for most real-world scenarios– large e-commerce sites have tens of millions of customers and millions of

items

Model-based approaches– based on an offline pre-processing or "model-learning" phase– at run-time, only the learned model is used to make predictions– models are updated / re-trained periodically– large variety of techniques used – model-building and updating can be computationally expensive– item-based CF is an example for model-based approaches

Page 24: Recommender Systems: An Introduction to the Art Tony White

- 24 -

Item-based collaborative filtering

Basic idea: – Use the similarity between items (and not users) to make predictions

Example: – Look for items that are similar to Item5– Take Alice's ratings for these items to predict the rating for Item5

Item1 Item2 Item3 Item4 Item5

Alice 5 3 4 4 ?

User1 3 1 2 3 3

User2 4 3 4 3 5

User3 3 3 1 5 4

User4 1 5 5 2 1

Page 25: Recommender Systems: An Introduction to the Art Tony White

- 25 -

The cosine similarity measure

 

 

Page 26: Recommender Systems: An Introduction to the Art Tony White

- 26 -

Making predictions

A common prediction function:

Neighborhood size is typically also limited to a specific size

Not all neighbors are taken into account for the prediction

An analysis of the MovieLens dataset indicates that "in most real-world situations, a neighborhood of 20 to 50 neighbors seems reasonable" (Herlocker et al. 2002)

 

Page 27: Recommender Systems: An Introduction to the Art Tony White

- 27 -

Data sparsity problems

Cold start problem– How to recommend new items? What to recommend to new users?

Straightforward approaches– Ask/force users to rate a set of items– Use another method (e.g., content-based, demographic or simply non-

personalized) in the initial phase– Default voting: assign default values to items that only one of the two users to

be compared has rated (Breese et al. 1998)

Alternatives– Use better algorithms (beyond nearest-neighbor approaches)– Example:

In nearest-neighbor approaches, the set of sufficiently similar neighbors might be too small to make good predictions

Assume "transitivity" of neighborhoods

Page 28: Recommender Systems: An Introduction to the Art Tony White

- 28 -

Association rule mining

Page 29: Recommender Systems: An Introduction to the Art Tony White

- 29 -

Recommendation based on Association Rule Mining

Simplest approach– transform 5-point ratings into binary

ratings (1 = above user average)

Mine rules such as– Item1 → Item5

support (2/4), confidence (2/2) (without Alice)

Make recommendations for Alice (basic method)– Determine "relevant" rules based on Alice's transactions

(the above rule will be relevant as Alice bought Item1)– Determine items not already bought by Alice– Sort the items based on the rules' confidence values

Different variations possible– dislike statements, user associations ..

Item1 Item2 Item3 Item4 Item5

Alice 1 0 0 0 ?User1 1 0 1 0 1

User2 1 0 1 0 1

User3 0 0 0 1 1

User4 0 1 1 0 0

Page 30: Recommender Systems: An Introduction to the Art Tony White

- 30 -

The Apriori Algorithm

Scan D

itemset sup.{1} 2{2} 3{3} 3{4} 1{5} 3

C1

itemset sup.{1} 2{2} 3{3} 3{5} 3

L1

itemset sup{1 3} 2{2 3} 2{2 5} 3{3 5} 2

L2itemset sup

{1 2} 1{1 3} 2{1 5} 1{2 3} 2{2 5} 3{3 5} 2

C2 itemset{1 2}{1 3}{1 5}{2 3}{2 5}{3 5}

C2

Scan D

C3 itemset{2 3 5}

Scan D L3 itemset sup{2 3 5} 2

TID Items100 1 3 4200 2 3 5300 1 2 3 5400 2 5

Database DMin support =50%

Page 31: Recommender Systems: An Introduction to the Art Tony White

Content-based recommendation

Page 32: Recommender Systems: An Introduction to the Art Tony White

- 32 -

Content-based recommendation

While CF – methods do not require any information about the items, it might be reasonable to exploit such information; and recommend fantasy novels to people who liked fantasy novels in the past

What do we need: some information about the available items such as the genre ("content") some sort of user profile describing what the user likes (the preferences)

The task: learn user preferences locate/recommend items that are "similar" to the user preferences

"show me more of the same what I've liked"

Page 33: Recommender Systems: An Introduction to the Art Tony White

- 33 -

What is the "content"?

Most CB-recommendation techniques applied to recommending text documents.– Like web pages or newsgroup messages for example.

Content of items can also be represented as text documents.– With textual descriptions of their basic characteristics.– Structured: Each item is described by the same set of attributes

– Unstructured: free-text description.

Title Genre Author Type Price Keywords

The Night of the Gun

Memoir David Carr Paperback 29.90 Press and journalism, drug addiction, personal memoirs, New York

The Lace Reader

Fiction, Mystery

Brunonia Barry

Hardcover 49.90 American contemporary fiction, detective, historical

Into the Fire Romance, Suspense

Suzanne Brockmann

Hardcover 45.90 American fiction, murder, neo-Nazism

Page 34: Recommender Systems: An Introduction to the Art Tony White

- 34 -

Item representation

Content representation and item similarities

Simple approach– Compute the similarity of an unseen item with the

user profile based on the keyword overlap (e.g. using the Dice coefficient)

– Or use and combine multiple metrics

Title Genre Author Type Price Keywords

The Night of the Gun

Memoir David Carr Paperback 29.90 Press and journalism, drug addiction, personal memoirs, New York

The Lace Reader

Fiction, Mystery

Brunonia Barry

Hardcover 49.90 American contemporary fiction, detective, historical

Into the Fire

Romance, Suspense

Suzanne Brockmann

Hardcover 45.90 American fiction, murder, neo-Nazism

User profile

Title Genre Author Type Price Keywords

… Fiction Brunonia, Barry, Ken Follett

Paperback 25.65 Detective, murder, New York

 

 

Page 35: Recommender Systems: An Introduction to the Art Tony White

- 35 -

Simple keyword representation has its problems – in particular when automatically extracted as

not every word has similar importance longer documents have a higher chance to have an overlap with the user profile

Standard measure: TF-IDF– Encodes text documents in multi-dimensional Euclidian space

weighted term vector– TF: Measures, how often a term appears (density in a document)

assuming that important terms appear more often normalization has to be done in order to take document length into account

– IDF: Aims to reduce the weight of terms that appear in all documents

Page 36: Recommender Systems: An Introduction to the Art Tony White

- 36 -

TF-IDF II

or log(1+freqij)

Page 37: Recommender Systems: An Introduction to the Art Tony White

- 37 -

Example TF-IDF representation

 

Example taken from http://informationretrieval.org

Antony and Cleopatra

Julius Caesar

The Tempest

Hamlet Othello Macbeth

Antony 157 73 0 0 0 0

Brutus 4 157 0 1 0 0

Caesar 232 227 0 2 1 1

Calpurnia 0 10 0 0 0 0

Cleopatra 57 0 0 0 0 0

mercy 1.51 0 3 5 5 1

worser 1.37 0 1 1 1 0

 

Page 38: Recommender Systems: An Introduction to the Art Tony White

- 38 -

Example TF-IDF representation

 

Example taken from http://informationretrieval.org

Antony and Cleopatra

Julius Caesar

The Tempest

Hamlet Othello Macbeth

Antony 157 73 0 0 0 0

Brutus 4 157 0 1 0 0

Caesar 232 227 0 2 1 1

Calpurnia 0 10 0 0 0 0

Cleopatra 57 0 0 0 0 0

mercy 1.51 0 3 5 5 1

worser 1.37 0 1 1 1 0

Antony and Cleopatra

Julius Caesar

The Tempest

Hamlet Othello Macbeth

Antony 5.25 3.18 0 0 0 0.35

Brutus 1.21 6.1 0 1 0 0

Caesar 8.59 2.54 0 1.51 0.25 0

Calpurnia 0 1.54 0 0 0 0

Cleopatra 2.85 0 0 0 0 0

mercy 1.51 0 1.9 0.12 5.25 0.88

worser 1.37 0 0.11 4.15 0.25 1.95

Page 39: Recommender Systems: An Introduction to the Art Tony White

- 39 -

Improving the vector space model

Vectors are usually long and sparse

remove stop words – They will appear in nearly all documents.– e.g. "a", "the", "on", …

use stemming– Aims to replace variants of words by their common stem– e.g. "went" "go", "stemming" "stem", …

size cut-offs – only use top n most representative words to remove "noise" from data– e.g. use top 100 words

Page 40: Recommender Systems: An Introduction to the Art Tony White

- 40 -

Cosine similarity

 

Page 41: Recommender Systems: An Introduction to the Art Tony White

- 41 -

Recommending items

Page 42: Recommender Systems: An Introduction to the Art Tony White

- 42 -

Recommending items

Retrieval quality depends on individual capability to formulate queries with right keywords.

Query-based retrieval: Rocchio's method– The SMART System: Users are allowed to rate (relevant/irrelevant) retrieved

documents (feedback)– The system then learns a prototype of relevant/irrelevant documents– Queries are then automatically extended with additional terms/weight of relevant

documents

Page 43: Recommender Systems: An Introduction to the Art Tony White

- 43 -

Rocchio details

Document collections D+ (liked) and D- (disliked)– Calculate prototype vector for these categories.

Computing modified query Qi+1 from current query Qi with:

 

, , used to fine-tune the feedback – weight for original query– weight for positive feedback– weight for negative feedback

Often only positive feedback is used– More valuable than negative feedback

Typical α=1,β=0.8,γ=0

Page 44: Recommender Systems: An Introduction to the Art Tony White

- 44 -

Practical challenges of Rocchio's method

Page 45: Recommender Systems: An Introduction to the Art Tony White

- 45 -

Summary

Described collaborative filtering and content-based filtering

Problems of cold start, sparsity and dimensionality

Knowledge based systems deal with domains where few examples exist

Integration of social network data also used