Personalized Recommender Systems: Mining User PreferenceRecommender Types of Recommender Systems - 6...

Personalized Recommender Systems:Mining User Preference

Joonseok LeeGeorgia Institute of Technology

2013/06/10

Joonseok Lee

Agenda Introduction Types of Recommender Systems Collaborative Filtering Toolkit PREA and Comparative Study Evaluation of Recommender Systems

Joonseok Lee

Why recommendation?

Joonseok Lee

Product recommendation

Friend recommendation

Rating prediction

Personalized web search

Examples

Joonseok Lee

Content-basedRecommender

Types of Recommender Systems

RecommenderSystems

CollaborativeFiltering

Memory-based Model-based

HybridRecommender

Recommendinggood items

Predictingunseen ratings

Maximizing autility function

By approach By goal

Joonseok Lee

Contents-based Filtering

Contents information Demographic data about users

Age, gender, geographic location

Product attributes Movie: genre, director, release year Book: author, language, published year, genre

Depends on domain Domain-specific modeling is necessary. Algorithm is also different domain by domain. With abundant domain knowledge or data, can be very

powerful.

Joonseok Lee

Collaborative Filtering

Collaborative Filtering definition People collaborate to help one another perform filtering, by

recording their reactions to products they consumed. Use other users’ feedback to fit my preference!

Make use of rating data from users only. Direct feedback: rating, , Indirect feedback: click through, page view

Independent of domain Many models and algorithms can work regardless of the

domain.

Joonseok Lee

User-based Collaborative Filtering

- 10 -

Find preferred items by similar users to me. Su: a set of similar users to user u. Trust other user’s opinion proportional to the similarity

between (s)he and I.

Joonseok Lee

User-based Collaborative Filtering

- 11 -

0.8 * 4 + 0.5 * 2 0.8 + 0.5 = 3.23

Joonseok Lee

Item-based Collaborative Filtering

- 12 -

Same way to user-based CF, but in column-wise.

More powerful than user-based CF. Item neighbors tend to be more stable than user neighbors.

Joonseok Lee

Memory-based CF Summary

- 13 -

Pros Simple, easy to implement. Can explain reason of recommendation.

Cons Huge memory consumption. Not scalable.

Joonseok Lee

Matrix Factorization

- 14 -

ItemsV

Joonseok Lee

- 15 -

3 1.4 3 2 1.8

Joonseok Lee

- 16 -

1.8 1.8

2.0 4.2

2.6 2.3

3 1.4 3 2 1.8

Joonseok Lee

- 17 -

3.9 1.8 3.9

4.8 2.2 4.8

1.8 0.8 1.8

4.2 2.0 4.2

3.9 1.8 3.9

2.6 2.3

3.2 2.9

1.2 1.1

2.8 2.5

2.6 2.3

3 1.4 3 2 1.8

Joonseok Lee

- 18 -

3.9 1.8 3.9

4.8 2.2 4.8

1.8 0.8 1.8

4.2 2.0 4.2

3.9 1.8 3.9

2.6 2.3

3.2 2.9

1.2 1.1

2.8 2.5

2.6 2.3

3 1.4 3 2 1.8

Joonseok Lee

Matrix Factorization Summary

- 19 -

Pros Prediction is (most) accurate.

Cons Computationally expensive. Difficult to explain why we recommend.

Joonseok Lee

- 20 -

Joonseok Lee

Why PREA? Since Netflix Prize (2006), a lot of state-of-the-art

algorithms were suggested. They are implemented only by the authors. Different language, different dataset, different evaluation

measures.

Standardized implementation is needed for fair comparison of CF algorithms. PREA implements those algorithms on the same interface.

- 21 -

Joonseok Lee

Features

- 22 -

Joonseok Lee

Comparative Study Motivation: each algorithm may perform well in

different situation. Observe how well each algorithm performs depending on dataset size and density.

Three variables: User count, Item count, Density Dataset: Netflix data Evaluation measures

MAE, RMSE Asymmetric measure

- 23 -

Joonseok Lee- 24 -

Accuracy tends to get better with larger and denser data.

Some algorithms highly depend on dataset size, only in sparse cases. (PMF, Slope1, etc.)

Shape of contour lines are different. (Constant is horizontal, while ItemAvg vertical.)

Joonseok Lee

Best-performing algorithm

- 25 -

The identity of the best-performing algorithm is non-linearly dependent on user/item count, density.

NMF is dominant in low density cases, while BPMFworks well for high density cases and larger dataset.

Regularized SVD and PMF perform well for density levels 2%-4%.

Joonseok Lee

Overall Comparison

- 26 -

Joonseok Lee

Conclusions

- 27 -

Performance of all algorithms depend on dataset size and density, but the nature of dependency varies a lot.

Matrix-factorization methods generally have the highest accuracy. NMF works well for sparse data.

There is general trade-off between accuracy and other factors such as low variance on dataset, computational efficiency, memory consumption, and small number of adjustable parameters.

Joonseok Lee

- 28 -

Joonseok Lee

Online Evaluation Test with real users, on a real situation!

Set up several recommender engines on a target system. Redirect each group of subjects to different recommenders. Observe how much the user behaviors are influenced by the

recommender system.

Limitation Very costly. Need to open imperfect version to real users.

May give negative experience, making them to avoid using the system in the future.

- 29 -

Joonseok Lee

Offline Experiments Filtering promising ones before online evaluation!

Train/Test data split Learn a model from train data, then evaluate it with test data.

How to split: Simulating online behaviors Using timestamps, allow ratings only before it rated. Hide ratings after some specific timestamps. For each test user, hide some portion of recent ratings. Regardless of timestamp, randomly hide some portion of

ratings.

- 30 -

Joonseok Lee

Predicting Ratings Goal: Evaluate the accuracy of predictions. Popular metrics:

Root of the Mean Square Error (RMSE)

Mean Average Error (MAE)

Normalized Mean Average Error (NMAE)

- 31 -

Joonseok Lee

Recommending Items Goal: Suggesting good items (not discouraging bad

items) Popular metrics:

- 32 -

THE END

Thank you for your attention!

Personalized Recommender Systems: Mining User PreferenceRecommender Types of Recommender Systems - 6...

Documents

Recommender Systems Based on Generative Adversarial ... · Recommender Systems Based on Generative Adversarial Networks: A Problem-Driven Perspective

Pazzani - Content-Based Recommender Systems

Advances in Content-based Recommender Systems Explanation ...swap/recsyssummerschool/advances_in_cbrs_201… · Cataldo Musto. Advances in Content-based Recommender Systems –Explanation

Part 15: Knowledge-Based Recommender Systems · Part 15: Knowledge-Based Recommender Systems ... relaxing adventure lying on a beach meat = pork ... discrete or real valued target

Recommender Systems an Introduction Chapter03 Content-based Recommendation

Filtering and Recommender Systems Content-based and Collaborative

Exploratory Recommender Systems Based on ......Exploratory Recommender Systems Based on Reinforcement Learning for Finding Research Topic Li Yu School of Information Renmin University

Distributed Representation-based Recommender Systems in E

Recommender Systems an Introduction Chapter07 Evaluating Recommender Systems

Design Principles for Competence-based Recommender Systems

A Survey on Context-aware Recommender Systems Based on

Distributed Representation-based Recommender Systems in …Distributed Representation-based Recommender Systems in E-commerce ... In Word2vec, or Doc2vec model, a document is a sequence

COMP4121 Advanced Algorithmscs4121/lectures_2019/recommender... · 2019-10-28 · Recommender Systems Content based recommender systems su er a serious problem: classi cation according

Recommender Systems based on Personality Traits

Flexible recommender systems based on graphs

1 Recommender Systems Collaborative Filtering & Content-Based Recommending

Choice-Based Preference Elicitation for Collaborative Filtering Recommender Systems · 2018-01-22 · Choice-Based Preference Elicitation for Collaborative Filtering Recommender Systems

Trust-based recommender systems

Stream-based semi-supervised learning for recommender systems€¦ · · 2017-08-27Stream-based semi-supervised learning for recommender systems ... In our method we follow the

Recommender Systems based on Personality Traits - Tel - Hal