Performance Prediction in Recommender Systemsir.ii.uam.es/~alejandro/2011/umap-slides.pdf · IRG IR Group @ UAM User Modeling, Adaptation and Personalization 2011 July 13, Girona,

User Modeling, Adaptation and Personalization 2011

July 13, Girona, Spain IRGIR Group @ UAM

Performance Prediction in

Recommender Systems

Alejandro BellogínSupervisor: Pablo Castells

Co-supervisor: Iván CantadorEscuela Politécnica Superior

Universidad Autónoma de Madrid

@abellogin

[email protected]

IRGIR Group @ UAM


July 13, Girona, Spain

Recommender Systems

Content-based filtering (CB), Collaborative Filtering (CF), Hybrid Filtering (HF)

For example: User-based Collaborative Filtering

i1 ik im

u1 rat11 rat1k rat1m

uj ratj1 ? rjm

un ratn1 ratnk ratnm

[ ]

, s im , ,j k j k

v N u

g u i C u v ra t v i

s im , : P earso nu v

[ ] : m o st s im ilarN u

IRGIR Group @ UAM



Motivation

Can we detect ambiguous users?

In fact, when is a user considered ambiguous?

IRGIR Group @ UAM



Hypothesis

The amount of uncertainty (ambiguity)

in user data may correlate with the

accuracy of a system’s

recommendations

IRGIR Group @ UAM



Hypothesis

The amount of uncertainty (ambiguity)

in user data may correlate with the

accuracy of a system’s

recommendations

IRGIR Group @ UAM



Research Question

How to dynamically adapt a recommendation strategy to the

user’s preference information available at a certain time?

IRGIR Group @ UAM



Research Question

How to dynamically adapt a recommendation strategy to the

user’s preference information available at a certain time?

Or, if we predict which are the ambiguous users, can we treat

them in a way such the system’s performance increases?

IRGIR Group @ UAM



Proposal

1. Define a predictor of performance = (u, i, r, …)

2. Introduce the predictor in an adaptive strategy:

a) Evaluate its predictiveness using correlation with performance measure

b) Evaluate final performance: static vs adaptive strategy

IRGIR Group @ UAM



Predictor definition

Based on performance prediction from Information Retrieval

• “Estimation of the system’s performance in response to a specific query”

User clarity: captures uncertainty in user data

• Distance between the user’s and the system’s probability model

• X may be: users, items, ratings, or a combination

|c la rity | lo g

x X c

p x uu p x u

p u

system’s model

user’s model

IRGIR Group @ UAM



Applications

Neighbour weighting in Collaborative Filtering

• User’s neighbours are weighted according to their similarity

• Can we take into account the neighbour’s confidence/ambiguity?

Hybrid recommendation

• Weight is the same for every item and user (learnt from training)

• What about boosting those users predicted to perform better for some method?

IRGIR Group @ UAM



Adaptive Strategies

User neighbour weighting

• Static: [ ]

, s im , ,

v N u

g u i C u v r a t v i

IRGIR Group @ UAM



Adaptive Strategies

User neighbour weighting [1]

• Static:

• Adaptive:

[ ]

, s im , ,

v N u


[ ]

, γ s im , ,

v N u

g u i C v u v r a t v i

IRGIR Group @ UAM



Adaptive Strategies


• Static:

• Adaptive:

Hybrid recommendation

• Static: R 1 R 2, , 1 ,g u i g u i g u i

[ ]

, s im , ,

v N u


[ ]

, γ s im , ,

v N u


IRGIR Group @ UAM



Adaptive Strategies


• Static:

• Adaptive:

Hybrid recommendation [3]

• Static:

• Adaptive:

R 1 R 2, , 1 ,g u i g u i g u i

R 1 R 2, γ , 1 γ ,g u i u g u i u g u i

[ ]

, s im , ,

v N u


[ ]

, γ s im , ,

v N u


IRGIR Group @ UAM



0,80

0,82

0,84

0,86

0,88

0,90

0,92

0,94

0,96

0,98

10 20 30 40 50 60 70 80 90 10 20 30 40 50 60 70 80 90

MA

E

% of ratings for training

Standard CF

Clarity-enhanced CF

b) Neighbourhood size: 500

0,80

0,81

0,82

0,83

0,84

0,85

0,86

0,87

0,88

100 150 200 250 300 350 400 450 500 100 150 200 250 300 350 400 450 500

MA

E

Neighbourhood size

Standard CF

Clarity-enhanced CF

b) 80% training

Results – Neighbour weighting

Correlation analysis [1]

• With respect to Neighbour Goodness metric: “how good a neighbour is to her vicinity”

Performance [1] (MAE = Mean Average Error, the lower the better)

IRGIR Group @ UAM



0,80

0,82

0,84

0,86

0,88

0,90

0,92

0,94

0,96

0,98

10 20 30 40 50 60 70 80 90 10 20 30 40 50 60 70 80 90

MA

E

% of ratings for training

Standard CF

Clarity-enhanced CF

b) Neighbourhood size: 500

0,80

0,81

0,82

0,83

0,84

0,85

0,86

0,87

0,88

100 150 200 250 300 350 400 450 500 100 150 200 250 300 350 400 450 500

MA

E

Neighbourhood size

Standard CF

Clarity-enhanced CF

b) 80% training

Results – Neighbour weighting


• With respect to Neighbour Goodness metric: “how good a neighbour is to her vicinity”

Performance [1] (MAE = Mean Average Error, the lower the better)

Improvement of over 5% wrt. the baseline

Plus, it does not degrade performance

Positive, although not very strong correlations

IRGIR Group @ UAM



0

0,01

0,02

0,03

0,04

0,05

0,06

0,07

H1 H2 H3 H4

MAP@50

Adaptive Static

0

0,05

0,1

0,15

0,2

H1 H2 H3 H4

nDCG@50

Adaptive Static

Results – Hybrid recommendation


• With respect to nDCG@50 (nDCG, normalized Discount Cumulative Gain)

Performance [3]

IRGIR Group @ UAM



0

0,01

0,02

0,03

0,04

0,05

0,06

0,07

H1 H2 H3 H4

MAP@50

Adaptive Static

0

0,05

0,1

0,15

0,2

H1 H2 H3 H4

nDCG@50

Adaptive Static

Results – Hybrid recommendation


• With respect to nDCG@50 (nDCG, normalized Discount Cumulative Gain)

Performance [3]

In average, most of the predictors obtain positive, strong correlations

Adaptive strategy outperforms static for

different combination of recommenders

IRGIR Group @ UAM



Contributions

Inferring user’s performance in a recommender system

Building adaptive recommendation strategies

• Dynamic neighbour weighting: according to expected goodness of neighbour

• Dynamic hybrid recommendation: based on predicted performance

Encouraging results

• Adaptive strategies obtain better (or equal) results than static

• Positive predictive power (good correlations between predictors and metrics)

IRGIR Group @ UAM



Related publications

[1] A Performance Prediction Aproach to Enhance Collaborative

Filtering Performance. A. Bellogín and P. Castells. In ECIR 2010.

[2] Predicting the Performance of Recommender Systems: An

Information Theoretic Approach. A. Bellogín, P. Castells, and I.

Cantador. In ICTIR 2011.

[3] Performance Prediction for Dynamic Ensemble Recommender

Systems. A. Bellogín, P. Castells, and I. Cantador. In press.

IRGIR Group @ UAM



Future Work

What is performance?

We need a theoretical background

• Why do some predictors work better?

Explore other input sources

• Implicit data (with time)

• Social links

Larger datasets

IRGIR Group @ UAM



FW – Performance definition

What is performance?

RMSE?

Precision?

User satisfaction?

IRGIR Group @ UAM



FW – Theoretical background

We need a theoretical background

• Why do some predictors work better?

IRGIR Group @ UAM



FW – Other input sources



• Social links

= (u, …)

Ratings

Implicit

Social

IRGIR Group @ UAM






• Social links

= (u, …)

Ratings

Implicit

Social

IRGIR Group @ UAM






• Social links

= (u, …)

Ratings

Implicit

Social

IRGIR Group @ UAM






• Social links

= (u, …)

Ratings

Implicit

Social

IRGIR Group @ UAM



Thank you!

Performance Prediction in

Recommender Systems

Alejandro Bellogín

Supervisor: Pablo Castells

Co-supervisor: Iván Cantador Escuela Politécnica Superior

Universidad Autónoma de Madrid

@abellogin

[email protected]

IRGIR Group @ UAM



Questions to the committee

In the same way as we have translated the performance prediction concept fromIR to RS, is there any concept from the User Modelling area which infers theambiguity in a user profile and can be incorporated in a similar way into RS?

Up to now, we have focused our research on user-based CF and ensemblerecommenders. We believe this idea may also be useful in a personalisationscenario, where depending on how ambiguous a user is predicted to be, thepersonalisation should receive more or less weight than the query. Could this beinteresting for the UMAP community? Moreover, is there any otherapplication where the proposal may also be relevant?

In theory, correlation values between a predictor and a performance metric shoulduncover some aspects of the user, such as her ambiguity and uncertainty. At thismoment, we have checked that performance predictors are able to capture rating noise (as in Amatriain et al., UMAP 2009). If a user study could be conducted, which variables should be measured in order to validate our predictors?

IRGIR Group @ UAM



Answers (from reviews)

Concepts from User Modelling area which infers the ambiguity in a

user profile

• More general: context

• Goal: how to find the best fit of the conditions for a particular user goal

Useful for personalisation? Or any other application?

• It could be, but the model might be much more complex

Variables to measure in a hypothetical user study

• It depends on the user profile representation

Documents

Performance Prediction in Recommender Systemsir.ii.uam.es/~alejandro/2011/umap-slides.pdf · IRG IR Group @ UAM User Modeling, Adaptation and Personalization 2011 July 13, Girona,