Upload
victor-codina
View
1.063
Download
0
Embed Size (px)
DESCRIPTION
Citation preview
Extending recommendation systems with semantics and context-awareness
CCIA 2011
Victor Codina & Luigi Ceccaroni
[email protected] [email protected]
Departament de Llenguatges i Sistemes Informàtics
Knowledge Engineering and Machine Learning Group
Health Informatics
Personalized Computational Medicine
Outline
Traditional vs. Contextual recommendation
State-of-the-art & Current limitations
Research question
Semantics acquisition & exploitation
Proposed model
Experimental evaluation
Conclusions & Future work
Extending Recommendation Systems with Semantics and Context-Awareness 2
Regression problem:
o Given a pair (u ∈ U, i ∈ I), predict item’s degree of utility ( )
Estimation based only on user and item information
Traditional recommendation problem
Extending Recommendation Systems with Semantics and Context-Awareness 3
Recommendation model
preferences (u)
attributes (i)Preference
MatrixContent-based (CB)
recommenderCollaborative filtering (CF)
recommenderHybrid
recommender
Context as additional dimension for estimationo Given a tuple (u, i, c), predict item’s degree of utility in context c
o Context = “situated action”
Context-aware recommendation problem
Extending Recommendation Systems with Semantics and Context-Awareness 4
Recommendation model
Training data
Pre-filteringc
Post-filtering
Multi-Dimensional (MD)c
c
Representational view:
Example:c = (winter, cold)c1 = Season c2 = Temperature
State-of-the-art & limitations
Adaptations of latent-factor models (MD paradigm)
Examples:
o N-dimensional Tensor Factorization
o Bias-based Matrix Factorization with temporal dynamics
Best prediction accuracy results on recent competitions
o E.g.: Netflix challenge (2009), Yahoo! Labs KDD Cup (2011)
Main limitations of latent-factor models:
o Lack of transparency in explaining recommendations
o Low cold-start performance (users and items with few ratings)
o Lack of novelty and diversity of recommendations
Extending Recommendation Systems with Semantics and Context-Awareness 5
Research questions & main assumptions
Research questions
o Q1. Can we overcome the limitations and improve global recommendation quality (not only prediction accuracy) by exploiting domain and context knowledge?
o Q2. Under which conditions is this improvement maximized?
Main assumptions
o There exists semantic relationships among entities of the recommendation space (users, items, contexts)
o The adequate exploitation of these semantic relationships is useful to overcome current limitations
Extending Recommendation Systems with Semantics and Context-Awareness 6
Explicit similarity
Knowledge acquisition and representation
Extending Recommendation Systems with Semantics and Context-Awareness 7
Concept x
S(x,y)?
Concept y
Ontology-based - Edge-based (LCA)- Node-based (MICA)- Logic-based
Statistics-based - Probabilistic measures (PMI)- Dimensionality reduction (LSA)- Graph-based (SimRank)
Domain/Contextconcepts
Similarity measure
Knowledge source
Implicit similarity
Data collections- Folksonomies- Item descriptions
Ontologies- Taxonomies (ODP)- Thesauri (Wordnet)
uses uses
Concept-based modeling (weighted overlay approach)
User/Item representation
Extending Recommendation Systems with Semantics and Context-Awareness 8
d1
d2
d3
d4
Domain knowledge (concepts = item attributes)
Item ‘i’
Pi
User ‘u’
Pu
(Relevance of d3)(Degree of interest in d1)
Interest inferring method- Explicit feedback (Rating avg)- Implicit feedback (Seen frequency)
Attribute weighting method- Structured content (IDF)- Unstructured (TFIDF, tagshare)
Knowledge exploitation
Extending Recommendation Systems with Semantics and Context-Awareness 9
Knowledgetype
Can be used for… Possible benefits
Contextual
Measuring the semantic matching among different context states
Less rigid contextual filtering than using exact matching
Domain-based
Applying semantic inference methods over user/item concept-based profiles:
- Spreading activation- Reasoning based on DLs
Enrich item/user profiles with new concepts semantically related
Measuring the matching between two user/item using various semantic matching strategies:
- Pairwise (Best-pairs or All-pairs)- Groupwise (set-, vector- or graph-based)
More precisesimilarity measurements that using traditional measures
Case of study: a MD semantically-enhanced CB
Contextual prediction model (bias-based):
where:
Stochastic gradient descent for model training:
Extending Recommendation Systems with Semantics and Context-Awareness 10
Contextual User bias
Contextual Item bias All-pairs Item-User semantic matching
Overallrating avg
Session bias of (u,d) contextual bias of (u,d)
MovieLens Dataset
Contextual concepts without semantics
o 3 contextual factors (season, time of the day, weekend?)
Domain concepts with implicit semantics
o Set of pre-selected tags + set of genres
o Semantic relationships among tags acquired from folksonomy
Original dataset pruned by selecting only items with a certain amount of pre-selected tags
Extending Recommendation Systems with Semantics and Context-Awareness 11
Offline experiment
Last ratings (according to timestamp) testing
o In this way we simulate future predictions for each user
5-fold cross validation
Two recommendation tasks evaluated
o Rating prediction (RMSE) and Top-10 recommendation (Recall)
Threshold-based cold-start performance evaluation
o User profile size < 25 ratings: 10% of users
Performance comparison of the proposed model with:
o 3 model variants
o 5 baseline models
o 1 model based on matrix factorization
Extending Recommendation Systems with Semantics and Context-Awareness 12
Results
Paired t test significance among 4 model variants:o Model 1 “Static-CB” (static bias + traditional Item-User matching)
o Model 2 “Static-SemCB” (static bias + “All-pairs” matching)
o Model 3 “Contextual-CB” (contextual bias + traditional matching)
o Model 4 “Contextual-SemCB” (contextual bias + “All-pairs” matching)
Extending Recommendation Systems with Semantics and Context-Awareness 13
0,05 0,001 0,17
0,83
0,837
0,844
0,851
Model 1 Model 2 Model 3 Model 4
Global RMSE
0,05 0,62 0,01
0,916
0,917
0,918
0,919
Model 1 Model 2 Model 3 Model 4
Cold-Start RMSE
(P-values in red)E.g. P-value = 0,05 means that there is a 95% chance of being a real difference
Conclusions
Context-awareness improves prediction accuracy for users with a certain number of ratings (non cold-start)
o 25+ rating: 90% of users
Semantics slightly improves cold-start performance
The knowledge acquisition method for the MovieLens folksonomy may be not adequate: limited domain knowledge
MovieLens users rate several movies at once and not just after seeing the movie
o Rating-session--specific effects have a major influence in the user ratings: distorted contextual information
Extending Recommendation Systems with Semantics and Context-Awareness 14
Future work
Extending evaluation of the proposed CB model:
o Using datasets from other domains (e.g. music, tourism, health)
o Experimenting with other sources of knowledge (e.g. Amazon movie taxonomy)
o Experimenting with other methods for semantics exploitation
o Evaluating other properties (e.g. diversity, novelty, coverage)
Extending CF models with the proposed semantic approach:
o Neighborhood-based
o Matrix Factorization
Extending Recommendation Systems with Semantics and Context-Awareness 15
Extending recommendation systems with semantics and context-awareness
CCIA 2011
Victor Codina & Luigi Ceccaroni
[email protected] [email protected]
Departament de Llenguatges i Sistemes Informàtics
Knowledge Engineering and Machine Learning Group
Health Informatics
Personalized Computational Medicine
Backup slides
Extending Recommendation Systems with Semantics and Context-Awareness 17
Prediction models of all variants
Model 1 (Static-CB):
Model 2 (Static-SemCB):
Model 3 (Contextual-CB):
Model 4 (Contextual-SemCB):
Extending Recommendation Systems with Semantics and Context-Awareness 18