Semantically-Enhanced Recommendation Algorithms

  • View
    124

  • Download
    2

Embed Size (px)

DESCRIPTION

 

Transcript

  • 1. Semantically-EnhancedRecommendation AlgorithmsCCIA 2012Victor Codina& Luigi Ceccaroni vcodina@lsi.upc.edulceccaroni@BDigital.orgDepartament de Llenguatges i Sistemes Informtics Health InformaticsKnowledge Engineering and Machine Learning GroupPersonalized Computational Medicine

2. Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 2 3. The value of recommendations Netflix: 2/3 of the movies rented are recommend Google News: 38% more clickthrough Amazon: 35% sales from recommendations All these systems employ as a main component Collaborative Filtering (CF) approachSemantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 3 4. But in most online services the CF approach does not work so wellWhy?? Usually: Lack of Data Other reasons: lack of context-awareness,domain-specific particularities Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 4 5. OutlineCold-start problem and existing solutionsProposed solution to overcome cold startEvaluation and results Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 5 6. OutlineCold-start problemCold-start problem andexisting solutionsHybrid recommendersProposed solution to overcome cold startEvaluation and results Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 6 7. What is the cold-start problem? Narrow view o No ratings at all associated to items or users Wider viewo Few ratings associated Cold-start scenarios:Users Many ratings Few ratingsMany Normal New user ratingsItemsFew New itemNew user & item ratings Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 7 8. Typical solution: hybrid recommender combiningCF with content-based filteringPAST SOLUTION MORE RECENT SOLUTION Collaborative FilteringCollaborative Filtering ++TraditionalSemantically-Enhanced Content-based filtering Content-based filteringNew itemNew userLack of understandingThe need of domainLimitationand exploitation ofontologies describing explicitdomain semantics metadata relations Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 8 9. OutlineCold-start problem and existing solutionsAcquisition of implicit semanticsProposed solution toovercome cold start Methods for semantics exploitationEvaluation and results Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 9 10. Acquisition of implicit domain semantics Implicit semantics = semantic similarities among itemattributes extracted from Vector Space Models (VSMs) Distributional hypothesis: words that share similarcontexts share similar meaning ItemsUsersContext MatrixAttributesSimilarity Attribute wa,cTransformation measuresemantic(SVD, Conditional(Cosine, similaritiesprobabilities) Jaccard)Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 10 11. Semantic similarities are context-dependant Item-based o Similarity is measured in terms of how many items are similarly described by both attributes User-based o Similarity is measured in terms of how many users are similarly interested in both attributesExample: User-based Items-based- Top-5 tags similar to Sci-Fi Scifi 0.79598457Scifi 0.48631117- Calculated using cosinefuture0.6889696 aliens0.42508063similarity without matrixspace 0.65459067dystopia0.34769687transformation aliens0.6110453 space 0.32580933 robots0.59465224future0.27470198 Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 11 12. Exploitation of implicit semantics in content-based filtering USER MODELINGPREDICTION GENERATION AttributesAttributesAttributerelevance [0,1] wi,a Items w Item attributes (i)i,a degree of interest [-1,1] Items score Attributes ru,i User modeling wu,aVector-based2. Semantic()techniquematchingmatchinguser ratings (u) User interests (u)Expanded user interests (u)1. Profile expansion Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 12 13. Method 1: User profile expansion by constrained spreading activationactivated nodeAttribute a1 a2 a3 a4 a5semantic similarities00.5-0.100 User interests [-1,1] a1a2 a3a4a510.5 0.200.3a1(0.5) (0.3) 0.510.300.1a2a3 0.2 0.310.7 0.8a4 0.250.50.05 00Expanded0 00.71 0 a1a2 a3a4 a5user interests [-1,1]a5 0.3 0.1 0.8 0 1 new interestWeight updated Similarities can be symmetric or not depending on the similarity measure usedMethod- activation threshold = 0.25 hyper-parameters: - fan-out threshold = 0.25 - max.expansion levels = 1 Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 13 14. Method 2: Prediction generation by pair-wise semantic matching strategies Approach:Vector-based matching All-pairs matching Best-pairs matchingAttributeResult:0.15 - 0.056 = 0.094 - 0.056 = 0.12 - 0.009 + 0.035semantic similarities(using the product as aggregation function) a1a2 a3a4a5a1a2 a3a4 a5 Item attributes [0,1]10.5 0.200.3a10 0.30 0 0.7 0.510.300.1a2a3 (0.3) 0.2 0.310.7 0.8 Direct(0.1)a40 00.71 0 matching (1) (0.8)a5 0.3 0.1 0.8 0 1 Similarities can be symmetric or 0 0.5 -0.1 00User interests [-1,1] not depending on the similarity a1a2 a3 a4a5 measure usedMethod - similarity threshold = 0.05hyper-parameter: Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni14 15. OutlineCold-start problem and existing solutionsProposed solution to overcome cold startMovieLens data setEvaluation and resultsExperimental results Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 15 16. Offline experimentation with a MovieLens dataset extended with movie metadataData set statistics after pruning unusualattributes values and movies with few attributes:Users2113Movies 1646Attributes 4 (Genres, directors, actors and tags)Attribute values 2886Ratings per user on avg. 239Rating density 14%Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 16 17. Evaluation of methods for semantics exploitationBaseline = Traditional CB using hybrid user modeling techniqueExpansion-CB = CSA-same + User-based + raw frequenciesMatching-CB = Best-pairs-same + User-based + Forbes-Zhu methodBPR-MF = CF based on matrix factorization optimized for ranking Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 17 18. Conclusions Cold-start problem can be very critical o Above all in systems with small databases Existing solutions have some limitations o Traditional CB cannot solve new user scenario o Semantically-enhanced CB requires domain ontologies to work Exploitation of implicit semantics can be a goodalternative to overcome cold-start problem o User-based semantics is more effective than item-based o The best-pair semantic matching method is more effective than the profile expansion based on spreading activationSemantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 18 19. Future work Experimenting with data sets of different domains o Million Song data set Extending the study of Vector Space Models o Probabilistic similarity measures (e.g. Kullback-Leiber) Apply the same approach to enhance cold-startperformance of context-aware recommenders o Implicit semantics of contextual conditions can also be acquired from user data o Similarly, pair-wise semantic strategies can be employed to enhance contextual user modelingSemantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 19