Upload
lamkiet
View
214
Download
0
Embed Size (px)
Citation preview
Albert-Ludwigs-Universität Freiburg
Various Aspects of Recommender Systems
May 2nd, 2017
Master project SS17
Master Project
Prof. Dr. Georg Lausen
Dr. Michael Färber
Anas Alzoghbi
Victor Anthony Arrascue Ayala
Agenda
Organization
Recommender Systems
Topics- Finding complementary products (Anthony)
- Cross-domain recommendations (Anthony)
- Scientific Paper recommendation (Anas)
- Recommending new Wikipedia articles (Michael)
- Recommending references for (scientific) texts (Michael)
03.05.2017 Various Aspects of Recommender Systems SS17 2
Requirements
Study regulations (Studienordnung)
- 16 ECTS → 480 hours
Master project
- Team size: 1-3 students
- Project report: ~10-12 pages per student
- Short presentations: 2-3 (individual as needed)
- Final presentation: 25 min
Some preconditions
- Recommended lecture “Data Analysis and Query Language” or similar
03.05.2017 3Various Aspects of Recommender Systems SS17
General goals
Collective work on a project
Gain experience in research and development method
Improve individual programming skills
Incorporate in new topics (Semantic Web, Recommender systems,…)
Learn about problems of larger projects
03.05.2017 4Various Aspects of Recommender Systems SS17
Assessment
Workload of every student must be clearly distinguishable
Some Criteria
- Methodology
- The scope and difficulty of the work / implementation
- Individual contribution
- Team performance: a successful project has a positive effect
- Role and participation in the team (coordination, etc.)
- Quality of code (formatting, documentation, testing)
- Individual report (project report)
- Presentations (especially the final presentation)
03.05.2017 5Various Aspects of Recommender Systems SS17
Organization
6
Meetings
- Building 51 – SR 01 029
Website
- Apply via HISinOne
SVN repository
Various Aspects of Recommender Systems SS1703.05.2017
Master projects
1. Finding complementary products (Anthony)
2. Cross-domain recommendations (Anthony)
3. Scientific Paper recommendation (Anas)
4. Recommending new Wikipedia articles (Michael)
5. Recommending references for (scientific) texts (Michael)
7Various Aspects of Recommender Systems SS1703.05.2017
Finding complementary products - 1st project
03.05.2017 8
“Products that are sold separately but that are used together, each creating a demand for the other”
Click
Various Aspects of Recommender Systems SS17
CP – Traditional Approaches
03.05.2017 10
Data Mining (Association Rules)
- Require transactions
Limitations
- Cold start for new items
- Unpopular products
- No explanations
Various Aspects of Recommender Systems SS17
CP – Problem
03.05.2017 11
Predict if complementary relationship holds
No transactions
Using Semantic Web technologies
- Linked Open Data (DBpedia): knowledge graph
Based on product‘s meta-data
- Publicly available
Various Aspects of Recommender Systems SS17
CP – Solution scheme
03.05.2017 12
Learning to Identify Complementary Products from Dbpedia. Victor Anthony Arrascue Ayala, Trong-Nghia Cheng, Anas Alzoghbi, Georg Lausen. LDOW@WWW 2017
Evaluation using Amazon’s data
Various Aspects of Recommender Systems SS17
1. Reproduce pipeline
2. Add new features
- Observable graph-features
- Meta-data: e.g. price
3. Extend evaluation
- Other categories (Books, Movies and TV, etc.)
- Ranking vs. classification
03.05.2017 13
Goal: improving the scheme
Various Aspects of Recommender Systems SS17
Compulsory task
14
1. Read the paper
2. Extract products attributes
- Smallest category
- Using NER tool (Alchemy / Spotlight)
3. Create knowledge graph
- Crawl links between attributes from DBpedia
4. Data analysis
- Products coverage
- Interconnection’s quality
- Etc…
Various Aspects of Recommender Systems SS1703.05.2017
Submission of compulsory task
15
Pre-requisite to participation
Report- Introduction
- Problem statement (1 page)
- Solution proposal (1 page)
- Data analysis (2 pages)
- Related work (1 pages)
1 team, max. 3 students
Deadline: 16.05.2017, 12:00
Various Aspects of Recommender Systems SS1703.05.2017
Cross-domain recommendations - 2nd project
03.05.2017 16
“The research on cross-domain recommendation generally aims to exploit knowledge from a source domain DS to perform or improve
recommendations in a target domain DT” [RS Handbook]
?
?
?
Various Aspects of Recommender Systems SS17
CDRS – Problem
03.05.2017 17
For each user
- Given a set of likes for items in DS
- Predict items in DT
Using Semantic Web technologies
- Linked Open Data (DBpedia): knowledge graph
- Items are interconnected
Various Aspects of Recommender Systems SS17
CP – Solution scheme (not assessed)
03.05.2017 18
Learning to Identify Complementary Products from Dbpedia. Victor Anthony Arrascue Ayala, Trong-Nghia Cheng, Anas Alzoghbi, Georg Lausen. LDOW@WWW 2017
Evaluation using Facebook’s data (likes)
Various Aspects of Recommender Systems SS17
CP – Solution scheme (not assessed)
03.05.2017 19
Learning to Identify Complementary Products from Dbpedia. Victor Anthony Arrascue Ayala, Trong-Nghia Cheng, Anas Alzoghbi, Georg Lausen. LDOW@WWW 2017
Evaluation using Facebook’s data (likes)
Liked ?
Various Aspects of Recommender Systems SS17
1. Reproduce pipeline
2. Implement a recommender on top
- Predict if a user would like the item
- Predict top-k recommendations
- *Optional: Integrate into RecRD4J
3. Evaluate the recommender
- Use standard metrics: Precision, Recall
03.05.2017 20
Goal: try the scheme
Various Aspects of Recommender Systems SS17
Compulsory task
21
1. Read the paper
2. Build infrastructure- Large dataset (approx. 15 GB)
3. Data analysis
- For each domain (books, movies, music)
- Interconnection’s quality
- Long-tail
- Sparsity
- Etc…
Various Aspects of Recommender Systems SS1703.05.2017
Submission of compulsory task
22
Pre-requisite to participation
Report- Introduction
- Problem statement (1 page)
- Solution proposal (1 page)
- Data analysis (2 pages)
- Related work (1 pages)
1 team, max. 3 students
Deadline: 16.05.2017, 12:00
Various Aspects of Recommender Systems SS1703.05.2017
Scientific Paper recommendation- 3rd project
Recommend Scientific papers to users
Content-Based, Collaborative filtering and Hybrid
Papers features (meta-data)- Textual features: Title, Abstract, Keyword list
- Non-textual features: Publication year, Authors, Venue, Publisher, …
03.05.2017 23Various Aspects of Recommender Systems SS17
Textual paper representation
24
𝑘1 … 𝑘𝑖 𝑘𝑖+1 … 𝑘𝑛
1 … 1 tf-idfi+1 tf-idfn
Term
Extraction
Paper Paper Vector
Scientific Paper recommendation- 3rd project
Various Aspects of Recommender Systems SS1703.05.2017
Scientific Paper recommendation- 3rd project
Rating Matrix
03.05.2017 25Various Aspects of Recommender Systems SS17
HyPRec
Master Project WS 2016
Scientific papers recommender
Probabilistic Topic Modeling (LDA)
Matrix factorization (ALS Algorithm)
Python
GitHub https://github.com/mostafa-mahmoud/sahwaka
03.05.2017 26Various Aspects of Recommender Systems SS17
HyPRec - Architecture
03.05.2017 27
Data Parser
Evaluator
Citeulike Dataset
(csv files)
Recommender
Mysql DB
Metrics Calculator
Train-Test splitter
Content-Based Filtering
Collaborative FilteringMatrix Factorization CF
Item-based CBF
Hybrid
Weighted
(Linear Combination)Papers Model
Textual representation
Latent topics
Features
LDA
Tf-IDF
Publication year, authors,
publisher,...
MRR, NDCG, Recall
User-Based K-Fold Split
Various Aspects of Recommender Systems SS17
Regulations
28
One team – max 3 students
Weekly meetings
Programming language: Python
Various Aspects of Recommender Systems SS1703.05.2017
Regulations
29
Compulsory task (Deadline: 17.05.2017, Pre-requisite toparticipation)
- Get familiar with HyPRec
- Implement a simple Recommender (User-based CF)
- Submit evaluation results (small presentation)
Starting Report (Submission: 24.05.2017)
- Problem statement (1 page)
- Solution proposal (1 page)
Various Aspects of Recommender Systems SS1703.05.2017
New Wikipedia Article Recommendation - 4th project
03.05.2017 30Various Aspects of Recommender Systems SS17
03.05.2017 31
Motivation: Writing New Wikipedia Articles
What to
write
about?
Michael
Slager
LG G4
Dan
Fredinburg
Oleg
Kalashnikov
Adult
Beginners
What to do?
1. Use list of requested articleshttps://en.wikipedia.org/wiki/Wikipedia:Requested_articles
2. Read news or consume other media.
Automatically recommendrelevant novel Wikipedia articles based on newsstream.
Various Aspects of Recommender Systems SS17
Distinguish between notable and not-notable entities
32
Various Aspects of Recommender Systems SS1703.05.2017
Existing Approach for Recommending New Wikipedia Articles
03.05.2017 34
see Färber et al.: „On Emerging Entity Detection“, EKAW 2016.
Various Aspects of Recommender Systems SS17
Task
Build a „live system“ for Wikipedia article recommendation.
03.05.2017 35Various Aspects of Recommender Systems SS17
Task
Improve the system via…
- Better selection of news sources
- Distributed processing of news articles (especially text annotation)
- Considering also very recently added Wikipedia pages
- Find and implement better features / adapt existing features
- Improve binary classification, e.g., by using a Recurrent NeuralNetwork.
- Using word embeddings for better representation of candidates in news articles.
- Using other Knowledge Graphs, e.g., Wikidata or CrunchBase.
03.05.2017 36Various Aspects of Recommender Systems SS17
Compulsory task
1. Read related work (esp., „On Emerging Entity Detection“, EKAW 2016).
2. Extract Wikipedia articles which were inserted between twoWikipedia dumps (given the Wikipedia indices).
3. Annotate news articles (from between the Wikipedia versions) via an entity linking tool and extract noun phrases.
4. Calculate statistics about annotations.
5. Correlate new Wikipedia articles and their mentions with meta-information of news articles (e.g., which sources are suitable forpredicting new Wikipedia articles).
03.05.2017 37Various Aspects of Recommender Systems SS17
Submission of compulsory task
03.05.2017 38
1 team, max. 2 students
Report, Deadline: 16.05.2017, 12:00, Pre-requisite toparticipation
- Introduction (1 page)
- Data analysis (2 pages)
- Related work (1 page)
Project proposal (24.05.2017)
- Additional sections: Problem statement (1 page), proposedapproach/improvements of the system (2 pages), proposedevaluation (1 page)
Various Aspects of Recommender Systems SS17
Citation Recommendation - 5th project
Idea: Enrich (scientific) text with citation markers (e.g, “[1]”) and references.
03.05.2017 39Various Aspects of Recommender Systems SS17
Approach
1. Create model:
- Extract citations with context from publication corpus.
- Develop & implement features for ranking publications.
2. Apply model:
- Extract citation contexts from input text.
- Determine which publications to cite in which context.
- Add citations to text.
03.05.2017 40Various Aspects of Recommender Systems SS17
Useful Data Sets
„Scholarly“
- 101k papers in computer science domain, PDF+metadata
arXiv.org
- Over 1M papers (PDF+metadata)
- Different fields: Physics, Mathematics, Computer Science, Quantitative Biology, Quantitative Finance and Statistics
CiteSeerX
- Database with publications and citations
- Ca. 7M papers
DBLP, Microsoft Academic Graph, …
03.05.2017 41Various Aspects of Recommender Systems SS17
Compulsory task
Read related work
Analyze and compare existing data sets for citationrecommendation, including
- citation context extraction
- publication meta-data retrieval
- citation graph creation
- incorporating external data sets (e.g., DBLP, PageRank, …)
03.05.2017 42Various Aspects of Recommender Systems SS17
Submission of compulsory task
03.05.2017 43
1 team, max. 3 students
Report, Deadline: 16.05.2017, 12:00, Pre-requisite forparticipation
- Introduction (1 page)
- Analysis & comparison of data sets and tools (2 pages)
- Related work (for task in general) (2 pages)
Project proposal (24.05.2017)
- Additional sections: Problem statement (1 page), proposedapproach (2 pages), proposed evaluation (1 page)
Various Aspects of Recommender Systems SS17