View
210
Download
2
Category
Tags:
Preview:
DESCRIPTION
Presentation at the Iknow 2012 conference in Graz, Austria.
Citation preview
© author(s) of these slides including research results from the KOM research network and TU Darmstadt; otherwise it is specified at the respective slide 7-Sep-12
Prof. Dr.-Ing. Ralf Steinmetz KOM - Multimedia Communications Lab
Iknow_Ranking_Sem_Info_v9.0__2012.09.07_MA.pptx
Ranking Resources in Folksonomies By Exploiting Semantic Information
i-KNOW 2012, Saarbrücken
Thomas Rodenhausen Mojisola Anjorin Renato Domínguez García Christoph Rensing Ralf Steinmetz
Research Talk
Ranking
Algorithms
Slideshare
Tags ResourcesUsers
Type
Event
Person
Location
Other
Topic
Activity
KOM – Multimedia Communications Lab 2
Social Tagging Applications
www.wordle.net
Social tagging applications are used to organize, classify, manage and share knowledge resources ! Tags are freely chosen keywords attached to resources ! Tags often describe an aspect of the resource
Keynote
I-know 2012
Lora
Aroyo Graz
Semantic Web
Dial E
for
EventsPreparing for Conference
KOM – Multimedia Communications Lab 3
Recommendations in Social Tagging Applications
KOM – Multimedia Communications Lab 4
! Motivation ! Basics ! Folksonomy ! Folksonomy Extended by Tag Types ! Graph-based Resource Recommendation ! Challenge: Concept Drift
! AspectScore & InteliScore ! Evaluation Methodology and Metrics ! Results ! Conclusion & Future Work
Overview
KOM – Multimedia Communications Lab 5
A folksonomy is a quadruple F:= (U, T, R, Y), where U – Users T – Tags R – Resources Y ⊆ U ! T ! R - tag assignment
Folksonomy
Research Talk
Ranking
Algorithms
Slideshare
Tags ResourcesUsers
[Hotho et al. 2006]
KOM – Multimedia Communications Lab 6
An extended folksonomy FA:= (U, T, R, A, Y) where U – Users T – Tags R – Resources A – Tag Types Y ⊆ U ! T ! R x A A = {Topic, Resource Type, Location, Person, Event, Activity, Other}
Folksonomy Extended by Tag Types
[Böhnstedt et al. 2009]
Type
Event
Person
Location
Other
Topic
Activity
Research Talk
Ranking
Algorithms
Slideshare
Tags ResourcesUsers
KOM – Multimedia Communications Lab 7
Graph-based Resource Recommendation
Graph-based Ranking
Algorithm
Item Score r1 0.9 r2 0.7 r3 0.5 r4 0.2
Recommendation List
1 1
2 1
P1
P2
P4
P3
3
4
2
1 2
Folksonomy Graph Scorer e.g. PageRank, HITS Ranked Resources
KOM – Multimedia Communications Lab 8
Adapted PageRank
!
!
!"
"
"
"
"
# #
$%&'()*+,& Tango0
Buenos
Aires0
Buenos
Aires0
Dancing
Festival0
1
"-.
#-.
#-.
"-.
PageRank‘s intelligent surfer model The ranking of a node is determined by how often the surfer visits the node Adjoining edges are followed with a certain probability – determined by the edge weights The query node acts as the starting point and focus i.e. the surfer returns to this node with a certain probability – determined by the node weights
[Hotho et al. 2006]
KOM – Multimedia Communications Lab 9
Concept drift is a challenge for graph-based ranking algorithms ! e.g. Ambiguous tags can cause concept drift as a single tag might represent multiple semantic concepts
Challenge: Concept Drift
FC Barcelona Website
News about Messi
Dallas Cowboys‘ Website
football
?
?
KOM – Multimedia Communications Lab 10
! Motivation ! Basics ! Folksonomy ! Folksonomy Extended by Tag Types ! Graph-based Resource Recommendation ! Challenge: Concept Drift
! AspectScore & InteliScore ! Evaluation Methodology and Metrics ! Results ! Conclusion & Future Work
Overview
KOM – Multimedia Communications Lab 11
The semantic information gained from the semantic relatedness between tags is used to reduce concept drift
Source: wikipedia.org
InteliScore
0.005
Semantic Relatedness (XESA)
XESA calculates the semantic relatedness between pairs of tokens (tags) using the English Wikipedia as reference corpus
[Scholl et al. 2010]
Tango
Buenos
Aires
Dancing Festival
KOM – Multimedia Communications Lab 12
Tag types help to alleviate concept drift ! Tags are disambiguated with respect to different aspects of a resource that a user may describe while tagging
AspectScore
News about Tango
topic location
Tourism in Argentina
Buenos
Aires
KOM – Multimedia Communications Lab 13
News about Tango
topic location
Tourism in Argentina
Buenos
Aires
AspectScore
Google Map of Buenos Aires
topic
Assumption: The tags of type „Topic“ describe the content of the resources well, therefore „Topic“ Tags are given priority.
Tag types help to alleviate concept drift ! e.g. by focusing on the tags describing the content of resources
KOM – Multimedia Communications Lab 14
AspectScore: Step 1
1. Transform Query Node into Query Tags
User query node is transformed into tag nodes, weighted by the usage frequency of the user Assumption: Tags of a user describe the user‘s interests well
Tango3
Buenos
Aires1
Buenos
Aires1
[Abel 2011]
KOM – Multimedia Communications Lab 15
AspectScore: Step 1
1. Transform Query Node in Query Tags
Tango
Buenos
Aires
Buenos
Aires
Dancing
Festival
Query Node
Query Tags
Query Tag
KOM – Multimedia Communications Lab 16
AspectScore: Step 2
1. Transform Query Node into Query Tags
2. Create Folksonomy
Graph for each Query Tag
KOM – Multimedia Communications Lab 17
AspectScore: Step 2
1. Transform Query Node into Query Tags
2. Create Folksonomy
Graph for each Query Tag !
!
"
#!
!"
" "
"
"
"
# #
"
$%&'()*+, Tango3
Buenos
Aires0
Buenos
Aires0
"
Dancing
Festival0
Depending on Ranking Algorithm e.g. FolkRank
KOM – Multimedia Communications Lab 18
AspectScore: Step 3
1. Transform Query Node into Query Tags
2. Create Folksonomy Graph for each Query Tag
3. Adapt Edge Weights
Edge Weights are adapted (in several iteration steps) depending on Query Tag
!
!
"#"$
"%"$!
!"#"$
"#"$ "#"$
"#"$
"#"$
"#"$
"%"$ "%"$
"#"$&
'()*+",-. Tango3
Buenos
Aires0
Buenos
Aires0
"#"$
Dancing
Festival0
KOM – Multimedia Communications Lab 19
AspectScore: Step 4
1. Transform Query Node into Query Tags
2. Create Folksonomy Graph for each Query Tag
3. Adapt Edge Weights
4. Run Ranking Algorithm Run e.g. FolkRank on the adapted folksonomy graph
!
!
"#"$
"%"$!
!"#"$
"#"$ "#"$
"#"$
"#"$
"#"$
"%"$ "%"$
"#"$&
'()*+",-. Tango3
Buenos
Aires0
Buenos
Aires0
"#"$
Dancing
Festival0
KOM – Multimedia Communications Lab 20
AspectScore: Step 5
1. Transform Query Node into Query Tags
2. Create Folksonomy Graph for each Query Tag
3. Adapt Edge Weights
4. Run Ranking Algorithm
5. Accumulate Results for each Query Node
The resulting rankings are accumulated giving preference to certain tag types e.g. topic tags
Tango
3δTopicBuenos
Aires
1δ Location
Buenos
Aires1δ
Topic
KOM – Multimedia Communications Lab 21
! Motivation ! Basics ! Folksonomy ! Folksonomy Extended by Tag Types ! Graph-based Resource Recommendation ! Challenge Concept Drift
! AspectScore & InteliScore ! Evaluation Methodology and Metrics ! Results ! Conclusion & Future Work
Overview
KOM – Multimedia Communications Lab 22
Tango
Buenos
Aires
DancingFestival
Tango
Buenos
Aires
Dancing Festival
A post is a Pu,r= {(u,r,t)|(u,r,t) ! Y} For LeavePostOut, the recommendation task with user as input is harder as with tag as input
Evaluation Methodology: LeavePostOut
[Jäschke et al. 2007]
KOM – Multimedia Communications Lab 23
RTr,t= {(u,r,t)|(u,r,t) ! Y} For LeaveRTOut, the recommendation task with tag as input is harder as with user as input
Evaluation Methodology: LeaveRTOut
Tango
Buenos
Aires
DancingFestival
Tango
Buenos
Aires
Dancing Festival
KOM – Multimedia Communications Lab 24
Bibsonomy corpus with a p-core extraction at level 5 to reduce noise and to focus on the dense portion of the corpus
Evaluation Corpus
Knowledge and Data Engineering Group, University of Kassel: Benchmark Folksonomy Data from Bibsonomy, version of July 7th 2011
Before After Users 7243 69 Bookmark resources 281550 9 Bibtex resources 469654 134 Tags 216094 179 Tag assignments 2740834 3269 Bookmark posts 330192 51 Bibtex posts 526691 959
Tag Type Count Topic 2225 Other 486 Resource Type 198 Event 182 Person/Organisation 143 Activity 35
FReSET – Domínguez García et al 2012 http://www.kom.tu-darmstadt.de/research-results/downloads/software/freset/
KOM – Multimedia Communications Lab 25
Evaluation Metrics
Mean Normalized Precision:
The mean of the normalized Precision at k over several queries Q
The mean of the Average Precision over several queries Q
MNP(Q,k) =1
|Q|
|Q|�
j=1
Precisionj(k)
Precisionmax,j(k)
MAP(Q) =1
|Q|
|Q|�
j=1
1
mj
mj�
k=1
Precision(Rjk)
Mean Average Precision:
[Manning et al 2008]
KOM – Multimedia Communications Lab 26
A violin plot is a combination of a box plot and a density trace
Visualization of Results with Violin Plots
Median
3rd Quartile
1st Quartile [Hintze et al. 1998]
KOM – Multimedia Communications Lab 27
Evaluation Results LeavePostOut
Evaluation results for the recommendation task having tag as input
KOM – Multimedia Communications Lab 28
Evaluation Results for LeavePostOut
Approaches MAP AspectScore 0.2240 FolkRank 0.2136 InteliScore 0.1801 Popularity 0.0937
Evaluation results for the recommendation task having tag as input
KOM – Multimedia Communications Lab 29
Evaluation results for the recommendation task having tag as input
Evaluation Results for LeaveRTOut
KOM – Multimedia Communications Lab 30
Evaluation Results for LeaveRTOut
Approaches MAP Popularity 0.0834 AspectScore 0.0589 FolkRank 0.0529 InteliScore 0.0433
Evaluation results for the recommendation task having tag as input
KOM – Multimedia Communications Lab 31
Exploiting semantic information for resource ranking in folksonomies Limitations ! Manually labeled tag type dataset – error prone, subjective ! XESA based on English Wikipedia – No semantic relatedness measurable for
27% of tags in corpus
Future Work ! Evaluation using CROKODIL corpus – an e-learning application with tag types ! User Study
Conclusion and Future Work
AspectScore Tag disambiguation & importance of tags (based on type)
InteliScore Based on semantic relatedness between tags e.g. XESA Type
Event
Person
Location
Other
Topic
Activity
www.crokodil.de
KOM – Multimedia Communications Lab 32
Questions & Contact
Recommended