Upload
massimiliano-ruocco
View
341
Download
2
Embed Size (px)
DESCRIPTION
Citation preview
1
NTNU@MediaEval 2011: Social Event Detection Task (SED)NTNU@MediaEval 2011: Social Event Detection Task (SED)Massimiliano Ruocco, Heri RamampiaroData and Information Management Group Department Of Computer and Information ScienceNorwegian University of Science and [email protected]
MediaEval 2011 Workshop - Pisa
2
MediaEval2011 – SED Task
Outline
- Proposed Approach
- Experiments
- Results
- Conclusions and Future Works
3
MediaEval2011 – SED Task
Framework - Workflow
QueryExpansion
QueryExpansion SearchSearch
ClusteringClustering
Semantic Merge
Semantic Merge
RefinementRefinement
CategorizationCategorization
CategorizationCategorization
DatasetDataset
LastFMLastFM
SparQLEndpointSparQLEndpoint
DBPediaDBPedia
Clustered List
4
Challenge 1
5
MediaEval2011 – SED Task
Framework – Challenge 1
Query Expansion- Football venues names based in Rome and Barcelona
- Location of the venues (Latitude and Longitude)
- Output: list of venues names in different languages with related location
- V = {(v11,…,vN1
1,g1),…,(v1M,…,vNM
M,gM)}
QueryExpansion
QueryExpansion SearchSearch
RefinementRefinement
CategorizationCategorization
CategorizationCategorization
DatasetDataset
LastFMLastFM
SparQLEndpoint
SparQLEndpoint
DBPediaDBPedia
Clustered List
Resources- Query Language: SparQL
- Database: DBPedia
- Java Interface: Jena
6
MediaEval2011 – SED Task
Framework – Challenge 1
Search + Categorization- Terms in OR , (Terms in OR + Spatial constraint)
- Categorization over different textual metadata (Title, Tag, Description)
- Output: result list grouped by venue (topic: soccer)
- R = {(r11,...,rN1
1),…, (r1M,...,rNM
M)}
QueryExpansion
QueryExpansion SearchSearch
RefinementRefinement
CategorizationCategorization
CategorizationCategorization
DatasetDataset
LastFMLastFM
SparQLEndpoint
SparQLEndpoint
DBPediaDBPedia
Resources- Index: Solr
- Categorization: SemanticHacker API
- Categories: Open Directory Project
7
MediaEval2011 – SED Task
Framework – Challenge 1
Clustering- Results grouped by temporal tag
- Quality Threshold Clustering (Qt Clustering)
- Output: pictures grouped by temporal tag and venue
QueryExpansion
QueryExpansion SearchSearch ClusteringClustering
Semantic Merge
Semantic Merge
RefinementRefinement
CategorizationCategorization
CategorizationCategorization
DatasetDataset
LastFMLastFM
SparQLEndpoint
SparQLEndpoint
DBPediaDBPedia
8
Challenge 2
9
MediaEval2011 – SED Task
Framework – Challenge 2
Query Expansion- Location and venue names extraction of “Paradiso” and “Parc del Fórum”
- Output: list of venues names with related location
- V = {(v11,…,vN1
1,g1),…,(v1M,…,vNM
M,gM)}
Resources- Service: LastFM API
- Database: LastFM
QueryExpansion
QueryExpansion SearchSearch
RefinementRefinement
CategorizationCategorization
CategorizationCategorization
DatasetDataset
LastFMLastFM
SparQLEndpoint
SparQLEndpoint
DBPediaDBPedia
10
MediaEval2011 – SED Task
Framework – Challenge 2
Search- (Terms in OR + Spatial constraint), (Terms in AND)
- Output: result list grouped by venue
- R = {(r11,...,rN1
1),…, (r1M,...,rNM
M)}
Resources- Index: Solr
QueryExpansion
QueryExpansion SearchSearch
RefinementRefinement
CategorizationCategorization
CategorizationCategorization
DatasetDataset
LastFMLastFM
SparQLEndpoint
SparQLEndpoint
DBPediaDBPedia
Clustered List
11
MediaEval2011 – SED Task
Framework – Challenge 2
Clustering + Semantic Merge- Results grouped by temporal tag
- Quality Threshold Clustering (Qt Clustering)
- Semantic merge: based on entity names representing artist and event name
- Semantic similarity: number of shared entity names
- Output: pictures grouped by temporal tag and venue
QueryExpansion
QueryExpansion SearchSearch ClusteringClustering
Semantic Merge
Semantic Merge
RefinementRefinement
CategorizationCategorization
CategorizationCategorization
DatasetDataset
LastFMLastFM
SparQLEndpoint
SparQLEndpoint
DBPediaDBPedia
12
MediaEval2011 – SED Task
Framework – Challenge 2
Refinement- Refinement query for each cluster:
- Top-k most frequent tags
- Top-k most frequent entity names
- Categorization: filter over the query result (using search engine score)
QueryExpansion
QueryExpansion SearchSearch
RefinementRefinement
CategorizationCategorization
CategorizationCategorization
DatasetDataset
LastFMLastFM
SparQLEndpoint
SparQLEndpoint
DBPediaDBPediaClustered
List
Resources- Index: Solr
13
MediaEval2011 – SED Task
Results - Experiments
Challenge 2- Run 1: No Refinement step
- Run 2: Refinement with top-100 tags
- Run 3: Refinement with entity names
Challenge 1- Run 1: Categorization with only Tag
- Run 2: Categorization with all textual metadata
14
MediaEval2011 – SED Task
Results - Experiments
Challenge 2- Run 1: No Refinement step
- Run 2: Refinement with top-100 tags
- Run 3: Refinement with entity names
Challenge 1- Run 1: Categorization with only Tag
- Run 2: Categorization with all textual metadata
homogeinity
15
MediaEval2011 – SED Task
Results - Experiments
Challenge 2- Run 1: No Refinement step
- Run 2: Refinement with top-100 tags
- Run 3: Refinement with entity names
Challenge 1- Run 1: Categorization with only Tag
- Run 2: Categorization with all textual metadata
completeness
16
MediaEval2011 – SED Task
Conclusions and Future Works
- Tag metadata more representative
- Better performance using of entity names in event cluster refinement
- Refinement block useful for better completeness
- Use of Refinement block for general event clustering purpose
17
Thanks for the attention
Questions?
http://www.idi.ntnu.no/~ruocco/