17
1 NTNU@MediaEval 2011: Social Event Detection Task (SED) NTNU@MediaEval 2011: Social Event Detection Task (SED) Massimiliano Ruocco, Heri Ramampiaro Data and Information Management Group Department Of Computer and Information Science Norwegian University of Science and Technology [email protected] MediaEval 2011 Workshop - Pisa

NTNU @ Social Event Detection Task (SED)

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: NTNU @ Social Event Detection Task (SED)

1

NTNU@MediaEval 2011: Social Event Detection Task (SED)NTNU@MediaEval 2011: Social Event Detection Task (SED)Massimiliano Ruocco, Heri RamampiaroData and Information Management Group Department Of Computer and Information ScienceNorwegian University of Science and [email protected]

MediaEval 2011 Workshop - Pisa

Page 2: NTNU @ Social Event Detection Task (SED)

2

MediaEval2011 – SED Task

Outline

- Proposed Approach

- Experiments

- Results

- Conclusions and Future Works

Page 3: NTNU @ Social Event Detection Task (SED)

3

MediaEval2011 – SED Task

Framework - Workflow

QueryExpansion

QueryExpansion SearchSearch

ClusteringClustering

Semantic Merge

Semantic Merge

RefinementRefinement

CategorizationCategorization

CategorizationCategorization

DatasetDataset

LastFMLastFM

SparQLEndpointSparQLEndpoint

DBPediaDBPedia

Clustered List

Page 4: NTNU @ Social Event Detection Task (SED)

4

Challenge 1

Page 5: NTNU @ Social Event Detection Task (SED)

5

MediaEval2011 – SED Task

Framework – Challenge 1

Query Expansion- Football venues names based in Rome and Barcelona

- Location of the venues (Latitude and Longitude)

- Output: list of venues names in different languages with related location

- V = {(v11,…,vN1

1,g1),…,(v1M,…,vNM

M,gM)}

QueryExpansion

QueryExpansion SearchSearch

RefinementRefinement

CategorizationCategorization

CategorizationCategorization

DatasetDataset

LastFMLastFM

SparQLEndpoint

SparQLEndpoint

DBPediaDBPedia

Clustered List

Resources- Query Language: SparQL

- Database: DBPedia

- Java Interface: Jena

Page 6: NTNU @ Social Event Detection Task (SED)

6

MediaEval2011 – SED Task

Framework – Challenge 1

Search + Categorization- Terms in OR , (Terms in OR + Spatial constraint)

- Categorization over different textual metadata (Title, Tag, Description)

- Output: result list grouped by venue (topic: soccer)

- R = {(r11,...,rN1

1),…, (r1M,...,rNM

M)}

QueryExpansion

QueryExpansion SearchSearch

RefinementRefinement

CategorizationCategorization

CategorizationCategorization

DatasetDataset

LastFMLastFM

SparQLEndpoint

SparQLEndpoint

DBPediaDBPedia

Resources- Index: Solr

- Categorization: SemanticHacker API

- Categories: Open Directory Project

Page 7: NTNU @ Social Event Detection Task (SED)

7

MediaEval2011 – SED Task

Framework – Challenge 1

Clustering- Results grouped by temporal tag

- Quality Threshold Clustering (Qt Clustering)

- Output: pictures grouped by temporal tag and venue

QueryExpansion

QueryExpansion SearchSearch ClusteringClustering

Semantic Merge

Semantic Merge

RefinementRefinement

CategorizationCategorization

CategorizationCategorization

DatasetDataset

LastFMLastFM

SparQLEndpoint

SparQLEndpoint

DBPediaDBPedia

Page 8: NTNU @ Social Event Detection Task (SED)

8

Challenge 2

Page 9: NTNU @ Social Event Detection Task (SED)

9

MediaEval2011 – SED Task

Framework – Challenge 2

Query Expansion- Location and venue names extraction of “Paradiso” and “Parc del Fórum”

- Output: list of venues names with related location

- V = {(v11,…,vN1

1,g1),…,(v1M,…,vNM

M,gM)}

Resources- Service: LastFM API

- Database: LastFM

QueryExpansion

QueryExpansion SearchSearch

RefinementRefinement

CategorizationCategorization

CategorizationCategorization

DatasetDataset

LastFMLastFM

SparQLEndpoint

SparQLEndpoint

DBPediaDBPedia

Page 10: NTNU @ Social Event Detection Task (SED)

10

MediaEval2011 – SED Task

Framework – Challenge 2

Search- (Terms in OR + Spatial constraint), (Terms in AND)

- Output: result list grouped by venue

- R = {(r11,...,rN1

1),…, (r1M,...,rNM

M)}

Resources- Index: Solr

QueryExpansion

QueryExpansion SearchSearch

RefinementRefinement

CategorizationCategorization

CategorizationCategorization

DatasetDataset

LastFMLastFM

SparQLEndpoint

SparQLEndpoint

DBPediaDBPedia

Clustered List

Page 11: NTNU @ Social Event Detection Task (SED)

11

MediaEval2011 – SED Task

Framework – Challenge 2

Clustering + Semantic Merge- Results grouped by temporal tag

- Quality Threshold Clustering (Qt Clustering)

- Semantic merge: based on entity names representing artist and event name

- Semantic similarity: number of shared entity names

- Output: pictures grouped by temporal tag and venue

QueryExpansion

QueryExpansion SearchSearch ClusteringClustering

Semantic Merge

Semantic Merge

RefinementRefinement

CategorizationCategorization

CategorizationCategorization

DatasetDataset

LastFMLastFM

SparQLEndpoint

SparQLEndpoint

DBPediaDBPedia

Page 12: NTNU @ Social Event Detection Task (SED)

12

MediaEval2011 – SED Task

Framework – Challenge 2

Refinement- Refinement query for each cluster:

- Top-k most frequent tags

- Top-k most frequent entity names

- Categorization: filter over the query result (using search engine score)

QueryExpansion

QueryExpansion SearchSearch

RefinementRefinement

CategorizationCategorization

CategorizationCategorization

DatasetDataset

LastFMLastFM

SparQLEndpoint

SparQLEndpoint

DBPediaDBPediaClustered

List

Resources- Index: Solr

Page 13: NTNU @ Social Event Detection Task (SED)

13

MediaEval2011 – SED Task

Results - Experiments

Challenge 2- Run 1: No Refinement step

- Run 2: Refinement with top-100 tags

- Run 3: Refinement with entity names

Challenge 1- Run 1: Categorization with only Tag

- Run 2: Categorization with all textual metadata

Page 14: NTNU @ Social Event Detection Task (SED)

14

MediaEval2011 – SED Task

Results - Experiments

Challenge 2- Run 1: No Refinement step

- Run 2: Refinement with top-100 tags

- Run 3: Refinement with entity names

Challenge 1- Run 1: Categorization with only Tag

- Run 2: Categorization with all textual metadata

homogeinity

Page 15: NTNU @ Social Event Detection Task (SED)

15

MediaEval2011 – SED Task

Results - Experiments

Challenge 2- Run 1: No Refinement step

- Run 2: Refinement with top-100 tags

- Run 3: Refinement with entity names

Challenge 1- Run 1: Categorization with only Tag

- Run 2: Categorization with all textual metadata

completeness

Page 16: NTNU @ Social Event Detection Task (SED)

16

MediaEval2011 – SED Task

Conclusions and Future Works

- Tag metadata more representative

- Better performance using of entity names in event cluster refinement

- Refinement block useful for better completeness

- Use of Refinement block for general event clustering purpose

Page 17: NTNU @ Social Event Detection Task (SED)

17

Thanks for the attention

Questions?

http://www.idi.ntnu.no/~ruocco/