33
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 1 Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs Alexandre Passant EDF, Recherche & Développement, Clamart, France Université Paris 4, Laboratoire LaLICC, Paris, France

Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

Embed Size (px)

DESCRIPTION

Talk at ICWSM2007 - http://www.icwsm.org/papers/paper15.html

Citation preview

Page 1: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 1

Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

Alexandre Passant EDF, Recherche & Développement, Clamart, France Université Paris 4, Laboratoire LaLICC, Paris, France

Page 2: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 2

1 Corporate Blogging

Page 3: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 3

Classical (old-school) information systems

•  Features: •  Workflows •  Access Control •  Templates •  Archives

•  Designed to store: •  Project reports •  Internal notes •  Hierarchical information

•  But, what about informal and unstructured content: •  Short news •  E-mails •  Coffee-break discussions

Page 4: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 4

A new vision: Enterprise 2.0

•  Web 2.0 concepts in a corporate environment

•  Write and share any knowledge:

•  Blogs •  Comments, trackbacks

•  Re-use and capitalize knowledge:

•  Wikis •  Open structure and cooperation

•  Be informed:

•  RSS feeds •  Shared subscriptions

Page 5: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 5

EDF R&D Experiment

•  EDF R&D: •  2000 researchers •  3 research centers

•  Our platform: •  RSS subscription system •  Blogs •  Group wikis

•  Experiment (1 1/2 year): •  1500 feeds, 8000 blog posts •  80 bloggers, 600 readers / subscribers

Page 6: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 6

2 Weblogs Indexing and Folksonomies

Page 7: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 7

Indexing blog posts

•  Meta-data about the container:

•  Automatically-created metadata: •  Author, creation date, title

•  DublinCore standards •  Meta-data about blog post content:

•  Previously-defined: •  Ontologies, taxonomies

•  User-defined: •  Free keywords and folksonomies

•  Folksonomies: •  Easy to adopt •  Build from scratch

Page 8: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 8

Folksonomies limits

•  Tags variation: •  Different tags for the same meaning:

•  Synonymy, Abbreviation, Acronyms … •  Typos

•  Tag ambiguity: •  Different meanings for the same tag:

•  Acronyms, Homonymy …

•  Flat organisation: •  No way to suggest related tags

•  Paris / France •  blog / socialsoftware

Page 9: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 9

Ambiguity example from del.icio.us

Page 10: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 10

Some tools to reduce variation and explosion

•  Tag completion: •  Forward and “backward” completion •  AJAX

•  Tag suggestion: •  Based on blog post content •  Suggesting existing tags •  But also new ones using named entites extraction •  AJAX

Page 11: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 11

3 The Semantic Web

Page 12: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 12

A Few Words About the Semantic Web

•  Semantic Web

•  Extending the current Web •  Unified description of resources

•  Ontologies

•  Representation of a domain •  Common vocabulary

•  Languages

•  Modeling: RDF(S), OWL … •  Querying: SPARQL …

Page 13: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 13

Weblogging and the Semantic Web

•  Tools: •  Snippet Manager [Cayzer & al. 2005] •  semiBlog [Möller & al. 2006]

•  Folksonomies: •  Tag ontology to represent connections between tags [Newman 2005] •  Tagging model [Gruber 2005]

•  Vocabularies:

•  RSS 1.0 based on RDF, can be extended with other vocabularies •  SIOC: Semantically-Interlinked Online Communities [Breslin & al. 2005]

•  State of The Art: •  « What next for Semantic Blogging » [Cayzer 2006]

Page 14: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 14

4 Strengthening Folksonomies

Page 15: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 15

A mix of Folksonomies and Ontologies

•  Mixing both approaches •  Keep the open spirit of folksonomies •  Use an ontology layer as a formal way to represent data •  Remove ambiguity and add meaning to tags to enhance search experience

•  How ? •  Link tags to domain ontology concepts (classes / instances) •  Create links between posts and ontology concepts using SIOC

•  First steps •  Analyse the folksonomy and most popular tags •  Create a core ontology (based on PROTON) and its instances •  300 classes, 600 instances, mainly named entities

Page 16: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 16

A simple way to link tags to ontology instances

Page 17: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 17

From Tags to Ontology (1/5)

1)  Create and tag post

Page 18: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 18

From Tags to Ontology (2/5)

1)  Create and tag post

2)  Link tag to ontology

Page 19: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 19

From Tags to Ontology (3/5)

1)  Create and tag post

2)  Link tag to ontology

3)  Link post to ontology

Page 20: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 20

From Tags to Ontology (4/5)

1)  Create and tag post

2)  Link tag to ontology

3)  Link post to ontology

4)  Create new post

Page 21: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 21

From Tags to Ontology (5/5)

1)  Create and tag post

2)  Link tag to ontology

3)  Link post to ontology

4)  Create new post

5)  Infer related post

Page 22: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 22

Dealing with ambiguity and maintaining instances

•  User interface to remove ambiguity when adding a new post

•  Ability to browse the ontology to link to other class / instance for new / different meaning tags

•  Restricted back office to create new instances

Page 23: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 23

5 Enhanced Information Retrieval

Page 24: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 24

A Semantic Search Engine

•  Decentralized data:

•  Ontologies •  Blog posts w/ RDF description •  A central 3store w/ SPARQL

•  Using tags / ontology links to:

•  Deal with tags variations •  Remove ambiguity

Page 25: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 25

Suggesting related tags and posts

•  Goal:

•  Offer suggestions to users based on their current tag search •  Let users discover potential interesting posts •  Different approaches, use both tagging and ontologies:

•  Using instances co-occurrence •  Using instances type •  Defining specific rules

•  Run on the fly and suggest results, using SPARQL

Page 26: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 26

Using instances co-occurrence

•  Basic approach:

–  Posts sharing concepts

–  Ontology to remove ambiguity

Page 27: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 27

Using instances types

•  Suggesting instances of the same class:

–  Programming languages

–  Energy companies

Page 28: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 28

Defining specific rules

•  Rule(s) for each class

•  Location / {sub|top}locations •  Company / companies working

in the same area

Page 29: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 29

Rule example

•  RDF rule {! :x rdf:type sioc:post .! :y rdf:type sioc:post .! :x sioc:topic :a .! :y sioc:topic :b .! :a rdf:type ptop:Agent .! :b rdf:type ptop:Agent .! :a company:expertiseIn :d .! :b company:expertiseIn :d !} => {! :x sioc:related_to :y !} .

•  Reformulated w/ SPARQL SELECT ?post ?topic ?domain!WHERE {! <$uri> [! rdf:type ptop:Agent ;! company:expertiseIn ?domain! ] .! ?post sioc:topic ?topic .! ?topic [! rdf:type ptop:Agent ;! company:expertiseIn ?domain! ] !}

Page 30: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 30

Running and Displaying results

•  SPARQL queries on the triplestore

•  Offering different tagclouds to split results in clusters:

•  Instances •  Specific rules •  Co-occurrence

•  Let users know why they should have a look to related posts

Page 31: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 31

6 Conclusion

Page 32: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 32

Conclusion

•  Mixing folksonomies and ontologies •  Keep simplicity of free tagging •  Ontologies to reduce limits of free tagging •  Infer related posts / topics •  A mix of top-down and bottom-up approach

•  A real use-case •  Adapt model and interfaces to users needs

•  Future works •  Knowledge extraction from weblogs •  Use wikis to add properties to ontology instances •  New ways to browse and query blog posts

Page 33: Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 33

Thank you !