Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

Preview:

DESCRIPTION

Talk at ICWSM2007 - http://www.icwsm.org/papers/paper15.html

Citation preview

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 1

Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs

Alexandre Passant EDF, Recherche & Développement, Clamart, France Université Paris 4, Laboratoire LaLICC, Paris, France

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 2

1 Corporate Blogging

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 3

Classical (old-school) information systems

•  Features: •  Workflows •  Access Control •  Templates •  Archives

•  Designed to store: •  Project reports •  Internal notes •  Hierarchical information

•  But, what about informal and unstructured content: •  Short news •  E-mails •  Coffee-break discussions

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 4

A new vision: Enterprise 2.0

•  Web 2.0 concepts in a corporate environment

•  Write and share any knowledge:

•  Blogs •  Comments, trackbacks

•  Re-use and capitalize knowledge:

•  Wikis •  Open structure and cooperation

•  Be informed:

•  RSS feeds •  Shared subscriptions

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 5

EDF R&D Experiment

•  EDF R&D: •  2000 researchers •  3 research centers

•  Our platform: •  RSS subscription system •  Blogs •  Group wikis

•  Experiment (1 1/2 year): •  1500 feeds, 8000 blog posts •  80 bloggers, 600 readers / subscribers

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 6

2 Weblogs Indexing and Folksonomies

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 7

Indexing blog posts

•  Meta-data about the container:

•  Automatically-created metadata: •  Author, creation date, title

•  DublinCore standards •  Meta-data about blog post content:

•  Previously-defined: •  Ontologies, taxonomies

•  User-defined: •  Free keywords and folksonomies

•  Folksonomies: •  Easy to adopt •  Build from scratch

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 8

Folksonomies limits

•  Tags variation: •  Different tags for the same meaning:

•  Synonymy, Abbreviation, Acronyms … •  Typos

•  Tag ambiguity: •  Different meanings for the same tag:

•  Acronyms, Homonymy …

•  Flat organisation: •  No way to suggest related tags

•  Paris / France •  blog / socialsoftware

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 9

Ambiguity example from del.icio.us

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 10

Some tools to reduce variation and explosion

•  Tag completion: •  Forward and “backward” completion •  AJAX

•  Tag suggestion: •  Based on blog post content •  Suggesting existing tags •  But also new ones using named entites extraction •  AJAX

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 11

3 The Semantic Web

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 12

A Few Words About the Semantic Web

•  Semantic Web

•  Extending the current Web •  Unified description of resources

•  Ontologies

•  Representation of a domain •  Common vocabulary

•  Languages

•  Modeling: RDF(S), OWL … •  Querying: SPARQL …

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 13

Weblogging and the Semantic Web

•  Tools: •  Snippet Manager [Cayzer & al. 2005] •  semiBlog [Möller & al. 2006]

•  Folksonomies: •  Tag ontology to represent connections between tags [Newman 2005] •  Tagging model [Gruber 2005]

•  Vocabularies:

•  RSS 1.0 based on RDF, can be extended with other vocabularies •  SIOC: Semantically-Interlinked Online Communities [Breslin & al. 2005]

•  State of The Art: •  « What next for Semantic Blogging » [Cayzer 2006]

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 14

4 Strengthening Folksonomies

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 15

A mix of Folksonomies and Ontologies

•  Mixing both approaches •  Keep the open spirit of folksonomies •  Use an ontology layer as a formal way to represent data •  Remove ambiguity and add meaning to tags to enhance search experience

•  How ? •  Link tags to domain ontology concepts (classes / instances) •  Create links between posts and ontology concepts using SIOC

•  First steps •  Analyse the folksonomy and most popular tags •  Create a core ontology (based on PROTON) and its instances •  300 classes, 600 instances, mainly named entities

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 16

A simple way to link tags to ontology instances

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 17

From Tags to Ontology (1/5)

1)  Create and tag post

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 18

From Tags to Ontology (2/5)

1)  Create and tag post

2)  Link tag to ontology

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 19

From Tags to Ontology (3/5)

1)  Create and tag post

2)  Link tag to ontology

3)  Link post to ontology

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 20

From Tags to Ontology (4/5)

1)  Create and tag post

2)  Link tag to ontology

3)  Link post to ontology

4)  Create new post

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 21

From Tags to Ontology (5/5)

1)  Create and tag post

2)  Link tag to ontology

3)  Link post to ontology

4)  Create new post

5)  Infer related post

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 22

Dealing with ambiguity and maintaining instances

•  User interface to remove ambiguity when adding a new post

•  Ability to browse the ontology to link to other class / instance for new / different meaning tags

•  Restricted back office to create new instances

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 23

5 Enhanced Information Retrieval

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 24

A Semantic Search Engine

•  Decentralized data:

•  Ontologies •  Blog posts w/ RDF description •  A central 3store w/ SPARQL

•  Using tags / ontology links to:

•  Deal with tags variations •  Remove ambiguity

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 25

Suggesting related tags and posts

•  Goal:

•  Offer suggestions to users based on their current tag search •  Let users discover potential interesting posts •  Different approaches, use both tagging and ontologies:

•  Using instances co-occurrence •  Using instances type •  Defining specific rules

•  Run on the fly and suggest results, using SPARQL

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 26

Using instances co-occurrence

•  Basic approach:

–  Posts sharing concepts

–  Ontology to remove ambiguity

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 27

Using instances types

•  Suggesting instances of the same class:

–  Programming languages

–  Energy companies

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 28

Defining specific rules

•  Rule(s) for each class

•  Location / {sub|top}locations •  Company / companies working

in the same area

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 29

Rule example

•  RDF rule {! :x rdf:type sioc:post .! :y rdf:type sioc:post .! :x sioc:topic :a .! :y sioc:topic :b .! :a rdf:type ptop:Agent .! :b rdf:type ptop:Agent .! :a company:expertiseIn :d .! :b company:expertiseIn :d !} => {! :x sioc:related_to :y !} .

•  Reformulated w/ SPARQL SELECT ?post ?topic ?domain!WHERE {! <$uri> [! rdf:type ptop:Agent ;! company:expertiseIn ?domain! ] .! ?post sioc:topic ?topic .! ?topic [! rdf:type ptop:Agent ;! company:expertiseIn ?domain! ] !}

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 30

Running and Displaying results

•  SPARQL queries on the triplestore

•  Offering different tagclouds to split results in clusters:

•  Instances •  Specific rules •  Co-occurrence

•  Let users know why they should have a look to related posts

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 31

6 Conclusion

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 32

Conclusion

•  Mixing folksonomies and ontologies •  Keep simplicity of free tagging •  Ontologies to reduce limits of free tagging •  Infer related posts / topics •  A mix of top-down and bottom-up approach

•  A real use-case •  Adapt model and interfaces to users needs

•  Future works •  Knowledge extraction from weblogs •  Use wikis to add properties to ontology instances •  New ways to browse and query blog posts

October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 33

Thank you !