34
Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise Research Institute - National University of Ireland, Galway Copyright 2010 Digital Enterprise Research Institute. All rights

Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Embed Size (px)

Citation preview

Page 1: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Finding, Linking and Organizing Resources with Linked Data & Natural Language

Processing

Paul Buitelaar

Unit for Natural Language ProcessingDigital Enterprise Research Institute - National University of Ireland, Galway

Copyright 2010 Digital Enterprise Research Institute. All rights reserved, Paul Buitelaar

Page 2: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

What does that mean?What is a (re)source? What is a link? What

resources can we link – and how? How to find and organize resources and links?

Let’s go through an example…

Finding, Linking & Organizing Resources

Page 3: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Linking van Gogh (resources)

Page 4: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Linking van Gogh (links)

Page 5: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Linking van Gogh (objects)

March 30th, 1853

personObj-1personObj-1

Zundert

July 29th, 1890

Auvers-sur-Oise

Vincent van Gogh

personRepresentationObj-1personRepresentationObj-1

personBirthObj-1personBirthObj-1locationObj-1locationObj-1

locationObj-2locationObj-2

personDeathObj-1personDeathObj-1personObj-2personObj-2

personRepresentationObj-2personRepresentationObj-2

Theo van Gogh

Page 6: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Finding Resources (and Links)

Structured Data (Proprietary databases, thesauri etc.) Open-domain databases, thesauri, etc. … … increasingly turned into ‘Linked Open Data’

Unstructured Data (Proprietary textual descriptions, images, videos etc.) Open-domain textual descriptions, images, videos etc. … … to be turned into & connected with ‘Linked Open

Data’

Page 7: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Finding Links in Text

Page 8: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Linking van Gogh - continued

personObj-1personObj-1

artistObj-1artistObj-1

bold brushstrokes

Diego Velazquez

personRepresentationObj-2personRepresentationObj-2

artistObj-2artistObj-2

artistTechniqueObj-1artistTechniqueObj-1

Vincent van Gogh

personRepresentationObj-1personRepresentationObj-1

Page 9: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

The Remainder of this Talk

Linked Open Data (LOD) Some LOD applications & tools LOD and Natural Language Processing

Page 10: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Linked Open Data

Turning web of documents into a Web of Data Uniquely identifying web objects (documents,

images, named-entities, facts, …) Enabling the discovery & interlinking of web

objects through semantic metadata Open access to data

Page 11: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Linked Open Data ‘cloud’

Page 12: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Linked Open Media Data

Page 13: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

LOD Applications

Search Engine for the Web of Data SIGMA http://sig.ma (builds on http://sindice.com/)

Contact: Giovanni Tummarello, DERI

Music Recommendation http://dbrec.net

Contact: Alexandre Passant, DERI

Research Collaboration Support, Expert Finding http://saffron.deri.ie/

Contact: Paul Buitelaar, DERI

Page 14: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Search the Web of Data with SIGMA

Page 15: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

More Data – but also more issues…

Page 16: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

dbrec : the Web of Data recommends…

Page 17: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Mary Black is related to Frances Black …

Page 18: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

… and this is why

Page 19: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Saffron : Expert Finding

Page 20: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Expertise Topic Extraction

Page 21: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Publication Browsing

Page 22: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Expert Browsing

Page 23: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Publication Details: Abstract/PDF

Page 24: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Publication Details: Authors/Topics

Page 25: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Expertise Topic Details

Page 26: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Personalization

Page 27: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Personalized Expert Recommendation

Page 28: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Linking Objects in Saffron

AuthorAuthor

Document Document

TitleTitle

PDFPDF

TopicTopic

TopicTopic

AffiliationAffiliationResearcherResearcher

PicturePicture

ResearcherResearcher

ExpertiseTopicExpertiseTopic

Page 29: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Other LOD Application Areas

Linked Open Drug Data (Matthias Samwald, DERI) http://esw.w3.org/HCLSIG/LODD - W3C WG includes

participation by Johnson & Johnson, AstraZeneca

http://esw.w3.org/HCLSIG/LODD/Data

Open Government Data (Richard Cyganiak, DERI) http://linkeddata.deri.ie/node/72 - includes data sets

from USA, UK, Australia, Canada, Sweden, New Zealand

Library Linked Data (Jodi Schneider, DERI) http://www.w3.org/2005/Incubator/lld/

Financial Linked Data (Sean O’Riain, DERI) Linking Enterprise Data for Business Intelligence

Page 30: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Linked with extracted Financial Facts (amounts) – annotated with semantic metadata (financial meaning) according to eXtensible Business Reporting Language (XBRL)

http://www.monnet-project.eu/

Financial Linked Data

Page 31: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Some LOD Tools

‘RDB2RDF’ - mapping relational DBs to RDF http://www.w3.org/2001/sw/rdb2rdf/ (incl. Survey Report)

‘Silk’ (Freie Universitaet Berlin) - specify links to use in discovering relationships between LOD data items http://www4.wiwiss.fu-berlin.de/bizer/silk/

Semantic Drupal, ‘sparqlviews’ (Lin Clark, DERI) - easy integration of Linked Data in CMS Drupal http://semantic-drupal.com/

http://drupal.org/project/sparql_views

EU Projects http://latc-project.eu/ http://lod2.eu/

Page 32: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Open LOD Issues

How to integrate new LOD into the LOD cloud – with addition of information rather than duplication? Entity consolidation

dbpedia:JohnSmith owl:sameAs bbcmusic:JohnSmith

Vocabulary alignment

geonames:location owl:sameAs dbpedia:place

How to identify the most fitting LOD resources for a particular application/domain? Estimate application/domain semantics

Match application/domain semantics with LOD semantics

Page 33: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Linked Open Data

LOD and Natural Language Processing

Domain/Application Semantics

Linked Open Data forDomain/Application

Domain Corpus

YZ

X

YZ

X

LOD vocabularies

YZ

X

Y1Z1

LOD instances from domain corpus

Page 34: Finding, Linking and Organizing Resources with Linked Data & Natural Language Processing Paul Buitelaar Unit for Natural Language Processing Digital Enterprise

Acknowledgements & Further Info DERI colleagues on all things ‘linked open data’, for more info

http://linkeddata.deri.ie/

The Saffron team (in alphabetical order)

Georgeta Bordea, Fergal Monaghan, Krystian Samp

http://saffron.deri.ie/

Grant support

Science Foundation Ireland Grant No. SFI/08/CE/I1380 for Lion-2 http://nlp.deri.ie/

EU FP7 Grant No. 248458 for the Monnet project on Multilingual Ontologies for Networked Knowledge http://www.monnet-project.eu