48
Steffen Staab Seamless Semantics 1 Institute for Web Science and Technologies · University of Koblenz-Landau, Germany Web and Internet Science Group · ECS · University of Southampton, UK & Seamless Semantics: Avoiding Semantic Discontinuity Steffen Staab University of Southampton & Universität Koblenz-Landau

Seamless semantics - avoiding semantic discontinuity

Embed Size (px)

Citation preview

Page 1: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 1Institute for Web Science and Technologies · University of Koblenz-Landau, GermanyWeb and Internet Science Group · ECS · University of Southampton, UK &

Seamless Semantics: Avoiding Semantic Discontinuity

Steffen Staab

University of Southampton

&

Universität Koblenz-Landau

Page 2: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 2

„I have bought myself a 30‘‘ screen, because half of my work is re-typing existing

material.“

Why do we have to re-type at all?

A professorial colleague:

Page 3: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 3

Why do we have to retype at all?

Examples:• CVs• Bibliographies• Visa entry forms• Adresses• Purchase orders• ...

Reasons:• No semantics• Other semantics• Other formats• Other importance

ranking• (partial)

incompleteness

Page 4: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 4

„Solution 1“: Tailorism (aka workflow)

https://www.flickr.com/photos/treborscholz/2857596696

Page 5: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 5

„Solution 2“: Conversion tools

Page 6: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 6

Why do we know the „how“ (see talk by Maria-Esther & Axel)

but not the „what“?

Page 7: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 7

Traditional Information System

Business Logics

Structured DataUnstructured

Data

Presentation and Interaction

Charakteristics:• Processes known• Data structures

known• Meaning of data in

schema and implicit in code

Page 8: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 8

Information ecosystems nowadaysExamples• Open Data• 1000s

DBs/company• Ad-hoc data

Characteristics• Some structure• Late structure• Social context• Meaning of data

most important

Page 9: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 9

How does data receive meaning?Explicit:• Formal schema/ontology

– By someone else?

Implicit:• Names are just used for

describing

Social:• Communities converge

– By discussion– By emergence

Meaning?

Page 10: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 10

Talking many languages...

Sub languages• For consumer

– title,...

• Global retailers– barcode

• US food industry– serving size, calories,...

• Producer– batch number

...https://www.youtube.com/watch?v=ga1aSJXCFe0

Depending on who you are – you encounter the (un)expected, the (un)known, the (un)understandable,...

Page 11: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 11

The Unknown

https://www.flickr.com/photos/wrobel/8175902444/

Page 12: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 12

We know a bit.... 1. URIs as identifiers

2. http lookup

3. RDF (triples)

4. relations, also to other locations

Page 13: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 13

What is/should Linked Data good for?• Data integration is (relatively) easy

– Migrating different data sources to linked data is (relatively) easy

• Late schema is easy– Just add some more fields

• Ignoring data is easy– Think of crisps

• Serendipitous use– Discover new information &

new sources by following links

• Data repurposing / pointing– Use what others have done at both schema

and data level

Dealing w

ith the unknown

data and data schema

Page 14: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 14

Issue: From Data Publishing to Understanding

?

De-contextualization Re-contextualization

Publishing data the structure of which you know is easier than understanding what you do not know

Page 15: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 15

1. Reducing language friction

2. Reducing re-use friction

3. Reducing information loss

Agenda

Page 16: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 16

Reducing Language Friction

Page 17: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 17

Italian

Spanish

French

Page 18: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 18

Language Dimensions (in the Semantic Web)

3 Generalization/Specialization

2 Modularization

1 Lexicalization

4 Sophistication

Page 19: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 19

1 Lexicalization

http://img.remastersys.com/nimg/c1/a4/20dc0bfd21b1eac7c08889238b38-300x300-0/recyclable_laminated_plastic_potato_chips_bag_with_back_side_sealing.jpg

[Cimiano et al]

https://www.flickr.com/photos/theimpulsivebuy/11056507874

Page 20: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 21

2 Modularization

Multimedia (@WeST)• FOAF• F event ontology• COMM• ..

Sensors (@Galway)• SSN ontology• COMM• F

Italian

Spanish

French

[Scherp et al][Leggieri et al]

Page 21: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 22

2 Pattern as Micro-Module for Image Tagging

[Scherp&Saathoff, WWW-2010][Troncy et al 2007] 

Page 22: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 23

3 Understanding via generalization

Fracture of Femur Fracture of bone

Femur is bone in your upper leg

Page 23: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 24

3 Generalization/Specialization

DOLCE

Ontologyof Plans

CoreSoftware Ontology

Core Ontologyof Web Services

Core Ontology ofSoftware Components

specificity

gene

ricco

re

reusedontology modules

Ontology of Information Objects

Descriptions& Situations

contributedontology modules

http://cos.ontoware.org

Page 24: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 25

4 Sophistication

Page 25: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 26

4 Ontology API Model for Image Tagging

Page 26: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 27

4 Automatically Generated Ontology API

Page 27: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 28

4 Comparing the two structures

Page 28: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 29

4 OntoMDE Workflow

Model of Ontologies (MoOn)Adding declarative layer:Structuring the ontologies intosemantic units

Ontology API Model (OAM)Adding declarative layer:Structuring pragmatic units specifying how entities are to be used together

Page 29: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 30

Reducing Re-use Friction: Semantic Programming

Page 30: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 31

Example scenario: Jamendo

Data about license free music• ~ 1 Million triples• classes and predicates

from 18 different ontologies– FOAF, Tag ontology,

music ontology, …

Simple programming task:• List for every music artist,

all the records they made

Page 31: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 32

Software Development Process Overview

data model design

revised data model design

data model prototype

data queries

final data model

Creation of initial data

model

Exploration of the data

source

Creation of model in

code

Query design / implementation

Mapping of query results

Page 32: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 33

Accessing Artists Using Apache Jena

Page 33: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 34

From artists to songs

Observations• SPARQL queries are strings• Results are strings• Requires good understanding of the data source

RDF Typing is lost

Page 34: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 35

Programming Language Support for RDF Access

Static Typing Errors detected before

execution Misspelling discovered

by compiler! Anectode: 2nd place

because of misspelt code

Static types are form of documentation Less knowledge about

data source required

Better IDE integration / autocompletion

Code generation• Sommer• Winter• OntoMDE

Dynamic Typing E.g. ActiveRDF

(Oren et al 2007)) “convention over

configuration”

dynamic metaprogramming allows for slick code

Page 35: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 36

Programming with Linked Data

Page 36: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 37

c1

Programming with Linked Data

Tasks of the Programmer

1 Schema exploration

2 Programming code types

3 Programming queries

4 Programming procedures for

• creating, • manipulating,• persisting

objects

Page 37: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 38

Node Path Query Language Using Autocompletion

Exploration of classes

Page 38: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 39

Node Path Query Language Using Autocompletion

Exploration of classes

Exploration of relations

Page 39: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 40

Node Path Query Language: Query Formulation

Exploration of classes

Exploration of relations

Querying for instances

Type set of mo:MusicArtist

No definition or declaration needed

Page 40: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 41

Node Path Query Language for Code DevelopmentExploration of classes

Exploration of relations

Querying for instances

Developing code with queries

All translated into SPARQL queries at• Development time• Type inference at compile time

(but also as part of IDE)• Querying again at run time

One language to bind them all

Page 41: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 42

Node Path Query Language for Code Development

Exploration of classes

Exploration of relations

Querying for instances

Developing code with queries

Developing code with new classes

All translated into SPARQL queries at• Development time• Run time update• Persistence!

Page 42: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 43

NPQL

NPQL (Node Path Query Language)• Intensional Queries Describing RDF classes and properties for reuse in IDE and in host language metaprogramming

• Extensional Queries Class instances and property instances

• Compilation to SPARQL for reuse of existing endpoints

Ongoing discussion about details of NPQL

Page 43: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 44

LITEQ

NPQL (Node Path Query Language)• Intensional Queries• Extensional Queries• Compilation to SPARQL

LITEQ (Language Integrated Types, Extensions and Queries) • Implementation of NPQL as F# Type Provider in Visual Studio• Autocompletion using NPQL queries• Automatic typing

of extensional query resultsby intensional queries

Page 44: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 45

Outlook: Programming with Linked Data• More expressive query languages

– Derived data types in tractable description logics!

• More precise combined type inference– (derived) type from data source– type inference in programming language

• Programming across data sources– Federated queries– Linktraversal-based queries (the unknown sources)

• Integration of schema induction – Low quality of schema/ontologies

• Improved autocompletion

Page 45: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 46

Conclusion

Page 46: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 47

Issue: From Data Publishing to Unknown Data Understanding

CognitionStorytellingPragmatics

Ontology PatternsConceptual Modeling

Metamodels...

QuantityPertinenceManner

Page 47: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 48

What is missing?

...a lot...

• Indexing• Search• Data and schema quality• Pragmatics• ...

Page 48: Seamless semantics - avoiding semantic discontinuity

Steffen Staab Seamless Semantics 49

Semantic Web

Social Web & Web Retrieval

Interactive Web & Human Computing

Web & Economy

Software & Services

Computational Social Science

Thank You!