The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma...

Preview:

Citation preview

The Semantic Web

Riccardo Rosati

Dottorato in Ingegneria InformaticaSapienza Università di Roma

a.a. 2006/07

The Semantic Web - course overview 2

Overview

the aim of this course is to provide an introduction to the Semantic Web...

... with emphasis on two aspects:• emphasis on AI-related aspects, in particular

knowledge representation and reasoning• emphasis on database-related aspects, in particular

data integration

The Semantic Web - course overview 3

Overview

• Lecture 1: Introduction to the Semantic Web

• Lecture 2: The XML layer

• Lecture 3: The RDF layer

• Lecture 4: The Ontology layer 1– Description Logics, OWL

• Lecture 5: The Ontology layer 2– reasoning in OWL, OWL species

Overview

• Lecture 6: The Ontology layer 3– OWL technologies: OWL Tools, QuOnto

• Lecture 7: The rule layer

• Lecture 8: The RDF layer 2– RDF semantics

• Lecture 9: The Ontology layer 4– query answering in OWL

• Lecture 10: Handling inconsistency (Jan Chomicki)

Lecture 1

Introduction to the Semantic Web

Introduction to the Semantic Web 6

What is the Semantic Web?

• “The Semantic Web is a Web of actionable information—information derived from data through a semantic theory for interpreting the symbols.”

• “The semantic theory provides an account of ‘meaning’ in which the logical connection of terms establishes interoperability between systems”

(Shadbot, Hall, Berners-Lee, The Semantic Web revisited, IEEE Intelligent Systems, May 2006)

Introduction to the Semantic Web 7

The Semantic Web: why?

• search on the Web: problems...• ...due to the way in which information is stored on

the Web• Problem 1: web documents do not distinguish

between information content and presentation (“solved” by XML)• Problem 2: different web documents may

represent in different ways semantically related pieces of information

• this leads to hard problems for “intelligent” information search on the Web

Introduction to the Semantic Web 8

Separating content and presentation

Problem 1: web documents do not distinguish between information content and presentation

• problem due to the HTML language• problem “solved” by current technology

– stylesheets (HTML, XML)– XML

• stylesheets allow for separating formatting attributes from the information presented

Introduction to the Semantic Web 9

Separating content and presentation

• XML: eXtensible Mark-up Language• XML documents are written through a user-

defined set of tags • tags are used to express the “semantics” of the

various pieces of information

Introduction to the Semantic Web 10

XML: example

HTML:<H1>Seminari di Ingegneria del Software</H1>

<UL> <LI>Teacher: Giuseppe De Giacomo

<LI>Room: 7 <LI>Prerequisites: none</UL>

XML:<course>

<title>Seminari di Ingegneria del Software </title>

<teacher>Giuseppe De Giacomo</teacher><room>1AI, 1I</room><prereq>none</prereq>

</course>

Introduction to the Semantic Web 11

Limitations of XML

XML does not solve all the problems:• legacy HTML documents• different XML documents may express

information with the same meaning using different tags

Introduction to the Semantic Web 12

The need for a “Semantic” Web

Problem 2: different web documents may represent in different ways semantically related pieces of information

• different XML documents do not share the “semantics” of information

• idea: annotate (mark-up) pieces of information to express the “meaning” of such a piece of information

• the meaning of such tags is shared! shared semantics

Introduction to the Semantic Web 13

The Semantic Web initiative

viewpoint:

the Web = a web of data

goal:

to provide a common framework to share data on the Web across application boundaries

main ideas:• ontology• standards• “layers”

Introduction to the Semantic Web 14

The Semantic Web Tower

Introduction to the Semantic Web 15

The Semantic Web Layers

• XML layer

• RDF + RDFS layer

• Ontology layer

• Proof-rule layer

• Trust layer

Introduction to the Semantic Web 16

The XML layer

• XML (eXtensible Markup Language) – user-definable and domain-specific markup

• URI (Uniform Resource Identifier)– universal naming for Web resources

– same URI = same resource

– URIs are the “ground terms” of the SW

• W3C standards

Introduction to the Semantic Web 17

The RDF + RDFS layer

RDF = a simple conceptual data modelW3C standard (1999)

RDF model = set of RDF triples

triple = expression (statement)

(subject, predicate, object)

• subject = resource• predicate = property (of the resource)• object = value (of the property)=> an RDF model is a graph

Introduction to the Semantic Web 18

The RDF + RDFS layer

Example of RDF graph:

http://www.w3.org/TR/REC-rdf-syntax/

“Ora Lassila”

dc:Creator

“1999-02-22”

dc:Date

“W3C”

dc:Publisher

Introduction to the Semantic Web 19

The RDF + RDFS layer

• RDFS = RDF Schema• “vocabulary” for RDF• W3C standard (2004)

example:

Person

Student Researcher

subClassOfsubClassOf

Jeentype

hasSuperVisordomain range

Frank

type

hasSuperVisor

RDFS

RDF

Introduction to the Semantic Web 20

The Ontology layer

ontology = shared conceptualization conceptual model

(more expressive than RDF + RDFS) expressed in a true knowledge representation

language

OWL (Web Ontology Language) = standard language for ontologies

Introduction to the Semantic Web 21

The proof/rule layer

beyond OWL:• proof/rule layer• rule: informal notion• rules are used to perform inference over

ontologies• rules as a tool for capturing further knowledge

(not expressible in OWL ontologies)

Introduction to the Semantic Web 22

The Trust layer

• SW top layer:• support for provenance/trust• provenance:

– where does the information come from?

– how this information has been obtained?

– can I trust this information?

• largely unexplored issue • no standardization effort

Introduction to the Semantic Web 23

The Semantic Web: main ingredients

• underlying web layer (URI, XML)– reusing and extending web technologies

• basic conceptual modeling language (RDF)• ontology language (OWL)• rules/proof• reusing and extending AI technologies

– knowledge representation – automated reasoning

• ...and database technologies– data integration

Introduction to the Semantic Web 24

The Semantic Web from an AI perspective

• the notion of ontology• the role of logic and Description Logics• the role of rule-based formalisms• ... (agent technology)

Introduction to the Semantic Web 25

The notion of ontology

• ontology = shared conceptualization of a domain of interest

• shared vocabulary => simple (shallow) ontology• (complex) relationships between “terms” => deep

ontology• AI view:

– ontology = logical theory (knowledge base)

• DB view:– ontology = conceptual model

Introduction to the Semantic Web 26

Ontologies: example

class-def animal % animals are a classclass-def plant % plants are a class subclass-of NOT animal % that is disjoint from animalsclass-def tree subclass-of plant % trees are a type of plantsclass-def branch slot-constraint is-part-of % branches are parts of some tree has-value tree max-cardinality 1class-def defined carnivore % carnivores are animals subclass-of animal slot-constraint eats % that eat any other animals value-type animalclass-def defined herbivore % herbivores are animals subclass-of animal, NOT carnivore % that are not carnivores, and slot-constraint eats % they eat plants or parts of plants value-type plant OR (slot-constraint is-part-of has-value

plant)

Introduction to the Semantic Web 27

Ontologies: the role of logic

• ontology = logical theory • why?

– declarative

– formal semantics

– reasoning (sound and complete inference techniques)

• well-established correspondence between conceptual modeling formalisms and logic

Introduction to the Semantic Web 28

Ontologies and Description Logics

• OWL is based on a fragment of first-order predicate logic (FOL)

• Description Logics (DLs) = subclasses of FOL– only unary and binary predicates– function-free– quantification allowed only in restricted form– (variable-free syntax)– decidable reasoning

• DLs are one of the most prominent languages for Knowledge Representation

Introduction to the Semantic Web 29

Ontologies and Description Logics

• expressive abilities of DLs have been widely explored

• reasoning in DLs has been extensively studied• DL reasoners have been developed and optimized

DLs as a central technology for the SW

Introduction to the Semantic Web 30

Rule-based formalisms

• Prolog• Logic programming• Constraint (logic) programming• Production rules• Datalog• ...

Rule language for SW not standardized yet

RIF (Rule Interchange Format) W3C working group

Introduction to the Semantic Web 31

The SW from a database perspective

• ontology as a virtual database schema• ontology-based information access• the Semantic Web as a framework for data

integration

Introduction to the Semantic Web 32

Virtual data integration

abstract model of a virtual data integration system:triple (G,S,M) • G = global information schema• S = set of data sources• M = mapping between sources and global schema

1. the data sources are autonomous and heterogeneous information systems

2. the global schema is a virtual conceptualization of the domain of interest

3. the mapping is a declarative specification of the relationship between sources and global schema

Introduction to the Semantic Web 33

Ontology-based information access

• ontology = virtual global information schema• ontology language = conceptual modeling

language• the Web = distributed, heterogeneous, autonomous

set of data sourcesbut :• data integration considers different formalisms for

expressing schema and data• the architecture of a data integration system is

centralized

Introduction to the Semantic Web 34

Ontology-based information access

• the data integration approach can be generalized to non-centralized architectures

=> peer data management systems• despite the differences in the formalisms adopted,

there is a tight relationship between the SW and data integration

=> dealing with incomplete information

Introduction to the Semantic Web 35

The Semantic Web and data integration

similarity between data integration and the SW:• global schema = ontology• sources = web sites (data)• mapping = ??

however:• different languages• different technologies

=> the SW is essentially a data integration technology

Introduction to the Semantic Web 36

Very quick overview of SW technologies

main current technologies and standards for the SW: • RDF • RDFS• OWL

Introduction to the Semantic Web 37

RDF

• RDF = Resource Description Framework• RDF data model is an abstract, conceptual layer

(independent of XML)• RDF data model = set of RDF triples

• triple = (subject, predicate, object)– subject = resource– predicate = property (of the resource)– object = value (of the property)

• standardized in 1999

Introduction to the Semantic Web 38

RDFS

• RDFS = RDF Schema• set of predefined predicates:

– subClassOf– subPropertyOf– domain– range– ...

• ...with predefined semantics!• standardized in 2004

Introduction to the Semantic Web 39

OWL

• OWL = Web Ontology Language• the OWL family is constituted by 3 different

languages (with different expressive power):– OWL Full– OWL-DL– OWL-Lite

• technology at an early stage – standardized in 2004– reasoning techniques and tools are very recent– “optimization” of reasoning is largely unexplored

Introduction to the Semantic Web 40

OWL vs. RDFS

• class-def• subclass-of• property-def• subproperty-of• domain• range

• class-def• subclass-of• property-def• subproperty-of• domain• range

• class-expressions• AND, OR, NOT

• role-constraints• has-value, value-type• cardinality

• role-properties• trans, symm...

• class-expressions• AND, OR, NOT

• role-constraints• has-value, value-type• cardinality

• role-properties• trans, symm...

RDF(S) OWL

Introduction to the Semantic Web 41

From RDFS to OWL Full

with respect to the relative expressive power:

OWL Full > OWL-DL > OWL-Lite > RDFS

Recommended