36
Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia, Rui Zhang and Vincenzo Maltese

Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

Embed Size (px)

Citation preview

Page 1: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

Data and Knowledge Representation Languages

Introduction to the course

Originally by Alessandro Agostini and Fausto GiunchigliaModified by Fausto Giunchiglia, Rui Zhang and Vincenzo Maltese

Page 2: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

2

Website and provisional syllabushttp://disi.unitn.it/~ldkr/ldkr2015/index.html

http://disi.unitn.it/~ldkr/ldkr2015/syllabus.pdf

Page 3: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

3

Outline Modeling and logical modeling

Domain Language Theory Model

Languages Logic: formal languages Using Logic

Specification Automation Expressiveness Efficiency VS. Complexity Decidability

Page 4: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

4

Modeling the world

DOMAIN(e.g. the girls in foreground)

INDIVIDUAL

SET

RELATION(e.g. Near, Sister)

A model is an abstraction of a part of the world

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 5: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

5

Modeling

World

LanguageL

TheoryT

DomainD

ModelM

DataKnowledge

Meaning

MentalModel

SEMANTICGAP

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 6: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

6

Modeling: explained World: the phenomenon we want to describe Domain: the abstract relevant elements in the real world Mental Model: what we have in mind. It is the first

abstraction of the world (subject to the semantic gap) Language: the set of words and rules we use to build

sentences used to express our mental model Theory: the set of sentences (constraints) about the world

expressed in the language that limit the possible models Model: the formalization of the mental model, i.e. the set of

true facts in the language, in agreement with the theory Semantic gap: the impossibility to capture the entire reality

NOTE: this does not necessarily need to be in formal semantics

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 7: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

7

Mental modelsThere is a monkey in a laboratory with some bananas hanging out of reach from the ceiling. A box is available that will enable the monkey to reach the bananas if he climbs on it. The monkey and box have height Low, …

… but if the monkey climbs onto the box he will have height High, the same as the bananas. [...]”

NOTE: each picture is a different model

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 8: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

8

Example of informal Modeling

World

MentalModel

SEMANTICGAP

ModelM

L: natural language

D: {monkey, banana, tree}

T: “If the monkey climbs on the tree, he can get the banana”

M: “The monkey actually climbs on the tree and gets the banana”

TheoryT

LanguageL

DomainD

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 9: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

9

Logical Modeling

Modeling

Realization

World

LanguageL

TheoryT

DomainD

ModelM

DataKnowledge

Meaning

MentalModel

SEMANTICGAP

Inte

rpre

tatio

n

I

En

tailm

en

t

NOTE: the key point is that in logical modeling we have formal semantics

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 10: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

10

Logical Modeling: explained Domain: relevant objects Logical Language: the set of formal words and rules we

use to build complex sentences Interpretation: the function that associates elements of

the language to the elements in the domain Theory / Knowledge Base (KB): the set of facts which

are always true (in data and knowledge) Model: the set of true facts in the language describing

the mental model (the part of the world observed), in agreement with the theory

Truth-relation / logical entailment (⊨): deduction, reasoning, inference

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 11: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

“The author of Romeo and Juliet is Shakespeare”author(R&G, Shakespeare)GirlsPlaying = {Mary, Sara, Julie}

“Bananas are yellow” “Bananas have a curve shape”“All men are mortal” man mortalGirlsPlaying = True

Intensional vs Extensional semantics Intensional: I can only express the fact that a

given proposition is true or false

Extensional: I provides the objects of the domain corresponding to the proposition

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

11

Page 12: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

12

Example of formal (intentional) modeling

World

SEMANTICGAP

L = {MClimbs, MGetBanana, , , }

D= {T, F}

T = { MClimbs MGetBanana}

A possible model M:I(MClimbs) = TI(MGetBanana) = T

MentalModel

ModelM

TheoryT

LanguageL

DomainD

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 13: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

13

Example of formal (extensional) modeling

World

SEMANTICGAP

L = {Monkey, Climbs, GetBanana, , , }

D= {Cita, Banana#1, Tree#1}

T = { (Monkey Climbs) GetBanana}

A possible model M:I(Monkey) = {Cita}I(Climbs) = {(Cita, Tree#1)}I(GetBanana) = {(Cita, Banana#1)}

MentalModel

ModelM

TheoryT

LanguageL

DomainD

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 14: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

14

A (usually finite) set of words (the alphabet) and rules to compose them to build “correct sentences” e.g. Monkey and GetBanana are words e.g. Monkey GetBanana is a sentence (rule: A B)

A tool for codifying our (mental) model (what we have in mind): Sentences (syntax) with an intended meaning

(semantics) e.g. with the word Monkey we mean

Language

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 15: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

15

We need an algorithm for checking correctness of the sentences in a language

Language and correctness

PARSERs, LYes, correct!

No

We say that a language is decidable if it is possible to create such a tool that in finite time can take the decision

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 16: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

Syntax and Semantics Syntax: the way a language is written:

Syntax is determined by a set of rules saying how to construct the expressions of the language from the set of atomic tokens (i.e., terms, characters, symbols).

The set of atomic tokens is called alphabet of symbols, or simply the alphabet).

Semantics: the way a language is interpreted: It determines the meaning of the syntactic constructs

(expressions), that is, the relationship between syntactic constructs and the elements of some universe of meanings (the intended model).

Such relationship is called interpretation.

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

16

Page 17: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

Example of Syntax and Semantics Suppose we want to represent the fact that Mary

and Sara are near each other.

ENGLISH

Mary is near Sara.

formal syntax, informal semantics

‘SYMBOLIZED’ ENGLISH

near(M,S)

near(Mary, Sara)

formal syntax, informal semantics

LOGICS with an interpretation function I

“near” by I it means near as spatial relation

“M” by I it means Mary the girl

“S” by I it means Sara the girl

formal syntax, formal semantics

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

17

Page 18: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

18

Formal vs. Informal languages Language = Syntax (what we write) + Semantics (what we

mean)

Formal syntax Infinite/finite (always recognizable) alphabet Finite set of formal constructors and building rules for

phrase construction Algorithm for correctness checking (a phrase in a language)

Formal Semantics The relationship between syntactic constructs in a

language L and the elements of an universe of meanings D is a (mathematical) function I: L pow(D)

Informal syntax/semantics the opposite of formal, namely the absence of the elements

above.

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 19: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

Levels of Formalization

LogicsDiagramsNLProgramming

Languages

EnglishItalian

RussianHindi

...

ERUML...

SQL...

PLFOLDL...

Both Syntax and Semantics can be formal or informal.

Level1 Leveln

FORMALITY

informal semi-formal formal

19

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 20: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

20

Informal languages: natural language Let us try to recognize relevant entities, relations

and properties in the NL text below

The Monkey-Bananas (MB) problem by McCarthy, 1969“ There is a monkey in a laboratory with some bananas hanging out of reach from the ceiling. A box is available that will enable the monkey to reach the bananas if he climbs on it. The monkey and box have height Low, but if the monkey climbs onto the box he will have height High, the same as the bananas. [...]”

Question:How shall the monkey reach the bananas?

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 21: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

21

Languages with formal syntax In the Entity-Relationship (ER) Model [Chen 1976]

the alphabet is a set of graphical objects, that are used to construct schemas (the sentences).

Entity Relation Attribute

Monkey Box

Climb

0..1 0..n

Examples of ER sentences:

Banana Height

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 22: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

22

Languages with formal syntax and semantics Logics has two fundamental components:

L is a formal language (in syntax and semantics) I is an interpretation function which maps sentences into

a formal model M (over all possible ones) with a domain D

Domain D = {T, F} or D’ = {o1, …, on} Language L = {A,,,} Interpretation I: L D is intensional or I: L pow(D’) is

extensional Theory: a set of sentences which are always true in the

language (facts) Model: the set of true facts in the language describing the

mental model (the part of the world observed), in agreement with the theory

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 23: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

TheoryA theory T is set of facts:

A fact written in the language L defines a piece of knowledge (something true) about the domain D

A theory can be seen as a set of constraints on possible models to filter out all undesired ones

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

23

Page 24: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

24

Data and Knowledge Knowledge is factual information about what is true in

general (axioms and theorems) e.g. singers sing songs Data is factual information about specific individuals

(observed knowledge) e.g. Michael Jackson is a singer

A finite theory T is called: An ontology if it contains knowledge only A knowledge base (KB) if it contains knowledge and data A database (DB) if it only contains data. A database + its schema is the simplest kind of

knowledge base

NOTE: Sometimes the terms Ontology and KB are used interchangeably

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 25: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

25

ModelA model M is an abstract (in mathematical sense) representation of the intended truths via interpretation I of the language L

e.g. Madonna is a singer, Sting is a singer, …

M ⊨ T (M entails T) iff M ⊨ A, for each formula A in T

A model M of a theory T is any interpretation function that satisfies all the facts in T (M satisfies T)There can be many models satisfying the theory T. They are a subset of all possible interpretation functions over L.In case there are no models for T, we say that the theory T is unsatisfiable.

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 26: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

26

Usages of Representation LanguagesThe two purposes in modeling

Specification: Representation of the problem at the proper level of

abstraction Allow informal/formal syntax and informal/formal semantics

Automation (Automated Reasoning) Computing consequences or properties of the chosen

specifications It requires formal semantics.

Logics have formal syntax and formal semantics!

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 27: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

27

Why Natural Languages?

Used for Advantages Disadvantages

For informal specification

Often used at the very beginning of problem solving, when we need a direct, “flexible”, well-understood language and the problem is still largely unclear

Useful to interact with users

Semantics is informal, largely ambiguous

Pragmatically inefficient for automation

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 28: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

28

Why Diagrams?

Used for Advantages Disadvantages

To provide more structured and organized specification than natural languages

Informal/formal syntax (depends on the kind of diagram)

Largely structured and organized; usually used in representation with unified languages when things are non-trivial or more precision is required w.r.t. Natural Language

Useful to interact with users

Semantics is informal, largely ambiguous

Pragmatically inefficient for automation

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 29: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

29

Why Logic?

Used for Advantages Disadvantages

Formal specification

Automation

Well-understood with formal syntax and formal semantics: we can better specify and prove correctness

Pragmatically efficient for automation exploiting the explicitly codified semantics: reasoning services

It can be hardly used to interact with users

An exponential grow in cost (computational, man power)

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 30: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

30

Logics for Automation: reasoning Logics provides a notion of deduction

axioms, deductive machinery, theorem Deduction can be used to implement reasoners

Reasoners allow inferring conclusions from known facts (i.e., a set of “premises”, premises can be axioms or theorems).

From implicit knowledge to explicit knowledge

Reasoning services: Model Checking (EVAL) Is a sentence ψ true in model M? Satisfiability (SAT) Is there a model M where ψ is true? Validity (VAL) Is ψ true according to all possible models? Entailment (ENT) ψ1 true implies ψ2 true (in all models)

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 31: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

How to use logical modeling in practice

31

1. Define a logic most often by reseachers once for all (not a trivial task!)

2. Choose the right logic for the problem Given a problem the computer scientist must choose

the right logic, most often one of the many available

3. Write the theory The computer scientist writes a theory T

4. Use reasoning services Computer scientists use reasoning services to solve

their programs

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 32: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

32

An Important Trade-Off There is a trade-off between:

expressive power (expressiveness) and computational efficiency provided by a (logical)

language

This trade-off is a measure of the tension between specification and automation

To use logic for modeling, the modeler must find the right trade off between expressiveness in the language for more tractable forms of reasoning services.

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 33: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

33

Examples of Expressiveness

LANGUAGE NL SENTENCE FORMULAPropositional logic

Fausto likes skiing

I like skiing

Fausto-likes-skiing

I-like-skiing

Modal logic I believe I like skiing B(I-like-skiing)

First-order logic Every person likes skiing

I like skiing

Fausto likes skiing

∀ person.like-skiing(person)

like-skiing(I)

like-skiing(Fausto)

Description Logic

Every person likes cars person ⊑ ∃ likes.Car

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 34: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

34

Efficiency VS. Complexity Efficiency

Performing in the best possible manner; satisfactory and economical to use [Webster]

In modeling it applies to reasoning In time, space, consumption of resources

Complexity (or computational complexity) of reasoning The difficulty to compute a reasoning task expressed by

using a logic With degrees of expressiveness, we may classify the logical

languages according to some “degrees of complexity”

NOTE: When logic is used we always pay a performance price and therefore we use it when it is cost-effective

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 35: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

35

Decidability The existence of an effective method to

determine the validity of formulas in a logical language

A logic is decidable if there is an effective method to determine whether arbitrary formulas are included in a theory

A decision procedure is an algorithm that, given a decision problem, terminates with the correct yes/no answer.

In this course we focus on logics that are expressive enough to model real problems but are still decidable

MODELING :: LOGICAL MODELING :: LANGUAGES :: LOGIC :: USING LOGIC

Page 36: Data and Knowledge Representation Languages Introduction to the course Originally by Alessandro Agostini and Fausto Giunchiglia Modified by Fausto Giunchiglia,

36

Appendix: Greek lettersΑα Alpha Νν Nu

Ββ Beta Ξξ Xi

Γγ Gamma Οο Omicron

Δδ Delta Ππ Pi

Εε Epsilon Ρρ Rho

Ζζ Zeta Σσς Sigma

Ηη Eta Ττ Tau

Θθ Theta Υυ Upsilon

Ιι Iota Φφ Phi

Κκ Kappa Χχ Chi

Λλ Lamdba Ψψ Psi

Μμ Mu Ωω Omega