39
November 2005 CSA3180: Semantics I 1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies • Truth Conditions and First Order Logic • Quantified Sentences • Translating English into FOL and vice-versa • XML in an NLP context • Semantic Web • Taxonomies

November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

Embed Size (px)

Citation preview

Page 1: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 1

CSA3180: Natural Language Processing

Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies

• Truth Conditions and First Order Logic• Quantified Sentences• Translating English into FOL and vice-versa• XML in an NLP context• Semantic Web• Taxonomies

Page 2: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 2

Introduction

• Quantification and FOL/English translation slides partly based on Introduction to Logic Lectures by Angelo Dalli given in 2000

• Quotes from W3C website and NLPRS 2001 Tokyo

• Will introduce the concepts of linking semantics to syntactic objects

• Taxonomies and the use of XML in an NLP context

Page 3: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 3

Quantification

Prepositional Logic addresses shortcomings of

Propositional Logic mainly by introducing

predicates.

Atomic or Compound Propositional statements like

“This whiteboard is white” do not allow us to get

to more generic/lower level concepts, like “You

can write on all whiteboards”

Page 4: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 4

Propositional to Predicate

Propositional logic uses the notion of variables.

Variables are used as placeholders that indicate

relationships between quantifiers and argument

positions of predicates.

So apart from statements like father(Max) and

mother(Claire) we can have father(X) and

mother(Y).

Page 5: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 5

Propositional Logic

Propositional logic is thus similar to algebra

using constants only (like 1+(2/3)), while

prepositional logic uses variables (like x+

(y/z)).

Page 6: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 6

Variables

• Variables are named symbolically - a,b,c. In Prolog they

usually start with an uppercase letter.

• Variables can appear in argument lists ex. big(i)

• Variables can appear in place of constants, ex.

student(x) noisy(x)

• With the help of variables we can produce wffs - man(x),

mortal(x)

Page 7: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 7

Formulae vs. Sentences

A formula like man(x) is not a sentence because it does not make an identifiable claim. To make such claims we require quantifiers in order to actually bind the variables (in this case ‘x’)

Examples of an atomic wff: cube(a) big(a) green(a) doctor(x) expensive(x)

Examples of FOL which we would like to represent:

All green cubes are green

Some doctors are expensive

Page 8: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 8

Quantifiers

A need to use quantifiers has therefore been argued due to the lack of expressiveness of Propositional logic and also to represent better FOL wffs in Predicate logic.

Quantifiers tell us about the number or quantity of things that satisfy some of the conditions within the scope of the quantifier.

They are also used to help bind variables to values within a universe of discourse.

The universe of discourse is the domain of the interpretation under consideration, or, more formally, ‘the set of individual objects which we are discussing now’.

Page 9: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 9

UNIVERSAL Quantifier The first of the two quantifiers is the :

“for all” or “for every” or “for ever ”

The domain of the quantifier when we say (x) includes all those objects that can take up the value of ‘x’ in the universe of discourse - all have to bind

The scope of when we state

(x)(is_integer(x) has_prime_fac(x))is obviously equivalent exactly to

(y)(is_integer(y) has_prime_fac(y))However, the following is not possible

(x)(is_integer(x) has_prime_fac(y))

Page 10: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 10

UNIVERSAL Quantifier E.g.1. Every (all) student is noisy

That is, for all x,

if x is a student,

then x is noisy.

For all x, (student(x) noisy(x))

(x)(student(x) noisy(x))E.g.2. All men are mortal.

Socrates is a man.

Therefore Socrates is mortal

For all y, (is_a_man(y) is_mortal(y))

(y)(is_a_man(y) is_mortal(y))

Page 11: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 11

EXISTENTIAL Quantifier The second quantifier is the existence, meaning “there exists” or “there

xists” at least on object in the domain that binds with the variable to satisfy the wff.

The scope of , that is, the part of the formula to which it applies, is the same as , exactly where the variable is bound to some value or object within the domain of discourse.

So, in this case the use of brackets is very important, as seen in this example:

x y (y = 2x)

Is it O.K. if: y x (y = 2x)

More about scope in ‘Free vs Bound’ slide.

Page 12: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 12

EXISTENTIAL Quantifier E.g.1. Some persons never learn.

That is, there exists at least one x,

if x is a person,

then x will never learn.

(x) (person(x) never_learns(x))E.g.2. Some footballers will never play in the Premier or First division.

Reformulating, there exists y such that y is a footballer and y will not play in the premier or first division.

There exists at least on person, y, who

ftball(y) ~ (prem(y) div1(y)

(y) (ftball(y) ~ (prem(y) div1(y))

Page 13: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 13

Free vs. Bound Variables

If P is a wff and ‘v’ is a variable, then:

v P and v P are wff too and ‘v’ is bound in P.

E.g. x (student(x) noisy(x))‘x’ is bound within the scope of the

A variable which is not bound in P is said to be unbound or free in P.

E.g. x student(x) noisy(y)

‘y’ is unbound within the scope of

A sentence is a wff with NO unbound variables.

Page 14: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 14

Points to Remember

• Quantified sentences make claims about some intended domain of discourse.

• A sentence of the form is x P(x) is TRUE iff the wff P(x) is satisfied by every object in the domain of discourse.

• A sentence of the form is x P(x) is TRUE iff the wff P(x) is satisfied by some object (at least one) in the domain of discourse.

Page 15: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 15

Translating Quantified Sentences

is often used in sentences like the following:• Every P is a Q x (P(x) Q(x))• While is normally used as follows:• There is a P which also has property Q. x (P(x) Q(x))• It is often tempting to translate the latter sentence as: x (P(x) Q(x))• but this means something rather different, being true just

in case there is an object which is either not a P or else is a Q; in particular, it is true when there is no object satisfying P(x).

Page 16: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 16

Vacuously True Sentences

• Suppose we try to evaluate the sentence: x (student(x) noisy(x))• in a world where there are no students. Nobody will

satisfy the first part (student(x)) and so from the truth table for implication, all the possible instances come out True - hence the universal statement holds.

• From this we can conclude that any sentence of the form:

x (P(x) Q(x))• is vacuously true in a world where the first part of the

universal statement does not hold.

Page 17: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 17

Complex Noun Phrases

• Most of the time we use to translate sentences with “every” or “all”.

• Every small dog that is at home is happy. x (small(x) dog(x) at_home(x) happy(x))• and we use to translate sentences involving “a”.• A small happy dog is at home.• x (small(x) happy(x) home(x))

• However, sometimes “a” has also a universal sense, as in:• A dog is a kind mammal. x y (dog(x) kind_of(x,y) mammal(y))

Page 18: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 18

Quantifier Equivalence

• If it is a known fact that not everything has some property, then it follows that there is something that does not have that property.

• Symbolically, ~x P(x) x ~P(x)• Similar to ~(AB…) (~A~B…)• ~(P(x1)P(x2)...) (~P(x1)~P(x2) …)

• Similarly, if it is a known fact that it’s not the case that something has a property, then all things do not have that property.

• Symbolically, ~x P(x) x ~P(x)• Similar to ~(AB…) (~A~B…)• ~(P(x1)P(x2)...) (~P(x1)~P(x2)…)

Page 19: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 19

Multiple Quantifiers

• Some cube is to the left of some tetrahedron.

x y (cube(x) tet(y) leftof (x,y))Precisely expressing the logical formula as an English

sentence reading from left to right:‘There exists x, there exists y, such that x is a cube, y is a

tetrahedron and x is on the left of y’• All cubes are to the left of all tetrahedrons.

xy((cube(x)tet(y)) leftof(x,y))‘For all x, for all y, if x is a cube and y is a tetrahedron, then x

is to the left of y’

Page 20: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 20

Prenex Form

• When translating from English to FOL quantifiers and connectives usually end up mixed together.

• In prenex form all quantifiers are put at the start of the sentence, followed by a wff that is quantifier-free.

Q1v1Q2v2…Qnvn P• Where every Qi is either or , each vi is

a variable and P is quantifier-free wff.

Page 21: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 21

Restrictions and Sets

Restricted quantifiers – quantifiers that are restricted to some set membership.

Ex. If P(x) denotes the predicate that is true when x is a person. Thus the set P generated by P(x) is the set of all persons.

This is denoted formally by (x)P

Alternatively you can define P(x) and then say that x P. Then you can simply write down x

Page 22: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 22

Restrictions and Sets

(x)P

P(x) generates P, which is the set of all people

Page 23: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 23

FOL to English Translation

• Two main steps:• 1. Translate the formula by writing the literal

meanings of the logical symbols and predicates as they occur.

• 2. Reword the sentence so that it has the same logical meaning (the truth or falsity of the sentence should not change) but is written in more ‘acceptable’ English. This actually involves avoiding the use of variable names.

Page 24: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 24

Alternative NotationsCourse Notation Alternative Notations

P ~P, !P, P, Np

P Q P&Q, P&&Q, P.Q, PQ, Kpq

P Q P|Q, P||Q, P+Q, Apq

P Q P Q, Cpq

P Q P Q, Epq

X Y X,Y

Page 25: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 25

Some simple exercises…

Let van(x) represent ‘x is a van’,car(x) represent ‘x is a car’,bike(x) represent ‘x is a bike’,exp(x,y) ‘x is more expensive y’,faster(x,y) ‘x is faster than y’.

Translate the following formula into natural language:

1. x (bike(x)y (car(y) exp(y,x))

2.xy ((van(x) bike(y)) faster(x,y))

3. z (car(z) xy((van(x)bike(y)) (faster(z,x)faster(z,y)exp(z,x)exp(z,y))))

Page 26: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 26

English to FOL Translation

• Inverse translation is much more challenging. Three main steps:

• Identify predicates in the sentence.• Rearrange the sentence into a logical

formulation. Capture the essential meaning of the sentence using predicates, quantifiers and connectives.

• Cater for expressions involving time such as ‘always’, ‘afterwards’, etc.

Page 27: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 27

Some more simple exercises…

• Translate the following natural language statements into predicate logic:

1. Every school boy thinks that Robin Hood is a hero.

2. Some people will never learn to keep their mouth shut or to respect other people.

3. A person’s mother is always older than that same person.

Page 28: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 28

eXtensible Markup Language (XML)

Page 29: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 29

eXtensible Markup Language (XML)

• Universal structured data representation language

• Framework for web publishing

• E-Commerce Applications (B2B/B2C)

• “Point of Creation” Bottleneck – people are lazy!

• Too time consuming to markup NLP texts manually

Page 30: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 30

eXtensible Markup Language (XML)

• NLP applications should help in automatic markup of texts using XML

• Gives back much richer text structure and documents

• Intelligence to documents

• Disambiguation and search functionalities

Page 31: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 31

Semantic WebThe Semantic Web provides a common framework that

allows data to be shared and reused across application, enterprise, and community boundaries. It is a collaborative effort led by W3C with participation from a large number of researchers and industrial partners. It is based on the Resource Description Framework (RDF), which integrates a variety of applications using XML for syntax and URIs for naming.

"The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation." -- Tim Berners-Lee, James Hendler, Ora Lassila, The Semantic Web, Scientific American, May 2001

Page 32: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 32

Semantic Web

• Next generation Web?

• http://www.w3.org/2001/sw/

• Many small applications, lots of hype, few large spread uses

• Most notable: RDF/RSS/Atom for blogs and news syndication (also for podcasting)

Page 33: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 33

NLP for XML (NLPRS 2001)

Ontology extraction into XML based structured languages using XML Schema

Message Translation for multilingual B2B, B2C e-commerce applications

Automatic XML to XML schema mapping by XML vocabulary translators with morphological analyzers

Web (XHTML) resource discovery and indexing Automatic hyperlink (XLink) generation Multimodal techniques to take advantage of XML

compound documents (e.g. search the key string in XHTML, MathML, SVG and SMIL components at the same time)

Page 34: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 34

XML for NLP (NLPRS 2001)

NL Corpora representation languages and the conversions among them, from and to RDB, and from raw text

XML based Machine Translation / Interlingua

XML based multilingual Web contents management system

Tree transducers implemented by XSLT

IR powered by both NLP and XML

Task-oriented Summarization using XML Schemas

VoiceXML applications and the dialogue scenario generation

Foreign language e-Education (CALL) material (texts, drills, grading systems etc.) generation by XML

Page 35: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 35

Taxonomies

Taxonomy (from Greek ταξινομία (taxinomia) from the words taxis = order and nomos = law) may refer to either the classification of things, or the principles underlying the classification. Almost anything, animate objects, inanimate objects, places, and events, may be classified according to some taxonomic scheme.

Wikipedia Definition

Page 36: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 36

Taxonomies/Ontologies

Used to markup texts

Define XML tags (or SGML) used to markup semantic objects

Example: Use <noun> tag to markup “nouns”

Frequently hierarchical

Confusion with Ontologies – often referring to same thing (ontologies used more in Knowledge Management)

Ontologies seen sometimes as being broader in scope than taxonomies

Page 37: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 37

Scientific vs. Folk

Scientific taxonomies: Objective Universal

Example: Biological Taxonomy (Linnaean/Evolutionary Tree)

Folk taxonomies: Subjective Vernacular naming system Social knowledge representation

Example: Flickr, del.icio.us, podcast labels More or less the same thing as folksonomies

Page 38: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 38

Taxonomies/Ontologies

Formally represent an acyclic graph/tree XML or SGML frequently used as base

language Prolog can also be used (80’s AI projects) FOL can also be used (Cyc) Modern standards: OWL, RDF, RDFS, OIL,

DAML, DAML+OIL Welcome to acronym world!

Page 39: November 2005CSA3180: Semantics I1 CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies Truth

November 2005 CSA3180: Semantics I 39

Stuff to lookup

RDF, DAML+OIL RSS Podcasting – behind the scenes

(Non-comprehensive) List of NLP-related projects using ontologies

http://www.cs.utexas.edu/users/mfkb/related.html