Linguistic Knowledge Representation Scott Farrar Department of Linguistics

Preview:

DESCRIPTION

Inference from Knowledge John’s hand is in his pocket. John owns the hand. The hand is physically attached to John. Hand is a body-part, not a person. The hand is physically contained in the pocket, not the other way around. John’s wants his hand to be in his pocket. This event is occurring now. John’s hand is not in Bill’s pocket. A pocket is a container in clothing. A hand is smaller than a pocket. L CS, V L, CS V CS L L-Linguistic CS-Commonsense V-Visual

Citation preview

Linguistic Knowledge Linguistic Knowledge RepresentationRepresentation

Scott FarrarScott FarrarDepartment of LinguisticsDepartment of Linguistics

farrar@u.arizona.edufarrar@u.arizona.edu

Problems to OvercomeProblems to Overcome1.1. Specifying the relationship between Specifying the relationship between

linguistic and other forms of linguistic and other forms of knowledge.knowledge.

Inference from KnowledgeInference from KnowledgeJohn’s hand is in his pocket.

John owns the hand.

The hand is physically attached to John.

Hand is a body-part, not a person.

The hand is physically contained in the pocket, not the other way around.

John’s wants his hand to be in his pocket.

This event is occurring now.

John’s hand is not in Bill’s pocket.

A pocket is a container in clothing.

A hand is smaller than a pocket.

LCS, V

L, CS

VCS

CS

L

CS

CS

L-LinguisticCS-CommonsenseV-Visual

Problems to OvercomeProblems to Overcome1.1. Specifying the relationship between Specifying the relationship between

linguistic and other forms of linguistic and other forms of knowledge.knowledge.

2.2. Dealing with ambiguity and other Dealing with ambiguity and other issues of natural language issues of natural language processing (NLP).processing (NLP).

What is Meaning?What is Meaning? Symbols, Representation, ExtensionsSymbols, Representation, Extensions

[010011101]

moon

What is Meaning?What is Meaning?

Conceptual structure hypothesis:Conceptual structure hypothesis:

The meaning of a word is the The meaning of a word is the corresponding mental corresponding mental representationrepresentation in the mind of an in the mind of an agent.agent.

The Lexicon links Linguistic The Lexicon links Linguistic Knowledge to the Rest of Knowledge to the Rest of

CognitionCognition

ConceptualStructure

SpatialStructure

SyntacticStructure

Visioninput

Hapticinput

Auditoryinput

Motoroutput

Jackendoff’s Model

The Computational LexiconThe Computational Lexicon formform—what data structures comprise —what data structures comprise

the lexicon?the lexicon?

organizationorganization—how are the data —how are the data structures organized?structures organized?

contentcontent—what information is contained —what information is contained in the data structures?in the data structures?

Features—Katz and Fodor (1963):Features—Katz and Fodor (1963):knowledge is a conjunction of knowledge is a conjunction of features (monadic predicates)features (monadic predicates)

bachelor (x)bachelor (x)→→unmarried (x) & male (x) & unmarried (x) & male (x) &

young (x)young (x)

Form of the LexiconForm of the Lexicon

Form of the LexiconForm of the Lexicon Frames—Minsky (1975):Frames—Minsky (1975):

knowledge is organized around knowledge is organized around conceptsconcepts

give:give:<agent Person><agent Person><recipient Person><recipient Person><theme PhysicalObject><theme PhysicalObject>slots values

Form of the LexiconForm of the Lexicon Attribute-value matrices (feature Attribute-value matrices (feature

structures)structures)From knowledge engineering (AI) From knowledge engineering (AI) communitycommunitye.g., Head-Driven Phrase Structure e.g., Head-Driven Phrase Structure GrammarGrammar

<see example><see example>

Organization of the LexiconOrganization of the Lexicon Semantic network—Quillian (1966):Semantic network—Quillian (1966):

knowledge is interconnectedknowledge is interconnectedanimal

Subbird

tweety

lungs

feathers

has-part

has-part

Inst

Hierarchy of ConceptsHierarchy of ConceptsArtifact

Machine Tool

Motorized-Machine

Non-motorized-Machine

automobile drill loom windmill

… …

Problems to OvercomeProblems to Overcome1.1. Specifying the relationship between Specifying the relationship between

linguistic and other forms of knowledge.linguistic and other forms of knowledge.2.2. Dealing with ambiguity and other issues Dealing with ambiguity and other issues

of natural language processing (NLP).of natural language processing (NLP).3.3. Determining what role visual knowledge Determining what role visual knowledge

of objects and events has in the of objects and events has in the disambiguation process.disambiguation process.

Formalism for Ambiguity Formalism for Ambiguity ResolutionResolution(9/13/02)(9/13/02)Scott FarrarScott Farrar

Department of LinguisticsDepartment of Linguisticsfarrar@u.arizona.edufarrar@u.arizona.edu

Goals of the Present Goals of the Present ResearchResearch

Build a theoretical system that can Build a theoretical system that can construct a visual scene from English construct a visual scene from English text input.text input. focus on the problem of lexical ambiguityfocus on the problem of lexical ambiguity access and use the visual knowledge access and use the visual knowledge

linked to lexical itemslinked to lexical items argue for a knowledge-rich approach to argue for a knowledge-rich approach to

natural language processing (lexicon)natural language processing (lexicon)

Lexical AmbiguityLexical AmbiguityWhen one linguistic form has multiple meanings:When one linguistic form has multiple meanings:The book is on the The book is on the edgeedge of the table. of the table. [area][area]The The edgeedge of the table is sharp. of the table is sharp. [line] [line]

The park is five The park is five blocksblocks away. away. [large dimension][large dimension]Kids like to play with Kids like to play with blocksblocks.. [small dimension][small dimension]

The The middlemiddle (of the bench) is wet. (of the bench) is wet. [center-part][center-part]Put the pan in the Put the pan in the middlemiddle (between the bowls). (between the bowls). [space-[space-

between]between]

To vote “YES” check the upper To vote “YES” check the upper boxbox.. [2d][2d]Put your hand in the Put your hand in the boxbox.. [3d][3d]

Natural Language Natural Language ProcessingProcessing

syntax

“The king gave the people bread.”

give: tense: past agent: the king recipient: the people theme: bread

The king gave the people bread DT N VB DT N N

(the king) (gave) (the people) (bread)

The people ate the bread.

The people have bread.

lexicon

semantics

Grammar

other knowledge

How much visual knowledge How much visual knowledge does the lexicon have access does the lexicon have access

to?to?Type of conceptType of concept ExampleExamplea.a. a-spatiala-spatial fear, hour, durationfear, hour, durationb.b. extrinsicallyextrinsically animal, robot,animal, robot,

spatialspatial instrumentinstrumentc.c. intrinsicallyintrinsically horse, man, violinhorse, man, violin

spatialspatial my legmy legd.d. strictly spatialstrictly spatial square, margin, square, margin,

heightheight(Bierwisch 1996: 52)(Bierwisch 1996: 52)

Content of the LexiconContent of the Lexicon

Purely linguistic:Purely linguistic:

dogdog

““dog”, [da:g]dog”, [da:g]dog collar dog collar not *not *collar dogcollar dog

dog+PL = dog+PL = dogs dogs not *not *dogesdoges

Content of the LexiconContent of the LexiconPurely visual:Purely visual: dogdog

shape: shape: size:size:

color:color:texture:texture:

Content of the LexiconContent of the LexiconCommonsense/other/visual:Commonsense/other/visual:

dogdog

has-part (dog, tail)has-part (dog, tail)makes-noise (dog, “makes-noise (dog, “barkbark”)”)

disjoint (dog cat)disjoint (dog cat)likes (Scott, terrier)likes (Scott, terrier)

Knowledge ComponentsKnowledge Components

L

VCS

knowledge is distributedyet interoperable

L LinguisticCS Commonsense V Visual

Formalization of the ProblemFormalization of the Problem input: a list of well-formed English input: a list of well-formed English

utterances utterances U, U, where where UU={={uu11,u,u22,u,u33,,…,u…,unn}, |}, |UU||≥1, and ≥1, and UU can be can be interpreted as a complete visual interpreted as a complete visual scene.scene.

Examples of Examples of UU{John is standing on the bridge. John has his {John is standing on the bridge. John has his

hands in his pockets. John is wearing a hands in his pockets. John is wearing a cap…}cap…}

{The table is in the middle of the room. There {The table is in the middle of the room. There is a ball on the edge of the table…}is a ball on the edge of the table…}

notnot{Mary loves John. She has known him for four {Mary loves John. She has known him for four

years…}years…}

Formalization of the Problem Formalization of the Problem (cont.)(cont.)

output: output: VVUU, such that , such that VVUU is a visual is a visual scene based on scene based on UU consisting of a 3- consisting of a 3-tuple tuple <<I, O, RI, O, R> where > where II is a set of icons, is a set of icons, OO is a set of orientations for the is a set of orientations for the icons, and icons, and RR is a set of relations is a set of relations among the icons.among the icons.

A Solution ApproachA Solution Approach Represent all knowledge in pure first-Represent all knowledge in pure first-

order logicorder logic a knowledge base KB consists of a knowledge base KB consists of

axioms and facts about the domain axioms and facts about the domain with no distinctions made between with no distinctions made between types of knowledgetypes of knowledge

a forward-chaining algorithm is used a forward-chaining algorithm is used to generate a visual sceneto generate a visual scene

Formalization of the LexiconFormalization of the LexiconA lexicon L is at least the 4-tuple <F, R, G, C>, A lexicon L is at least the 4-tuple <F, R, G, C>,

where:where:

F is the set of linguistic forms.F is the set of linguistic forms.

R is the set of formal relations among members of R is the set of formal relations among members of F.F.

G is the grammatical information relevant to F.G is the grammatical information relevant to F.C is the conceptual content (the meaning).C is the conceptual content (the meaning).

Commonsense KnowledgeCommonsense Knowledge

A ball will not remain stationary A ball will not remain stationary on an on an inclined surface.inclined surface.

A jar can be a container.A jar can be a container. Unsupported objects fall.Unsupported objects fall.

Formalization of Commonsense Formalization of Commonsense KnowledgeKnowledge

KB is the tuple <C, R, I, O>, where:KB is the tuple <C, R, I, O>, where:

C is the (possibly infinite) set of concepts.C is the (possibly infinite) set of concepts.

R is set of relations over C.R is set of relations over C.

I is the set of individuals.I is the set of individuals.

O is an ontology specifying the precise O is an ontology specifying the precise formalization of C, R, and I.formalization of C, R, and I.

Visual KnowledgeVisual KnowledgeConcepts:Concepts:

AbstractShapes={Circle, Line, Sphere,…}AbstractShapes={Circle, Line, Sphere,…}

Relations:Relations:SpatialRelations={In, On, Contains,…}SpatialRelations={In, On, Contains,…}

Axioms:Axioms:Peas are smaller than landmines.Peas are smaller than landmines.If object A contacts object B, then A is near B.If object A contacts object B, then A is near B.

Formalization of Visual Formalization of Visual KnowledgeKnowledge

Visual knowledge V is a subset of KB. Visual knowledge V is a subset of KB.

So far so GoodSo far so Good lexicon L = lexicon L = <F, R, G, C><F, R, G, C> knowledge base KB = <C, R, I, O>knowledge base KB = <C, R, I, O> visual knowledge V visual knowledge V ∈∈ KB KB* Well understood inferencing * Well understood inferencing

procedures for FOL knowledge procedures for FOL knowledge bases: theorem proving: Prolog, bases: theorem proving: Prolog, forward-chaining: CLIPSforward-chaining: CLIPS

So Far so IntractableSo Far so Intractable** If represented in pure first-order (or If represented in pure first-order (or

higher order) logic, then the higher order) logic, then the problem will eventually become problem will eventually become computationally intractable computationally intractable depending on scope of domain and, depending on scope of domain and, for NLP, ambiguity of the word in for NLP, ambiguity of the word in question (compare ‘edge’ to ‘hand’).question (compare ‘edge’ to ‘hand’).

AlternativesAlternatives

Use a logical system that is well-Use a logical system that is well-understood and known to be understood and known to be complete and tractablecomplete and tractable

Description Logic (Brachman Description Logic (Brachman 1979)1979)

Description LogicDescription Logic A KR formalism (like frames, semantic nets, prod. syst.)A KR formalism (like frames, semantic nets, prod. syst.) A way to build a conceptualization of the domainA way to build a conceptualization of the domain Basic structure is the concept (a structured entity)Basic structure is the concept (a structured entity) Intuitively appealingIntuitively appealing Incorporates a subset of FOLIncorporates a subset of FOL Expressive syntax and decidable inference proceduresExpressive syntax and decidable inference procedures

e.g., KL-ONE (Brachman 1979) e.g., KL-ONE (Brachman 1979) KRYPTON (Brachman, Fikes, and Levesque (1983) KRYPTON (Brachman, Fikes, and Levesque (1983) LOOM, CLASSIC…LOOM, CLASSIC…

Description Logic Description Logic ComponentsComponents

SyntaxSyntax The KBThe KB SemanticsSemantics Reasoning ProceduresReasoning Procedures Reasoning TasksReasoning Tasks

Syntax: AtomsSyntax: Atoms concepts concepts (unary predicates)(unary predicates) rolesroles (binary predicates)(binary predicates) individualsindividuals (constants)(constants)

Syntax: ConceptsSyntax: Concepts RoundRound FlatFlat LongLong HoleHole PersonPerson EventEvent

Syntax: RolesSyntax: Roles OnOn InIn StrikeStrike TouchTouch NeighborNeighbor FatherFather

Syntax: IndividualsSyntax: Individuals TABLE-1TABLE-1 JOHNJOHN MY-HANDMY-HAND ROLLING-EVENT-3ROLLING-EVENT-3

The Knowledge Base of a The Knowledge Base of a Description LogicDescription Logic

Terminology (TBox) – hierarchy of Terminology (TBox) – hierarchy of concepts and rolesconcepts and roles

Assertions (ABox) – axioms for Assertions (ABox) – axioms for individual objectsindividual objects

Benefits of Dividing the KBBenefits of Dividing the KB Reasoning is tractable due to sacrifice of Reasoning is tractable due to sacrifice of

expressiveness (Brachman and Levesque 1985)expressiveness (Brachman and Levesque 1985) Philosophically ‘clean’: Philosophically ‘clean’:

TBox (intensional knowledge, always true, TBox (intensional knowledge, always true, doesn’t change, a priori) doesn’t change, a priori) ABox (extensional knowledge, can change, a ABox (extensional knowledge, can change, a posteriori)posteriori)

Satisfiability of conceptualization (domain) is Satisfiability of conceptualization (domain) is determined easily when only TBox is determined easily when only TBox is consideredconsidered

Conceptual modeling appears more intuitiveConceptual modeling appears more intuitive

Syntax: ConstructorsSyntax: Constructors intersection (C intersection (C D) : D) :

Round Round Flat FlatRound Round Flat Flat Light Light

value restriction (value restriction (∀ ∀ R.C): R.C): ∀∀hasHole.ContainerhasHole.Container

limited existential quantification (limited existential quantification (∃R.T∃R.T): ): ∃hasHole.T∃hasHole.T

Syntax: TBox AxiomsSyntax: TBox Axioms DefinitionsDefinitions

Ball Ball Sphere Sphere Toy ToyBox Box Container Container Cube CubeBiped Biped Animal Animal =2hasLegs =2hasLegsDefinitions are basic operation in TBox for Definitions are basic operation in TBox for

deriving new concepts (other concepts deriving new concepts (other concepts are primitive).are primitive).

Syntax: TBox Axioms (cont.)Syntax: TBox Axioms (cont.) subsumption operator: ‘subsumption operator: ‘’’

Table Table FurnitureFurnitureHuman Human Animal AnimalSphere Sphere 3DShape 3DShape

Provide structure for TBoxProvide structure for TBox

Syntax: ABox AssertionsSyntax: ABox Assertions Concept assertions C(a):Concept assertions C(a):

Person (JOHN)Person (JOHN)Table (TABLE-1)Table (TABLE-1)Tool (HAMMER-23)Tool (HAMMER-23)

Role assertions R(a,b):Role assertions R(a,b):Likes (JOHN, HAMMER-23)Likes (JOHN, HAMMER-23)On (HAMMER-23, TABLE-1)On (HAMMER-23, TABLE-1)

Serves to link individuals in ABox to concepts in TBoxServes to link individuals in ABox to concepts in TBox

Note on T/ABox RelationNote on T/ABox Relation TBox imposes selection restrictions TBox imposes selection restrictions

on ABox assertions, e.g., Eat(JOHN, on ABox assertions, e.g., Eat(JOHN, TABLE-1) is not possible, becauseTABLE-1) is not possible, because

Edible(Table) is not in TBoxEdible(Table) is not in TBox

Informal SemanticsInformal Semantics A concept C is a set of individuals A concept C is a set of individuals

{a,b,…}{a,b,…} A role R is a relation between a pair A role R is a relation between a pair

of individuals or concepts.of individuals or concepts. A conjoined concept C A conjoined concept C ⊔⊔ D is both C D is both C

and D.and D.

Reasoning TasksReasoning Tasks TBoxTBox

categorization, satisfiabilitycategorization, satisfiability ABoxABox

consistency checking w.r.t. TBoxconsistency checking w.r.t. TBox

Ambiguity in a DL SystemAmbiguity in a DL System More than one referent concept in More than one referent concept in

TBoxTBox if a word W represents concept C and if a word W represents concept C and

concept D, and C and D are disjoint, concept D, and C and D are disjoint, then W is ambiguousthen W is ambiguous

Represents(‘edge’, LinearBoundary)Represents(‘edge’, LinearBoundary) Represents(‘edge’, 2DSurface)Represents(‘edge’, 2DSurface)

Consistency Checking for Consistency Checking for Determining Lexical AmbiguityDetermining Lexical Ambiguity

Recall that ‘box’ is ambiguous, as in Recall that ‘box’ is ambiguous, as in ‘Check the ‘Check the boxbox if you are a student.’ if you are a student.’

2DObject(Box-34)2DObject(Box-34)3DObject(Box-34)3DObject(Box-34)

Disjoint (2DObject, 3DObject)Disjoint (2DObject, 3DObject)

Consistency Checking for Consistency Checking for Resolving AmbiguityResolving Ambiguity

The ball is on the The ball is on the edgeedge of the table. of the table. Ball can’t be in two places.Ball can’t be in two places. Only one semantic representation Only one semantic representation

obtains from the TBox.obtains from the TBox.

ConclusionConclusion Word sense disambiguation is Word sense disambiguation is

challenge to NLP.challenge to NLP. NLP can benefit from a knowledge-rich NLP can benefit from a knowledge-rich

approach.approach. A combination of visual and other A combination of visual and other

commonsense assertions can enrich commonsense assertions can enrich the lexicon.the lexicon.

Description logic provide a Description logic provide a computationally tractable solution.computationally tractable solution.

Recommended