32
International Technology Alliance In Network & Information Sciences Fact Extraction using CNL: summary of reasoning (v3) David Mott, Dave Braines(ETS, IBM UK) Stephen Poteet (Boeing) February 2012

International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

Embed Size (px)

Citation preview

Page 1: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

International Technology AllianceIn Network & Information Sciences

International Technology AllianceIn Network & Information Sciences

Fact Extraction using CNL:summary of reasoning (v3)Fact Extraction using CNL:summary of reasoning (v3)

David Mott, Dave Braines(ETS, IBM UK)

Stephen Poteet (Boeing)February 2012

Page 2: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[2]

Objective of reasoning

• Define rules for semantic reasoning– entities– situations

• Standardise processing of NL and CNL to use the same concepts and rules– NL processing currently done by basic rules in two

agents– CNL processing currently done by "linguistic frames"

• Two different rule interpreters, but use same rules

Page 3: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[3]

Conceptual Model(s)

Meta Model Concept, Entity Concept, Relation Concept, Conceptual Model

belongs to, has as domain

Semiotic Triangle

Thing, Meaning, Symbol stands for, expresses

General Agent, Spatial Entity, Temporal Entity, Situation, Container

has as agent role, is contained in

Linguistic Sentence, Phrase, Word, Noun, Linguistic Category, Linguistic Frame

has as dependent, is parsed from

ACM Place, Church, Person, Village, IED, Facility, ....

is located in

meaning

symbol thing

conceptualises

stands for

expresses

"Our" Semiotic Triangle, based on the original [Ogden, C. K. and Richards, I. A. (1923). ]

Page 4: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[4]

Current NL Processing

StanfordParser

Entity Extractor

SituationExtractor

Names

CEAggregatorCEStore

SYNCOINReports

MessagePreProcessor

"Stylistic" CE

Conceptual Model(concepts, logical rules, linguistic expression)

Proper Nouns(places, units)

For Analysis

Our focus is on the semantics

of the conceptual

model

Page 5: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[5]

Meta facts – the hard stuff!• Conceptualise statements:

conceptualise a ~ person ~ P.conceptualise the person P ~ is married to ~ the person P1.

• can be written in "meta facts", about the concepts themselves:there is an entity concept named 'person'.there is a relation concept named 'is married to' that

has the entity concept 'person' as domain and has the entity concept 'person as range.

the relation concept 'is married to'

has as range

the entity concept 'person'

has as domain

• These meta facts can be used to talk about the concepts:the conceptual model m1 contains the entity concept 'person'.the relation concept 'is married to' is a symmetric relation.

•or to map between words and concepts:the verb '|marry|' expresses the relation concept 'is married to'.

Page 6: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[6]

Meta facts and object facts

• Most "normal" facts are not meta facts:the person John is married to the person Jane.

• Sometimes we need to bridge the world of meta facts and "normal" (object) facts

meta facts about the things and

relations that exist

words in a sentence

object facts about the things and

relations that exist

the person John is married to the person Jane.

the verb phrase has the verb |marry| as head.

the relation concept 'is married to' has ....

What do we put in here?

the noun phrase has the noun |person| as head and |John| as dependent.

magic mapping

Page 7: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[7]

Meta facts – Mapping entities

the thing John

the entity concept 'person'

realises

person

Meta rule:

if ( the thing T realises the entity concept EC )then ( the thing T is a < EC> )

is a

the thing John realises the entity concept person the thing John is a personmagic mapping

Meta level facts Object level facts

Page 8: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[8]

Meta facts – Mapping relations

the person John

the relation concept 'is married to'

relation realisation

Meta rule:

if ( the relation concept RC has the sequence ( the thing T , and the thing T2 ) as relation realisation )then ( the thing T <RC > the thing T2 )

the relation concept 'is married to' has the sequence ( the person John , and the person Jane ) as relation realisation

the person John is married to the person Jane.

magic mapping

Meta level facts Object level facts

the person Jane

the sequence

"1" "2"

is married to

Page 9: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[9]

Meta facts – Mapping attributes

the person John

the attribute concept 'sister'

attribute realisation

Meta rule:

if ( the attribute concept AC has the sequence ( the thing T , and the thing T2 ) as attribute realisation )then ( the thing T has the thing T2 as < AC >)

the attribute concept 'sister' has the sequence ( the person John , and the person Jane ) as attribute realisation.

the person John has the person Jane as sister

magic mapping

Meta level facts Object level facts

the person Jane

the sequence

"1" "2"

has as sister

Page 10: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[10]

Add meta syntax to CE rules?

if ...then ( the thing T <RC > the thing T2 ).

if ( the word W expresses the entity concept EC ) and ....then ( the thing T is a < EC> ).

if ...then ( the thing T has the thing T2 as < AC >

Magic mapping occurs in the rule

interpreter(need to define

semantics)

Page 11: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[11]

NP processing logic

Page 12: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[12]

ENTITIES

Page 13: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[13]

Logic of Entities

the noun phrase np1

[ nn_cat_ent_1 ]if ( the noun phrase NP has the noun N as head and stands for the thing T ) and ( the noun N expresses the entity concept C )then ( the thing T realises the entity concept EC ).

"the patrol in East Rashid discovers the facility."

the noun |patrol|

has as head

the thing s1

stands for

the entity concept 'patrol unit'

expresses

realises

patrol unit

Analyst's helper

is a

Page 14: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[14]

Proper Names

[ nn_comname ]if ( the noun phrase NP stands for the thing T and has the proper noun N as proper name head )then ( the thing T has the value N as common name ).

the noun phrase np1

the thing s1

stands for

the proper noun |East Rashid|

has as proper name head

has as common name

A "common name" defines a "well known"

name that may be used when viewing the output CE as the name of the entity (but care

is needed as uniqueness will only be

within a certain context)

Page 15: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[15]

Adjectives

[ nn_cat_ent_2 ]if ( the noun phrase NP has the word W as dependent and stands for the thing T ) and ( the word W expresses the entity concept C )then ( the thing T realises the entity concept EC ).

the noun phrase np1

the adjective |Christian|

has as dependent

the thing s1

stands for

the entity concept 'christian entity'

expresses

realises

christian entity

Analyst's helper

is a

Handled similarly to nouns, but currently

requires the conceptual model to

contain a noun form of the adjective e.g. "christian entity"

"the Christian market"

Needs more work here

Page 16: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[16]

Containers

[ nn_prep_in ]if ( the noun phrase NP has the prepositional phrase PP as dependent and stands for the thing T ) and ( the prepositional phrase PP has the preposition '|in|' as head and has the noun phrase NP1 as object ) and ( the noun phrase NP1 stands for the thing T1 )then ( the thing T1 is a container ).

[ nn_prep_in_1 ]if ( the noun phrase NP has the prepositional phrase PP as dependent and stands for the thing T ) and ( the prepositional phrase PP has the preposition '|in|' as head and has the noun phrase NP1 as object ) and ( the noun phrase NP1 stands for the container T1 )then ( the thing T is contained in the container T1 ).

the noun phrase np1

the prepositional phrase pp1

has as dependent

"the patrol in East Rashid discovers the facility."

the preposition |in|

the patrol unit p1

stands for

the noun phrase np2

has as head has as object

container

is a

the thing t2

stands for

is contained in

Page 17: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[17]

"Same as" processing

Sameas inference is implemented in teh Prolog rule engine, but leads to large

number of inferences; may need a better implementation

if ( the thing T is the same as the thing T1 ) and ( the thing T is an < EC > )then ( the thing T1 is an < EC > ).

if ( the thing T is the same as the thing T1 ) and ( the thing T < RC> the thing T3 )then ( the thing T1 < RC > the thing T3 ).

if ( the thing T is the same as the thing T1 ) and ( the thing T has V as < AC > ) then ( the thing T1 has V as < AC > ) .

• Two things may be determined to be the same entity, in which case their properties and relations are cross propagated

A meta statement of propagation

Page 18: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[18]

Common Names

if ( there is a thing named T that has the proper noun PN as common name ) and ( there is a thing named T1 that has the proper noun PN as common name ) and ( the thing T # the thing T1 )then ( the thing T is the same as the thing T1 ).

Things with the same common name are the same thing (This is an assumption

that common name is unique)

Proper NamesThis is the way to identify the entities in noun phrases as already-known places/people/organisations etc,

using a preexisting set of common names

there is a place named 1234 that has the proper noun |East Rashid| as common name and has '32,33' as coordinates and is located in the place |Afghanistan|..

Page 19: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[19]

Places

[ place_1 ]if ( the thing T is contained in the container P ) and ( the container P is a place )then ( the thing T is located in the place P ).

there is a place named 1234 that has the proper noun |East Rashid| as common name and has '32,31' as coordinates.

Factbase of Names

the thing t1 has the proper noun |East Rashid| as common name.

the thing t1 is a place

sameas processing

the patrol unit p1 is contained in the container t1

the patrol unit p1 is located in the place t1

Page 20: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[20]

Specific ACM semantics

[ attack_perp_1 ]

if ( the attack A has the agent A1 as agent role )then ( the attack A has the agent A1 as perpetrator ).

[ attack_targ ]if ( the attack A has the thing A1 as patient role )then ( the attack A has the thing A1 as target ).

[ discovery_finder_1 ]if ( the discovery D has the agent A1 as agent role )then ( the discovery D has the agent A1 as finder ).

[ discovery_find ]if ( the discovery D has the thing A1 as patient role )then ( the discovery D has the thing A1 as find ).

Is this really necessary – could it be handled by meta level rules

based on the range and domain of the entity concepts?

We would need to define a new relation in the conceptualise?

This needs to be done for each concept, by the analyst

Analyst's helper

Page 21: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[21]

SITUATIONS

Page 22: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[22]

Logic of situations

the verb phrase v1

the verb |discover|

the relation concept 'finds'

the thing s1

stands for

is viewed relationally as

has as head

expresses

[ vb_sit ]if ( the verb phrase VB stands for the thing T ) then ( the thing T is a situation ).

[ vb_cat_ent ]if ( the verb phrase VB has the verb PT as head and stands for the situation T ) and ( the verb PT expresses the relation concept RC )then ( the situation T is viewed relationally as the relation concept RC ).

"the patrol in East Rashid discovers the facility."

Analyst's helper

situation

is a

"finds" is really a relation but the situation is an

entity, so we need to reconcile these views, hence "is

viewed relationally"

Page 23: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[23]

Logic of situations (2)

the verb phrase v1

the verb |discovers|

the relation concept 'finds'

the entity concept 'discovery'

the situation s1

stands for

realises

reifiesis viewed relationally as

has as head

expresses

[ gen_reify ]if ( the situation S is viewed relationally as the relation concept RC ) and ( the entity concept EC reifies the relation concept RC )then ( the situation S realises the entity concept EC ).

"the patrol in East Rashid discovers the facility."

discovery

Analyst's helper

Analyst's helper

is a

Page 24: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[24]

Logic of situations (3)

the verb phrase v1

the verb |finds|

the relation concept 'finds'

the entity concept 'discovery'

the discovery s1

stands for

realises

reifiesis viewed relationally as

has as head

expresses

the patrol p1 the facility f1

has as patient role

has as agent role

[ vb_patient ]if ( the verb phrase VB has the noun phrase NP as dependent and stands for the situation VBT ) and ( the noun phrase NP stands for the thing NPT )then ( the situation VBT has the thing NPT as patient role ).

[ vb_agent ]if ( the sentence phrase SP has the noun phrase NP as dependent and has the verb phrase VB as head ) and ( the noun phrase NP stands for the thing NPT ) and ( the verb phrase VB stands for the situation VBT )then ( the situation VBT has the thing NPT as agent role ).

the noun phrase np2

has as dependent

stands for

the sentence s1

the noun phrase np1

has as dependenthas as head

stands for

"the patrol in East Rashid discovers the facility."

Page 25: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[25]

Logic of situations (4)

the verb phrase v1

the verb |finds|

the relation concept 'finds'

the entity concept 'discovery'

the discovery s1

stands for

realises

reifiesis viewed relationally as

has as head

expresses

the patrol p1 the facility f1

has as patient role

has as agent role

the noun phrase np2

has as dependent

stands for

the sentence s1

the noun phrase np1

has as dependenthas as head

stands for

finds

"the patrol in East Rashid discovers the facility."

agent

is a

[ gen_relation ]if ( the situation S has the thing A as agent role and has the thing B as patient role and is viewed relationally as the relation concept RC )then ( the relation concept RC has the sequence ( the thing A , and the thing B ) as relation instance ).

[ gen-relation-domain ]if ( the situation S has the thing A as agent role and is viewed relationally as the relation concept RC ) and ( the relation concept RC has the entity concept EC as domain )then ( the thing A realises the entity concept EC ).

[ gen-relation-range ]if ( the situation S has the thing B as patient role and is viewed relationally as the relation concept RC ) and ( the relation concept RC has the entity concept EC as range )then ( the thing B realises the entity concept EC ).

Page 26: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[26]

Processing for CNL and NL

• NL processing: rules implemented as standard CE inference engine in agents running against CE store.

• CNL processing: special purpose inference engine, interpreting CE logic in linguistic frames

• SAME CE logic in both

Page 27: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[27]

"Identical" NL and CNL parsers

NL Parser CNL Parserlexicon

conceptualmodel

Reference English

Grammar

SemanticTheory

Increase stylistic expressibility of CEBetter understanding of linguistics

stylistically expressive CE

basic CE or predicate logic orCE-in-Java

stylistically expressive CE

NLP

Page 28: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[28]

Linguistic Frame for semantics

there is a linguistic frame named np3 that has 'a person' as example and

defines the noun phrase NP_np3 and has the sequence ( the determiner DET_np3 , and the noun COMMON_np3 ) as syntactic pattern and

is predicated on the thing X and

has the statement that

( the noun COMMON_np3 expresses the entity concept EC_np3 )

as preconditions and

has the statement that

( the thing X realises the entity concept EC_np3 ) and

( the noun phrase NP_np3 stands for the thing X )

as semantic statement

the word |a| belongs to the linguistic category 'determiner'.the word |person| is a noun.

the word |person| expresses the entity concept person.

semantics

syntaxdeterminer noun

noun phrase

a person

person(COMMON_np3)

v(X), X=COMMON_np3,...

Analyst's Conceptual Model

Linguistic Model

We want exactly the same logic here as in

the real NL processing (cf earlier slide on Logic

of Entities)

lambda variable

Page 29: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[29]

CNL semantic processing

truth box

truth box truth box

truth box truth box

the person John

is married to the person Jane

Linguistic Frame for NP

Linguistic Frame for NP

Linguistic Frame for VP

Word Category for Verb

Linguistic Frame for Sentence

NP processing not shown

Rules for NP

Rules for Sentence

Concept Lookup

Rules for VP

Rules for NP

CE facts passed upwards from box

to box

The result is the set of CE facts

representing the sentence

Page 30: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[30]

Analyst Helper

• Provides background linguistic information to NL and CNL parsers, specific to the ACM

• meta information on concepts– generated automatically from the ACM

• "expresses" relation between words and concepts– only analyst knows what the concepts mean– for each concept, ask analyst to say what words express it– can use Wordnet to make suggestions

• rules to determine ACM specific relations• Sets of proper names

– places, people, organisations,...– use (assumed unique) common name to identify

• feedback words that have not been recognised when analysing sentences– needs interaction between parser and analyst helper

• Basic user interface is needed– could be more elaborate if resources available

Page 31: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[31]

What we need from the analysts helper?

• the word X expresses the concept Y.• the entity concept EC reifies the relation concept RC• rules, eg [ attack_perp_1 ] ?

• there is an entity concept named EC.• the relation concept RC has the entity concept EC as domain and the entity

concept EC1 as range.• the attribute concept RC has the entity concept EC as domain and the entity

concept EC1 as range.

• there is a place named 1234 that has the proper name |East Rashid| as common name and has '32,33' as coordinates...

Analyst's helper

• Issues:– do we name concepts by the user-visible "concept term"? [a detail]– the expresses type information seems too simplistic– at some stage we need far more detailed background semantics for applying semantic constraints

to parsing and for disambiguation

conceptual model

Proper Names

Page 32: International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction

[32]

Flow of information for NLP

Analyst HelperNL parser

"expresses"

conceptual model

Proper Names

wordnet/etc

meta information

ITAnet

MetaModel generator

gazetteers etc

Analyst

the word |xxx| is an unrecognised word

wordnet/etc gazetteers etc

translate translate

Actually the CE parser uses the same resources

semantic rules

generator?