18
OntoSoar: Feeding a Growing Ontology CS 652 Information Extraction and Integration Fall 2012 Peter Lindes pl 12/4/2012 OntoSoar 1

OntoSoar: Feeding a Growing Ontology

Embed Size (px)

DESCRIPTION

OntoSoar: Feeding a Growing Ontology. CS 652 Information Extraction and Integration Fall 2012 Peter Lindes. Project Goals. Use linguistic technologies to: Find more facts Learn new categories and relations Technologies to be used: OntoES Link Grammar Parser LG-Soar Soar - PowerPoint PPT Presentation

Citation preview

OntoSoar 1

OntoSoar: Feeding a Growing Ontology

CS 652 Information Extraction and IntegrationFall 2012

Peter Lindes

pl 12/4/2012

OntoSoar 2

Project Goals

• Use linguistic technologies to:– Find more facts– Learn new categories and relations

• Technologies to be used:– OntoES– Link Grammar Parser– LG-Soar– Soar– Discourse Representation Theory

pl 12/4/2012

OntoSoar 3

What is OntoES?

• A system for building OSM models• Capable of representing extraction ontologies• Processes text to extract facts

pl 12/4/2012

OntoSoar 4pl 12/4/2012

OntoSoar 5pl 12/4/2012

OntoSoar 6

What is Soar?

pl 12/4/2012

• A cognitive architecture• A system that implements that architecture• Major elements:– Short- and long-term memories– Decision procedure– Perception and action modules– Various kinds of learning

• Example applications:– TacAirSoar– BOLT Project

OntoSoar 7pl 12/4/2012

OntoSoar 8pl 12/4/2012

OntoSoar 9pl 12/4/2012

OntoSoar 10pl 12/4/2012

OntoSoar 11

OntoSoar

pl 12/4/2012

OntoSoar 12

Raw OCR’d Text Example

pl 12/4/2012

OntoSoar 13

Segmented Text

pl 12/4/2012

243314. Charles Christopher Lathrop, N. Y. City, b. 1817, d. 1865, son of Mary Ely and Gerard Lathrop ;m. 1856, Mary Augusta Andruss, 992 Broad St., Newark, N. J., who was b. 1825,

dau. of Judge Caleb Halstead Andruss and Emma Sutherland Goble.Mrs. Lathrop died at her home, 992 Broad St., Newark, N. J., Friday morning, Nov. 4, 1898.The funeral services were held at her residence on Monday, Nov. 7, 1898, at half-

past two o'clock P. M.Their children:1. Charles Halstead, b. 1857, d. 1861.2. William Gerard, b. 1858, d. 1861.3. Theodore Andruss, b. i860.4. Emma Goble, b. 1862.

OntoSoar 14

Parsing and Semantics

pl 12/4/2012

+-------------------------------------Xp-------------------------------------+ | +---------------Ss---------------+ | +---------Wd--------+ +----------VJlsi---------+------MVp------+ | | +----G----+ +---Pa--+-MVp-+-IN-+ +--VJrsi-+ +-IN-+ | | | | | | | | | | | | |LEFT-WALL Charles.b Halstead was.v-d born.a in.r 1857 and.j-v died.v-d in.r 1861 .

1. Charles Halstead, b. 1857, d. 1861.

Charles Halstead was born in 1857 and died in 1861.

person(P1)named(P1, "Charles Halstead")born(P1, "1857")died(P1, "1861")

Person(P1)Person_Name(P1, "Charles Halstead")Person_BirthDate(P1, "1857")Person_DeathDate(P1, "1861")

Predicates: Extracted facts:

OntoSoar 15

More Complex Parsing

pl 12/4/2012

Charles Christopher Lathrop, N. Y. City, was born in 1817 and died in 1865 and was the son of Mary Ely and Gerard Lathrop ;

+-----------------------Ss------------------- +------MXs-----+ | +----Xd---+ +----------VJlsi------ +-----G-----+-----G----+ | +-G+-G-+Xc+ +---Pv--+-MVp-+-IN-+ | | | | | | | | | | | |Charles.b Christopher.b Lathrop , N. Y. City , was.v-d born.v in.r 1817

---+ +-----------VJrsi----------+---+ +------VJlsi------+ +----Ost---+ +--------Ju-------+---- | +--MVp-+-IN-+ +-VJrsi-+ +-Ds-+-Mp-+ +--G--+-SJls-+ | | | | | | | | | | | |and.j-v died.v-d in.r 1865 and.j-v was.v-d the son.n of Mary.b Ely.m and.j-n

--SJrs------+ +---G---+ | |Gerard.m Lathrop [;]

OntoSoar 16

More Complex Semantics

pl 12/4/2012

person(P2)named(P2, "Charles Christopher Lathrop")place(GE1)named(GE1, "N. Y. City")livedIn(P2, GE1)born(P2, "1817")died(P2, "1865")person(P3)named(P3, "Mary Ely")son(P2, P3)person(P4)named(P4, "Gerald Lathrop")son(P2, P4)couple(P3, P4)

Person(P2)Person(P3)Person(P4)Person_Name(P2, "Charles Christopher Lathrop")Person_Name(P3, "Mary Ely")Person_Name(P4, "Gerald Lathrop")Person_BirthDate(P2, "1817")Person_DeathDate(P2, "1865")Parent_has_Child(P3, P2)Parent_has_Child(P4, P2)Male(P2)Parent_with_Parent(P3, P4)GeoEntity(GE1)GeoEntity_Name(GE1, "N. Y. City")Person_livedIn_GeoEntity(P2, GE1)

OntoSoar 17

More Learning

pl 12/4/2012

He graduated B. A. from Rensselaer Polytechnic College, Troy, N. Y.

+--------MXs--------+ +-----Os----+ +-------------Js-------------+---MXs---+ +--Xd-+ +---Ss--+ +-G+-Mp+ +-----G----+----G----+ +-Xd-+Xca+ +-G+ | | | | | | | | | | | | |he graduated.v-d B. A. from Rensselaer Polytechnic College , Troy.b , N. Y.

pro3SingMasc(X1)institution(I1)named(I1, "Rensselaer Polytechnic College")graduatedFrom(X1, I1)person(X1)place(GE2)named(GE2, "Troy, N. Y.")locatedIn(I1, GE2)

person(P5)named(P5, "Gardner Bullard")sameAs(X1, P5)

Person(P5)GeoEntity(GE2)Person_Name(P5, "Gardner Bullard")GeoEntityName(GE2, "Troy, N. Y.")Male(P5)Institution(I1)Institution_Name(I1, "Rensselaer Polytechnic College")Person_graduatedFrom_Institution(P5, I1)Institution_locatedIn_GeoEntity(I1, GE2)

OntoSoar 18

Processing Steps

• Gather section of text• Segment into sentence fragments• Parse with the LG-Parser• Build predicates with LG-Soar• Resolve pronouns using DRT• Convert predicates to facts• Match extracted facts against conceptual model• Record facts that match • Learn from partial matchespl 12/4/2012