46
Lecture 13 Information Extraction Topics Topics Name Entity Recognition Relation detection Temporal and Event Processing Template Filling Readings: Chapter 22 Readings: Chapter 22 February 27, 2013 CSCE 771 Natural Language Processing

Lecture 13 Information Extraction

  • Upload
    annis

  • View
    40

  • Download
    0

Embed Size (px)

DESCRIPTION

Lecture 13 Information Extraction. CSCE 771 Natural Language Processing. Topics Name Entity Recognition Relation detection Temporal and Event Processing Template Filling Readings: Chapter 22. February 27, 2013. Overview. Last Time Dialogues Human conversations Today - PowerPoint PPT Presentation

Citation preview

Page 1: Lecture 13 Information Extraction

Lecture 13Information Extraction

Lecture 13Information Extraction

Topics Topics Name Entity Recognition Relation detection Temporal and Event Processing Template Filling

Readings: Chapter 22Readings: Chapter 22

February 27, 2013

CSCE 771 Natural Language Processing

Page 2: Lecture 13 Information Extraction

– 2 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

OverviewOverviewLast TimeLast Time

Dialogues Human conversations

TodayToday Slides from Lecture24 Dialogue systems Dialogue Manager Design

Finite State, Frame-based, Initiative: User, System, Mixed VoiceXML Information Extraction

ReadingsReadings Chapter 24, Chapter 22

Page 3: Lecture 13 Information Extraction

– 3 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Information extractionInformation extraction

• Information extraction – turns unstructured Information extraction – turns unstructured information buried in texts into structured datainformation buried in texts into structured data

• Extract proper nouns – “named entity recognition”Extract proper nouns – “named entity recognition”

• Reference resolution – \Reference resolution – \• named entity mentions• Pronoun references

• Relation Detection and classificationRelation Detection and classification

• Event detection and classificationEvent detection and classification

• Temporal analysisTemporal analysis

• Template fillingTemplate filling

Page 4: Lecture 13 Information Extraction

– 4 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Template FillingTemplate Filling

Example template for “airfare raise”Example template for “airfare raise”

Page 5: Lecture 13 Information Extraction

– 5 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.1 List of Named Entity TypesFigure 22.1 List of Named Entity Types

Page 6: Lecture 13 Information Extraction

– 6 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.2 Examples of Named Entity TypesFigure 22.2 Examples of Named Entity Types

Page 7: Lecture 13 Information Extraction

– 7 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.3 Categorical AmbiguitiesFigure 22.3 Categorical Ambiguities

Page 8: Lecture 13 Information Extraction

– 8 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.4 Categorical AmbiguityFigure 22.4 Categorical Ambiguity

Page 9: Lecture 13 Information Extraction

– 9 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.5 Chunk Parser for Named EntitiesFigure 22.5 Chunk Parser for Named Entities

Page 10: Lecture 13 Information Extraction

– 10 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.6 Features used in Training NERFigure 22.6 Features used in Training NER

Gazetteers – lists of place namesGazetteers – lists of place names• www.geonames.com• www.census.gov

Page 11: Lecture 13 Information Extraction

– 11 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.7 Selected Shape FeaturesFigure 22.7 Selected Shape Features

Page 12: Lecture 13 Information Extraction

– 12 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.8 Feature encoding for NERFigure 22.8 Feature encoding for NER

Page 13: Lecture 13 Information Extraction

– 13 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.9 NER as sequence labelingFigure 22.9 NER as sequence labeling

Page 14: Lecture 13 Information Extraction

– 14 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.10 Statistical Seq. LabelingFigure 22.10 Statistical Seq. Labeling

Page 15: Lecture 13 Information Extraction

– 15 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Evaluation of Named Entity Rec. Sys.Evaluation of Named Entity Rec. Sys.

• Recall terms from Information retreivalRecall terms from Information retreival• Recall = #correctly labeled / total # that should be labeled• Precision = # correctly labeled / total # labeled

• F- measure where F- measure where ββ weights preferences weights preferences • β=1 balanced• β>1 favors recall• β<1 favors precision

RP

PRF

2

2 )1(

Page 16: Lecture 13 Information Extraction

– 16 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

NER Performance revisitedNER Performance revisited

NER performance revisitedRecall, Precision, F High performance systems

» F ~ .92 for PERSONS and LOCATIONS and ~.84 for ORG

Practical NERMake several passes on text

1. Start by using highest precision rules (maybe at expense of recall) make sure what you get is right

2. Search for substring matches or previously detected names using probabilistic searches string matching metrics(Chap 19)

3. Name lists focused on domain

4. Probabilistic sequence labeling techniques using previous tags

Page 17: Lecture 13 Information Extraction

– 17 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Relation Detection and classificationRelation Detection and classification

• Consider Sample text:Consider Sample text:• Citing high fuel prices, [ORG United Airlines] said [TIME

Friday] it has increased fares by [MONEY $6] per round trip on flights to some cities also served by lower-cost carriers. [ORG American Airlines], a unit of [ORG AMR Corp.], immediately matched the move, spokesman [PERSON Tim Wagner] said. [ORG United Airlines] an unit of [ORG UAL Corp.], said the increase took effect [TIME Thursday] and applies to most routes where it competes against discount carriers, such as [LOC Chicago] to [LOC Dallas] and [LOC Denver] to [LOC San Francisco].

• After identifying named entities what else can we After identifying named entities what else can we extract?extract?

• RelationsRelations

Page 18: Lecture 13 Information Extraction

– 18 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Fig 22.11 Example semantic relationsFig 22.11 Example semantic relations

Page 19: Lecture 13 Information Extraction

– 19 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.12 Example ExtractionFigure 22.12 Example Extraction

Page 20: Lecture 13 Information Extraction

– 20 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.13 Supervised Learning Approaches to Relation AnalysisFigure 22.13 Supervised Learning Approaches to Relation AnalysisAlgorithm two step processAlgorithm two step process

1.1. Identify whether pair of named entities are related Identify whether pair of named entities are related

2.2. Classifier is trained to label relationsClassifier is trained to label relations

Page 21: Lecture 13 Information Extraction

– 21 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Factors used in ClassifyingFactors used in Classifying

Features of the named entitiesFeatures of the named entities• Named entity types of the two arguments• Concatenation of the two entity types• Headwords of the arguments• Bag-of-words from each of the arguments

• Words in textWords in text• Bag-of-words and Bag-of-digrams• Stemmed versions• Distance between named entities (words / named entities)

• Syntactic structureSyntactic structure• Parse related structures

Page 22: Lecture 13 Information Extraction

– 22 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.14 a-part-of relationFigure 22.14 a-part-of relation

Page 23: Lecture 13 Information Extraction

– 23 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.15 Sample features ExtractedFigure 22.15 Sample features Extracted

Page 24: Lecture 13 Information Extraction

– 24 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Bootstrapping Example “Has a hub at”Bootstrapping Example “Has a hub at”

Consider the pattern Consider the pattern

/ * has a hub at * / / * has a hub at * /

Google searchGoogle search

22.4 Milwaukee-based Midwest has a hub at KCI22.4 Milwaukee-based Midwest has a hub at KCI

22.5 Delta has a hub at LaGuardia22.5 Delta has a hub at LaGuardia

……

Two ways to failTwo ways to fail

1.1. False positive: False positive: e.g. a star topology has a hub at its centere.g. a star topology has a hub at its center

2.2. False negative? Just miss False negative? Just miss

22.11 No frill rival easyJet, which has established a hub at Liverpool22.11 No frill rival easyJet, which has established a hub at Liverpool

Page 25: Lecture 13 Information Extraction

– 25 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.16 Bootstrapping Relation ExtractionFigure 22.16 Bootstrapping Relation Extraction

Page 26: Lecture 13 Information Extraction

– 26 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Using Features to restrict patternsUsing Features to restrict patterns

22.13 Budget airline Ryanair, which uses Charleroi as a 22.13 Budget airline Ryanair, which uses Charleroi as a hub, scrapped all weekend flightshub, scrapped all weekend flights

/ [ORG] , which uses a hub at [LOC] // [ORG] , which uses a hub at [LOC] /

Page 27: Lecture 13 Information Extraction

– 27 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Semantic DriftSemantic Drift

Note it will be difficult (impossible) to get annotated Note it will be difficult (impossible) to get annotated materials for trainingmaterials for training

Accuracy of process is heavily dependant on initial Accuracy of process is heavily dependant on initial seessees

Semantic Drift – Semantic Drift – • Occurs when erroneous patterns(seeds) leads to the

introduction of erroneous tuples

Page 28: Lecture 13 Information Extraction

– 28 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Fig 22.17 Temporal and Durational ExpressionsFig 22.17 Temporal and Durational ExpressionsAbsolute temporal expressionsAbsolute temporal expressions

Relative temporal expressionsRelative temporal expressions

Page 29: Lecture 13 Information Extraction

– 29 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Fig 22.18 Temporal lexical triggersFig 22.18 Temporal lexical triggers

Page 30: Lecture 13 Information Extraction

– 30 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Fig 22.19 MITRE’s tempEx tagger-perlFig 22.19 MITRE’s tempEx tagger-perl

Page 31: Lecture 13 Information Extraction

– 31 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Fig 22.20 Features used to train IOBFig 22.20 Features used to train IOB

Page 32: Lecture 13 Information Extraction

– 32 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.21 TimeML temporal markupFigure 22.21 TimeML temporal markup

Page 33: Lecture 13 Information Extraction

– 33 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Temporal NormalizationTemporal Normalization

iSO 8601 - standard for encoding temporal valuesiSO 8601 - standard for encoding temporal values

YYYY-MM-DDYYYY-MM-DD

Page 34: Lecture 13 Information Extraction

– 34 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.22 Sample ISO PatternsFigure 22.22 Sample ISO Patterns

Page 35: Lecture 13 Information Extraction

– 35 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Event Detection and AnalysisEvent Detection and Analysis

Event Detection and classificationEvent Detection and classification

Page 36: Lecture 13 Information Extraction

– 36 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Fig 22.23 Features for Event DetectionFig 22.23 Features for Event Detection

Features used in rule-based and statistical techniquesFeatures used in rule-based and statistical techniques

Page 37: Lecture 13 Information Extraction

– 37 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Fig 22.24 Allen’s 13 temporal RelationsFig 22.24 Allen’s 13 temporal Relations

Page 38: Lecture 13 Information Extraction

– 38 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.24 continuedFigure 22.24 continued

Page 39: Lecture 13 Information Extraction

– 39 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.25 Example from Timebank CorpusFigure 22.25 Example from Timebank Corpus

Page 40: Lecture 13 Information Extraction

– 40 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Template FillingTemplate Filling

Page 41: Lecture 13 Information Extraction

– 41 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.26 Templates produced by Faustus 1997Figure 22.26 Templates produced by Faustus 1997

Page 42: Lecture 13 Information Extraction

– 42 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.27 Levels of processing in FaustusFigure 22.27 Levels of processing in Faustus

Page 43: Lecture 13 Information Extraction

– 43 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.28 Faustus Stage 2Figure 22.28 Faustus Stage 2

Page 44: Lecture 13 Information Extraction

– 44 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.29 The 5 Partial Templates of FaustusFigure 22.29 The 5 Partial Templates of Faustus

Page 45: Lecture 13 Information Extraction

– 45 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.30 Articles in PubMedFigure 22.30 Articles in PubMed

Page 46: Lecture 13 Information Extraction

– 46 – CSCE 771 Spring 2013Slide from Speech and Language Processing -- Jurafsky and Martin

Figure 22.31 biomedical classes of named entitiesFigure 22.31 biomedical classes of named entities