31
Tricks for Statistical Tricks for Statistical Semantic Knowledge Semantic Knowledge Discovery: Discovery: A Selectionally A Selectionally Restricted Sample Restricted Sample Marti A. Hearst Marti A. Hearst UC Berkeley UC Berkeley

Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

  • Upload
    maren

  • View
    32

  • Download
    0

Embed Size (px)

DESCRIPTION

Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample. Marti A. Hearst UC Berkeley. Acquire Semantic Information. Goal:. Something on Finin. Tricks I Like. Unambiguous Cues. Lots o’ Text. Rewrite and Verify. Trick: Lots o’ Text. - PowerPoint PPT Presentation

Citation preview

Page 1: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Tricks for Statistical Tricks for Statistical Semantic Knowledge Semantic Knowledge

Discovery:Discovery:

A Selectionally Restricted A Selectionally Restricted SampleSample

Marti A. HearstMarti A. Hearst

UC BerkeleyUC Berkeley

Page 2: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Acquire Semantic Acquire Semantic InformationInformation

Goal:Goal:

Page 3: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

►Something on FininSomething on Finin

Page 4: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Tricks I LikeTricks I Like

Lots o’ Text

Unambiguous Cues

Rewrite and Verify

Page 5: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Trick: Lots o’ TextTrick: Lots o’ Text

► Idea: words in the same syntactic Idea: words in the same syntactic context are semantically related.context are semantically related. Hindle, ACL’90, “Noun classification from predicate-argument Hindle, ACL’90, “Noun classification from predicate-argument

structure.”structure.”

Page 6: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Trick: Lots o’ TextTrick: Lots o’ Text

► Idea: words in the same syntactic Idea: words in the same syntactic context are semantically related.context are semantically related. Nakov & Hearst, ACL/HLT’08 “Solving Relational Similarity Problems Using the Web as a Nakov & Hearst, ACL/HLT’08 “Solving Relational Similarity Problems Using the Web as a

Corpus”Corpus”

Page 7: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Trick: Lots o’ TextTrick: Lots o’ Text

► Idea: bigger is better than smarter!Idea: bigger is better than smarter! Banko & Brill ACL’01: “Scaling to Very, Very Large Corpora for Banko & Brill ACL’01: “Scaling to Very, Very Large Corpora for

Natural Language Disambiguation”Natural Language Disambiguation”

Page 8: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Trick: Lots o’ TextTrick: Lots o’ Text► Idea: apply web-scale n-grams to Idea: apply web-scale n-grams to

every problem imaginable.every problem imaginable. Lapata & Keller, HLT/NACCL ‘04: “Web as a Baseline: Lapata & Keller, HLT/NACCL ‘04: “Web as a Baseline:

Evaluating the Performance of Unsupervised Web-Based Evaluating the Performance of Unsupervised Web-Based Models for a Range of NLP Tasks”Models for a Range of NLP Tasks”

MT candidate selection

Article suggestion

Noun compound interpretation

Noun compound bracketing

Adjective ordering

> supervised = supervised

Page 9: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Limitation Limitation

►Sometimes counts alone are too ambiguous.Sometimes counts alone are too ambiguous.

Solution Solution

►Bootstrap from Bootstrap from unambiguousunambiguous contexts. contexts.

Page 10: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Trick: Use Unambiguous Trick: Use Unambiguous ContextContext

►… … to build statistics for ambiguous to build statistics for ambiguous contexts. contexts. Hindle & Rooth, ACL ’91“Structural Ambiguity and Lexical Hindle & Rooth, ACL ’91“Structural Ambiguity and Lexical

Relations”Relations”

Example: PP attachmentI eat spaghetti with sauce.

Bootstrap from unambiguous contexts:

Spaghetti with sauce is delicious.I eat with a fork.

Page 11: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Trick: Use Unambiguous Trick: Use Unambiguous ContextContext

► … … to identify semantic relations to identify semantic relations (lexico-syntactic contexts)(lexico-syntactic contexts) Hearst, COLING ’92, Hearst, COLING ’92, ““Automatic Acquisition of Hyponyms from Automatic Acquisition of Hyponyms from

Large Text Corpora”Large Text Corpora”

Example: Hyponym Identification

Page 12: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Combine Tricks 1 and Combine Tricks 1 and 22

Page 13: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Trick: Use Unambiguous Trick: Use Unambiguous Contexts + Lot’s O’ TextContexts + Lot’s O’ Text

►Combine lexico-syntactic patterns with Combine lexico-syntactic patterns with occurrence counts.occurrence counts. Kozareva, Riloff, Hovy, HLT-ACL’08. “Semantic Class learning form the Kozareva, Riloff, Hovy, HLT-ACL’08. “Semantic Class learning form the

Web with Hyponym Pattern Linkage Graphs”.Web with Hyponym Pattern Linkage Graphs”.

Page 14: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Trick: Use Unambiguous Trick: Use Unambiguous Contexts + Lot’s O’ TextContexts + Lot’s O’ Text

►Combine (usually) unambiguous Combine (usually) unambiguous surface patterns with occurrence surface patterns with occurrence counts.counts. Nakov & Hearst, HLT/EMNLP’05 “Using the Web as an Implicit Nakov & Hearst, HLT/EMNLP’05 “Using the Web as an Implicit

Training Set: Application to Structural Ambiguity Resolution”.Training Set: Application to Structural Ambiguity Resolution”.

Left dashLeft dashcell-cycle analysiscell-cycle analysis leftleft

Possessive markerPossessive markerbrain’s stem cellbrain’s stem cell rightright

ParenthesesParenthesesgrowth factor (beta)growth factor (beta) leftleft

PunctuationPunctuationheath care, providerheath care, provider leftleft

AbbreviationAbbreviationtum. necr.(TN) factortum. necr.(TN) factor rightright

ConcatenationConcatenationheathcare reformheathcare reform leftleft

Page 15: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Trick: Use Unambiguous Trick: Use Unambiguous Contexts + Lot’s O’ TextContexts + Lot’s O’ Text

► Identify a “protagonist” in each text to Identify a “protagonist” in each text to learn narrative structurelearn narrative structure Chambers & Jurafsky, ACL’08 “Unsupervised Learning of Narrative Event Chambers & Jurafsky, ACL’08 “Unsupervised Learning of Narrative Event

Chains”.Chains”.

Page 16: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Trick 3: Trick 3: Rewrite & VerifyRewrite & Verify

Page 17: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Trick: Rewrite & VerifyTrick: Rewrite & Verify

► Check if alternatives exist in textCheck if alternatives exist in text Nakov & Hearst, HLT/EMNLP’05 “Using the Web as an Implicit Training Set: Nakov & Hearst, HLT/EMNLP’05 “Using the Web as an Implicit Training Set:

Application to Structural Ambiguity Resolution”.Application to Structural Ambiguity Resolution”.

Example: NP bracketingExample: NP bracketing PrepositionalPrepositional

►stem cellsstem cells inin the the brainbrain rightright►stem cellsstem cells fromfrom the the brainbrain rightright►cellscells fromfrom the the brainbrain stemstem leftleft

VerbalVerbal►virusvirus causingcausing human immunodeficiencyhuman immunodeficiency leftleft►painpain associated withassociated with arthritis migrainearthritis migraine leftleft

CopulaCopula►office buildingoffice building that isthat is a a skyscraperskyscraper rightright

Page 18: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Trick: Use Lexical HierarchiesTrick: Use Lexical Hierarchies

► To improve generation of pseudo-words for WSDTo improve generation of pseudo-words for WSD Nakov & Hearst, HLT/NAACL’03, “Category-based Pseudo-Words” Nakov & Hearst, HLT/NAACL’03, “Category-based Pseudo-Words”

► To classify nouns in noun compounds and thus To classify nouns in noun compounds and thus determine the semantic relations between themdetermine the semantic relations between them Rosario, Hearst, & Fillmore, ACL’02, “Descent of Hierarchy and Selection in Relational Rosario, Hearst, & Fillmore, ACL’02, “Descent of Hierarchy and Selection in Relational

Semantics”Semantics”

► To generate new (faceted) category systemsTo generate new (faceted) category systems Stoica, Hearst, & Richardson, NAACL/HLT’07. “Automating Creation of Hierarchical Faceted Stoica, Hearst, & Richardson, NAACL/HLT’07. “Automating Creation of Hierarchical Faceted

Metadata Structures”Metadata Structures”

Page 19: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Example: Recipes Example: Recipes (3500 docs)(3500 docs)

Page 20: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Castanet OutputCastanet Output (shown in Flamenco)(shown in Flamenco)

Page 21: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Castanet OutputCastanet Output

Page 22: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Castanet OutputCastanet Output

Page 23: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Towards New Towards New Approaches to Semantic Approaches to Semantic

AnalysisAnalysis

Page 24: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

IdeasIdeas

► Inducing Semantic GrammarsInducing Semantic Grammars Boggess, Agarwal, & Davis, AAAI’91, “Disambiguation of Boggess, Agarwal, & Davis, AAAI’91, “Disambiguation of

Prepositional Phrases in Automatically Labelled Technical Text”Prepositional Phrases in Automatically Labelled Technical Text”

Page 25: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

IdeasIdeas

►Use Cognitive LinguisticsUse Cognitive Linguistics Hearst, ’90,’92, “Direction-Based Text Interpretation”.Hearst, ’90,’92, “Direction-Based Text Interpretation”. Talmy’s Force Dynamics + Reddy’s Conduit Metaphor Talmy’s Force Dynamics + Reddy’s Conduit Metaphor

Path Model Path Model Solves: Was the person in favor of or opposed to the idea:Solves: Was the person in favor of or opposed to the idea:

Page 26: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Using Cognitive LinguisticsUsing Cognitive Linguistics

►Talmy’s Theory of Force DynamicsTalmy’s Theory of Force Dynamics Talmy, “Force Dynamics in Language and Thought,” in Talmy, “Force Dynamics in Language and Thought,” in Parasession on Causatives

and Agentivity, Chicago Linguistic Society 1985.

Describes how the interaction of agents with respect to force is Describes how the interaction of agents with respect to force is lexically and grammatically expressed.lexically and grammatically expressed.

Posits two opposing entities: Agonist and Antagonist.Posits two opposing entities: Agonist and Antagonist. Each entity expresses an intrinsic force: towards rest or motion.Each entity expresses an intrinsic force: towards rest or motion. The balance of the strengths of the entities determines the The balance of the strengths of the entities determines the

outcome of the event.outcome of the event.► Grammatical expression includes using a claused headed by “despite” to Grammatical expression includes using a claused headed by “despite” to

express a weaker antagonist.express a weaker antagonist.

Page 27: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Using Cognitive LinguisticsUsing Cognitive Linguistics

►Reddy’s Conduit MetaphorReddy’s Conduit Metaphor Reddy, “The Conduit Metaphor – A Case of Frame Conflict in Our Language about Reddy, “The Conduit Metaphor – A Case of Frame Conflict in Our Language about

Language,” in Language,” in Metaphor and Thought, Ortony (Ed), Cambridge University Press, 1979.

A thought is schematized as an object which is placed by the A thought is schematized as an object which is placed by the speaker into a container that is sent along a conduit.speaker into a container that is sent along a conduit.

The receiver at the other end is the listener, who removes the The receiver at the other end is the listener, who removes the objectified thought from the container and thus possesses it.objectified thought from the container and thus possesses it.

Inferences that apply to conduits can be applied to Inferences that apply to conduits can be applied to communication.communication.► ““Your meaning did not come through.”Your meaning did not come through.”► ““I can’t put this thought into words.”I can’t put this thought into words.”► ““She is sending you some kind of message with that remark.”She is sending you some kind of message with that remark.”

Page 28: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Using Cognitive LinguisticsUsing Cognitive Linguistics

►Combine into the Path ModelCombine into the Path Model Hearst, “Direction-based Text Interpretation as an Information Access Refinement,” in Hearst, “Direction-based Text Interpretation as an Information Access Refinement,” in

Text-based Intelligent Systems, Jacobs (Ed), Lawrence Erlbaum Associates, 1992.

If an agent favors an entity or event, that agent can be said to If an agent favors an entity or event, that agent can be said to desire the existence or “well-being” of that entity, and vice-versa.desire the existence or “well-being” of that entity, and vice-versa.

Thus if an agent favors an entity’s triumph in a force-dynamic Thus if an agent favors an entity’s triumph in a force-dynamic interaction, then the agent favors that entity or event.interaction, then the agent favors that entity or event.

But: force dynamics does not have the expressive power for a But: force dynamics does not have the expressive power for a sequence. sequence.

Instead of focusing on the relative strength of two interacting entities, the model should represent what happens to a single entity through the course of its encounters with other entities.

Thus the entity can be schematized as if it were moving along a path toward some destination or goal.

Page 29: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Using Cognitive LinguisticsUsing Cognitive Linguistics

►The Path ModelThe Path Model Hearst, “Direction-based Text Interpretation as an Information Access Refinement,” Hearst, “Direction-based Text Interpretation as an Information Access Refinement,”

in in Text-based Intelligent Systems, Jacobs (Ed), Lawrence Erlbaum Associates, 1992.

Page 30: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Using Cognitive LinguisticsUsing Cognitive Linguistics

►The Path ModelThe Path Model Hearst, “Direction-based Text Interpretation as an Information Access Refinement,” Hearst, “Direction-based Text Interpretation as an Information Access Refinement,”

in in Text-based Intelligent Systems, Jacobs (Ed), Lawrence Erlbaum Associates, 1992.

Page 31: Tricks for Statistical Semantic Knowledge Discovery: A Selectionally Restricted Sample

Using Cognitive LinguisticsUsing Cognitive Linguistics

►The Path ModelThe Path Model Hearst, “Direction-based Text Interpretation as an Information Access Refinement,” Hearst, “Direction-based Text Interpretation as an Information Access Refinement,”

in in Text-based Intelligent Systems, Jacobs (Ed), Lawrence Erlbaum Associates, 1992.