View
216
Download
1
Category
Preview:
Citation preview
AQUAINTAQUAINT IBMIBMPIQUANTPIQUANT
ARDAARDA CycorpCycorp
Subcontractor:
PIQUANTQuestion Answering System
ARDA AQUAINT ProgramJune Workshop 2002
This work was supported in part by the Advanced Research and Development Activity (ARDA)'s Advanced Question Answering for Intelligence (AQUAINT) Program under contract number MDA904-01-C-0988.
Dave Ferrucci, John Prager, Jennifer Chu-Carroll, Chris Welty, Chris Cesar and Scott Fahlman
IBM Research
Subcontractor: Cycorp
IBM - PIQUANTIBM - PIQUANT
Overview
Progress UpdateArchitecture Qplans
Working Example
Answer Selection and ResolutionPerformance Improvements
Summary
IBM Research
Subcontractor: Cycorp
IBM - PIQUANTIBM - PIQUANT
PIQUANT Research Objectives
Integration & impact of knowledge based system (e.g., Cyc)
in QA
Extensible QA architectures
Declarative question plans
Parallel solution paths and pervasive confidence processing
Deeper linguistic & knowledge-based analysis
IBM Research
Subcontractor: Cycorp
IBM - PIQUANTIBM - PIQUANT
Progress Since AQUAINT Kickoff
Architecture DesignSupport for multiple answering agents, solution paths and knowledge sourcesCentralized ontology management & uniform access to knowledge sourcesNew question plan modules
Improved RankingEnhanced Answer Selection using deeper linguistic analysisIntegration of Cyc in Answer Resolution for “sanity checking”Integration of multiple knowledge sources
Answering question previously missedMultiple solutions paths based on alternative question decompositionIntegration of Cyc as a knowledge source
IBM Research
Subcontractor: Cycorp
IBM - PIQUANTIBM - PIQUANT
Architectural Limitations as of TREC10
PipelineSingle Answering ApproachLimited Extensibility
Single Solution SourceWordNet added as second-class citizen
No Knowledge System componentLimited question understandingShallow conceptual map from Q to ALimited to explicit matches -- cut-off from inferred possibilities“Explanations” limited to text passages containing answersCan’t filter out crazy answers
IBM Research
Subcontractor: Cycorp
IBM - PIQUANTIBM - PIQUANT
Classic Pipeline with WordNet
QuestionQuestion SearchSearch
AnswerSelectionAnswer
Selection
AnswerAnswer
Answer Type
HitListText Query
WordNetWordNet
WN Query WN Answer
AnswerClassification
AnswerClassification
QuestionAnalysisQuestionAnalysis
IBM Research
Subcontractor: Cycorp
IBM - PIQUANTIBM - PIQUANT
Knowledge Source Services
QuestionQuestion
WordNetWordNet
AnswerResolution
AnswerResolutionAnswerAnswer
Answer Type
Hit Hit ListList
Text Query
SearchSearch
KB Query
AnswerSelectionAnswer
Selection
AnswerJustification
& Presentation
AnswerJustification
& Presentation
Answers
Text Text Search Search AnswersAnswers
AnswerClassification
AnswerClassification
QuestionAnalysisQuestionAnalysis
CycCyc
Cyc Answers
WordNet Answers
IBM Research
Subcontractor: Cycorp
IBM - PIQUANTIBM - PIQUANT
Answering Agents
QuestionQuestion
WordNetWordNet
AnswerResolution
AnswerResolutionAnswerAnswer
Hit Hit ListList
SearchSearch
QGoals
AnswerSelectionAnswer
Selection
AnswerJustification
& Presentation
AnswerJustification
& Presentation
Answers
CycCyc
ConvertQuestion toWeb Query
QFrame
Web
ComplexDecomposition
& Planning
Answering Agents
KS Adaptation Layer
AnswerClassification
AnswerClassification
QuestionAnalysisQuestionAnalysis
Causality
IBM Research
Subcontractor: Cycorp
IBM - PIQUANTIBM - PIQUANTPlanning-Based Answering Agent
QuestionQuestion
WordNetWordNet
AnswerResolution
AnswerResolutionAnswerAnswer
Hit Hit ListList
SearchSearch
AnswerSelectionAnswer
Selection
AnswerJustification
& Presentation
AnswerJustification
& Presentation
Answers
CycCyc
QFrame
Web
Answering Agents
AnsweringAgent
Selection
AnsweringAgent
Selection
KS Adaptation Layer
AnswerClassification
AnswerClassification
QuestionAnalysisQuestionAnalysis
QPlans
QPlan Execution Eng
QPlan Execution Eng
AnswerResolution
AnswerResolution
Answer Candidates
PlanSelection
PlanSelection
QGoals
QFilter
IBM Research
Subcontractor: Cycorp
IBM - PIQUANTIBM - PIQUANT
QPlans
Plans for attacking different question typesIdentifies knowledge sources to use
Text Search, Cyc, WordNet, …
Specifies preferences, when relevant, of sourcesSimple questions have base plans (no recursion)Complex questions can be broken into sub-plans
IBM Research
Subcontractor: Cycorp
IBM - PIQUANTIBM - PIQUANT
Sample Question Types10 identified, 5 with QPlans
WhenWhen was the Battle of Hastings?
DefineWhat is anorexia nervosa?
PropertyWhat is the population of the capital of Great Britain?
WhatXWhat county is Phoenix AZ in?
SuperWhat is the largest snake in the world?
IBM Research
Subcontractor: Cycorp
IBM - PIQUANTIBM - PIQUANT
Mapping Questions to QPlans
What is the Declaration of Independence?What is the capital of Great Britain?
What is the P of X? What is the P of X?
What is the Declaration of Independence?
What is X?
What is the capital of Great Britain?
What is X?
Property
Define
IBM Research
Subcontractor: Cycorp
IBM - PIQUANTIBM - PIQUANT
QPlan Example
Ask: “What is the population of the capital of Great Britain?”Recognize question type: PropertyRecognize answer type: NUMBER/POPULATIONPlan
Text Search: “Population of the capital of Great Britain”
PA Search: “The capital of Great Britain” and (NUMBER$ or POPULATION$)
Cyc, DB and WordNet queries
Decomposition
For each answer, A, to “What is the capital of Great Britain?”
Ask: “What is the population of” A
Each element of the decomposition may be answered by different knowledge
sources (e.g., Cyc, WordNet etc).
IBM Research
Subcontractor: Cycorp
IBM - PIQUANTIBM - PIQUANT
Our TREC10 System vs. PIQUANT
What is the population of the capital of Tajikistan? TextSearchText
Search5.3 MillionWrong!
What is the capital of Tajikistan?
What is the population of Dushanbe?
TextSearchText
Search
CycCycX = Dushanbe
CycCyc
What is the population of Dushanbe?
460,000
nil
What is the population of the capital of Tajikistan?
What is the population of X?
Right!
IBM Research
Subcontractor: Cycorp
IBM - PIQUANTIBM - PIQUANT
PIQUANT Architecture
QuestionQuestion
WordNetWordNet
AnswerResolution
AnswerResolutionAnswerAnswer
Hit Hit ListList
SearchSearch
AnswerSelectionAnswer
Selection
AnswerJustification
& Presentation
AnswerJustification
& Presentation
Answers
CycCyc
QFrame
Web
Answering Agents
AnsweringAgent
Selection
AnsweringAgent
Selection
KS Adaptation Layer
AnswerClassification
AnswerClassification
QuestionAnalysisQuestionAnalysis
QPlans
QPlan Execution Eng
QPlan Execution Eng
AnswerResolution
AnswerResolution
Answer Candidates
PlanSelection
PlanSelection
QGoals
QFilter
IBM Research
Subcontractor: Cycorp
IBM - PIQUANTIBM - PIQUANT
Enhance Answer Resolution/Selection
Deeper linguistic analysisIdentifying and matching answer type
Name-Entity Tagger
Matching syntactic relationships between Q and ADeep Parser
Multiple knowledge sources to reinforce answers
Encyclopedia Britannica
“Crazy Answer” EliminationUsing Cyc
IBM Research
Subcontractor: Cycorp
IBM - PIQUANTIBM - PIQUANT
Deeper Linguistic Analysis In Answer Selection
InputPassages (typically 10) returned by the search engine
Candidate passages for question: What is the capital of England?
“Shaykh Salim Sabah al-Salim continued his talks today with high-ranking officials in the British capital, London.”
“BRISTOL, capital of south-west England, holds a peculiar fascination for psephologists.”
Semantic type(s) of answer sought
ProcessIdentify candidate answers using a semantic-based named-entity tagger
<PERSON>Shaykh Salim Sabah al-Salim</PERSON> continued his talks <DATE>today</DATE> with <ROLE>high-ranking officials</ROLE> in the British capital, <CAPITAL>London</CAPITAL>.”
Rank candidate answers based on pre-identified features
Hit List (Passages)
Answer typeAnswers & RanksAnswer
SelectionAnswer
Selection
IBM Research
Subcontractor: Cycorp
IBM - PIQUANTIBM - PIQUANT
Multiple Knowledge Sources
QuestionQuestion
WordNetWordNet
AnswerResolution
AnswerResolutionAnswerAnswer
Answer Type
Hit Hit ListList
Text Query
SearchSearch
KB Query
AnswerSelectionAnswer
Selection
AnswerJustification
& Presentation
AnswerJustification
& Presentation
Answers
Text Text Search Search AnswersAnswers
AnswerClassification
AnswerClassification
QuestionAnalysisQuestionAnalysis
CycCyc
Cyc Answers
WordNet Answers
EBwith
PA Index
EBwith
PA Index
TRECwith
PA Index
TRECwith
PA Index
Substantiating answers with multiple sources increases confidence
TREC Corpus + Encyclopedia Britannica
Found previously missed answers
Improved rank of previously found answers
Substantiating answers with multiple sources increases confidence
TREC Corpus + Encyclopedia Britannica
Found previously missed answers
Improved rank of previously found answers
IBM Research
Subcontractor: Cycorp
IBM - PIQUANTIBM - PIQUANT
PIQUANT Architecture
QuestionQuestion
WordNetWordNet
AnswerResolution
AnswerResolutionAnswerAnswer
Hit Hit ListList
SearchSearch
AnswerSelectionAnswer
Selection
AnswerJustification
& Presentation
AnswerJustification
& Presentation
Answers
CycCyc
QFrame
Web
Answering Agents
AnsweringAgent
Selection
AnsweringAgent
Selection
KS Adaptation Layer
AnswerClassification
AnswerClassification
QuestionAnalysisQuestionAnalysis
QPlans
QPlan Execution Eng
QPlan Execution Eng
AnswerResolution
AnswerResolution
Answer Candidates
PlanSelection
PlanSelection
QGoals
QFilter
IBM Research
Subcontractor: Cycorp
IBM - PIQUANTIBM - PIQUANT
“Crazy Answer” EliminationSemantic type mismatch
ExamplesWhat city in Florida is Sea World in?
London, San Diego, TulsaWho was Charles Lindbergh’s wife?
Babe Ruth, Jack Dempsey
IssueNeed to determine if an ISA relationship is possible between two entities
Unreasonable numerical rangesExamples
What is the weight of a wolf?300 tons
How many states have a lottery? 600, 203
How big is our galaxy in diameter?14 feet, 43 feet
Issues (Under Development at Cycorp)Need upper and/or lower bounds on property valuesNeed reasonable units for certain measures
IBM Research
Subcontractor: Cycorp
IBM - PIQUANTIBM - PIQUANT
Performance Evaluation
Evaluation performed on a set of 364 TREC9 questionsResults of Improved Answer Selection/Resolution
Deeper linguistic analysisMultiple knowledge sources to reinforce answers
MRR # Missed Answers
# Answers in Rank 1
TREC10 0.666 64 203
+Improved Ranking 0.720 47 228
+Multiple Sources 0.739 42 235
+Sanity Checking TBD TBD TBD
Substantially increased number of answers in rank 1 particularly important in recursive architecture
IBM Research
Subcontractor: Cycorp
IBM - PIQUANTIBM - PIQUANT
Next Six Months
Richer question-classification, plan development and execution
Ontology synthesis and central management/access
Richer and more robust integration of knowledge sourcesAnswer Aggregation
Answer Elimination
Answer Generation
Answering Agent for Causality QuestionsLeverage dialog with Cyc regarding event pre and post conditions
e.g., postCondition (“drink poison”, “die”)
Improve Answer Resolution
Confidence Processing
Implementation Improvements (Speed, Modularity)
AQUAINTAQUAINT IBMIBMPIQUANTPIQUANT
ARDAARDA CycorpCycorp
Subcontractor:
PIQUANTJune Workshop Update
The End
Recommended