16
Slide 1 of 13 Increasing the coverage of answer extraction by applying anaphora resolution Increasing the coverage of answer extraction by applying anaphora resolution IS-LTC October 10 2006 Jori Mur Humanities Computing University of Groningen

Increasing the coverage of answer extraction by applying anaphora resolution IS-LTC

Embed Size (px)

DESCRIPTION

Increasing the coverage of answer extraction by applying anaphora resolution IS-LTC October 10 2006 Jori Mur Humanities Computing University of Groningen. Outline. Background Question Answering (QA) Off-line answer extraction Anaphora resolution for answer extraction - PowerPoint PPT Presentation

Citation preview

Page 1: Increasing the coverage  of answer extraction  by applying anaphora resolution IS-LTC

Slide 1 of 13

Increasing the coverage of answer extraction by applying anaphora resolution

Increasing the coverage of answer extraction

by applying anaphora resolution

IS-LTC October 10 2006

Jori MurHumanities ComputingUniversity of Groningen

Page 2: Increasing the coverage  of answer extraction  by applying anaphora resolution IS-LTC

Slide 2 of 13

Increasing the coverage of answer extraction by applying anaphora resolution

OutlineBackground

Question Answering (QA)

Off-line answer extraction

Anaphora resolution for answer extraction

Anaphora resolution technique for definite nouns

Anaphora resolution technique for pronouns

Experiment and Results

Conclusion

Page 3: Increasing the coverage  of answer extraction  by applying anaphora resolution IS-LTC

Slide 3 of 13

Increasing the coverage of answer extraction by applying anaphora resolution

Question Anwering (QA)

Task: Find an answer in a text collection to a question posed in a natural language.

Question: How old is John McEnroe?Answer: 35 years

Question: When was Hillary Clinton born?Answer: October 26 1947

Page 4: Increasing the coverage  of answer extraction  by applying anaphora resolution IS-LTC

Slide 4 of 13

Increasing the coverage of answer extraction by applying anaphora resolution

Off-line answer extractionUse dependency parser to parse the corpus

Define dependency patterns[Location Name] has [Number] inhabitants<have, subj, [Location Name]><have, obj, inhabitants><inhabitants, det, [Number]>

Match dependency relations of sentence from text with dependency pattern

Extract and save facts

Page 5: Increasing the coverage  of answer extraction  by applying anaphora resolution IS-LTC

Slide 5 of 13

Increasing the coverage of answer extraction by applying anaphora resolution

Problem

Text: McEnroe was injured on his right knee. [...] The problems with his knee kept bothering the 35-year old American for two weeks.

Page 6: Increasing the coverage  of answer extraction  by applying anaphora resolution IS-LTC

Slide 6 of 13

Increasing the coverage of answer extraction by applying anaphora resolution

Anaphora resolution for definite nouns

Modify patterns to match definite nouns

[Definite noun] has [Number] inhabitants<have, subj, [Definite noun]><have, obj, inhabitants><inhabitants, det, [Number]>

Create instance list using predicate and apposition relation

Select first preceding name, check if it occurs together with the noun at the instance list

Fall back: select first preceding name

Page 7: Increasing the coverage  of answer extraction  by applying anaphora resolution IS-LTC

Slide 7 of 13

Increasing the coverage of answer extraction by applying anaphora resolution

Experiment

12 question types

Age

Date of Birth

Location of Birth

Capital

Date of Death

Location of Death

Manner/Cause of Death

Age of Death

Founded

Function

Inhabitants

Winner

Page 8: Increasing the coverage  of answer extraction  by applying anaphora resolution IS-LTC

Slide 8 of 13

Increasing the coverage of answer extraction by applying anaphora resolution

Experiment

Clef corpus for Dutch: Two newspapers (Algemeen Dagblad and NRC Handelsblad)

1994 and 1995

Simple predefined dependency patterns and patterns based on anaphora resolution

200 Dutch Questions of Clef-2005

QA system: Joost

Page 9: Increasing the coverage  of answer extraction  by applying anaphora resolution IS-LTC

Slide 9 of 13

Increasing the coverage of answer extraction by applying anaphora resolution

Results for extraction

Around 10,900 fact-types extra

Basic patterns Anaphora patterns

Fact-tokens 93497 (86%) +51644 (34%)

Fact-types 64627 (75%) +39208 (28%)

Page 10: Increasing the coverage  of answer extraction  by applying anaphora resolution IS-LTC

Slide 10 of 13

Increasing the coverage of answer extraction by applying anaphora resolution

Results for QA

200 questions from Clef-2005 data-set

Basic patterns Anaphora patterns

Total 103/200 (52%) 105/200 (53%)

12 types 26/40 (65%) 28/40 (70%)

Page 11: Increasing the coverage  of answer extraction  by applying anaphora resolution IS-LTC

Slide 11 of 13

Increasing the coverage of answer extraction by applying anaphora resolution

Discussion of Results

Hypothesis 1: Precision should be increased.

Hypothesis 2: Selection of types was limited.

Hypothesis 3: Answers to questions occur in one sentence

Page 12: Increasing the coverage  of answer extraction  by applying anaphora resolution IS-LTC

Slide 12 of 13

Increasing the coverage of answer extraction by applying anaphora resolution

Answer in one sentence

Question 107: Who was the pilot of the mission that repaired the astronomic satelite, the Hubble Space Telescope?

Text AD19940719: Bowersox was the pilot of the mission that repaired the astronomic satelite, the Hubble Space Telescope.

Page 13: Increasing the coverage  of answer extraction  by applying anaphora resolution IS-LTC

Slide 13 of 13

Increasing the coverage of answer extraction by applying anaphora resolution

Conclusion

One way to improve the coverage of answer extraction is anaphora resolution

Although precision drops it doesn’t hurt the performance of QA. Result even improved.

It should be investigated what happens if the domain of question types on which anaphora resolution is applied is broadened

It should be investigated what happens if the questions are really independent of the corpus

Page 14: Increasing the coverage  of answer extraction  by applying anaphora resolution IS-LTC

Slide 14 of 13

Increasing the coverage of answer extraction by applying anaphora resolution

Questions?

Page 15: Increasing the coverage  of answer extraction  by applying anaphora resolution IS-LTC

Slide 15 of 13

Increasing the coverage of answer extraction by applying anaphora resolution

Anaphora resolution for pronouns

Modify patterns to match pronouns

[Pronoun] has [Number] inhabitants<have, subj, [Pronoun]><have, obj, inhabitants><inhabitants, det, [Number]>

Create list of boys and girls names (baby names site at the internet)

Select first preceding name, check if it does not occur on the list of the opposite sexe of the pronoun

Fall back: select first preceding name

Page 16: Increasing the coverage  of answer extraction  by applying anaphora resolution IS-LTC

Slide 16 of 13

Increasing the coverage of answer extraction by applying anaphora resolution

Example

Text: NH19941209 35-year old McEnroe ...

Question: How old is McEnroe ?

Answer: 35

Name Age Id

McEnroe 35 NH19941209