53
1 PIRE Trained Trigger Language Model Trained Trigger Language Model for Sentence Retrieval for Sentence Retrieval in Question Answering Systems in Question Answering Systems Saeedeh Saeedeh Momtazi Momtazi Spoken Language Systems Spoken Language Systems Saarland University Saarland University [email protected] [email protected] - - saarland.de saarland.de Research Advisor: Dietrich Research Advisor: Dietrich Klakow Klakow

Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

1

PIRE

Trained Trigger Language Model Trained Trigger Language Model

for Sentence Retrieval for Sentence Retrieval

in Question Answering Systems in Question Answering Systems

Saeedeh Saeedeh MomtaziMomtazi

Spoken Language SystemsSpoken Language Systems

Saarland UniversitySaarland University

[email protected]@lsv.uni--saarland.desaarland.de

Research Advisor: Dietrich Research Advisor: Dietrich KlakowKlakow

Page 2: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

2

2

OutlineOutline

��Question Answering SystemQuestion Answering System

��Language Models for Information RetrievalLanguage Models for Information Retrieval

��Trained Triggering ModelTrained Triggering Model

��ResultsResults

��SummarySummary

Page 3: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

3

3

Page 4: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

4

4

Page 5: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

5

5

Page 6: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

6

6

1 Answer

Internet

pages web1010

Document Retrieval

Sentence Retrieval

Information

Extraction

Answer

Selection

Goal:answer questions like “Who is Warren Moon’s Agent?”

e.g. “Leigh Steinberg”

What is Question Answering?What is Question Answering?

Page 7: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

7

7

Sentence RetrievalSentence Retrieval

��Task:Task:

��Finding a small segment of text that contains the answer Finding a small segment of text that contains the answer

[[CorradaCorrada--Emmanuel, Croft, & Murdock, 2003Emmanuel, Croft, & Murdock, 2003]]

��Benefits beyond document retrieval: Benefits beyond document retrieval:

��Documents are very largeDocuments are very large

��Documents span different subject areasDocuments span different subject areas

��The relevant information is expressed much more locallyThe relevant information is expressed much more locally

��Retrieving the sentences Retrieving the sentences simplifies the next step: simplifies the next step:

information extractioninformation extraction

Page 8: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

8

8

OutlineOutline

�� Question Answering SystemQuestion Answering System

��Language Models for Information RetrievalLanguage Models for Information Retrieval

��Trained Triggering ModelTrained Triggering Model

��ResultsResults

��SummarySummary

Page 9: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

9

9

Language Models for IRLanguage Models for IR

�� P(Q|SP(Q|Sii)): language model trained on : language model trained on SSii

�� Ranking sentences in descending order of this probabilityRanking sentences in descending order of this probability

S6

S1

S2

S3

S5S4

S7Query

Q

P(Q|S2)

Page 10: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

10

10

Query Likelihood ModelQuery Likelihood Model

�� Unigram language model of sentences and queriesUnigram language model of sentences and queries

[[Song & Croft, 1999Song & Croft, 1999]]

∏=

=Mi

i SqPSQP...1

)|()|(

Query

ModelSentence

ModelQuery

Terms

Page 11: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

11

11

� A separate language model is trained for each sentence

in the search space

� The probability is calculated based on the frequency of

query term in the sentence

Maximum Likelihood EstimationMaximum Likelihood Estimation

∏=

=Mi

i SqPSQP...1

)|()|(

∑=

w

S

iSi

wf

qfSqP

)(

)()|(

Page 12: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

12

12

ExampleExample

��Question:Question:

��Answers:Answers:

Who invented the automobile?Who invented the automobile?

An automobile powered by An automobile powered by his own engine was his own engine was invented by Karl Benz in invented by Karl Benz in 1885 and granted a patent.1885 and granted a patent.

Page 13: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

13

13

ExampleExample

��Question:Question:

��Answers:Answers:

Who Who inventedinventedinventedinventedinventedinventedinventedinvented the the automobileautomobileautomobileautomobileautomobileautomobileautomobileautomobile??

An An automobileautomobileautomobileautomobileautomobileautomobileautomobileautomobile powered by powered by his own engine was his own engine was inventedinventedinventedinventedinventedinventedinventedinvented by Karl Benz in by Karl Benz in 1885 and granted a patent.1885 and granted a patent.

Nicolas Joseph Nicolas Joseph CugnotCugnotinvented the first self invented the first self propelled mechanical propelled mechanical vehicle.vehicle.

Page 14: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

14

14

Nicolas Joseph Nicolas Joseph CugnotCugnotinventedinventedinventedinventedinventedinventedinventedinvented the first self the first self propelled mechanical propelled mechanical vehicle.vehicle.

ExampleExample

��Question:Question:

��Answers:Answers:

Who Who inventedinventedinventedinventedinventedinventedinventedinvented the the automobileautomobileautomobileautomobileautomobileautomobileautomobileautomobile??

An An automobileautomobileautomobileautomobileautomobileautomobileautomobileautomobile powered by powered by his own engine was his own engine was inventedinventedinventedinventedinventedinventedinventedinvented by Karl Benz in by Karl Benz in 1885 and granted a patent.1885 and granted a patent.

Thomas Edison Thomas Edison inventedinventedinventedinventedinventedinventedinventedinventedthe first commercially the first commercially practical light.practical light.

Alexander Graham Bell of Alexander Graham Bell of Scotland is the Scotland is the inventorinventorinventorinventorinventorinventorinventorinventorof the first practical of the first practical telephone.telephone.

Page 15: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

15

15

OutlineOutline

�� Question Answering SystemQuestion Answering System

�� Language Models for Information RetrievalLanguage Models for Information Retrieval

��Trained Triggering ModelTrained Triggering Model

��ResultsResults

��SummarySummary

Page 16: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

16

16

� A unique model is trained on a large corpus firstly, then

it is being used for all of the sentences to be retrieved

� The trained model is represented by a set of triples:

� is the number of times the word w triggers

the target word w’.

Trained Trigger ModelTrained Trigger Model

∏=

=Mi

i SqPSQP...1

)|()|(

>< )',(,', wwfww C

)',( wwf C

Page 17: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

17

17

� Triggering Model:

Trained Trigger ModelTrained Trigger Model

∏=

=Mi

i SqPSQP...1

)|()|(

∑=

=Nj

jii sqPN

SqP..1

)|(1

)|(

∑=

iq

jiC

jiC

jisqf

sqfsqP

),(

),()|(

Page 18: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

18

18

Word Unigram ModelWord Unigram Model

S6

S1

S2

S3

S5S4

S7Query

Q

P(Q|S2)

2Sθ

Page 19: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

19

19

Word Unigram ModelWord Unigram Model

S6

S1

S2

S3

S5S4

S7Query

Q

P(Q|S5)

5Sθ

Page 20: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

20

20

Triggering ModelTriggering Model

S6

S1

S2

S3

S5S4

S7Query

Q

Corpus CCθ

P(Q|S2)

Page 21: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

21

21

Triggering ModelTriggering Model

S6

S1

S2

S3

S5S4

S7Query

Q

Corpus CCθ

P(Q|S5)

Page 22: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

22

22

Word Unigram ModelWord Unigram Model

S6

S1

S2

S3

S5S4

S7Query

Q

P(Q|S2)

2Sθn-gram

Page 23: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

23

23

Triggering ModelTriggering Model

S6

S1

S2

S3

S5S4

S7Query

Q

Corpus CCθ

P(Q|S2)

word

triggering

Self Triggering

Inside Sentence

Across Sentence

Question and Answer Pair

Page 24: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

24

24

Self TriggeringSelf Triggering

� Each word can only trigger itself

� Works similar to the basic bag-of-words model

� The words that appeared in both the query and the sentence have the higher priority

� It is an essential part of a retrieval engine and have to be used beside any other triggering models

Page 25: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

25

25

Inside Sentence TriggeringInside Sentence Triggering

� Idea: there is a relation between words appear in the same sentences

� Considers that each word in a sentence triggers all other words in the same sentence

� Uses a large unannotated corpus for training

� The sentence retrieval can retrieve sentences which do not share many terms with the query, but their terms frequently co-occur with query terms in the same sentences of the training corpus.

Page 26: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

26

26

Nicolas Joseph Nicolas Joseph CugnotCugnotinventedinventedinventedinventedinventedinventedinventedinvented the first self the first self propelled mechanical propelled mechanical vehicle.vehicle.

ExampleExample

��Question:Question:

��Answers:Answers:

Who Who inventedinventedinventedinventedinventedinventedinventedinvented the the automobileautomobileautomobileautomobileautomobileautomobileautomobileautomobile??

Thomas Edison Thomas Edison inventedinventedinventedinventedinventedinventedinventedinventedthe first commercially the first commercially practical light.practical light.

Alexander Graham Bell of Alexander Graham Bell of Scotland is the Scotland is the inventorinventorinventorinventorinventorinventorinventorinventorof the first practical of the first practical telephone.telephone.

Page 27: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

27

27

Inside Sentence TriggeringInside Sentence Triggering

“The word automobileautomobileautomobileautomobilemeaning a vehiclevehiclevehiclevehicle that moves itself.”

“An automobileautomobileautomobileautomobileincludes at least two seats located one behind the other and attachable to a vehiclevehiclevehiclevehicle floor.”

Page 28: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

28

28

Across Sentence TriggeringAcross Sentence Triggering

� Idea: two adjacent sentences mostly talk about the same topic by using different words with the same concept and meaning

� Considers that each word of a sentence triggers all of the words of the next sentence in the training corpus

� Uses a large unannotated corpus for trainings

� Uses a wider context than inside sentence triggering

Page 29: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

29

29

ExampleExample

��Question:Question:

��Answers:Answers:

Where was the first McDonaldWhere was the first McDonald’’s built?s built?

Two brothers from Two brothers from Manchester, Dick and Mac Manchester, Dick and Mac McDonald, open the first McDonald, open the first McDonaldMcDonald’’s in California.s in California.

Page 30: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

30

30

ExampleExample

��Question:Question:

��Answers:Answers:

Where was the Where was the first McDonaldfirst McDonaldfirst McDonaldfirst McDonaldfirst McDonaldfirst McDonaldfirst McDonaldfirst McDonald’’’’’’’’ssssssss built?built?

Two brothers from Two brothers from Manchester, Dick and Mac Manchester, Dick and Mac McDonald, open the McDonald, open the first first first first first first first first McDonaldMcDonaldMcDonaldMcDonaldMcDonaldMcDonaldMcDonaldMcDonald’’’’’’’’ssssssss in California.in California.

The site of the The site of the first first first first first first first first McDonaldMcDonaldMcDonaldMcDonaldMcDonaldMcDonaldMcDonaldMcDonald’’’’’’’’ssssssss to be to be franchised by Ray Kroc is franchised by Ray Kroc is now a museum in Des now a museum in Des Plaines, Illinois.Plaines, Illinois.

The The first McDonaldfirst McDonaldfirst McDonaldfirst McDonaldfirst McDonaldfirst McDonaldfirst McDonaldfirst McDonald’’’’’’’’ssssssss TV TV commercial was a pretty commercial was a pretty lowlow--budget affair.budget affair.

Page 31: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

31

31

Two brothers from Two brothers from Manchester, Dick and Mac Manchester, Dick and Mac McDonald, open the McDonald, open the first first first first first first first first McDonaldMcDonaldMcDonaldMcDonaldMcDonaldMcDonaldMcDonaldMcDonald’’’’’’’’ssssssss in California.in California.

ExampleExample

��Question:Question:

��Answers:Answers:

Where was the Where was the first McDonaldfirst McDonaldfirst McDonaldfirst McDonaldfirst McDonaldfirst McDonaldfirst McDonaldfirst McDonald’’’’’’’’ssssssss builtbuiltbuiltbuiltbuiltbuiltbuiltbuilt??

The site of the The site of the first first first first first first first first McDonaldMcDonaldMcDonaldMcDonaldMcDonaldMcDonaldMcDonaldMcDonald’’’’’’’’ssssssss to be to be franchised by Ray Kroc is franchised by Ray Kroc is now a museum in Des now a museum in Des Plaines, Illinois.Plaines, Illinois.

The The first McDonaldfirst McDonaldfirst McDonaldfirst McDonaldfirst McDonaldfirst McDonaldfirst McDonaldfirst McDonald’’’’’’’’ssssssss TV TV commercial was a pretty commercial was a pretty lowlow--budget affair.budget affair.

Page 32: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

32

32

Across Sentence TriggeringAcross Sentence Triggering

“The structure of Eiffel Tower was builtbuiltbuiltbuilt between 1887 and 1889 as the entrance arch for the Exposition Universelle, a World’s Fair marking the centennial celebration of the French Revolution.”“The tower was inaugurated on 31 March 1889, and openedopenedopenedopened on 6 May.”

“Wembley Stadium was builtbuiltbuiltbuiltby Australian company Brookfield Multiplex.”“The stadium was scheduled to openopenopenopen on 13 May 2006.”

Page 33: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

33

33

Question and Answer Pair TriggeringQuestion and Answer Pair Triggering

� Idea: there is a relation between words appear in a pair of question and answer sentence

� Considers that each word in the question triggers all words in its answer sentence

� Requires a supervised training

� Uses a question-answer sentence corpus for training

Page 34: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

34

34

ExampleExample

��Question:Question:

��Answers:Answers:

How high is Everest?How high is Everest?

Everest is Everest is 29,029 feet.29,029 feet.

Page 35: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

35

35

ExampleExample

��Question:Question:

��Answers:Answers:

How high is How high is EverestEverestEverestEverestEverestEverestEverestEverest??

EverestEverestEverestEverestEverestEverestEverestEverest is is 29,029 feet.29,029 feet.

EverestEverestEverestEverestEverestEverestEverestEverest is is located in Nepal.located in Nepal.

EverestEverestEverestEverestEverestEverestEverestEverest has two main has two main climbing routes.climbing routes.

Page 36: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

36

36

EverestEverestEverestEverestEverestEverestEverestEverest is is 29,029 feet.29,029 feet.

ExampleExample

��Question:Question:

��Answers:Answers:

How How highhighhighhighhighhighhighhigh is is EverestEverestEverestEverestEverestEverestEverestEverest??

EverestEverestEverestEverestEverestEverestEverestEverest is is located in Nepal.located in Nepal.

EverestEverestEverestEverestEverestEverestEverestEverest has two main has two main climbing routes.climbing routes.

Page 37: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

37

37

Question and Answer Pair TriggeringQuestion and Answer Pair Triggering

Q: “How highhighhighhigh is Pikes peak?”A: “Pikes peak, Colorado At 14,110 feetfeetfeetfeet, altitude sickness is a consideration when driving up this mountain.”

Q: “How highhighhighhigh is Mount Hood?”A: “Mount Hood is in the Cascade Mountain range and is 11,245 feetfeetfeetfeet.”

Page 38: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

38

38

OutlineOutline

�� Question Answering SystemQuestion Answering System

�� Language Models for Information RetrievalLanguage Models for Information Retrieval

�� Trained Triggering ModelTrained Triggering Model

��ResultsResults

��SummarySummary

Page 39: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

39

39

CorporaCorpora

� Inside Sentence

� Across Sentence

� Question and Answer Pair

AQUAINT1 Corpus• Consists of Engish newswire text, extracted from:

• the Xinhua (XIN)• the New York Times (NYT)• the Associated Press Worldstream (APW)

• Contains • almost 450 million word tokens • as 23 million sentences• as 1.5 million documents.

QASP corpus

• Consists of TREC QA track• from 2002 to 2004• 985 questions

•Prepared via Amazon MTurk

Yahoo! Answers Corpus

• derived fromhttp://answers.yahoo.com/

• collected in 10/25/2007• containing 4,483,032 questions

Page 40: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

40

40

ExperimentsExperiments

�� TREC data setTREC data set

�� Development set: TREC05 Development set: TREC05 (316 questions)(316 questions)

�� Test set: TREC06 Test set: TREC06 (365 questions)(365 questions)

�� JudgmentJudgment

�� QASP CorpusQASP Corpus

�� Prepared via Amazon Prepared via Amazon MTurkMTurk

�� Evaluation MetricsEvaluation Metrics

�� Mean Average PrecisionMean Average Precision

�� Precision@5Precision@5

�� Mean Reciprocal Rank Mean Reciprocal Rank

Page 41: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

41

41

Results Results

0.37010.3701

MAPMAP

0.50470.5047

MRRMRR

Maximum Maximum LikelohoodLikelohood

ModelModel

0.22670.2267

P@5P@5

Page 42: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

42

42

Results Results

0.23490.23490.52190.52190.38060.3806Self TTLMSelf TTLM

0.37010.3701

MAPMAP

0.50470.5047

MRRMRR

Maximum Maximum LikelohoodLikelohood

ModelModel

0.22670.2267

P@5P@5

Page 43: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

43

43

Results Results

0.13600.13600.32630.32630.22660.2266Question and Answer Pair TTLM (QASP)Question and Answer Pair TTLM (QASP)

0.23490.23490.52190.52190.38060.3806Self TTLMSelf TTLM

0.10520.10520.25850.25850.19110.1911Inside Sentence TTLMInside Sentence TTLM

0.03440.0344

0.23670.2367

0.37010.3701

MAPMAP

0.04150.0415

0.30470.3047

0.50470.5047

MRRMRR

0.00990.0099Question and Answer Pair TTLM (Yahoo)Question and Answer Pair TTLM (Yahoo)

Across Sentence TTLMAcross Sentence TTLM

Maximum Maximum LikelohoodLikelohood

ModelModel

0.13600.1360

0.22670.2267

P@5P@5

Page 44: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

44

44

Results (Linear Interpolation) Results (Linear Interpolation)

+ QA Pair (QASP)+ QA Pair (QASP)

+ Inside Sentence+ Inside Sentence

0.23490.2349

P@5P@5

0.52190.5219

MRRMRR

0.38060.3806

MAPMAP

Self TriggeringSelf Triggering Maximum LikelihoodMaximum Likelihood

0.37010.3701

MAPMAP

0.50470.5047

MRRMRR

+ QA Pair (Yahoo)+ QA Pair (Yahoo)

+ Across Sentence+ Across Sentence

BaselineBaseline

ModelModel

0.22670.2267

P@5P@5

Page 45: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

45

45

Results (Linear Interpolation) Results (Linear Interpolation)

all the differences are statistically significant

at the level of p-value<0.01 based on t-tests.

0.26220.26220.54920.54920.42080.42080.26630.26630.54080.54080.42040.4204+ QA Pair (QASP)+ QA Pair (QASP)

0.25870.25870.55720.55720.43510.43510.26280.26280.54690.54690.43140.4314+ Inside Sentence+ Inside Sentence

0.26450.2645

0.25930.2593

0.23490.2349

P@5P@5

0.56180.5618

0.54800.5480

0.52190.5219

MRRMRR

0.43580.4358

0.42740.4274

0.38060.3806

MAPMAP

Self TriggeringSelf Triggering Maximum LikelihoodMaximum Likelihood

0.43710.4371

0.43810.4381

0.37010.3701

MAPMAP

0.56540.5654

0.56310.5631

0.50470.5047

MRRMRR

0.26280.2628+ QA Pair (Yahoo) + QA Pair (Yahoo)

+ Across Sentence+ Across Sentence

BaselineBaseline

ModelModel

0.26050.2605

0.22670.2267

P@5P@5

Page 46: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

46

46

Results (Linear Interpolation) Results (Linear Interpolation)

all the differences are statistically significant

at the level of p-value<0.01 based on t-tests.

0.26220.26220.54920.54920.42080.42080.26630.26630.54080.54080.42040.4204+ QA Pair (QASP)+ QA Pair (QASP)

0.25870.25870.55720.55720.43510.43510.26280.26280.54690.54690.43140.4314+ Inside Sentence+ Inside Sentence

0.26450.2645

0.25930.2593

0.23490.2349

P@5P@5

0.56180.5618

0.54800.5480

0.52190.5219

MRRMRR

0.43580.4358

0.42740.4274

0.38060.3806

MAPMAP

Self TriggeringSelf Triggering Maximum LikelihoodMaximum Likelihood

0.43710.4371

0.43810.4381

0.37010.3701

MAPMAP

0.56540.5654

0.56310.5631

0.50470.5047

MRRMRR

0.26280.2628+ QA Pair (Yahoo) + QA Pair (Yahoo)

+ Across Sentence+ Across Sentence

BaselineBaseline

ModelModel

0.26050.2605

0.22670.2267

P@5P@5

Page 47: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

47

47

Results (Linear Interpolation) Results (Linear Interpolation)

all the differences are statistically significant

at the level of p-value<0.01 based on t-tests.

0.26220.26220.54920.54920.42080.42080.26630.26630.54080.54080.42040.4204+ QA Pair (QASP)+ QA Pair (QASP)

0.25870.25870.55720.55720.43510.43510.26280.26280.54690.54690.43140.4314+ Inside Sentence+ Inside Sentence

0.26450.2645

0.25930.2593

0.23490.2349

P@5P@5

0.56180.5618

0.54800.5480

0.52190.5219

MRRMRR

0.43580.4358

0.42740.4274

0.38060.3806

MAPMAP

Self TriggeringSelf Triggering Maximum LikelihoodMaximum Likelihood

0.43710.4371

0.43810.4381

0.37010.3701

MAPMAP

0.56540.5654

0.56310.5631

0.50470.5047

MRRMRR

0.26280.2628+ QA Pair (Yahoo) + QA Pair (Yahoo)

+ Across Sentence+ Across Sentence

BaselineBaseline

ModelModel

0.26050.2605

0.22670.2267

P@5P@5

Page 48: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

48

48

Results (Linear Interpolation) Results (Linear Interpolation)

all the differences are statistically significant

at the level of p-value<0.01 based on t-tests.

0.26220.26220.54920.54920.42080.42080.26630.26630.54080.54080.42040.4204+ QA Pair (QASP)+ QA Pair (QASP)

0.25870.25870.55720.55720.43510.43510.26280.26280.54690.54690.43140.4314+ Inside Sentence+ Inside Sentence

0.26450.2645

0.25930.2593

0.23490.2349

P@5P@5

0.56180.5618

0.54800.5480

0.52190.5219

MRRMRR

0.43580.4358

0.42740.4274

0.38060.3806

MAPMAP

Self TriggeringSelf Triggering Maximum LikelihoodMaximum Likelihood

0.43710.4371

0.43810.4381

0.37010.3701

MAPMAP

0.56540.5654

0.56310.5631

0.50470.5047

MRRMRR

0.26280.2628+ QA Pair (Yahoo) + QA Pair (Yahoo)

+ Across Sentence+ Across Sentence

BaselineBaseline

ModelModel

0.26050.2605

0.22670.2267

P@5P@5

Page 49: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

49

49

Results (Linear Interpolation) Results (Linear Interpolation)

Page 50: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

50

50

OutlineOutline

�� Question Answering SystemQuestion Answering System

�� Language Models for Information RetrievalLanguage Models for Information Retrieval

�� Trained Triggering ModelTrained Triggering Model

�� ResultsResults

��SummarySummary

Page 51: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

51

51

SummarySummary

�� Introducing question answering systemsIntroducing question answering systems

�� The necessity of the sentence retrieval in a QA systemThe necessity of the sentence retrieval in a QA system

�� Using language models for sentence retrievalUsing language models for sentence retrieval

�� Describing the current unigram model and its problemsDescribing the current unigram model and its problems

�� Proposing trained triggering model with different types:Proposing trained triggering model with different types:

�� selfself

�� inside sentenceinside sentence

�� across sentenceacross sentence

�� question and answer pairquestion and answer pair

�� Linear interpolation of different modelsLinear interpolation of different models

Page 52: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

52

52

ReferencesReferences

�� BilottiBilotti, M. W., 2004. Query Expansion Techniques for Question Answering, M. W., 2004. Query Expansion Techniques for Question Answering. MS Thesis in Electrical . MS Thesis in Electrical Engineering and Computer Science, Massachusetts Institute of TecEngineering and Computer Science, Massachusetts Institute of Technologyhnology

�� CorradaCorrada--Emmanuel, A., Croft, W. B., and Murdock, V., 2003. Answer PassagEmmanuel, A., Croft, W. B., and Murdock, V., 2003. Answer Passage Retrieval for Question e Retrieval for Question Answering, in CIIR Technical Report, AmherstAnswering, in CIIR Technical Report, Amherst

�� Ponte, J. and Croft, W.B. 1998. A language modeling approach to Ponte, J. and Croft, W.B. 1998. A language modeling approach to information retrieval. In Proceedings of information retrieval. In Proceedings of the 1998 ACM SIGIR Conference on Research and Development in Infthe 1998 ACM SIGIR Conference on Research and Development in Information Retrieval, 275ormation Retrieval, 275--281 281

�� Lafferty, J. and Lafferty, J. and ZhaiZhai, C. 2001. Document language models, query models, and risk mini, C. 2001. Document language models, query models, and risk minimization for mization for information retrieval. In Proceedings of the 2001 ACM SIGIR Confinformation retrieval. In Proceedings of the 2001 ACM SIGIR Conference on Research and Development in erence on Research and Development in Information Retrieval, 111Information Retrieval, 111--119. 119.

�� HiemstraHiemstra, D. A. 1998. Linguistically Motivated Probabilistic Model of In, D. A. 1998. Linguistically Motivated Probabilistic Model of Information Retrieval. Second formation Retrieval. Second European Conference on Digital Libraries, 569European Conference on Digital Libraries, 569--584.584.

�� Song, F., & Croft, W.B. 1999. A general language model for inforSong, F., & Croft, W.B. 1999. A general language model for information retrieval. In Proceedings of the mation retrieval. In Proceedings of the 22nd annual international ACM22nd annual international ACM--SIGIR conference on research and development in information retrSIGIR conference on research and development in information retrieval, ieval, 279279--280, New York.280, New York.

�� ZhaiZhai, C. and Lafferty, J. 2001. A study of smoothing methods for lan, C. and Lafferty, J. 2001. A study of smoothing methods for language models applied to ad hoc guage models applied to ad hoc information retrieval. In W.B. Croft, D.J. Harper, D.H. Kraft, &information retrieval. In W.B. Croft, D.J. Harper, D.H. Kraft, & J. J. ZobelZobel (Eds.), Proceedings of the 24th (Eds.), Proceedings of the 24th annual international ACMSIGIR conference on research and developannual international ACMSIGIR conference on research and development in information. ment in information.

�� GaoGao, J., , J., NieNie, J.Y., Wu, G., , J.Y., Wu, G., CaoCao, G. 2004. Dependence language model for information retrieval. , G. 2004. Dependence language model for information retrieval. In: In: Proceedings of the 27th annual international conference on ReseaProceedings of the 27th annual international conference on Research and development in information rch and development in information retrieval. retrieval.

Page 53: Trained Trigger Language Model for Sentence …ufal.mff.cuni.cz/pire10/Saeedeh-PIRE-Talk.pdfLanguage Models for Information Retrieval Trained Triggering Model Results Summary 16 16

53

53

Thanks for your attention!Thanks for your attention!