53
10417/10617 Intermediate Deep Learning: Fall2019 Russ Salakhutdinov Machine Learning Department [email protected] https://deeplearning-cmu-10417.github.io/ Representation Learning for Reading Comprehension

10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

10417/10617IntermediateDeepLearning:

Fall2019RussSalakhutdinov

Machine Learning [email protected]

https://deeplearning-cmu-10417.github.io/

Representation Learning for Reading Comprehension

Page 2: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

• Disclaimer:SomeofthematerialandslidesforthislecturewereborrowedfromBhuwanDhingra’stalk.

2

ReadingComprehension

Page 3: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

ReadingComprehension

Query:“President-electBarackObamasaidTuesdayhewasnotawareofallegedcorruptionbyXwhowasarrestedonchargesoftryingtosellObama’ssenateseat.”FindX.

Document:“…arrestedIllinoisgovernorRodBlagojevichandhischiefofstaffJohnHarrisoncorruptioncharges…includedBlogojevichallegedlyconspiringtosellortradethesenateseatleftvacantbyPresident-electBarackObama…”

Answer:RodBlagojevich

Who-Did-What Dataset (Onishi, Wang, Bansal, Gimpel, McAllester, EMNLP, 2016)

Page 4: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Ø  isadocumentØ  isaquestionoverthecontentsofthatdocumentØ  istheanswertothisquery

• Theanswercomesfromafixedvocabulary.

TASK:Givenadocumentquerypairfindwhichanswers.

ReadingComprehension

Ø  mightconsistofalltokens/spansoftokensinthedocument(ExtractiveQuestionAnswering)

•  QuestionAnswering/InformationExtraction•  Testfortextrepresentationmodels

–  Abetterrepresentationcanhelpanswermorequestions

Page 5: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Approach--SupervisedLearning

ModelPr(c|d, q) = f✓(d, q, c) 8 c 2 A

L(✓) =X

(d,q,a)2D

� logPr(a|d, q) Loss

✓̂ = argmin✓

L(✓) Training

Whatarchitecturalbiasescanwebuildintothemodel?

(Aneuralnetwork)

D = {(d, q, a)}Ni=1 Dataset

Page 6: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

ArchitecturalBias

•  CNNs,RNNshavearchitecturalbiasestowardsimages/sequences

•  Forreadingcomprehensionwhatbiasescanwebuildtoreflectlinguisticphenomena?– Alignment,Paraphrasing,Aggregation– Coreference,SyntacticandSemanticDependencies

DesigningtheconnectivitypatternofaNeuralNetworktoreflectthenatureoftheproblembeingsolved.

(Part1)

(Part2)

Page 7: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Query:“President-electBarackObamasaidTuesdayhewasnotawareofallegedcorruptionbyXwhowasarrestedonchargesoftryingtosellObama’ssenateseat.”FindX.

Document:“…arrestedIllinoisgovernorRodBlagojevichandhischiefofstaffJohnHarrisoncorruptioncharges…includedBlogojevichallegedlyconspiringtosellortradethesenateseatleftvacantbyPresident-electBarackObama…”

TextPhenomena

Who-Did-What Dataset (Onishi, Wang, Bansal, Gimpel, McAllester, EMNLP, 2016)

Alignment

Page 8: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Query:“President-electBarackObamasaidTuesdayhewasnotawareofallegedcorruptionbyXwhowasarrestedonchargesoftryingtosellObama’ssenateseat.”FindX.

Document:“…arrestedIllinoisgovernorRodBlagojevichandhischiefofstaffJohnHarrisoncorruptioncharges…includedBlogojevichallegedlyconspiringtosellortradethesenateseatleftvacantbyPresident-electBarackObama…”

TextPhenomena

Who-Did-What Dataset (Onishi, Wang, Bansal, Gimpel, McAllester, EMNLP, 2016)

Alignment Paraphrasing

Page 9: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Query:“President-electBarackObamasaidTuesdayhewasnotawareofallegedcorruptionbyXwhowasarrestedonchargesoftryingtosellObama’ssenateseat.”FindX.

Document:“…arrestedIllinoisgovernorRodBlagojevichandhischiefofstaffJohnHarrisoncorruptioncharges…includedBlogojevichallegedlyconspiringtosellortradethesenateseatleftvacantbyPresident-electBarackObama…”

TextPhenomena

Who-Did-What Dataset (Onishi, Wang, Bansal, Gimpel, McAllester, EMNLP, 2016)

Alignment AggregationParaphrasing

Page 10: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Alignment

WordVectors+RNNstorepresentDocumentandQuery

Multiplepassesoverthedocument

+PointerSumAttention

AggregationParaphrasing

MultiplicativeAttention

Biases

Page 11: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

•  Ascompositionsofwordvectors

arrested Illinois governor Rod Blagojevich

Thingswhichhelp:Ø  PretrainedGloveembeddingsØ  RandomvectorsforOOVtokensattesttime.

•  Betterthantrained“UNK”embedding.Ø  Characterembeddings

*Dhingra,Liu,Salakhutdinov,Cohen(preprint,2017)**Yang,Dhingra,Yuan,Hu,Cohen,Salakhutdinov(ICLR,2017)

RepresentingDocument/Query

Page 12: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

•  BidirectionalGatedRecurrentUnitsprocessthetokensfromlefttorightandrighttoleft

arrested Illinois governor Rod Blagojevich

RepresentingDocument/Query

dt = [�!ht ; �ht ]

Page 13: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

•  Bothdocumentandqueryarerepresentedasmatrices

RepresentingDocument/Query

arrested Illinois governor Rod Blagojevich corruption by X who was

Q 2 R2h⇥|Q|D 2 R2h⇥|D|

h–StatesizeofeachGRU

Page 14: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

•  ForeachtokeninD,weformatokenspecificrepresentationofthequery

Blagojevich corruption by X who was

di

GatedAttentionMechanism

Dhingra et al (ACL 2017)

↵ij = softmax(qTj di)q̃i =

|Q|X

j=1

↵ijqj

document query

qj

Page 15: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

GatedAttentionMechanism•  Useelement-wisemultiplicationtogatethe

interactionbetweendocumenttokensandquery

corruption by X who wasBlagojevich

di

d0i = di ⇤ q̃i

Dhingra et al (ACL 2017)

q̃i =

|Q|X

j=1

↵ijqj

1. Findfeaturesinthequerywhichmatchthecontextualrepresentationofthedocument

2. Gatethedocumentrepresentationbymultiplyingwiththesefeatures

Page 16: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

MultiHopArchitecture•  Performseveralpassesoverthedocument

–  Allowmodeltocombineevidencefrommultiplesentences

Dhingra et al (ACL 2017)corruption by X who wasBlagojevicharrested

RepeatforKlayers

GatedAttention

GatedAttention

Page 17: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Multi-hopArchitecture•  Performseveralpassesoverthedocument

–  Allowmodeltocombineevidencefrommultiplesentences

Dhingra et al (ACL 2017)

Page 18: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

OutputModel• Probabilitythataparticulartokeninthedocumentanswersthequery:

Ø  Takeaninnerproductbetweenthequeryembeddingandtheoutputofthelastlayer:

• Theprobabilityofaparticularcandidateisthenaggregatedoveralldocumenttokenswhichappearinc:

setofpositionswhereatokenincappearsinthedocumentd.

PointerSumAttentionofKadlecetal.,2016

si =exp (hq(K), d(K)

i i)P

i0 exp (hq(K), d(K)i0 i)

, i = 1, . . . , |D|

Page 19: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

• Theprobabilityofaparticularcandidateisthenaggregatedoveralldocumenttokenswhichappearinc:

• Thecandidatewithmaximumprobabilityisselectedasthepredictedanswer:

• Usecross-entropylossbetweenthepredictedprobabilitiesandthetrueanswers.

OutputModel

Page 20: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Results

Datasets

Westudied5datasets:

Page 21: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Results

Datasets

Westudied5datasets:Stateoftheart

Dhingra et al (ACL 2017)

Page 22: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum
Page 23: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

AnalysisofAttention

Query:“President-electBarackObamasaidTuesdayhewasnotawareofallegedcorruptionbyXwhowasarrestedonchargesoftryingtosellObama’ssenateseat.”FindX.

Document:“…arrestedIllinoisgovernorRodBlagojevichandhischiefofstaffJohnHarrisoncorruptioncharges…includedBlogojevichallegedlyconspiringtosellortradethesenateseatleftvacantbyPresident-electBarackObama…”

Blagojevich corruption by X who was

di

↵ij = softmax(qTj di)q̃i =

|Q|X

j=1

↵ijqj

Page 24: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

AnalysisofAttention

Query:“President-electBarackObamasaidTuesdayhewasnotawareofallegedcorruptionbyXwhowasarrestedonchargesoftryingtosellObama’ssenateseat.”FindX.

Document:“…arrestedIllinoisgovernorRodBlagojevichandhischiefofstaffJohnHarrisoncorruptioncharges…includedBlogojevichallegedlyconspiringtosellortradethesenateseatleftvacantbyPresident-electBarackObama…”

Layer1 Layer2

Dhingra et al (ACL 2017)

Page 25: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Summarysofar

•  Multiplicativeattentionfordocumentandqueryalignment

•  Multiplelayersallowmodeltofocusondifferentsalientaspectsofthequery

Code+Data:https://github.com/bdhingra/ga-reader

Page 26: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Wordsvs.Characters• Word-levelrepresentationsaregoodatlearningthesemanticsofthetokens

• Character-levelrepresentationsaremoresuitableformodelingsub-wordmorphologies(“cat”vs.“cats”)

Ø  Word-levelrepresentationsareobtainedfromalearnedlookuptable

Ø  Character-levelrepresentationsareusuallyobtainedbyapplyingRNNorCNN

• Hybridword-charactermodelshavebeenshowntobesuccessfulinvariousNLPtasks(Yangetal.,2016a,Miyamoto&Cho(2016),Lingetal.,2015)

• Commonlyusedmethodistoconcatenatethesetworepresentations

Page 27: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Fine-GrainedGating• Fine-grainedgatingmechanism:

Word-levelrepresentation

Character-levelrepresentation

Gating

Additionalfeatures:namedentitytags,part-of-speechtags,documentfrequencyvectors,wordlook-uprepresentations

Yang et al, ICLR 2017

Page 28: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Children’sBookTest(CBC)Dataset

Page 29: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Wordsvs.Characters• Highgatevalues:character-levelrepresentations• Lowgatevalues:word-levelrepresentations.

Page 30: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

TalkRoadmap

•  MultiplicativeandFine-grainedAttention•  LinguisticKnowledgeasExplicitMemoryforRNNs

Page 31: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Broad-ContextLanguageModelingHer plain face broke into a huge smile when she saw Terry. “Terry!” she called out. She rushed to meet him and they embraced. “Hon, I want you to meet an old friend, Owen McKenna. Owen, please meet Emily.'' She gave me a quick nod and turned back to X

Task:FindX.

LAMBADA dataset, Paperno et al., ACL 2016

(Xcanbeanywordinthevocabulary)

Passagesarefilteredsuchthathumanscancorrectlyanswerwhengiventhewholecontextbutnotwhengivenonlythelastsentence.

Page 32: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Broad-ContextLanguageModeling(recastasReadingComprehension)

Query:She gave me a quick nod and turned back to XFindX.(NowXisassumedtobeinthedocument)

Document:Her plain face broke into a huge smile when she saw Terry. “Terry!” she called out. She rushed to meet him and they embraced. “Hon, I want you to meet an old friend, Owen McKenna. Owen, please meet Emily.''

Chu et al, EACL 2017

•  ReadingComprehensionapproachesperformbestonthistask

Page 33: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

TextPhenomenaDocument:Her plain face broke into a huge smile when she saw Terry. “Terry!” she called out. She rushed to meet him and they embraced. “Hon, I want you to meet an old friend, Owen McKenna. Owen, please meet Emily.''

Query:

She gave me a quick nod and turned back to XFindX.

conjnsubj nmod

A0:turner A1:destination

SyntacticDependency

SemanticDependency

Page 34: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

TextPhenomenaDocument:Her plain face broke into a huge smile when she saw Terry. “Terry!” she called out. She rushed to meet him and they embraced. “Hon, I want you to meet an old friend, Owen McKenna. Owen, please meet Emily.''

Query:

She gave me a quick nod and turned back to XFindX.

SyntacticDependency

SemanticDependency

X=Terry

Coreference

Page 35: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

ArchitecturalBias

1.  Dependent/Coreferentmentionsmaybefarapartinthedocument–  RNNs(includingGRU/LSTM)tendto“forget”long-terminteractions

Usememory-augmentedarchitecture.2.  Off-the-shelfNLPtoolsavailablewhichextract

thesedependencies–  Example:StanfordCoreNLP

Uselinguisticannotationstoguidememorypropagationinthenetwork.

Page 36: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

LinguisticKnowledge

there ball the left and kitchen the to went she football the got mary

CoreNLP / SENNA

there ball the left and kitchen the to went she football the got mary

1. Coreference2. SyntacticDependencies3. SemanticRoles+Inverserelations

Page 37: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

•  Bothdocumentandqueryarerepresentedasmatrices

RepresentingDocument/Query

arrested Illinois governor Rod Blagojevich corruption by X who was

Q 2 R2h⇥|Q|D 2 R2h⇥|D|

h–StatesizeofeachGRU

Page 38: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

•  Bothdocumentandqueryarerepresentedasmatrices

RepresentingDocument/QueryGraphs

Q 2 R2h⇥|Q|D 2 R2h⇥|D|

there ball the left and kitchen the to went she football the got mary there ball the left and kitchen the to went she football the got mary

GraphRepresentation GraphRepresentation

Page 39: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Forward/BackwardDAGs•  Topologicalordergivenbythesequenceitself

there ball the left and kitchen the to went she football the got mary

there ball the left and kitchen the to went she football the got mary

ForwardDAG

BackwardDAG

*Peng et al (TACL, 2017) use similar decomposition for Relation Extraction

Page 40: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

MemoryasAcyclicGraphEncoding(MAGE)RNN

there ball the left and kitchen the to went she football the got mary

went

she

she

left

to

kitchenh3t

h2th1t

m1t

m3t

Page 41: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

MemoryasAcyclicGraphEncoding(MAGE)RNN

there ball the left and kitchen the to went she football the got mary

went

Updateequations:

mt = m1tkm2

tk . . . km|Ef |t ,

rt = �(Wrxst + Urmt + br),

zt = �(Wzxst + Uzmt + bz),

h̃t = tanh(Whxst + rt � Uhmt + bh),

ht = (1� zt)� ht�1 + zt � h̃t,

Memory

GRUupdate

Page 42: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Performance•  LAMBADAdataset

– Onlyonquestionswhereanswerisincontext

Page 43: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Performance•  LAMBADAdataset

– OnlyonquestionswhereanswerisincontextStateoftheart

Page 44: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Document:marytravelledtothekitchen.danielgotthemilk.hewenttothekitchen.johnpickeduptheapple.marywenttothehallway.johnmovedtothebedroom.sandrajourneyedtothekitchen.danielwentbacktothebedroom.johnwenttotheoffice.danielputdownthemilk.hepickedupthemilk.sandrawenttothebedroom.johndroppedtheapple.marywentbacktothebathroom.

Performance•  FacebookbAbIdataset

–  Task3(1000trainingexamples)

Query:wherewastheapplebeforetheoffice?

Answer:bedroom

Onlycoreferenceannotationsextractedforthisdataset

Page 45: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Performance•  FacebookbAbIdataset

–  Task3

Page 46: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Performance•  FacebookbAbIdataset

–  Task3Onehotindicatorfeatureforentitymentionappendedtoinput

Page 47: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Performance•  FacebookbAbIdataset

–  Task3

Page 48: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Summarysofar

•  LinguisticannotationscanhelpguidememorypropagationinRNNs– Bettermodelingoflongtermdependencies

Page 49: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

AnnotatorNoise

•  Performancedropssharplywithaddednoiseinlinguisticannotations

•  Canwejointlytrainannotationandreadingcomprehensionmodels?–  Canwelearntask-specificknowledge?

Page 50: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

“Common”Sense?

Query:“Whowasthefemalememberofthe1980’spopmusicduoEurythmics?”

Document:“EurythmicswereaBritishmusicduoconsistingofmembersAnnieLennoxandDavidA.Stewart”

Answer:AnnieLennox

TriviaQA Dataset (Joshi et al, ACL 2017)

•  Wherecanwefindsuchknowledge?•  Needmoredatasetsfortestingthisaspecttoo.

Page 51: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

HerplainfacebrokeintoahugesmilewhenshesawTerry.“Terry!”shecalledout.Sherushedtomeethimandtheyembraced.“Hon,Iwantyoutomeetanoldfriend,OwenMcKenna.Owen,pleasemeetEmily.'’ShegavemeaquicknodandturnedbacktoX

CoreferenceDependencyParses

Entityrelations

Wordrelations

CoreNLP

Freebase

WordNet

RecurrentNeuralNetwork

TextRepresentation

IncorporatingPriorKnowledge

Page 52: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Search-and-ReadforopendomainQA•  RCassumespassagecontaininganswerisalreadyknown

•  ForrealQAweneedtosearchforthepassageaswell

7-ElevenstoresweretemporarilyconvertedintoKwikE-martstopromotethereleaseofwhatmovie?

QUASAR Dataset (Dhingra, Mazaitis, Cohen, in submission)

InJuly2007,7-ElevenredesignedsomestorestolooklikeKwik-E-MartsinselectcitiestopromoteTheSimpsonsMovie.

Corpus

Tie-inpromotionsweremadewithseveralcompanies,including7-Eleven,whichtransformedselectedstoresintoKwik-E-Marts.

Retrieval

7-ElevenBecomesKwik-E-MartforSimpsonsMoviePromotion

Excerpts

Page 53: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum

Thankyou