Upload
ellie
View
24
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Beyond Bags of Words: A Markov Random Field Model for Information Retrieval. Don Metzler. Bag of Words Representation. 71 the31 garden 22 and19 of 19 to19 house 18 in18 white 12 trees12 first 11 a11 president 8 for7 as 7 gardens7 rose 7 tour6 on 6 was6 east - PowerPoint PPT Presentation
Citation preview
Beyond Bags of Words:Beyond Bags of Words:A Markov Random Field Model for A Markov Random Field Model for
Information RetrievalInformation RetrievalDon Metzler
Bag of Words Representation71 the 31 garden22 and 19 of19 to 19 house18 in 18 white12 trees 12 first11 a 11 president8 for 7 as7 gardens 7 rose7 tour 6 on6 was 6 east6 tours 5 planting5 he 5 is5 grounds 5 that5 gardener 4 history4 text-decoration 4 john4 kennedy 4 april4 been 4 today4 with 4 none4 adams 4 spring4 at 4 had3 mrs 3 lawn… …
71 the 31 garden22 and 19 of19 to 19 house18 in 18 white12 trees 12 first11 a 11 president8 for 7 as7 gardens 7 rose7 tour 6 on6 was 6 east6 tours 5 planting5 he 5 is5 grounds 5 that5 gardener 4 history4 text-decoration 4 john4 kennedy 4 april4 been 4 today4 with 4 none4 adams 4 spring4 at 4 had3 mrs 3 lawn… …
Bag of Words Representation
Bag of Words Models
Binary Independence Model
Unigram Language Models
Language modeling first used in speech recognition to model speech generationIn IR, they are models of text generationTypical scenario
Estimate language model for every documentRank documents by the likelihood that the document generated the query
Documents modeled as multinomial distributions over a fixed vocabulary
Unigram Language Models
Query: Document:
Ranking via Query Likelihood
P( | ) = P( | )∙P( | )∙P( | )
Estimation
P( | ) P( | θ ) P( | θ ) P( θ )
Bag of Words Models
ProsSimple model
Easy to implement
Decent results
ConsToo simple
Unrealistic assumptions
Inability to explicitly model term dependencies
Beyond Bags of Words
Tree Dependence Model
Method of approximating complex joint distribution
Compute EMIM between every pair of terms
Build maximum spanning tree
Tree encodes first order dependencies
A
D
B
E
C
)|()|()|()|()(),,,,( CBPCEPDCPDAPDPEDCBAP
n-Gram Language Models
n-Gram Language Models
Query: Document:
Ranking via Query Likelihood
P( | ) = P( | )∙P( | , )∙P( | , )
Estimation
||
)1(||
)1()1(),|(1
,1
1
,11 C
cf
cf
cf
D
tf
tf
tfDwwP i
i
iii
i
iiii
Dependency Models
ProsMore realistic assumptions
Improved effectiveness
ConsLess efficient
Limited notion of dependence
Not well understood
State of the Art IR
Desiderata
Our desired model should be able to…Support standard IR tasks (ranking, query expansion, etc.)
Easily model dependencies between terms
Handle textual and non-textual features
Consistently and significantly improve effectiveness over bag of words models and existing dependence models
Proposed solution: Markov random fields
Markov Random Fields
MRFs provide a general, robust way of modeling a joint distributionThe anatomy of a MRF
Graph G • vertices representing random variables • edges encode dependence semantics
Potentials over the cliques of G• Non-negative functions over clique configurations• Measures ‘compatibility’
)(
, );(1
)(GCliquesc
G cZ
XP
Markov Random Fields for IR
Modeling Dependencies
Parameter Tying
In theory, a potential function can be associated with every clique in a graphTypical solution is to define potentials over maximal cliques of GNeed more fine-grained control over our potentialsUse clique sets
Set of cliques that share a parameter and potential functionWe identified 7 clique sets that are relevant to IR tasks
Clique SetsSingle term document/query cliques
TD = cliques w/ one query term + D
ψ(domestic, D), ψ(adoption, D), ψ(laws, D)
Ordered terms document/query cliques
OD = cliques w/ two or more contiguous query terms + D
ψ(domestic, adoption, D), ψ(adoption, laws, D),ψ(domestic, adoption, laws, D)
Unordered terms document/query cliques
UD = cliques w/ two or more query terms (in any order) + D
ψ(domestic, adoption, D), ψ(adoption, laws, D), ψ(domestic, laws, D), ψ(domestic, adoption, laws, D)
Clique Sets
Single term query cliquesTQ = cliques w/ one query termψ(domestic), ψ(adoption), ψ(laws)
Ordered terms query cliquesOQ = cliques w/ two or more contiguous query terms
ψ(domestic, adoption), ψ(adoption, laws), ψ(domestic, adoption, laws)
Unordered terms query cliquesUQ = cliques w/ two or more query terms (in any order) ψ(domestic, adoption), ψ(adoption, laws), ψ(domestic, laws),
ψ(domestic, adoption, laws)
Document cliqueD = singleton clique w/ D
ψ(D)
Features
Parameter Estimation
Given a set of relevance judgments R, we want the maximum a posteriori estimate:
What is P( Λ | R )? P( R | Λ ) and P( Λ )?Depends on how model is being evaluated!Want P( Λ | R ) to be peaked around the parameter setting that maximizes the metric we are interested in
)|(maxargˆ RP
Parameter Estimation
Evaluation Metric Surfaces
Feature Induction
Query Expansion
Query Expansion
Query Expansion ExampleO
rigin
al q
uery
: hu
bble
tel
esco
pe a
chie
vem
ents
Query Expansion ExampleO
rigin
al q
uery
: hu
bble
tel
esco
pe a
chie
vem
ents
Applications
Application: Ad Hoc Retrieval
Ad Hoc Results
Application: Web Search
Application: XML Retrieval
Example XML Document<PLAY><TITLE>The Tragedy of Romeo and Juliet</TITLE><ACT><TITLE>ACT I</TITLE><PROLOGUE><TITLE>PROLOGUE</TITLE><SPEECH><SPEAKER>NARRATOR</SPEAKER><LINE>Two households, both alike in dignity,</LINE><LINE>In fair Verona, where we lay our scene,</LINE><LINE>From ancient grudge break to new mutiny,</LINE><LINE>Where civil blood makes civil hands unclean.</LINE><LINE>From forth the fatal loins of these two foes</LINE><LINE>A pair of star-cross’d lovers take their life;</LINE>…</SPEECH></PROLOGUE><SCENE><TITLE>SCENE I. Verona. A public place.</TITLE><SPEECH><SPEAKER>SAMPSON</SPEAKER><LINE>Gregory, o’ my word, we’ll not carry coals.</LINE>…</PLAY>
Content and Structure Queries
NEXI query languagederived from XPath
allows mixture of content and structure
//scene[about(., poison juliet)]Return scene tags that are about poison and juliet.
//*[about(., plague both houses)]Return elements (of any type) about plague both houses.
//scene[about(., juliet chamber)]//speech[about(.//line, jewel)]Return speech tags about jewel where the scene is about juliet chamber.
Application: Text Classification
Conclusions