49
ACM SIGIR 2009 Workshop on Redundancy, Diversity, and Interdependent Document Relevance, July 23, 2009, Boston, MA 1 Modeling Diversity in Information Retrieval ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute for Genomic Biology Department of Statistics University of Illinois, Urbana-Champaign Joint work with John Lafferty, William Cohen, and Xuehua Shen

Modeling Diversity in Information Retrieval

Embed Size (px)

DESCRIPTION

Modeling Diversity in Information Retrieval. ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute for Genomic Biology Department of Statistics University of Illinois, Urbana-Champaign. - PowerPoint PPT Presentation

Citation preview

Page 1: Modeling Diversity in   Information Retrieval

ACM SIGIR 2009 Workshop on Redundancy, Diversity, andInterdependent Document Relevance, July 23, 2009, Boston, MA

1

Modeling Diversity in

Information Retrieval

ChengXiang (“Cheng”) Zhai

Department of Computer Science

Graduate School of Library & Information Science

Institute for Genomic Biology

Department of Statistics

University of Illinois, Urbana-Champaign

Joint work with John Lafferty, William Cohen, and Xuehua Shen

Page 2: Modeling Diversity in   Information Retrieval

Different Reasons for Diversification

• Redundancy reduction

• Diverse information needs – Mixture of users

– Single user with an under-specified query

– Aspect retrieval

– Overview of results

• Active relevance feedback

• …

2

Page 3: Modeling Diversity in   Information Retrieval

Outline

• Risk minimization framework

• Capturing different needs for diversification

• Language models for diversification

3

Page 4: Modeling Diversity in   Information Retrieval

4

IR as Sequential Decision Making

User System

A1 : Enter a query Which documents to present?How to present them?

Ri: results (i=1, 2, 3, …)Which documents to view?

A2 : View documentWhich part of the document

to show? How?

R’: Document contentView more?

A3 : Click on “Back” button

(Information Need) (Model of Information Need)

Page 5: Modeling Diversity in   Information Retrieval

5

Retrieval Decisions

User U: A1 A2 … … At-1 At

System: R1 R2 … … Rt-1

Given U, C, At , and H, choosethe best Rt from all possible

responses to At

History H={(Ai,Ri)} i=1, …, t-1

DocumentCollection

C

Query=“Jaguar”

All possible rankings of C

The best ranking for the query

Click on “Next” button

All possible size-k subsets of unseen docs

The best k unseen docs

Rt r(At)

Rt =?

Page 6: Modeling Diversity in   Information Retrieval

6

A Risk Minimization Framework

User: U Interaction history: HCurrent user action: At

Document collection: C

Observed

All possible responses: r(At)={r1, …, rn}

User Model

M=(S, U…) Seen docs

Information need

L(ri,At,M) Loss Function

Optimal response: r* (minimum loss)

( )arg min ( , , ) ( | , , , )tt r r A t tM

R L r A M P M U H A C dM ObservedInferredBayes risk

Page 7: Modeling Diversity in   Information Retrieval

7

• Approximate the Bayes risk by the loss at the mode of the posterior distribution

• Two-step procedure– Step 1: Compute an updated user model M* based on

the currently available information– Step 2: Given M*, choose a response to minimize the

loss function

A Simplified Two-Step Decision-Making Procedure

( )

( )

( )

arg min ( , , ) ( | , , , )

arg min ( , , *) ( * | , , , )

arg min ( , , *)

* arg max ( | , , , )

t

t

t

t r r A t tM

r r A t t

r r A t

M t

R L r A M P M U H A C dM

L r A M P M U H A C

L r A M

where M P M U H A C

Page 8: Modeling Diversity in   Information Retrieval

8

Optimal Interactive Retrieval

User

A1

U C

M*1P(M1|U,H,A1,C)

L(r,A1,M*1)

R1A2

L(r,A2,M*2)

R2

M*2P(M2|U,H,A2,C)

A3 …

Collection

IR system

Page 9: Modeling Diversity in   Information Retrieval

• At {“enter a query”, “click on Back button”, “click on Next button, …}

• r(At): decision space (At dependent)– r(At) = all possible subsets of C + presentation strategies– r(At) = all possible rankings of docs in C – r(At) = all possible rankings of unseen docs– …

• M: user model – Essential component: U = user information need– S = seen documents– n = “Topic is new to the user”

• L(Rt ,At,M): loss function– Generally measures the utility of Rt for a user modeled as M– Often encodes retrieval criteria (e.g., using M to select a ranking of docs)

• P(M|U, H, At, C): user model inference– Often involves estimating a unigram language model U

9

Refinement of Risk Minimization

Page 10: Modeling Diversity in   Information Retrieval

10

Generative Model of Document & Query [Lafferty & Zhai 01]

observedPartiallyobserved

QU)|( Up QUser

DS)|( Sp D

Source

inferred

),|( Sdp Dd Document

),|( Uqp Q q Query

( | , )Q Dp R R

Page 11: Modeling Diversity in   Information Retrieval

11

Risk Minimization with Language Models [Lafferty & Zhai 01, Zhai & Lafferty 06]

Choice: (D1,1)

Choice: (D2,2)

Choice: (Dn,n)

...

query quser U

doc set Csource S

q

1

N

dSCUqpDLDD

),,,|(),,(minarg*)*,(,

Loss

L

L

L

Page 12: Modeling Diversity in   Information Retrieval

12

Optimal Ranking for Independent Loss

1 11 1

1 1

1

1 1

1

1 1

1

1 1

* arg min ( , ) ( | , , , )

( , ) ( | ... )

( )

( ) ( )

* arg min ( ) ( ) ( | , , , )

arg min ( ) ( ) (

j j

j

j

j

j

N i

ii j

N i

ii j

N jN

ij i

N jN

ij i

N jN

ij i

L p q U C S d

L s l

s l

s l

s l p q U C S d

s l p

| , , , )

( | , , , ) ( ) ( | , , , )

* ( | , , , )

j j

k k k k

k

q U C S d

r d q U C S l p q U C S d

Ranking based on r d q U C S

Decision space = {rankings}

Sequential browsing

Independent loss

Independent risk= independent scoring

“Risk ranking principle”[Zhai 02, Zhai & Lafferty 06]

Page 13: Modeling Diversity in   Information Retrieval

Risk Minimization for Diversification

• Redundancy reduction: loss function includes a redundancy/novelty measure– Special case: list presentation + MMR [Zhai et al. 03]

• Diverse information needs: loss function defined on latent topics– Special case: PLSA/LDA + aspect retrieval [Zhai 02]

• Active relevance feedback: loss function considers both relevance and benefit for feedback– Special case: feedback only (hard queries) [Shen & Zhai 05]

13

Page 14: Modeling Diversity in   Information Retrieval

Subtopic Retrieval

Query: What are the applications of robotics in the world today?

Find as many DIFFERENT applications as possible.

Example subtopics: A1: spot-welding robotics

A2: controlling inventory A3: pipe-laying robotsA4: talking robotA5: robots for loading & unloading memory tapesA6: robot [telephone] operatorsA7: robot cranes… …

Subtopic judgments A1 A2 A3 … ... Ak

d1 1 1 0 0 … 0 0d2 0 1 1 1 … 0 0d3 0 0 0 0 … 1 0….dk 1 0 1 0 ... 0 1

Need to model interdependent document relevance

Page 15: Modeling Diversity in   Information Retrieval

Diversify = Remove Redundancy [Zhai et al. 03]

15

1,

))|(1()|(

))|(1)(|1(

))|1(1())|(1)(|1()}{,,,...,|(

),,,|(),,...,|(),...,|(

),...,|(minarg),,,|(),(minarg*

2

3

321

111

1111

111

c

cwhere

dNewpdqp

dNewpdRp

dRpcdNewpdRpcdddl

dSCUqpdddldddr

dddrsdSCUqpL

kk

Rank

kk

Rank

kkkkiiQkk

kkkk

N

j

N

jii jj

“Willingness to tolerate redundancy”

Cost NEW NOT-NEW REL 0 C2 NON-REL C3 C3

C2<C3, since a redundant relevant doc is better than a non-relevant doc

Greedy Algorithm for Ranking: Maximal Marginal Relevance (MMR)

Page 16: Modeling Diversity in   Information Retrieval

A Mixture Model for Redundancy

P(w|Background)Collection

P(w|Old)

Ref. document

1-

=?

p(New|d)= = probability of “new” (estimated using EM)p(New|d) can also be estimated using KL-divergence

Page 17: Modeling Diversity in   Information Retrieval

Evaluation metrics

• Intuitive goals:– Should see documents from many different

subtopics appear early in a ranking (subtopic coverage/recall)

– Should not see many different documents that cover the same subtopics (redundancy).

• How do we quantify these?– One problem: the “intrinsic difficulty” of

queries can vary.

Page 18: Modeling Diversity in   Information Retrieval

Evaluation metrics: a proposal

• Definition: Subtopic recall at rank K is the fraction of subtopics a so that one of d1,..,dK is relevant to a.

• Definition: minRank(S,r) is the smallest rank K such that the ranking produced by IR system S has subtopic recall r at rank K.

• Definition: Subtopic precision at recall level r for IR system S is:

),minRank(S

),minRank(Sopt

r

r

This generalizes ordinary recall-precision metrics.

It does not explicitly penalize redundancy.

Page 19: Modeling Diversity in   Information Retrieval

Evaluation metrics: rationale

recall

K

minRank(Sopt,r)

minRank(S,r)),minRank(S

),minRank(Sopt

r

r precision

1.0

0.0

For subtopics, theminRank(Sopt,r) curve’s shape is not predictable and linear.

Page 20: Modeling Diversity in   Information Retrieval

Evaluating redundancy

Definition: the cost of a ranking d1,…,dK is

where b is cost of seeing document, a is cost of seeing a subtopic inside a document (before a=0).Definition: minCost(S,r) is the minimal cost at which recall r is obtained.Definition: weighted subtopic precision at r is

),minCost(S

),minCost(Sopt

r

rwill use a=b=1

Page 21: Modeling Diversity in   Information Retrieval

Evaluation Metrics Summary

• Measure performance (size of ranking minRank,

cost of ranking minCost) relative to optimal.

• Generalizes ordinary precision/recall.

• Possible problems:– Computing minRank, minCost is NP-hard!

– A greedy approximation seems to work well for our data set

Page 22: Modeling Diversity in   Information Retrieval

Experiment Design

• Dataset: TREC “interactive track” data.– London Financial Times: 210k docs, 500Mb

– 20 queries from TREC 6-8• Subtopics: average 20, min 7, max 56

• Judged docs: average 40, min 5, max 100

• Non-judged docs assumed not relevant to any subtopic.

• Baseline: relevance-based ranking (using language models)

• Two experiments– Ranking only relevant documents

– Ranking all documents

Page 23: Modeling Diversity in   Information Retrieval

S-Precision: re-ranking relevant docs

Page 24: Modeling Diversity in   Information Retrieval

WS-precision: re-ranking relevant docs

Page 25: Modeling Diversity in   Information Retrieval

Results for ranking all documents

“Upper bound”: use subtopic names to build an explicit subtopic model.

Page 26: Modeling Diversity in   Information Retrieval

Summary: Remove Redundancy• Mixture model is effective for identifying novelty in relevant

documents

• Trading off novelty and relevance is hard

• Relevance seems to be dominating factor in TREC interactive-track data

Page 27: Modeling Diversity in   Information Retrieval

Diversity = Satisfy Diverse Info. Need[Zhai 02]

• Need to directly model latent aspects and then optimize results based on aspect/topic matching

• Reducing redundancy doesn’t ensure complete coverage of diverse aspects

27

Page 28: Modeling Diversity in   Information Retrieval

Aspect Generative Model of Document & Query

QU),|( Up Q

User),|( Qqp

q Query

DS),|( Sp D

SourceDdp ,|(

d Document

=( 1,…, k)

n

n

i

A

aDaiD dddwhereapdpdp ...,)|()|(),|( 1

1 1

dDirapdpdpn

i

A

aai )|()|()|(),|(

1 1

PLSI:

LDA:

Page 29: Modeling Diversity in   Information Retrieval

Aspect Loss Function

)|()1()|(1

)|(

,

)||()}{,,,...,|(

1

11,...,1

1,...,11111

k

k

ii

kk

kkQ

kiiQkk

apapk

ap

where

Ddddl

QU),|( Up Q ),|( Qqp

q

DS),|( Sp D Ddp ,|(

d

)ˆ||ˆ( 1,...,1k

kQD

Page 30: Modeling Diversity in   Information Retrieval

Aspect Loss Function: Illustration

Desired coverage

p(a|Q)

“Already covered”

p(a|1)... p(a|k -

1)Combined coverage

p(a|k)

New candidate p(a|k)

non-relevant

redundant

perfect

Page 31: Modeling Diversity in   Information Retrieval

Evaluation Measures• Aspect Coverage (AC): measures per-doc

coverage– #distinct-aspects/#docs

• Aspect Uniqueness(AU): measures redundancy– #distinct-aspects/#aspects

• Examples0001001

0101100

#doc 1 2 3 … …#asp 2 5 8 … …#uniq-asp 2 4 5AC: 2/1=2.0 4/2=2.0 5/3=1.67AU: 2/2=1.0 4/5=0.8 5/8=0.625

1000101

… ...d1 d3d2

Page 32: Modeling Diversity in   Information Retrieval

Effectiveness of Aspect Loss Function (PLSI)

Aspect Coverage Aspect UniquenessData set NoveltyCoefficient Prec() AC() Prec() AU()=0 0.265(0) 0.845(0) 0.265(0) 0.355(0)0 0.249(0.8) 1.286(0.8) 0.263(0.6) 0.344(0.6)

MixedData

Improve -6.0% +52.2% -0.8% -3.1%=0 1(0) 1.772(0) 1(0) 0.611(0)0 1(0.1) 2.153(0.1) 1(0.9) 0.685(0.9)

RelevantData

Improve 0% +21.5% 0% +12.1%

)|()1()|(1

)|(1

11,...,1 k

k

ii

kk apap

kap

Page 33: Modeling Diversity in   Information Retrieval

Effectiveness of Aspect Loss Function (LDA)

Aspect Coverage Aspect UniquenessData set NoveltyCoefficient Prec AC Prec AC=0 0.277(0) 0.863(0) 0.277(0) 0.318(0)0 0.273(0.5) 0.897(0.5) 0.259(0.9) 0.348(0.9)

MixedData

Improve -1.4% +3.9% -6.5% +9.4%=0 1(0) 1.804(0) 1(0) 0.631(0)0 1(0.99) 1.866(0.99) 1(0.99) 0.705(0.99)

RelevantData

Improve 0% +3.4% 0% +11.7%

)|()1()|(1

)|(1

11,...,1 k

k

ii

kk apap

kap

Page 34: Modeling Diversity in   Information Retrieval

Comparison of 4 MMR Methods

Mixed Data Relevant DataMMRMethod AC Improve AU Improve AC Improve AU ImproveCC() 0%(+) 0%(+) +2.6%(1.5) +13.8%(1.5)

QB() 0%(0) 0%(0) +1.8%(0.6) +5.6%(0.99)

MQM() +0.2%(0.4) +1.0%(0.95) +0.2%(0.1) +1.2%(0.9)

MDM() +1.5%(0.5) +2.2%(0.5) 0%(0.1) +1.1%(0.5)

CC - Cost-based CombinationQB - Query Background ModelMQM - Query Marginal ModelMDM - Document Marginal Model

Page 35: Modeling Diversity in   Information Retrieval

Summary: Diverse Information Need• Mixture model is effective for capturing latent topics

• Direct modeling of latent aspects/topics is more effective than indirect modeling through MMR in improving aspect coverage, but MMR is better for improving aspect uniqueness

• With direct topic modeling and matching, aspect coverage can be improved at the price of lower relevance-based precision

Page 36: Modeling Diversity in   Information Retrieval

Diversify = Active Feedback [Shen & Zhai 05]

* arg min ( , ) ( | , , )D

D L D p U q C d

Decision problem: Decide subset of documents for relevance judgment

1

( , ) ( , , ) ( | , , )

( , , ) ( | , , )

j

k

i ii

j

L D l D j p j D U

l D j p j d U

Page 37: Modeling Diversity in   Information Retrieval

Independent Loss

1

( , ) ( , , ) ( | , , )k

i ii

j

L D l D j p j d U

1

( , , ) ( , , )k

i ii

l D j l d j

Independent Loss

( ) ( , , ) ( | , , ) ( | , , )i

i i i i ij

r d l d j p j d U p U q C d

*

1

arg min ( , , ) ( | , , ) ( | , , )i

k

i i i iD i j

D l d j p j d U p U q C d

1 1

( , ) ( , , ) ( | , , )kk

i i i ii i

j

L D l d j p j d U

Page 38: Modeling Diversity in   Information Retrieval

Independent Loss (cont.)

Uncertainty Sampling

( ,1, ) log ( 1 | , )

( ,0, ) log ( 0 | , ) i i i

i i i

l d p R d d C

l d p R d d C

( ) ( | , ) ( | , , )i ir d H R d p U q C d

( ) ( , , ) ( | , , ) ( | , , )i

i i i i ij

r d l d j p j d U p U q C d

Top K

1

, 0 1 0

, ( ,1, ) ,

( 0, ) , i i

i

d C l d C

l d C C C

0 1 0( ) ( ) ( 1 | , , ) ( | , , )i i ir d C C C p j d U p U q C d

Page 39: Modeling Diversity in   Information Retrieval

Dependent Loss

1

( , , ) ( 1 | , , ) ( , )k

i ii

L D U p j d U D

Heuristics: consider relevance

first, then diversity

( 1)N G K

Gapped Top K

Select Top N documents

Cluster N docs into K clusters

K Cluster CentroidMMR

Page 40: Modeling Diversity in   Information Retrieval

Illustration of Three AF Methods

Top-K (normal feedback)

123456789

10111213141516…

GappedTop-K

K-cluster centroid

Aiming at high diversity …

Page 41: Modeling Diversity in   Information Retrieval

Evaluating Active Feedback

Query Select K

docs

K docs

Judgment File

+

Judged docs

+ +

+

-

-

InitialResultsNo feedback

(Top-k, gapped, clustering)

FeedbackFeedbackResults

Page 42: Modeling Diversity in   Information Retrieval

Retrieval Methods (Lemur toolkit)

Query Q

DDocument D

Q

)||( DQD Results

Kullback-Leibler Divergence Scoring

Feedback Docs F={d1, …, dn}

Active Feedback

Default parameter settings

unless otherwise stated

FQQ )1('F

Mixture Model Feedback

Only learn from relevant docs

Page 43: Modeling Diversity in   Information Retrieval

Comparison of Three AF Methods

Collection Active FB Method

#Rel

Include judged docs

MAP Pr@10doc

HARD

Top-K 146 0.325 0.527

Gapped 150 0.330 0.548

Clustering 105 0.332 0.565

AP88-89

Top-K 198 0.228 0.351

Gapped 180 0.234* 0.389*

Clustering 118 0.237 0.393Top-K is the worst!

bold font = worst * = best

Clustering uses fewest relevant docs

Page 44: Modeling Diversity in   Information Retrieval

Appropriate Evaluation of Active Feedback

New DB(AP88-89,

AP90)

Original DBwith judged docs(AP88-89, HARD)

+ -+

Original DBwithout judged

docs

+ -+

Can’t tell if the ranking of un-judged documents is improved

Different methods

have different test documents

See the learning effectmore explicitly

But the docs must be similar to original docs

Page 45: Modeling Diversity in   Information Retrieval

Comparison of Different Test Data

Test Data Active FB Method

#Rel MAP Pr@10doc

AP88-89

Including

judged docs

Top-K 198 0.228 0.351

Gapped 180 0.234 0.389

Clustering 118 0.237 0.393

AP90 Top-K 198 0.220 0.321

Gapped 180 0.222 0.326

Clustering 118 0.223 0.325

Clustering generates fewer, but higher quality examples

Top-K is consistently the worst!

Page 46: Modeling Diversity in   Information Retrieval

Summary: Active Feedback

• Presenting the top-k is not the best strategy

• Clustering can generate fewer, higher quality feedback examples

Page 47: Modeling Diversity in   Information Retrieval

Conclusions

• There are many reasons for diversifying search results (redundancy, diverse information needs, active feedback)

• Risk minimization framework can model all these cases of diversification

• Different scenarios may need different techniques and different evaluation measures

47

Page 48: Modeling Diversity in   Information Retrieval

References• Risk Minimization

– [Lafferty & Zhai 01] John Lafferty and ChengXiang Zhai. Document language models, query models, and risk minimization for information retrieval. In Proceedings of the ACM SIGIR 2001, pages 111-119.

– [Zhai & Lafferty 06] ChengXiang Zhai and John Lafferty, A risk minimization framework for information retrieval, Information Processing and Management, 42(1), Jan. 2006, pages 31-55.

• Subtopic Retrieval

– [Zhai et al. 03] ChengXiang Zhai, William Cohen, and John Lafferty, Beyond Independent Relevance: Methods and Evaluation Metrics for Subtopic Retrieval, In Proceedings of ACM SIGIR 2003.

– [Zhai 02] ChengXiang Zhai, Language Modeling and Risk Minimization in Text Retrieval, Ph.D. thesis, Carnegie Mellon University, 2002.

• Active Feedback

– [Shen & Zhai 05] Xuehua Shen, ChengXiang Zhai, Active Feedback in Ad Hoc Information Retrieval, Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval ( SIGIR'05), 59-66, 2005

ACM SIGIR 2009 Workshop on Redundancy, Diversity, andInterdependent Document Relevance, July 23, 2009, Boston, MA

48

Page 49: Modeling Diversity in   Information Retrieval

49

Thank You!