Diversifying Search Results Rakesh AgrawalSreenivas GollapudiSearch LabsMicrosoft Research...

Diversifying Search Results

Rakesh Agrawal Sreenivas GollapudiSearch Labs Search LabsMicrosoft Research Microsoft Researchrakesha@microsoft.com sreenig@microsoft.com

Alan Halverson Samuel IeongSearch Labs Search LabsMicrosoft Research Microsoft Researchalanhal@microsoft.com saieong@microsoft.com

WSDM ’09

Outline

• Introduction• Problem Formulation• A Greedy Algorithm for DIVERSIFY(K)• Performance Metrics• Evaluation• Conclusions

Introduction

• Minimize the risk of dissatisfaction of the average user

• Assume that there exist – a taxonomy – a model user intents

• Consider both the relevance of the documents and diversity of the search result

• Tradeoff relevance and diversity

Problem Formulation

• The number of results to show for each category according to the percentage of users interested in that category may perform poorly

• Example : Flash– Technology : 0.6

Problem Formulation

• Non-order• Our algorithm is also designed to generate an

ordering of results rather than just a set of results

Problem Formulation

• DIVERSIFY(k) is NP-hard• Optimal for DIVERSIFY (k-1) need not be a

subset of documents optimal for DIVERSIFY (k)• Example :

p(c1|q)=p(c2|q)=0.5

DIVERSIFY(1):d1,d2,d3

DIVERSIFY(2):d2,d3,d1

A Greedy Algorithm for DIVERSIFY(K)

Performance Metrics

• NDCG,MRR,MAP do not take into account the value of diversification

• Intent Aware Measure example:

p(c2|q)>>p(c1|q)d1 is Excellent for c1(but unrelated to c2)d2 is Good for c2(but unrelated to c1)Classical IR metrics:d1,d2

Intent aware measures:d2,d1

Intent Aware Measure

Evaluation

• Evaluate our approach against three commercial search engine

• Conduct three sets of experiments• Differ in how the distributions of intents and

how the relevance of the documents are obtained

Experiment 1

• The distributions of intents for both queries and documents via standard classifiers

• The relevance of documents from a proprietary repository of human judgements that we have been granted access to

• Dataset : 10,000 random queries with top 50 documents

• Many documents are assigned human judgments in the top 10 for each query

Experiment 1

• sample about 900 queries– at least two categories– a significant fraction of associated documents

have human judgments

Experiment 1

Experiment 2

• Obtain the distributions of intents for queries and the document relevance using the Amazon Mechanical Turk platform

• Sample 200 queries from the dataset– at least three categories

• Submit these queries along with the three most likely categories as estimated by the classier and the top five results produced by IA-Select to the Turks

Experiment 2

Experiment 3

• IA-Select : p(c|q) from Amazon Mechanical Turk platform

• Metrics : p(c|q) and relevance documents are the same as used in Experiment 1

Conclusions

• Provide a greedy algorithm with good approximation gurantees

• To evaluate the effectiveness of our approach, we proposed generalizations of well-studied metrics to take into account of the intentions of the users

• Our approach outperforms results produced by commercial search engines over all of the metrics

Diversifying Search Results Rakesh AgrawalSreenivas GollapudiSearch LabsMicrosoft Research...

Documents

00 Diversifying Citation Recommendations

Rakesh Agrawal Technical Fellow - microsoft.com · Rakesh Agrawal Technical Fellow Search Labs, Microsoft Research –Silicon Valley

Diversifying Teck’s Business

Diversifying Cropping Systems

Diversifying the STEM Pipeline

Diversifying Your Workforce - Parents Allianceparents-alliance.org/downloads/Diversifying-Your... · 2012-08-30 · Diversifying Your Workforce . A Four-Step Reference Guide to Recruiting,

Strategies for Diversifying Funding

DIVERSIFYING THE AFFORDABLE HOUSING SECTORnonprofithousing.org/wp-content/uploads/Diversifying-the-Aff-Hsg-Sector-PPT.pdf · DIVERSIFYING THE AFFORDABLE HOUSING SECTOR REFLECTIONS

Diversifying Network Participation: Study of India’s ... · Diversifying Network Participation: Study of Diversifying Network Participation: Study of India’s Universal Service

Diversifying, Sustaining, Strengthening

Diversifying Your Repertoire Site

WHITEPAPER SOLSTER DIVERSIFYING DECENTRALIZED …

\\Rakesh\rakesh d\Ratna C\Kir

Diversifying Your Funding

CACHI: Building, Diversifying, Transforming

Strategies for Diversifying Revenue

Diversifying Search Results

DIVERSIFYING INVESTMENTS - Knight Foundation

Diversifying your income

Diversifying Your Funding Base