View
217
Download
0
Category
Tags:
Preview:
Citation preview
Diversifying Search Results
Rakesh Agrawal Sreenivas GollapudiSearch Labs Search LabsMicrosoft Research Microsoft Researchrakesha@microsoft.com sreenig@microsoft.com
Alan Halverson Samuel IeongSearch Labs Search LabsMicrosoft Research Microsoft Researchalanhal@microsoft.com saieong@microsoft.com
WSDM ’09
Outline
• Introduction• Problem Formulation• A Greedy Algorithm for DIVERSIFY(K)• Performance Metrics• Evaluation• Conclusions
Introduction
• Minimize the risk of dissatisfaction of the average user
• Assume that there exist – a taxonomy – a model user intents
• Consider both the relevance of the documents and diversity of the search result
• Tradeoff relevance and diversity
Problem Formulation
• The number of results to show for each category according to the percentage of users interested in that category may perform poorly
• Example : Flash– Technology : 0.6
Problem Formulation
• Non-order• Our algorithm is also designed to generate an
ordering of results rather than just a set of results
Problem Formulation
• DIVERSIFY(k) is NP-hard• Optimal for DIVERSIFY (k-1) need not be a
subset of documents optimal for DIVERSIFY (k)• Example :
p(c1|q)=p(c2|q)=0.5
DIVERSIFY(1):d1,d2,d3
DIVERSIFY(2):d2,d3,d1
A Greedy Algorithm for DIVERSIFY(K)
Performance Metrics
• NDCG,MRR,MAP do not take into account the value of diversification
• Intent Aware Measure example:
p(c2|q)>>p(c1|q)d1 is Excellent for c1(but unrelated to c2)d2 is Good for c2(but unrelated to c1)Classical IR metrics:d1,d2
Intent aware measures:d2,d1
Intent Aware Measure
Evaluation
• Evaluate our approach against three commercial search engine
• Conduct three sets of experiments• Differ in how the distributions of intents and
how the relevance of the documents are obtained
Experiment 1
• The distributions of intents for both queries and documents via standard classifiers
• The relevance of documents from a proprietary repository of human judgements that we have been granted access to
• Dataset : 10,000 random queries with top 50 documents
• Many documents are assigned human judgments in the top 10 for each query
Experiment 1
• sample about 900 queries– at least two categories– a significant fraction of associated documents
have human judgments
Experiment 1
Experiment 2
• Obtain the distributions of intents for queries and the document relevance using the Amazon Mechanical Turk platform
• Sample 200 queries from the dataset– at least three categories
• Submit these queries along with the three most likely categories as estimated by the classier and the top five results produced by IA-Select to the Turks
Experiment 2
Experiment 3
• IA-Select : p(c|q) from Amazon Mechanical Turk platform
• Metrics : p(c|q) and relevance documents are the same as used in Experiment 1
Conclusions
• Provide a greedy algorithm with good approximation gurantees
• To evaluate the effectiveness of our approach, we proposed generalizations of well-studied metrics to take into account of the intentions of the users
• Our approach outperforms results produced by commercial search engines over all of the metrics
Recommended