Tutorial on query auto completion

Образец заголовка

Tutorial on Query Auto-Completion

Yichen Fengfeng36 AT illinois DOT edu

University of Illinois at Urbana-Champaign

Prepared as an assignment for CS410: Text Information Systems in Spring 2016

Образец заголовкаQuery Auto-Completoion• What is Query Auto-Completion

(QAC)– Giving search suggestions based on

typed prefixes by considering the search history log, search queries popularity, temporal factors and personal interests.

Образец заголовкаQAC is important• Faster users’ input, improve efficiency• Suggesting possible queries• Correct users’ typing errors• Users may not know how to describe

the information he needed• Speed and Accuracy• Minimize users’ cognitive and physical

effort

Образец заголовкаQAC is Everywhere

PIAZZA Facebook

Gmail Amazon

USA Government Coursera

Образец заголовкаMost Popular Completion• Traditional QAC (Most Popular Completion)– Query are suggested from the previous query

popularity. (Mawarkar and Malemath, 2015)– Ranked by queries’ number of frequent

occurances– Data Structure: TRIE

– Ranked by queries’ number of frequent occurances– Data Structure: TRIE– Always treated as baseline

Образец заголовкаQAC Challenges• Cannot catch the popular temporal

topics• Cannot treat different people differently• Cannot interact with users’ behaviors

(e.g. clicks)• Bad performance on the mobile devices• Needed to be optimized

Образец заголовкаSolutions• Time-sensitive QAC – Robust vs. Recent

• Personalized QAC– User behaviors– Context based QAC

• Time-sensitive Personalized QAC (Hybrid model)

• Optimizing search results presentation• Term by term QAC for mobile search• QAC for rare prefixes

Образец заголовкаTime-Sensitive QAC(SIGIR 12)

• Time-sensitive: query popularity changing over time– “di-”: Dictionary for weekday, Disney for weekend

• Key idea:– Predicting query popularity

• Forecast quality• Success & failure analysis • Temporal model selector

– Rely on shorter but frequent aggregation of data, model the overall query trends by time-series.

• Method: Time-sensitive auto-completion

– : estimated frequency of query q at time t

M. Shokouhi and K. Radinsky. Time-sensitive query auto-completion. In SIGIR ’12, pages 601–610, 2012.

Образец заголовкаTS QAC – Recent vs. Robust(WWW 14)

• QAC need to sufficiently rank both consistently and recently popular queries

• Motivation: Finding optimal trade-off between recency and robustness to achieve better QAC

• Key idea:– Optimal tradeoff could be researched– Each query log scenario has different temporal characteristics

• Approaches:– Based on past popularity distributions

• Maximum Likelihood Estimation, Recent Maximum Likelihood Estimation, Last N Query Distribution

– Based on short-range predicted query popularity• Predicted Next N Query Distribution

– Meta approach – optimize the parameters of above apporaches• Online Parameter Learning

S. Whiting, J. McMinn, and J. Jose. Exploring real-time temporal query auto-completion. In DIR Workshop ’13, pages 12–15

Образец заголовкаPersonalized QAC(SIGIR 13)

• QAC need to suggest people differently by considering their own interestes

• Motivation: Queries likelihoods vary drastically between different demographic groups [Weber and Castillo, 2010] and individuals [Teevan et al., 2011]

• Key idea:– Features based on: Users age, gender, location, short- and long-

term history– Novel supervised framework for leaning to personalize QAC

• Method:– Similar labelling strategy

• Evaluating by using Mean-Reciprocal-Rank (MRR)– Learning to rank

• Lambda-MART algorithm (boosted decision trees)• Location is more effective

M. Shokouhi. Learning to personalize query auto-completion. In SIGIR’13 2013

Образец заголовкаPersonalized QAC – Context Based(IJARCET 2015)

• Query auto-completer try to accurately predicted what user is typing

• Objective: Improve search quality by predicting the user’s query based on context

• Key idea:– Context

• Query similarity• User’s recent click throughs• Current location and time• Keywords and sessions

• Method:– Most Popular Completion

• Works well when context is empty– Nearest Completion

• Works well when context exists, terrible when context is empty– Hybrid Completion

• Combine both MPC and NCV. Mawarkar and V. Malemath. Context Based Query Auto-Completion. In IJARCET, Volume 4 Issue 6, June 2015.

Образец заголовкаContext Based HCA(IJARCET 2015)

V. Mawarkar and V. Malemath. Context Based Query Auto-Completion. In IJARCET, Volume 4 Issue 6, June 2015.

Образец заголовкаPersonalized QAC – User Behaviors(SIGIR14)

• Objective: Explaining the users’ interaction data to future improving the QAC performance

• Contributions:– First set High-resolution QAC query log:

• Recording every keystroke- Enable further analysis on understanding

– Horizontal skipping bias • First introduce and unique to QAC

– Vertical position bias– Two-dimensional Click Model

• Model users’ behavior on PC and mobile devicesY. Li, A. Dong, H. Wang, H. Deng, Y. Chang, C. Zhai. A Two-dimensional Click Model for Query Auto-completion. In SIGIR’ 2014

Образец заголовкаTwo-Dimensional Click Model(SIGIR14)

H Model

D Model

Y. Li, A. Dong, H. Wang, H. Deng, Y. Chang, C. Zhai. A Two-dimensional Click Model for Query Auto-completion. In SIGIR’ 2014

Образец заголовкаTime–Sensitive Personalized QAC(CIKM14)

• Key idea: – Hybrid model

• Time-sensitivity• Personalization

– Optimal time window• Achieving better predition

• Contributions:– Novel Hybrid Model– New query popularity prediction method

• Ranking with Mean Reciprocal Rank (MRR)– Effectiveness analysis

• Significantly outperforms state-of-art time-sensitive QACF. Cai, S. Liang, M. D. Rijke. Time-sensitive Personalized Query Auto-completion. In CIKM’ 2014

Образец заголовкаTSP QAC Performances(CIKM14)

• Tradeoff between recent and periodicity– Have critical parameter setting for accuracy

• Baselines check–Marginally outperforms baselines

• Fact not strongly differential features– Effective with a longer prefix– Available evidence matters

• Better QAC ranking– Sufficient personal queries– Time-sensitive popularity

F. Cai, S. Liang, M. D. Rijke. Time-sensitive Personalized Query Auto-completion. In CIKM’ 2014

Образец заголовкаPresenting Optimized Search Results(WSDM16)

• Objective:– Selectively presenting query based on a

probabilistic model to achieve optimized search results presentation

• Key ideas:– Time-consuming on too many query suggestions– Measuring the users’ time-loss– Patient users get more benefits

• Challenges:– Uncertain factors (e. g. intent, query suggestion

click probabilities)– Unclear of how long users spend on scanning

M. P. Kato, K. Tanaka. To Suggest, or Not to Suggest for Queries with Diverse Intents: Optimizing Search Result Presentation. In WSDM’ 2016

Образец заголовкаPresenting Optimized Search Results(WSDM16)

• Contributions:– Searcher model

• Interacting with query suggestions• According to users’ multiple intents

– Optimizing Search Results Presentation (OSRP)• Mainly focusing on ambiguous or underspecified query

– Examined effects of query suggestion on search behaviors• Conducting user survey

– Effectiveness of OSRP• Patient users• Queries with limited number of intents


Образец заголовкаUsers Survey(WSDM16)


SERP (M. P. Kato and K. Tanaka)

Образец заголовкаTerm-by-Term QAC for Mobile Search(WSDM16)

• Objective: – Specialized QAC for mobile search

• Mobile Input:– Small screen Term-by-Term QAC– Slower input High quality QAC– Clumsier QAC matters more than PC

• Key idea:– Faster exploration of suggestions– Fits for the text editing in mobile devices

S. Vargas, R. Blanco, P. Mika. Term-by-Term Query Auto-Completion for Mobile Search. In WSDM 2016

Образец заголовкаQuery-Term Graph(WSDM16)

– Based on previous submitted queries– Efficient way of• Storing • Retrieving

S. Vargas, R. Blanco, P. Mika. Term-by-Term Query Auto-Completion for Mobile Search. In WSDM 2016

Образец заголовкаQAC for Rare Prefixes(CIKM15)

• Motivation: QAC fail when the prefix is sufficiently rare

• Key ideas:– Supervised model ranking synthetic

suggestions– Query generated by mining query suffixes– Exploring new ranking signals

• Query n-gram statistics• Deep convolutional latent semantic model

(CLSM)S. Vargas, R. Blanco, P. Mika. Term-by-Term Query Auto-Completion for Mobile Search. In WSDM 2016

Образец заголовкаModel and Features(CIKM15)

• LambdaMART model：– Ranking using features

• N-gram based features–Model the likelihood that candidate

suggestion is generated by the same LM as the queries in the search logs

• CLSM based features– Based on clickthrough data– Effective for modelling query-document

relevance– Training on a prefix-suffix pairs dataset

B. Mitra, N. Craswell. Query Auto-Completion for Rare Prefixes. In CIKM 2015

Образец заголовкаQAC for Rare Prefixes(CIKM15)

• Motivation: QAC fail when the prefix is sufficiently rare

• Key ideas:– Supervised model ranking synthetic

suggestions– Query generated by mining query suffixes– Exploring new ranking signals

• Query n-gram statistics• Deep convolutional latent semantic model

(CLSM)B. Mitra, N. Craswell. Query Auto-Completion for Rare Prefixes. In CIKM 2015

Образец заголовкаFuture works• Short range query popularity prediction• Complex relationships between users’

behavior at different keystrokes• More complex click models• Model personalized temporal patterns for

active users (e.g. Professional searchers)• Online user behavior study on mobile• Other LM on rare prefixes

Образец заголовкаQAC Development Summary

Образец заголовкаReferences1. M. Shokouhi and K. Radinsky. Time-sensitive query auto-completion. In SIGIR ’12, pages 601–

610, 2012.2. S. Whiting, J. McMinn, and J. Jose. Exploring real-time temporal query auto-completion. In DIR

Workshop ’13, pages 12–153. M. Shokouhi. Learning to personalize query auto-completion. In SIGIR’13 20134. V. Mawarkar and V. Malemath. Context Based Query Auto-Completion. In IJARCET, Volume 4

Issue 6, June 2015.5. Y. Li, A. Dong, H. Wang, H. Deng, Y. Chang, C. Zhai. A Two-dimensional Click Model for Query

Auto-completion. In SIGIR’ 2014 6. F. Cai, S. Liang, M. D. Rijke. Time-sensitive Personalized Query Auto-completion. In CIKM’ 2014 7. M. P. Kato, K. Tanaka. To Suggest, or Not to Suggest for Queries with Diverse Intents:

Optimizing Search Result Presentation. In WSDM’ 2016 8. S. Vargas, R. Blanco, P. Mika. Term-by-Term Query Auto-Completion for Mobile Search. In WSDM

20169. B. Mitra, N. Craswell. Query Auto-Completion for Rare Prefixes. In CIKM 201510. L. Li, H. Deng, A. Dong, Y. Chang, H. Zha, R. Baeza-Yates. Analyzing User’s Sequential Behavior

in Query Auto-Completion via Markov Processes. In Proc. SIGIR’15 2015. 11. M. Shokouhi. Detecting seasonal queries by time-series analysis. In Proc. SIGIR, pages 1171–

1172, Beijing, China, 2011 12. R. W. White and G. Marchionini. Examining the effectiveness of real-time query expansion. Inf.

Process. Manage., 43:685–704, May 2007 13. Z. Bar-Yossef and N. Kraus. Context-sensitive query auto-completion. In WWW ’11, pages 107–

116, 2011.

Engineering

Tutorial on query auto completion