Query Operators Shown Beneficial for Improving Search Results
Gilles Hubert, Guillaume Cabanac,
Christian Sallaberry, Damien Palacio
TPDL’11: International Conference on Theory and Practice of Digital Libraries September 25-29, Berlin, Germany
2
Outline
1. Context Operators in Search Queries
2. Methodology Assessing the effects of query operators
3. Experiments Potential of effectiveness yielded
and Results by operators
4. Conclusion and Future Work
Query Operators Shown Beneficial for Improving Search Results G. Hubert et al.
3
Outline
1. Context Operators in Search Queries
2. Methodology Assessing the effects of query operators
3. Experiments Potential of effectiveness yielded
and Results by operators
4. Conclusion and Future Work
Query Operators Shown Beneficial for Improving Search Results G. Hubert et al.
Various Operators Quotation marks, Must appear (+), boosting operator (^),
Boolean operators, proximity operators… 4
1. Context Operators in Search Queries G. Hubert et al.
Information need
“I’m looking for research projects funded in the DL domain”
Regular query Query with operators
Search Engines Offer Query Operators
Case 1: What designers of search engines may expect
5
1. Context Operators in Search Queries G. Hubert et al.
Information need
“I’m looking for research projects funded in the DL domain”
Regular query Query with operators
Search Engines Offer Query Operators
Case 2: What users of search engines may believe
6
1. Context Operators in Search Queries G. Hubert et al.
Information need
“I’m looking for research projects funded in the DL domain”
Regular query Query with operators
Search Engines Offer Query Operators
Case 3: What designers of search engines may fear
7
1. Context Operators in Search Queries G. Hubert et al.
Information need
“I’m looking for research projects funded in the DL domain”
Regular query Query with operators
Search Engines Offer Query Operators
Quantitative Studies
Possible Explanations Unknown features?
No improvement observed?
1. Context Operators in Search Queries G. Hubert et al.
8
0%
5%
10%
15%
20%
25%
1999 2000 2001 2002 2003 2004 2005 2006 2007
Altavista [Silverstein et al., 1999]
Excite [Jansen et al. 2000]
Excite [Spink et al., 2001]
Google+MSN Search+Yahoo! [White and Morris, 2007]
Qu
erie
s w
ith
op
erat
ors
Usage of Query Operators
Usage of Query Operators
Qualitative Studies Users
Average users not comfortable with “advanced means of searching” [Jansen et al., 2000]
Expert users recourse to query operators more frequently [Hölscher and Strube, 2000; Lucas and Topi, 2002; White and Morris, 2007]
Information Needs
More used in dedicated search [Jansen and Pooch, 2001]
Difficulty in finding information (e.g., complex information needs) [Aula et al., 2010]
Appropriateness
Operators used in a “semantically appropriate manner” [Eastman and Jansen, 2004]
1. Context Operators in Search Queries G. Hubert et al.
9
Effects of Query Operators on Effectiveness
1. Context Operators in Search Queries G. Hubert et al.
10
Usage of Query Operators
[Eastman and Jansen, 2003]
Eastman and Jansen studied queries with operators
Real users: AOL, Google and MSN Search
Operators: AND, OR, MUST APPEAR and PHRASE
No statistically significant improvement P@10
Effects of Query Operators on Effectiveness
1. Context Operators in Search Queries G. Hubert et al.
11
Usage of Query Operators
[Eastman and Jansen, 2003]
Study on 20% of all queries
Expert users
Complex needs (Queries with operators)
Effects of Query Operators on Effectiveness
1. Context Operators in Search Queries G. Hubert et al.
12
Usage of Query Operators
[Eastman and Jansen, 2001]
What about the other 80% of all queries ?!
Average users
Regular queries (no operators)
13
Outline
1. Context Operators in Search Queries
2. Methodology Assessing the effects of query operators
3. Experiments Potential of effectiveness yielded
and Results by operators
4. Conclusion and Future Work
Query Operators Shown Beneficial for Improving Search Results G. Hubert et al.
Our Research Questions
2. Methodology Assessing the effects of query operators G. Hubert et al.
Q = Do query operators lead to improved search results?
Q1 = Maximum gain in effectiveness when enriching
a query with operators?
Q2 = Do users succeed in formulating better queries
involving operators?
14
15
Our Methodology in a Nutshell
2. Methodology Assessing the effects of query operators G. Hubert et al.
Regular query V1: Query variant with operators
0733.010
5
2
3
1
AP 4633.010
6
5
5
4
3
3
2
2
1
1
AP
V3 V2
V4 VN . . .
16
Overview of the Methodology
3. Methodology Assessing the effects of query operators G. Hubert et al.
{v1, … , vi, …, vn} Query Variant
Generator
Search
Engine
Evaluation
Procedure
preOps
query
postOps
corpus
IR model
qrels
metrics
measures of
effectiveness
l(vi)
Usual evaluation framework in IR
Components introduced for this study
17
Outline
1. Context Operators in Search Queries
2. Methodology Assessing the effects of query operators
3. Experiments Potential of effectiveness yielded
and Results by operators
4. Conclusion and Future Work
Query Operators Shown Beneficial for Improving Search Results G. Hubert et al.
18
Experiment Settings
Standard Test Collections TREC-7
TREC-8
Query Operators Must appear (+)
Term boosting (^N)
Variant Generation Must appear ‘+’ only
Boost ‘^’ only with weights ^10, ^20, ^30, ^40, and ^50
Both ‘+’ and ‘^’
Search engine Terrier with various models: BM25, DFR_BM25, InL2, PL2, TF_IDF
3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.
Variant # Query variants generated with preOps and postOps
1 encryption equipment export
2 encryption +equipment +export
… … … …
124 encryption +equipment export^10
… … … …
338 encryption^30 equipment^40 export^50
19
Results
TREC-7 per Topic Analysis: Boxplots ‘+’ and ‘^’
3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.
20
Results
Per Topic Analysis: Boxplot
3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.
AP of TREC’s regular query
Query variant highest AP
32 Topics
AP (
Ave
rage P
reci
sion)
0.2
0.1
0.3
0.4
Query variant lowest AP
21
Results
TREC-7 Per Topic Analysis ‘+’ and ‘^’
3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.
MAP = 0.1554
MAP ┬ = 0.2099 +35.1%
22
Results
TREC-8 per Topic Analysis ‘+’ and ‘^’
3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.
MAP = 0.1840
MAP ┬ = 0.2288 +24.3%
23
Results
Global Analysis: MAP ‘+’ only
3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.
TREC-7 TREC-8
MAP MAP
Model Baseline VOP (%) Baseline VOP (%)
BM25 0.1677 0.1836 9.5** 0.1957 0.2154 10.2*
DFR_BM25 0.1683 0.1843 9.5** 0.1965 0.2162 10.0*
InL2 0.1710 0.1852 8.3** 0.1996 0.2172 8.8*
PL2 0.1554 0.1826 17.5** 0.1840 0.2106 14.5**
TF_IDF 0.1674 0.1833 9.5** 0.1964 0.2158 9.9**
Statistical significance is denoted by ‘*’ for p < 0.05 (‘**’ for p < 0.01)
24
Results
Global Analysis: MAP ‘^’ only
3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.
TREC-7 TREC-8
MAP MAP
Model Baseline VOP (%) Baseline VOP (%)
BM25 0.1677 0.2027 20.9** 0.1957 0.2312 18.1**
DFR_BM25 0.1683 0.2034 20.9** 0.1965 0.2316 17.9**
InL2 0.1710 0.2059 20.4** 0.1996 0.2352 17.8**
PL2 0.1554 0.1926 23.9** 0.1840 0.2173 18.1**
TF_IDF 0.1674 0.2026 21.0** 0.1964 0.2312 17.7**
Statistical significance is denoted by ‘*’ for p < 0.05 (‘**’ for p < 0.01)
25
Results
Global Analysis: MAP ‘+’ and ‘^’
3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.
TREC-7 TREC-8
MAP MAP
Model Baseline VOP (%) Baseline VOP (%)
BM25 0.1677 0.2132 27.1** 0.1957 0.2381 21.7**
DFR_BM25 0.1683 0.2133 26.7** 0.1965 0.2387 21.5**
InL2 0.1710 0.2144 25.4** 0.1996 0.2407 20.6**
PL2 0.1554 0.2099 35.1** 0.1840 0.2288 24.3**
TF_IDF 0.1674 0.2131 27.3** 0.1964 0.2383 21.3**
Statistical significance is denoted by ‘*’ for p < 0.05 (‘**’ for p < 0.01)
26
Outline
1. Context Operators in Search Queries
2. Methodology Assessing the effects of query operators
3. Experiments Potential of effectiveness yielded
and Results by operators
4. Conclusion and Future Work
Query Operators Shown Beneficial for Improving Search Results G. Hubert et al.
27
Conclusions H: the Proper Use of Query Operators Improves Search Results
Methodology to Validate H
Standard IR Test Collections: TREC-7 and TREC-8
Must Appear (+) and Boosting Operators (^)
Findings Observed gain up to 35.1%
Statistically significant
For all tested IR models and collections
Users Should Use Query Operators More Often
4. Conclusion and Future Work G. Hubert et al.
28
Future Work Short Term
Experimenting our methodology in various contexts
Additional IR collections
Additional IR models
Additional query operators
Medium Term Address Q2: Do users succeed in formulating queries with operators,
so that these lead to a significant gain in effectiveness?
Study other factors
Number of terms
Selection of terms
Long Term Additional dimensions of information
Geographic IR
4. Conclusion and Future Work G. Hubert et al.
Thank you
TPDL’11: International Conference on Theory and Practice of Digital Libraries September 25-29, Berlin, Germany