41
Center for Intelligent Information Retrieval Henry Feild (UMass Amherst) James Allan (UMass Amherst) Rosie Jones (Akamai*) July 20, 2010 Predicting Searcher Frustration * work done while at Yahoo!

Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

Center for Intelligent Information Retrieval

Henry Feild (UMass Amherst) James Allan (UMass Amherst)

Rosie Jones (Akamai*)

July 20, 2010

Predicting Searcher Frustration

* work done while at Yahoo!!

Page 2: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

2 Center for Intelligent Information Retrieval July 20, 2010

Satisfaction vs. Frustration vs. Success

  Dissatisfactory: •  Getting a red light

  Frustrating: •  Getting every single red light between your house and

the airport

  Success •  Reaching the airport in time to catch your flight

  Take away: •  You can be dissatisfied and not frustrated •  You can be successful but still frustrated along the way*

* Ceaparu et al. (Journal of HCI, 2004)!

Page 3: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

3 Center for Intelligent Information Retrieval July 20, 2010

Real search example

  What was the best selling TV model in 2008?   Actual search sequence from UMass study:

•  television set sales 2008 •  “television set” sales 2008 •  “television” sales 2008 •  google trends •  “television” sales statistics 2008

user got frustrated starting here!

Questions: 1.  Can we detect when users get frustrated? 2.  Can we do something to help users once we

know they are frustrated?

Page 4: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

4 Center for Intelligent Information Retrieval July 20, 2010

Real search example

  What was the best selling TV model in 2008?   Actual search sequence from UMass study:

•  television set sales 2008 •  “television set” sales 2008 •  “television” sales 2008 •  google trends •  “television” sales statistics 2008

user got frustrated starting here!

Questions: 1.  Can we detect when users get frustrated? 2.  Can we do something to help users once we

know they are frustrated?

Page 5: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

5 Center for Intelligent Information Retrieval July 20, 2010

Outline

  Ways of detecting frustration   User study overview   Models   Conclusion

Page 6: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

6 Center for Intelligent Information Retrieval July 20, 2010

Ways of detecting frustration

  Physical sensors •  camera

•  predicts 6 mental states •  pressure sensitive mouse

•  pressure sensors around mouse •  pressure sensitive chair

•  pressure sensors on back and seat of chair

  Intelligent tutoring systems •  user cognitive state prediction [Cooper et al. (UMAP 2009)]

•  frustration prediction [Kapoor et al. (J. of Human-Computer Studies, 2007)]

•  when will the user click an “I’m frustrated” button

Page 7: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

7 Center for Intelligent Information Retrieval July 20, 2010

Ways of detecting frustration

  Query logs television set sales 2008 ! <click>! <scroll>!“television set” sales 2008! <click>!“television” sales 2008 ! <click>! <back>!...!

Page 8: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

8 Center for Intelligent Information Retrieval July 20, 2010

Ways of detecting frustration

  Query logs •  search level

•  query + navigation

television set sales 2008 ! <click>! <scroll>!“television set” sales 2008! <click>!“television” sales 2008 ! <click>! <back>!...!

Page 9: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

9 Center for Intelligent Information Retrieval July 20, 2010

Ways of detecting frustration

  Query logs •  search level

•  query + navigation

•  task level •  all searches related to an information

need

television set sales 2008 ! <click>! <scroll>!“television set” sales 2008! <click>!“television” sales 2008 ! <click>! <back>!...!

Page 10: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

10 Center for Intelligent Information Retrieval July 20, 2010

Ways of detecting frustration

  Query logs •  search level

•  query + navigation

•  task level •  all searches related to an information

need

•  user level •  ‘personalization’ •  aggregate stats over previous tasks

Where’s the nearest cafe?!

When’s the next time Dave Matthews is playing in Boston?!

What are the best grad school

programs in CS?!

television set sales 2008 ! <click>! <scroll>!“television set” sales 2008! <click>!“television” sales 2008 ! <click>! <back>!...!

Page 11: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

11 Center for Intelligent Information Retrieval July 20, 2010

Ways of detecting frustration

  Query logs •  search level

•  query + navigation

•  task level •  all searches related to an information

need

•  user level •  ‘personalization’ •  aggregate stats over previous tasks

  Search engine switching (White & Dumais, CIKM 2009)

  Next action prediction (Downey, ICAI, 2007)   Task satisfaction (Huffman & Hochster, SIGIR 2007;

Fox et al. TIS, 2005)

television set sales 2008 ! <click>! <scroll>!“television set” sales 2008! <click>!“television” sales 2008 ! <click>! <back>!...!

Page 12: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

12 Center for Intelligent Information Retrieval July 20, 2010

Outline

  Ways of detecting frustration   User study overview   Models   Conclusion

Page 13: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

13 Center for Intelligent Information Retrieval July 20, 2010

User study

  30 users   assigned 7—8 pre-defined tasks   searched the web

•  Google, Yahoo!, Bing, Ask.com

  prompted for feedback   logged sensor readings + web browsing

Page 14: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

14 Center for Intelligent Information Retrieval July 20, 2010

Frustration reporting dialog

Page 15: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

15 Center for Intelligent Information Retrieval July 20, 2010

Frustration reporting dialog

Page 16: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

16 Center for Intelligent Information Retrieval July 20, 2010

Frustration labels

Frustration Level Search

1 television set sales 2008

1 “television set” sales 2008

1 “television” sales 2008

2 google trends

3 ”television” sales statistics 2008

Page 17: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

17 Center for Intelligent Information Retrieval July 20, 2010

Statistics Frustration Level

Frequency

1 (None)

2

3

4

5 (Extreme) 12_2 28 03 11 02 22 07 12 30 25

Total number of instances

Frustrated instances

User ID

Fre

quency

05

10

15

20

Frustration No

Frustration

Success 46 85

Failure 72 8

Page 18: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

18 Center for Intelligent Information Retrieval July 20, 2010

Outline

  Ways of detecting frustration   User study overview   Models   Conclusion

Page 19: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

19 Center for Intelligent Information Retrieval July 20, 2010

Sensor features

  240 total •  10 sensor readings (from camera, mouse, & chair) •  min, max, mean, std-dev

•  over time windows preceding frustration judgment: •  30 seconds •  entire search •  entire task

•  two versions of each: •  including time spent responding to prompts •  excluding time spent responding to prompts

Page 20: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

20 Center for Intelligent Information Retrieval July 20, 2010

Sensor features

  240 total •  10 sensor readings (from camera, mouse, & chair) •  min, max, mean, std-dev

•  over time windows preceding frustration judgment: •  30 seconds •  entire search •  entire task

•  two versions of each: •  including time spent responding to prompts •  excluding time spent responding to prompts

Page 21: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

21 Center for Intelligent Information Retrieval July 20, 2010

Sensor features

  240 total •  10 sensor readings (from camera, mouse, & chair) •  min, max, mean, std-dev

•  over time windows preceding frustration judgment: •  30 seconds •  entire search •  entire task

•  two versions of each: •  including time spent responding to prompts •  excluding time spent responding to prompts

Page 22: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

22 Center for Intelligent Information Retrieval July 20, 2010

Sensor features

  240 total •  10 sensor readings (from camera, mouse, & chair) •  min, max, mean, std-dev

•  over time windows preceding frustration judgment: •  30 seconds •  entire search •  entire task

•  two versions of each: •  including time spent responding to prompts •  excluding time spent responding to prompts

Page 23: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

23 Center for Intelligent Information Retrieval July 20, 2010

Sensor features

  240 total •  10 sensor readings (from camera, mouse, & chair) •  min, max, mean, std-dev

•  over time windows preceding frustration judgment: •  30 seconds •  entire search •  entire task

•  two versions of each: •  including time spent responding to prompts •  excluding time spent responding to prompts

television set sales 2008 ! <click>! <scroll>!“television set” sales 2008! <click>!“television” sales 2008 ! <click>! <back>!...!

Page 24: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

24 Center for Intelligent Information Retrieval July 20, 2010

Sensor features

  240 total •  10 sensor readings (from camera, mouse, & chair) •  min, max, mean, std-dev

•  over time windows preceding frustration judgment: •  30 seconds •  search •  entire task

•  two versions of each: •  including time spent responding to prompts •  excluding time spent responding to prompts

television set sales 2008 ! <click>! <scroll>!“television set” sales 2008! <click>!“television” sales 2008 ! <click>! <back>!...!

Page 25: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

25 Center for Intelligent Information Retrieval July 20, 2010

Sensor features

  240 total •  10 sensor readings (from camera, mouse, & chair) •  min, max, mean, std-dev

•  over time windows preceding frustration judgment: •  30 seconds •  search •  entire task

•  two versions of each: •  including time spent responding to prompts •  excluding time spent responding to prompts

television set sales 2008 ! <click>! <scroll>!“television set” sales 2008! <click>!“television” sales 2008 ! <click>! <back>!...!

Page 26: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

26 Center for Intelligent Information Retrieval July 20, 2010

Sensor features

  240 total •  10 sensor readings (from camera, mouse, & chair) •  min, max, mean, std-dev

•  over time windows preceding frustration judgment: •  30 seconds •  search •  entire task

•  two versions of each: •  including time spent

responding to prompts •  excluding time spent

responding to prompts

television set sales 2008 ! <click>! <scroll>!“television set” sales 2008! <click>!“television” sales 2008 ! <click>! <back>!...!

Page 27: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

27 Center for Intelligent Information Retrieval July 20, 2010

Query log features

  43 total •  search-level

•  search duration •  query length •  average word length in query •  pages clicked...

•  task-level •  task duration •  # of searches •  average query length...

•  user-level •  average # of URLs visited per task •  average # of actions per task..

television set sales 2008 ! <click>! <scroll>!“television set” sales 2008! <click>!“television” sales 2008 ! <click>! <back>!...!

Page 28: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

28 Center for Intelligent Information Retrieval July 20, 2010

Query log features

  43 total •  search-level

•  search duration •  query length •  average word length in query •  pages clicked...

•  task-level •  task duration •  # of searches •  average query length...

•  user-level •  average # of URLs visited per task •  average # of actions per task..

television set sales 2008 ! <click>! <scroll>!“television set” sales 2008! <click>!“television” sales 2008 ! <click>! <back>!...!

Page 29: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

29 Center for Intelligent Information Retrieval July 20, 2010

Query log features

  43 total •  search-level

•  search duration •  query length •  average word length in query •  pages clicked...

•  task-level •  task duration •  # of searches •  average query length...

•  user-level •  average # of URLs visited per task •  average # of actions per task..

television set sales 2008 ! <click>! <scroll>!“television set” sales 2008! <click>!“television” sales 2008 ! <click>! <back>!...!

Page 30: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

30 Center for Intelligent Information Retrieval July 20, 2010

Query log features

  43 total •  search-level

•  search duration •  query length •  average word length in query •  pages clicked...

•  task-level •  task duration •  # of searches •  average query length...

•  user-level •  average # of URLs visited per task •  average # of actions per task..

television set sales 2008 ! <click>! <scroll>!“television set” sales 2008! <click>!“television” sales 2008 ! <click>! <back>!...!

Page 31: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

31 Center for Intelligent Information Retrieval July 20, 2010

Query log features

  43 total •  search-level

•  search duration •  query length •  average word length in query •  pages clicked...

•  task-level •  task duration •  # of searches •  average query length...

•  user-level •  average # of URLs visited per task •  average # of actions per task..

television set sales 2008 ! <click>! <scroll>!“television set” sales 2008! <click>!“television” sales 2008 ! <click>! <back>!...!

Page 32: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

32 Center for Intelligent Information Retrieval July 20, 2010

Query log features

  43 total •  search-level

•  search duration •  query length •  average word length in query •  pages clicked...

•  task-level •  task duration •  # of searches •  average query length...

•  user-level •  average # of URLs visited per task •  average # of actions per task..

Where’s the nearest cafe?!

When’s the next time Dave Matthews is playing in Boston?!

What are the best grad school

programs in CS?!

television set sales 2008 ! <click>! <scroll>!“television set” sales 2008! <click>!“television” sales 2008 ! <click>! <back>!...!

Page 33: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

33 Center for Intelligent Information Retrieval July 20, 2010

Query log features

  43 total •  search-level

•  search duration •  query length •  average word length in query •  pages clicked...

•  task-level •  task duration •  # of searches •  average query length...

•  user-level •  average # of URLs visited per task •  average # of actions per task...

Where’s the nearest cafe?!

When’s the next time Dave Matthews is playing in Boston?!

What are the best grad school

programs in CS?!

television set sales 2008 ! <click>! <scroll>!“television set” sales 2008! <click>!“television” sales 2008 ! <click>! <back>!...!

Page 34: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

34 Center for Intelligent Information Retrieval July 20, 2010

Modeling

  logistic regression •  binarize instances:

•  1 = “not frustrated” •  2—5 = “frustrated”

Page 35: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

35 Center for Intelligent Information Retrieval July 20, 2010

Models

  all features •  query log + sensors

  Sequential Forward Selection (SFS) over: •  all features

•  7 features automatically chosen •  query log features

•  5 features automatically chosen •  sensor features

•  3 features automatically chosen

  search engine switching [White & Dumais, CIKM 2009] •  5 query log features

  Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009]

Page 36: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

36 Center for Intelligent Information Retrieval July 20, 2010

Features from two of the models

SFS-QL+Sensors: SFS over query log and sensor features

1. task duration

2. proportion of unique queries in task

3. mean of ‘unsure’, 30-sec, no prompts

4. minimum of ‘unsure’, search, prompts

5. stddev of ‘concentrating’, 30-sec, no prompts

6. minimum of ‘net-back-change’, search, no prompts

7. minimum of ‘concentrating’, search, no prompts

W&D: Model used by White & Dumais (CIKM 2009) to detect switching between search engines

[task] task duration

[user] average number of URL’s visited per task

[search] character length of most recent query

[search] average token length of most recent query

[task] number of actions performed in task

Page 37: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

37 Center for Intelligent Information Retrieval July 20, 2010

Results

Model Accuracy Fβ=0.5 Mean Average Precision

W&D 0.75 0.80 0.87

SFS-QL+Sensors 0.69 0.72 0.85

SFS-QL 0.69 0.73 0.80

W&D+MML-time 0.66 0.69 0.76

MML-time 0.56 0.62 0.65

SFS-Sensors 0.55 0.61 0.65

QL+Sensors 0.54 0.49 0.59

Always frustrated 0.44 0.55 ---

Page 38: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

38 Center for Intelligent Information Retrieval July 20, 2010

Outline

  Ways of detecting frustration   User study overview   Models   Conclusion

Page 39: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

39 Center for Intelligent Information Retrieval July 20, 2010

Conclusions

  Searcher frustration is detectable   Sensors are not helpful using our processing

methods   Best prediction criteria:

•  long task duration •  user tends to visit few URLs per task •  few clicks and other actions are performed •  the most recent query is long, but has very short words

Page 40: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

40 Center for Intelligent Information Retrieval July 20, 2010

Future work

  What models work best in real search environments?

  How can we help frustrated searchers?

Page 41: Predicting Searcher Frustration · 2016. 4. 14. · [White & Dumais, CIKM 2009] • 5 query log features Markov Model Likelihood (event patterns) [Hassan et al. WSDM 2009] Center

41 Center for Intelligent Information Retrieval July 20, 2010

Results

0.0 0.2 0.4 0.6 0.8 1.0

0.5

0.6

0.7

0.8

0.9

1.0

Recall

Pre

cis

ion

! !

!

! ! !

! !

!

! !

f!

f

W&D

W&D+MML!time

SFS!QL+Sensors

SFS!QL

MML!time

SFS!Sensors

QL+Sensors

Always frustrated (Prec=0.49)

0.0 0.2 0.4 0.6 0.8 1.0

0.5

0.6

0.7

0.8

0.9

1.0

Recall

Pre

cis

ion

! !

!

! ! !

! !

!

! !

f!

f

W&D

W&D+MML!time

SFS!QL+Sensors

SFS!QL

MML!time

SFS!Sensors

QL+Sensors

Always frustrated (Prec=0.49)

0.0 0.2 0.4 0.6 0.8 1.0

0.5

0.6

0.7

0.8

0.9

1.0

Recall

Pre

cis

ion

! !

!

! ! !

! !

!

! !

f!

f

W&D

W&D+MML!time

SFS!QL+Sensors

SFS!QL

MML!time

SFS!Sensors

QL+Sensors

Always frustrated (Prec=0.49)

0.0 0.2 0.4 0.6 0.8 1.0

0.5

0.6

0.7

0.8

0.9

1.0

Recall

Pre

cis

ion

! !

!

! ! !

! !

!

! !

f!

f

W&D

W&D+MML!time

SFS!QL+Sensors

SFS!QL

MML!time

SFS!Sensors

QL+Sensors

Always frustrated (Prec=0.49)