38
WIN-WIN SEARCH: DUAL-AGENT STOCHASTIC GAME IN SESSION SEARCH Jiyun Luo Sicong Zhang Grace Hui Yang Department of Computer Science Georgetown University {jl1749, sz303}@georgetown.edu [email protected] 1

Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

Embed Size (px)

Citation preview

Page 1: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

WIN-WIN SEARCH: DUAL-AGENTSTOCHASTIC GAME IN SESSIONSEARCH

Jiyun Luo Sicong Zhang Grace Hui Yang

Department of Computer ScienceGeorgetown University

jl1749, [email protected] [email protected]

Page 2: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

2

AGE OF EMPIRE

2

Page 3: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

A NEW PERSPECTIVE TO LOOK AT SEARCH

3

Documents to explore Information

need

Observed documents

User

Devise a strategy for helping the user explore the information space in order to learn which documents are relevant and which aren’t, and satisfy their information need.

3

Page 4: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

WHY USERS MAKE CERTAIN MOVES?

4

Markov Chain of Decision Making States

Page 5: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

RELATED WORK! queries suitable for personalization [Teevan et al. SIGIR’08]! task types [Kanoulas et al. TREC’12]! roles of task stage and task type [Liu et al. SIGIR’10]! session query changes [Guan et al. SIGIR’13]! user intensions and attention [Carterette et al. CIKM’11]! user click model [Craswell et al. SIGIR’07]! page re-ranking [Jin et al. WWW’13]! Search topics [Jones et al. CIKM’08]! Ads selection using pomdp[Yuan et al. CIKM’12]

!Our work is a retrieval model! not a user study

5

Page 6: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

OUR SOLUTION

6

Try to find an optimal solution through a sequence of dynamic

interactions

Trial and Error: learn from repeated, varied attempts

which are continued until success

6

Page 7: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

TRIAL AND ERROR

7

! q1 – "dulles hotels"! q2 – "dulles airport"

! q3 – "dulles airport location"

! q4 – "dulles metrostop"7

Page 8: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

8

! Rich interactionsQuery formulation, Document clicks, Document

examination, eye movement, mouse movements, etc.! Temporal dependency

! Overall goal

RECAP – CHARACTERISTICS OFDYNAMIC IR

8

Page 9: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

9

! Model interactions, which means it needs to have place holders for actions;

! Model information need hidden behind user queries and other interactions;

! Set up a reward mechanism to guide the entire search algorithm to adjust its retrieval strategies;

! Represent Markov properties to handle the temporal dependency.

WHAT IS A DESIRABLE MODEL FORDYNAMIC IR

A model in Trial and Error setting will do!

A Markov Model will do!

9

Page 10: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

10

! Two agents work together to fulfill the information need

!Dual-agent stochastic game! Partially Observable Markov Decision Process ! Joint Optimization

!To achieve Win-win

WIN-WIN SEARCH

Page 11: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

WIN-WIN SEARCH

11

! A tuple (S, T, A, R, γ, O, Θ, B)! S : state space! T: transition matrix! A: action space(Au, Ase, Σu, Σse)! R: reward function(Ru, Rse)! γ: discount factor, 0< γ ≤1! O: observation set(Ωu, Ωse)

an observation is a symbol emitted according to a hidden state.! Θ: observation function

Θ(s,a,o) is the probability that o is observed when the system transitions into state s after taking action a, i.e. P(o|s,a).! B: belief space

Belief is a probability distribution over hidden states.

Page 12: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

12

Name Symbol Meanings

state S the four hidden decision states

user action Au add/remove/keep query terms

search engine action

Ase increase/decrease/keep term weights, adjust search techniques, etc.

message from user to search engine

Σu clicked and SAT clicked documents

message from search engine to user

Σse top k returned documents

user's observation Ωu observations that the user makes from the world

search engine's observation

Ωse observations that the search engine makes from the world and from the user

user reward Ru relevant information the user gains from reading the documents

search engine reward

Rse nDCG that the search gains by returning documents

belief state B belief states generated from the belief updater and shared by both agents

Page 13: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

STATES (S)

13

SRTRelevant &

Exploitation

SRRRelevant & Exploration

SNRTNon-Relevant & Exploitation

SNRRNon-Relevant & Exploration

! scooter price ⟶ scooter stores

! collecting old US coins⟶ selling old US coins

! Philadelphia NYC travel ⟶ Philadelphia NYC train

! Boston tourism ⟶ NYC tourism

q0

Page 14: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

ACTIONS (AU, ASE, ΣU, ΣSE)! User Action (Au)

! add query terms (+Δq)! remove query terms (-Δq)! keep query terms (qtheme)

! Search Engine Action(Ase)! increase term weights! decrease term weights! keep term weights! adjust search techniques, etc.

! Message from the user(Σu) ! clicked documents ! SAT clicked documents

! Message from search engine(Σse) ! top k returned documents 14

Page 15: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

1. At iteration t, the user agent takes action *+,

(query change).

15

2. The search engine picks the best action *-., to search

DUAL-AGENT STOCHASTIC GAME

Page 16: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

3. Search engine returns document set Dt

as message 4-., .

16

4. The user agent examines Dt

and sends clicks as feedback messages 4+, .

34

DUAL-AGENT STOCHASTIC GAME

Messages are essentially documents that an agent thinks they are relevant.

Page 17: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

DUAL-AGENT STOCHASTIC GAME

5. The user agent again makes action 5+,67

(query changes).

6. The world moves into iteration t + 1.

7. The loop continues

17

4 3

Page 18: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

OBSERVATION FUNCTION (O)

18

Probability of making observation ω after taking action a and landing in state s

e.g., Prob. of making observation ω after taking action a and landing in state

SRT=O(SREL, a, ω)O(SEXPLOITATION, a, ω)

Page 19: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

OBSERVATION FUNCTION (O)! Intuition """" Relevant or Non-relevant?

! Observation function

89:, ; Re=, 4+, ?, ; Re=) ∝ A9:, ; Re=|?, ; Re=)A9?, ; Re=|4+)

! A :, ; Re= ?, ; Re= and A9?, ; CD=|4+) are estimated from ! log data! TREC ground truth. 19

st is likely to be

Relevant

Non-Relevant

If ∃d ∈ D∃d ∈ D∃d ∈ D∃d ∈ Dtttt----1111 and and and and d is SAT Clickedd is SAT Clickedd is SAT Clickedd is SAT Clicked

otherwise

# TU TV:DWXDY WD=DX5Z[D# TU TV:DWX5\]TZ:

# TU ob:DWXDY \W_D WD=DX5Z[D# TUTV:DWXDY WD=DX5Z[D

Page 20: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

! Intuition """" Exploration or Exploitation!!!!

! Observation Function89:, ; `ab=TW5\]TZ, 5+ ; cde,, 4-. ; f,g7, ?, ; `ab=TW5\]TZ)∝ A9:, ; `ab=TW5\]TZ|?, ; `ab=TW5\]TZ)A9?, ; `ab=TW5\]TZ| c de,, f,g7)

! A9:, ; `ab=TW5\]TZ|?, ; `ab=TW5\]TZ) 5ZY A9?, ; `ab=TW5\]TZ| c de,, f,g7)are estimated! log data! human judgment.

20

st is likely to be

Exploration

Exploitation

if 9c9c9c9cΔΔΔΔqqqqtttt≠∅ and c≠∅ and c≠∅ and c≠∅ and cΔΔΔΔqqqqtttt∉D∉D∉D∉Dtttt----1111) ) ) ) oooor 9r 9r 9r 9ccccΔΔΔΔqqqqtttt;;;;∅ ∅ ∅ ∅ and and and and ----ΔΔΔΔqqqqtttt≠∅ ≠∅ ≠∅ ≠∅ ))))

if 9c9c9c9cΔΔΔΔqqqqtttt≠∅ and c≠∅ and c≠∅ and c≠∅ and cΔΔΔΔqqqqtttt∈∈∈∈DDDDtttt----1111) ) ) ) oooor 9r 9r 9r 9ccccΔΔΔΔqqqqtttt;;;;∅ ∅ ∅ ∅ and and and and ––––ΔΔΔΔqqqqtttt;∅ );∅ );∅ );∅ )

OBSERVATION FUNCTION (O)

# TU TV:DWXDY Dab=TW5\]TZ Y_D \T 5YY \DWl:# TU TV:DWX5\]TZ: Y_D \T 5YY \DWl:

# TU TV:DWXDY \W_D Dab=TW5\]TZ# TU TV:DWXDY Dab=TW5\]TZ

Page 21: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

! At every search iteration the belief state b is updatedwhen a new observation is obtained.

21

V,679:m) ; A9:m|?,, 5,, V,n

;A9?,|:m, 5,, V,) o A9:m|:p, 5,, V,)V,9:pn

-q∈rA9?,|5,, V,)

;89:m, 5,, ?,) o A9:m|:p, 5,, V,)V,9:pn

-q∈rA9?,|5,, V,)

BELIEF UPDATES (B)

Page 22: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

22

! q1=“best US destinations” observation= NRRSRT

Relevant & Exploitation

0.1784

SRRRelevant & Exploration

0.1135

SNRTNon-Relevant & Exploitation

0.2838

SNRRNon-Relevant & Exploration

0.4243

TREC’13 session #87 topic: planning a trip to the United States. You will be there for a month and able to travel within a 150-mile radius of your destination. What are the best cities to visit?

BELIEF UPDATES (B)

q0

Page 23: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

23

! q1=“best US destinations” observation= NRR

! q2=“distance New York Boston” observation = RT

SRTRelevant &

Exploitation0.0005

SRRRelevant & Exploration

0.0068

SNRTNon-Relevant & Exploitation

0.0715

SNRRNon-Relevant & Exploration

0.9212

TREC’13 session #87 topic: planning a trip to the United States. You will be there for a month and able to travel within a 150-mile radius of your destination. What are the best cities to visit?

BELIEF UPDATES (B)

q0

Page 24: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

24

! q1=“best US destinations” observation= NRR

! q2=“distance New York Boston” observation = RT

SRTRelevant &

Exploitation0.0005

SRRRelevant & Exploration

0.0068

SNRTNon-Relevant & Exploitation

0.0715

SNRRNon-Relevant & Exploration

0.9212

TREC’13 session #87 topic: planning a trip to the United States. You will be there for a month and able to travel within a 150-mile radius of your destination. What are the best cities to visit?

BELIEF UPDATES (B)

q0

Page 25: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

25

! q1=“best US destinations” observation= NRR

! q2=“distance New York Boston” observation = RT

! q3=“maps.bing.com” observation = NRT

SRTRelevant &

Exploitation0.0151

SRRRelevant & Exploration

0.4347

SNRTNon-Relevant & Exploitation

0.0276

SNRRNon-Relevant & Exploration

0.5226

TREC’13 session #87 topic: planning a trip to the United States. You will be there for a month and able to travel within a 150-mile radius of your destination. What are the best cities to visit?

BELIEF UPDATES (B)

q0

Page 26: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

26

! q1=“best US destinations” observation= NRR

! q2=“distance New York Boston” observation = RT

! q3=“maps.bing.com” observation = NRT

SRTRelevant &

Exploitation0.0151

SRRRelevant & Exploration

0.4347

SNRTNon-Relevant & Exploitation

0.0276

SNRRNon-Relevant & Exploration

0.5226

TREC’13 session #87 topic: planning a trip to the United States. You will be there for a month and able to travel within a 150-mile radius of your destination. What are the best cities to visit?

BELIEF UPDATES (B)

q0

Page 27: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

27

! q1=“best US destinations” observation= NRR

! q2=“distance New York Boston” observation = RT

! q3=“maps.bing.com” observation = NRT

SRTRelevant &

Exploitation0.0291

SRRRelevant & Exploration

0.7837

SNRTNon-Relevant & Exploitation

0.0081

SNRRNon-Relevant & Exploration

0.1790

TREC’13 session #87 topic: planning a trip to the United States. You will be there for a month and able to travel within a 150-mile radius of your destination. What are the best cities to visit?

! q20=“Philadelphia NYC train” observation = NRT

……

BELIEF UPDATES (B)

q0

Page 28: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

28

! q1=“best US destinations” observation= NRR

! q2=“distance New York Boston” observation = RT

! q3=“maps.bing.com” observation = NRT

SRTRelevant &

Exploitation0.0291

SRRRelevant & Exploration

0.7837

SNRTNon-Relevant & Exploitation

0.0081

SNRRNon-Relevant & Exploration

0.1790

TREC’13 session #87 topic: planning a trip to the United States. You will be there for a month and able to travel within a 150-mile radius of your destination. What are the best cities to visit?

! q20=“Philadelphia NYC train” observation = NRT

……

BELIEF UPDATES (B)

q0

Page 29: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

29

! q1=“best US destinations” observation= NRR

! q2=“distance New York Boston” observation = RT

! q3=“maps.bing.com” observation = NRT

SRTRelevant &

Exploitation0.0304

SRRRelevant & Exploration

0.8126

SNRTNon-Relevant & Exploitation

0.0066

SNRRNon-Relevant & Exploration

0.1505

TREC’13 session #87 topic: planning a trip to the United States. You will be there for a month and able to travel within a 150-mile radius of your destination. What are the best cities to visit?

……

! q20=“Philadelphia NYC train” observation = NRT

! q21=“Philadelphia NYC bus” observation = NRT

BELIEF UPDATES (B)

q0

Page 30: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

30

! q1=“best US destinations” observation= NRR

! q2=“distance New York Boston” observation = RT

! q3=“maps.bing.com” observation = NRT

SRTRelevant &

Exploitation0.0304

SRRRelevant & Exploration

0.8126

SNRTNon-Relevant & Exploitation

0.0066

SNRRNon-Relevant & Exploration

0.1505

TREC’13 session #87 topic: planning a trip to the United States. You will be there for a month and able to travel within a 150-mile radius of your destination. What are the best cities to visit?

……

! q20=“Philadelphia NYC train” observation = NRT

! q21=“Philadelphia NYC bus” observation = NRT

BELIEF UPDATES (B)

q0

Page 31: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

! The long term reward function for the search engine agent

! The long tern reward function for the user agent

! Joint optimization

31

s-.9V, 5) ; oV9:)C9:, 5)-∈r

c t o A9?|V, 5+, 4-.)A9?|V, 4+)l5au s-.9Vv, 5wx∈y

s+9V, 5+) ; C9:, 5+) c t z 9:,|:,g7, f,g7)u|max-~s+9:,g7, 5+)

= P(qt|d) +t z P9e,|e,g7, f,g7, 5)u max~A 9e,g7|f,g7)

5-. ; argmaxu

9s-.9V, 5) c s+9V, 5+))

JOINT OPTIMIZATION — WIN-WIN

Page 32: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

EXPERIMENTS! Evaluate on TREC 2012 and 2013 Session Tracks

! The session logs contain! session topic! user queries! previously retrieved URLs, snippets! user clicks, and dwell time etc.

! Task: retrieve 2,000 documents for the last query in each session

! The evaluation is based on the whole session. ! A document related to any query in the session is a good document

32

! Datasets ! ClueWeb09 CatB ! ClueWeb12 CatB ! spam documents are

removed! duplicated documents

are removed

Page 33: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

ACTIONS

! increasing weights of the added terms by a factor of x=1.05, 1.10, 1.15, 1.20, 1.25, 1.5, 1.75 or 2;

! decreasing weights of the added terms by a factor of y=0.5, 0.57, 0.67, 0.8, 0.83, 0.87, 0.9 or 0.95;

! QCM proposed in Guan et. al SIGIR’13;! Pseudo Relevance Feedback which assumes the top

20 retrieved documents are relevant;! directly uses the query in current iteration to perform

retrieval;! combines all queries in a session weights them

equally.33

Page 34: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

SEARCH ACCURACY! Search accuracy on TREC 2012 Session Track

34

TREC 2012 Session Track

# Win-win outperforms most retrieval algorithms on TREC 2012.

Page 35: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

35

# Systems in TREC 2012 perform better than in TREC 2013. # many relevant documents are not included in ClueWeb12 CatB collection

# Win-win outperforms all retrieval algorithms on TREC 2013.# It is highly effective in Session Search.

SEARCH ACCURACY! Search accuracy on TREC 2013 Session Track

TREC 2013 Session Track

Page 36: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

IMMEDIATE SEARCH ACCURACY

36

# Original run: top returned documents provided by TREC log data# win-win’s immediate search accuracy is better than the Original at

every iteration# win-win's immediate search accuracy increases while the number of

search iterations increases

TREC 2012 Session Track TREC 2013 Session Track

Page 37: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

Conclusions

37

! A novel session search framework! Model the interactions between user and search

engine as a dual-agent stochastic game! Able to perform efficient optimization

! a finite discrete set of states and actions! Jointly search for the goal in a trial-and-error

manner

Page 38: Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)

THANK YOU

[email protected]

38