25
Evaluation of Relevance Feedback Algorithms for XML Retrieval Silvana Solomon 27 February 2007 Supervisor: Dr. Ralf Schenkel

Evaluation of Relevance Feedback Algorithms for XML Retrieval

  • Upload
    armani

  • View
    28

  • Download
    0

Embed Size (px)

DESCRIPTION

Evaluation of Relevance Feedback Algorithms for XML Retrieval. Silvana Solomon 27 February 2007. Supervisor: Dr. Ralf Schenkel. Outline. Short introduction Motivation & Goals Evaluating retrieval effectiveness INEX tool Evaluation methodology Results. (4) expanded query. (1) query. - PowerPoint PPT Presentation

Citation preview

Page 1: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Evaluation of Relevance Feedback

Algorithms for XML Retrieval

Silvana Solomon27 February 2007

Supervisor:

Dr. Ralf Schenkel

Page 2: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

Outline

Short introduction

Motivation & Goals

Evaluating retrieval effectiveness

INEX tool

Evaluation methodology

Results

Page 3: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

Introduction

Path to the result

sec„The IR process is composed…“

article

body

sec

subsec„For small collections…“

frontmatter

sec

subsec

p p p„Figure 1 outlines…“

author„Ian Ruthven“

Content of result

citation„D. Harman“

backmatter

(3) feedback

(4) expanded query

FeedbackXML SearchEngine

(1) q

ue

ry

(2) re

su

lts

(5) re

su

lts o

f e

xp

an

de

d q

ue

ry

Page 4: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

Motivation

Best way to compare feedback algorithms?

Cannot use standard evaluation tools on feedback results

Goals:

Analyze evaluation methods

Develop an evaluation tool

Page 5: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

Evaluating Retrieval Effectiveness

Document collection

Topics set

Assessments set

Human assessors

Metrics

INEX: INitiative for the Evaluation of XML Retrieval 2006 document collection: 600,000 Wikipedia documents

Page 6: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

INEX Tool: EvalJ

Tool for evaluation of information retrieval experiments

Implements a set of metrics used for evaluation

Limitations: cannot measure improvement of runs produced with feedback

Page 7: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

RF Evaluation – Ranking Effect

Baseline run

doc[1]/bdy[1]

doc[2]/bdy[1]

doc[4]/bdy[1]/ article[1]/ sec[6]

Feedback run

doc[1]

Mark in top results

relevantdoc[3]

doc[8]/bdy[1]/article[3]

doc[3]

doc[8]/bdy[1]/article[3]

doc[7]/article[3]

push the known relevant results to the top of the element ranking

artificially improves RP figures

doc[2]/bdy[1]/article[1]

Page 8: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

RF Evaluation – Feedback Effect

measure improvement on unseen relevant elements

not directly tested

Modify

FB run

Evaluate untrained results

Baseline run

doc[1]/bdy[1]

doc[3]

doc[2]/bdy[1]

doc[8]/bdy[1]/article[3]

doc[4]/bdy[1]/ article[1]/ sec[6]

Feedback run

doc[3]

doc[8]/bdy[1]/article[3]

Mark in top results

relevant

Page 9: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

Evaluation Methodology (1)1. Standard text IR: freezing known results at the

top independent results assumption

2. New approach: remove known results+X from the collection

resColl-result: remove results only (~doc retrieval) resColl-desc: remove results+descendants resColl-anc: remove results+ancestors resColl-path: remove results+desc+anc resColl-doc: remove whole doc with known results

Page 10: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

Evaluation Methodology (2) Freezing:

Baseline run

doc[7]/bdy[1]

doc[3]

doc[2]/bdy[1]

doc[8]/bdy[1]/article[3]

doc[4]/bdy[1]/ article[1]/ sec[6]

Feedback run

doc[2]/bdy[1]/article[1]

doc[9]

doc[4]/bdy[1]/article[2]

doc[2]/bdy[1]

doc[4]/bdy[1]/ article[4]

Page 11: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

Evaluation Methodology (2)

Baseline run

doc[7]/bdy[1]

doc[3]

doc[2]/bdy[1]

doc[8]/bdy[1]/article[3]

doc[4]/bdy[1]/ article[1]/ sec[6]

block top-3

Feedback run

doc[7]/bdy[1]

doc[2]/bdy[1]/article[1]

doc[9]

doc[4]/bdy[1]/article[2]

doc[2]/bdy[1]

doc[4]/bdy[1]/ article[4]

Freezing:

Page 12: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

Evaluation Methodology (2)

Baseline run

doc[7]/bdy[1]

doc[3]

doc[2]/bdy[1]

doc[8]/bdy[1]/article[3]

doc[4]/bdy[1]/ article[1]/ sec[6]

block top-3

Feedback run

doc[7]/bdy[1]

doc[3]

doc[2]/bdy[1]/article[1]

doc[9]

doc[4]/bdy[1]/article[2]

doc[2]/bdy[1]

doc[4]/bdy[1]/ article[4]

Freezing:

Page 13: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

Evaluation Methodology (2)

Baseline run

doc[7]/bdy[1]

doc[3]

doc[2]/bdy[1]

doc[8]/bdy[1]/article[3]

doc[4]/bdy[1]/ article[1]/ sec[6]

block top-3

Feedback run

doc[7]/bdy[1]

doc[3]

doc[2]/bdy[1]

doc[2]/bdy[1]/article[1]

doc[9]

doc[4]/bdy[1]/article[2]

doc[2]/bdy[1]

doc[4]/bdy[1]/ article[4]

Freezing:

Page 14: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

Evaluation Methodology (2)

Baseline run

doc[7]/bdy[1]

doc[3]

doc[2]/bdy[1]

doc[8]/bdy[1]/article[3]

doc[4]/bdy[1]/ article[1]/ sec[6]

block top-3

Feedback run

doc[7]/bdy[1]

doc[3]

doc[2]/bdy[1]

doc[2]/bdy[1]/article[1]

doc[9]

doc[4]/bdy[1]/article[2]

doc[2]/bdy[1]

doc[4]/bdy[1]/ article[4]

Freezing:

Page 15: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

Evaluation Methodology (3)

Baseline run

doc[7]/bdy[1]

doc[3]

doc[2]/bdy[1]

doc[8]/bdy[1]/article[3]

doc[4]/bdy[1]/ article[1]/ sec[6]

Feedback run

doc[2]/bdy[1]/article[1]

doc[9]

doc[4]/bdy[1]/article[2]

doc[2]/bdy[1]

doc[4]/bdy[1]/ article[4]

resColl-path:

Page 16: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

Evaluation Methodology (3)

Baseline run

doc[7]/bdy[1]

doc[3]

doc[2]/bdy[1]

doc[8]/bdy[1]/article[3]

doc[4]/bdy[1]/ article[1]/ sec[6]

Feedback run

doc[2]/bdy[1]/article[1]

doc[9]

doc[4]/bdy[1]/article[2]

doc[2]/bdy[1]

doc[4]/bdy[1]/ article[4]

resColl-path:

Page 17: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

Evaluation Methodology (3)

Baseline run

doc[7]/bdy[1]

doc[3]

doc[2]/bdy[1]

doc[8]/bdy[1]/article[3]

doc[4]/bdy[1]/ article[1]/ sec[6]

Feedback run

doc[2]/bdy[1]/article[1]

doc[9]

doc[4]/bdy[1]/article[2]

doc[2]/bdy[1]

doc[4]/bdy[1]/ article[4]

resColl-path:

Page 18: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

Evaluation Methodology (3)

Baseline run

doc[7]/bdy[1]

doc[3]

doc[2]/bdy[1]

doc[8]/bdy[1]/article[3]

doc[4]/bdy[1]/ article[1]/ sec[6]

Feedback run

doc[2]/bdy[1]/article[1]

doc[9]

doc[4]/bdy[1]/article[2]

doc[2]/bdy[1]

doc[4]/bdy[1]/ article[4]

resColl-path:

Page 19: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

Evaluation Methodology (3)

Baseline run

doc[7]/bdy[1]

doc[3]

doc[2]/bdy[1]

doc[8]/bdy[1]/article[3]

doc[4]/bdy[1]/ article[1]/ sec[6]

Feedback run

doc[2]/bdy[1]/article[1]

doc[9]

doc[4]/bdy[1]/article[2]

doc[4]/bdy[1]/ article[4]

resColl-path:

Page 20: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

Evaluation Methodology (3)

Baseline run

doc[7]/bdy[1]

doc[3]

doc[2]/bdy[1]

doc[8]/bdy[1]/article[3]

doc[4]/bdy[1]/ article[1]/ sec[6]

Feedback run

doc[2]/bdy[1]/article[1]

doc[9]

doc[4]/bdy[1]/article[2]

doc[4]/bdy[1]/ article[4]

resColl-path:

Page 21: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

Best Evaluation Methodology?

sec„The IR process is composed…“

article

body

sec

subsec„For small collections…“

frontmatter backmatter

sec

subsec

p p P„Figure 1 outlines…“

author„Ian Ruthven“

citation„D. Harman“

resColl-path

Page 22: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

Testing Evaluated Results

Standard method: average – problems:

Topic-id 205 280 307 325 341 400 Avg.

Baseline 0.2 0.3 0.1 0.1 0.2 0.3 0.2

Modified feedback

0.2 0.2 0.1 0.9 0.2 0.2 0.3

t-test & Wilcoxon signed-rank test: gives probability p that the baseline run is better than the feedback run

experiment significant if p<0.05 or p<0.01

Page 23: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

Results (1)

Evaluation mode: resColl-path

Feedback file INEX metric

Abs. improv.

Rel. improv.

T-test WSR

TopX_CO_Content.xml 0.0185 0.0112 1.5467 0.0001 0.0001

xfirm_r1_cosc3s.xml 0.0028 0.0015 1.0975 0.0003 0.0023

xfirm_r1_cosc5.xml 0.0026 0.0012 0.9222 0.0028 0.0422

xfirm_r1_cosc3.xml 0.0025 0.0012 0.8854 0.0032 0.0441

xfirm_r1_coc3s3.xml 0.0031 -0.0017 -0.3564 0.9301 0.9995

xfirm2_r2_cop4.xml 0.0032 -0.0018 -0.3594 0.8532 0.9732

xfirm2_r2_cot40.xml 0.0025 -0.0024 -0.4863 0.9239 0.9987

xfirm2_r2_cot10.xml 0.0023 -0.0026 -0.5334 0.9429 0.9999

xfirm_r1_coc3.xml 0.0014 -0.0034 -0.7186 0.9993 0.9999

xfirm_r1_coc10.xml 0.0013 -0.0035 -0.7281 0.9989 0.9999

Page 24: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

Results (2)

Comparison of evaluation techniques based on relative improvement w.r.t. baseline run

freezing resColl-anc

resColl-desc

resColl-doc

resColl-path

resColl- res

c3s c3s TopX TopX TopX c3s

TopX c5 c3s c3s c3s c5

c5 TopX c5 c5 c5 TopX

c3 c3 c3 c3 c3 c3

TopX = TopX_CO_Content.xmlc3 = xfirm_r1_cosc3.xmlc3s = xfirm_r1_cosc3s.xmlc5 = xfirm_r1_cosc5.xml

Page 25: Evaluation of Relevance Feedback Algorithms  for XML Retrieval

Silvana Solomon Evaluation of RF Algorithms for XML Retrieval

27 Feb 2007

Conclusions & Future Work Evaluation based on different techniques &

metrics

Correct improvement measurement

Not solved: comparing several systems with different output

Maybe a hybrid evaluation mode