8
Applying the KISS Principle Applying the KISS Principle with Prior-Art Patent Search with Prior-Art Patent Search Walid Magdy Gareth Jones Dublin City University CLEF-IP, 22 Sep 2010

Applying the KISS Principle with Prior-Art Patent Search Walid Magdy Gareth Jones Dublin City University CLEF-IP, 22 Sep 2010

Embed Size (px)

Citation preview

Page 1: Applying the KISS Principle with Prior-Art Patent Search Walid Magdy Gareth Jones Dublin City University CLEF-IP, 22 Sep 2010

Applying the KISS Principle with Applying the KISS Principle with Prior-Art Patent SearchPrior-Art Patent Search

Walid Magdy Gareth Jones

Dublin City University

CLEF-IP, 22 Sep 2010

Page 2: Applying the KISS Principle with Prior-Art Patent Search Walid Magdy Gareth Jones Dublin City University CLEF-IP, 22 Sep 2010

DCU participation in CLEF-IP 2009

The more text, the better the results

Structured search does not help

Filtering helps

Combination of terms and phrases does better

Word matching for search is not the best

Blind relevance feedback is ineffective

Part of the answer is within the question

Page 3: Applying the KISS Principle with Prior-Art Patent Search Walid Magdy Gareth Jones Dublin City University CLEF-IP, 22 Sep 2010

KISS

Keep It Simple and Straightforward

Three submitted simple runs:1. IR run (simple search)2. Cit run (straightforward citation extraction)3. IR+Cit run (combine IR and Cit runs)

Evaluation results (25 submitted runs):1. IR run (3rd in recall)2. Cit run (1st in precision)3. IR+Cit run (2nd in MAP, recall, and PRES)

Page 4: Applying the KISS Principle with Prior-Art Patent Search Walid Magdy Gareth Jones Dublin City University CLEF-IP, 22 Sep 2010

IR run

Different document versions of a patent are merged

Only English parts are indexed (title, abstract, description, and claims)

Query is constructed from the same fields as follows:- unigrams with freq>2 from “description” field- bigrams with freq>3 from all fields

French and German topics are translated using Google translation

1st three levels of classification are used to filter results

Page 5: Applying the KISS Principle with Prior-Art Patent Search Walid Magdy Gareth Jones Dublin City University CLEF-IP, 22 Sep 2010

Cit and IR+Cit runs

All patents IDs are extracted from description section in patent topics

IDs that do not exist in collection are filtered out

Remaining IDs are considered as relevant documents

Only 771 out of 2,005 topics could have citations extracted from its text (2,307 citations)

IR run is appended to Cit run after removing duplicates to create IR+Cit run

Page 6: Applying the KISS Principle with Prior-Art Patent Search Walid Magdy Gareth Jones Dublin City University CLEF-IP, 22 Sep 2010

Results

Run # MAP R R@100 PRES PRES@100

IR 0.122 0.570 0.304 0.461 0.228

Cit 0.112 0.119 0.119 0.119 0.118

IR+Cit 0.2030.203 0.6180.618 0.3850.385 0.5230.523 0.3160.316

DCU runs among submitted runs (large topics set)

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

PR

ES

@1

00

Page 7: Applying the KISS Principle with Prior-Art Patent Search Walid Magdy Gareth Jones Dublin City University CLEF-IP, 22 Sep 2010

Conclusion & Future Work

When simpler approaches achieve better results than sophisticated ones:Much research is still needed in this area

Extracted citations can be useful for relevance feedback

Better translations can be used for FR/DE topics

Faster translation techniques can be used to translate FR/DE documents

Page 8: Applying the KISS Principle with Prior-Art Patent Search Walid Magdy Gareth Jones Dublin City University CLEF-IP, 22 Sep 2010

Simply,

Thank youThank you

this was the KISSKISS principle with patent search