18
Patent Track @ CLEF John Tait, Chief Scientific Officer, IRF

Patent Track @ CLEF John Tait, Chief Scientific Officer, IRF

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Patent Track @ CLEF

John Tait, Chief Scientific Officer, IRF

The IRF MissionTo bridge the gap between information retrieval research and the world of professional search especially in patents and intellectual property

To promote open research on very large scale information retrieval

To make available a facility that enables largescale information retrieval and in depthpatent and other complex data processing.

Across the world there are about 60 million patents and the number is growing rapidly

Patent documents formed the most important shared information pool:

Knowledge and research

Innovative capacity and commercial strength

Legal information

Patents - General

80% of world technical-scientific knowledge can be found in patent documents – in some branches of industry the number is significantly higher still

Intellectual Property (IP):

Innovation improves competitivity, creates jobs, promotes growth and secures prosperity.

The only valid and binding instrument to protect innovation

An important commercial asset – a monopoly on the use of an invention

The issue of licences has become a significant revenue source for many companies

Patents – Commercial importance Intangible Assets:

Distinctive Patent Search Characteristics

• High Recall: a single missed

document can invalidate a patent

• Session based: single searchers may

involve days of cycles of results review

and query reformulation

• Defendable: Process and results may

need to be defended in court

Established in 2005

Headquarters in Vienna

Has over 70 employees, an expert team of software

developers, technicians, mathematicians, language

experts and other specialists

Field of activity: Information Retrieval in the segment

of Intellectual Property

Products: innovative solutions for searching and

categorising patent data

Matrixware

Committed to provide1. Sample from Alexandria patent database

2. Leonardo

• Eclipse based IR open development platform

• Populated with various tools

• General IR

• NLP

• MT

• UI

But not necessarily in time for CLEF 2009

Patent Retrieval

Distinctive Problems

Patent Process

Submit to Patent

Office

Agent Grant Patent

Defend Patent

Types of Patent Search1. Patentability2. Validity3. Clearance (Freedom to Operate)4. Infringement5. State of the Art6. Patent Landscape

• 1-3 dependent of prior art search

Very High Recall

• Any prior publication will invalidate a patent

Other patents including lapsed

Scientific Publication

Comics ???!!!

Session Based • Patent Professionals

Searching Often Spend 2 or more days on one

search

May review more than 1000 results

Work with other professionals (lawyers, chemical engineers, chemists, marketing etc.

Have to record and defend search process to clients and courts

Classification

• All patents are classified

IPTC

• Automatic Classification Possible

• People search for Gaps

Multilingual1. A Russian patent can invalidate

a British patent

2. Complex and changing patterns of filing language

3. Patents come in families

Same idea: different jurisdictions and languages

4. MT already widely used

Filing Languages• English continues to be the dominant language

• Chinese is the most rapidly growing language and may surpass English shortly (China now bigger than US)

• Activity in India is growing rapidly but looks set to be English dominated

• Cyrillic Languages especially Russian are also rising rapidly

• Japanese and Korean are very important

• German and French are important but declining relatively

• Spanish is underepresented versus world wide speakers

• “Minor” European Languages are declining rapidly

PAIR 08

CIKM Workshop http://www.ir-facility.org/events/pair08

Includes proposed TREC Chemistry Track and

a proposal from Erik Graf and Leif Azzopardi from Glasgow on automatic Test Collection creation

Break Out Session• Meeting Room 1

• Through tunnel at end of corridor• 118 Mødelokale 1.1

• Areas for Discussion• Test Collection Creation

• Task(s)• Evaluation Methodolgies• Organizational Issues• Future Developments

Thank you for your attentionAny questions ?

www.ir-facility.orgwww.matrixware.com

Mailing List Subscription: http://tinyurl.com/clef-ip