View
215
Download
0
Tags:
Embed Size (px)
Citation preview
The IRF MissionTo bridge the gap between information retrieval research and the world of professional search especially in patents and intellectual property
To promote open research on very large scale information retrieval
To make available a facility that enables largescale information retrieval and in depthpatent and other complex data processing.
Across the world there are about 60 million patents and the number is growing rapidly
Patent documents formed the most important shared information pool:
Knowledge and research
Innovative capacity and commercial strength
Legal information
Patents - General
80% of world technical-scientific knowledge can be found in patent documents – in some branches of industry the number is significantly higher still
Intellectual Property (IP):
Innovation improves competitivity, creates jobs, promotes growth and secures prosperity.
The only valid and binding instrument to protect innovation
An important commercial asset – a monopoly on the use of an invention
The issue of licences has become a significant revenue source for many companies
Patents – Commercial importance Intangible Assets:
Distinctive Patent Search Characteristics
• High Recall: a single missed
document can invalidate a patent
• Session based: single searchers may
involve days of cycles of results review
and query reformulation
• Defendable: Process and results may
need to be defended in court
Established in 2005
Headquarters in Vienna
Has over 70 employees, an expert team of software
developers, technicians, mathematicians, language
experts and other specialists
Field of activity: Information Retrieval in the segment
of Intellectual Property
Products: innovative solutions for searching and
categorising patent data
Matrixware
Committed to provide1. Sample from Alexandria patent database
2. Leonardo
• Eclipse based IR open development platform
• Populated with various tools
• General IR
• NLP
• MT
• UI
But not necessarily in time for CLEF 2009
Types of Patent Search1. Patentability2. Validity3. Clearance (Freedom to Operate)4. Infringement5. State of the Art6. Patent Landscape
• 1-3 dependent of prior art search
Very High Recall
• Any prior publication will invalidate a patent
Other patents including lapsed
Scientific Publication
Comics ???!!!
Session Based • Patent Professionals
Searching Often Spend 2 or more days on one
search
May review more than 1000 results
Work with other professionals (lawyers, chemical engineers, chemists, marketing etc.
Have to record and defend search process to clients and courts
Classification
• All patents are classified
IPTC
• Automatic Classification Possible
• People search for Gaps
Multilingual1. A Russian patent can invalidate
a British patent
2. Complex and changing patterns of filing language
3. Patents come in families
Same idea: different jurisdictions and languages
4. MT already widely used
Filing Languages• English continues to be the dominant language
• Chinese is the most rapidly growing language and may surpass English shortly (China now bigger than US)
• Activity in India is growing rapidly but looks set to be English dominated
• Cyrillic Languages especially Russian are also rising rapidly
• Japanese and Korean are very important
• German and French are important but declining relatively
• Spanish is underepresented versus world wide speakers
• “Minor” European Languages are declining rapidly
PAIR 08
CIKM Workshop http://www.ir-facility.org/events/pair08
Includes proposed TREC Chemistry Track and
a proposal from Erik Graf and Leif Azzopardi from Glasgow on automatic Test Collection creation
Break Out Session• Meeting Room 1
• Through tunnel at end of corridor• 118 Mødelokale 1.1
• Areas for Discussion• Test Collection Creation
• Task(s)• Evaluation Methodolgies• Organizational Issues• Future Developments
Thank you for your attentionAny questions ?
www.ir-facility.orgwww.matrixware.com
Mailing List Subscription: http://tinyurl.com/clef-ip