Upload
phillychi
View
906
Download
1
Embed Size (px)
DESCRIPTION
John looks beyond taxonomy as classification and discusses ways of giving systems more data about the information they are processing.
Citation preview
Confidential
Machine Processing of Taxonomies
John Ferrara
> 2 Confidential
Introduction
• Taxonomy as metadata (not necessarily as navigation)
• Machine processing of information
• All about the meanings of word forms
• Smart systems– Today, that usually means search– Tomorrow, it’ll mean intelligent agents
> 3 Confidential
A Riddle
• 2 Vanguard systems– Same search engine– Similar content– Both use metadata in similar ways– Same queries
• One system returns higher quality search results much more reliably than the other. What makes the difference?
A Thesaurus!
> 4 Confidential
Controlled Vocabularies & Thesauri
• Equivalence: These things are the same– IRA = individual retirement account– redemption = sale– 401(k) = 401k = 401 k– AKA a “synonym ring”
• Preference: This is the standard term, these are variants– ETF over VIPER– Electronic Bank Transfer over wire– Beneficiary over Beneficary– AKA an “authority file”
> 5 Confidential
Controlled Vocabularies & Thesauri
• Classification: This is the parent (or child) of that– investment > mutual fund > stock fund > S&P 500 Index – “broader terms” and “narrower terms”– Similar to (but not the same as) a navigational taxonomy
• Related: This is associated with that– distributions & capital gains– download & Quicken & tax forms– May be used as a “See also” or “Best Bets” function
> 6 Confidential
Ontologies
• “A specification of a conceptualization”
• It explains the relationships between concepts
• Languages include RDF, DAML+OIL, and OWL
Subject ObjectPredicate
> 7 Confidential
Example of an Ontology
Planet
Star
Goes around
Mercury
Venus
Earth
Mars
Is a
Satellite Artificial
Natural
Type of
Goes around
Hubble
The moon
Is a
Is a
Goes around
The sunIs a
Atmosphere
Crust
Mantle
Core
Part of