Upload
darleen-boone
View
244
Download
2
Embed Size (px)
Citation preview
1
Barry Smith
August 26, 2013
Ontology: A Basic Introduction
2
Barry Smith – who am I?Director: National Center for Ontological Research (Buffalo)Founder: Ontology for the Intelligence Community (OIC, now STIDS) conference seriesOntology work for
Joint-Forces Command Joint Warfighting CenterArmy Net-Centric Data Strategy Center of ExcellenceArmy Intelligence and Information Warfare Directorate (I2WD)
Biomedical initiatives
3
•Stanford Medical School•Mayo Clinic•University of California at San Francisco•Cleveland Clinic Semantic Database•Duke University Health System•University of Pittsburgh Medical Center•German Federal Ministry of Health•European Union eHealth Directorate•Plant Genome Research Resource•Protein Information Resource
5http://ncor.us
Ontologists at UB (selected)• Thomas Bittner (Geography, Philosophy)• David Mark (Geography, NCGIA)• Randall Dipert (Philosophy)• Werner Ceusters (IHI, Psychiatry, Bioinformatics)• Alex Diehl (Neurology)• Alan Ruttenberg (Director of Institute for Healthcare
Informatics (IHI) Data Warehouse)• Peter Elkin (Chair, Department of Biomedical Informatics)
7
8
Ontology
= strong semantic indexing (tagging) systembiologymedicinegovernmentmilitary? google? commerce
9
10
Why do people think they need lexicons?
For people (people need to understand each other)• Training (Developing doctrine, …)• Planning (Joint operations, SOPs, …)• Executing (C2, …)• Reporting, Outcomes measurementFor machines• Compiling data (e.g. results of testing …)• Sharing of data (Compiling lessons learned …)• Collective inferencing
12
Approaches to the Construction of Lexicons
• Dictionary• Thesaurus• Subject Headings (Library of Congess, National
Library of Medicine)• Ontologies
13
Dictionary (Merriam-Webster)
14
Thesaurus
15
Plan
16
17
18
Planning
• Definition/Scope: (ADP 3-0) Planning is the art and science of understanding a situation, envisioning a desired future, and laying out effective ways of bringing about that future. Planning consists of two separate but closely related components: a conceptual component and a detailed component. Successful planning requires integrating both these components. Army leaders employ three methodologies for planning after determining the appropriate mix based on the scope of the problem, their familiarity with it, and the time available.
19
Planning
20
Subject Headings Lists
Broader, Narrower
• Cancer– Cancer, Astrology– Cancer Documentation– Cancer Prevention– Cancer, Tropic of
21
Human Disease Ontology
22
23
US DoD Civil Affairs strategy for non-classified information sharing
24
The problem of joint / coalition operations
Fire Support
LogisticsAir Operations
Intelligence
Civil-Military Operations
Targeting
Maneuver &Blue Force
Tracking
25
26
The problem with (actually existing) lexicons
• They promote the development of silos (roach motels for data)
• They do not allow us to exploit today’s technologies
• They do not combine natural language understandability with computational adequacy
• They do not scale
Military is 10 years behind the times when it comes to resolving data interoperability problems
–problems of Big Data in biomedicine were recognized already in 1998
27
MKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPISKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDVDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDV
New biology data
28
Old biology data
29/
30
The Gene Ontology
response to the massive opportunities created by the success of the Human Genome Project
for cross-organism biologyfor intra-organism biologyfor the biology of environments
31
How to find your data?
How to reason with data when you find it?How to understand the significance of the data
you collected 3 years earlier?How to integrate with other people’s data?
Part of the solution must involve consensus-based, standardized terminologies and coding schemes
32
I2WD = Information and Intelligence Warfare Directorate
DSGS-A = Distributed Common Ground System – ArmyDSC = DSGS-A CloudAIRS Ontology Suite (Ron Rudnicki)
33
Ontologies
controlled vocabularies (not lexicons)plus definitions of terms in a logical language
A. for tagging (search, retrieval, …)B. for reasoning (early warning, analysis …)