Upload
craig-trim
View
341
Download
1
Tags:
Embed Size (px)
DESCRIPTION
An Ontology is a description of things that exist and how they relate to each other. Ontologies and Natural Language Processing (NLP) can often be seen as two sides of the same coin.
Citation preview
© 2012 IBM Corporation
Outline
Triples– Reification– Confidence Levels
Ontology– Design– Architecture (big picture)– SPARQL– Inferencing
Methodology– Creating a Semantic Network
© 2012 IBM Corporation
© 2012 IBM Corporation
Triples
Subject Predicate Object
“The author of Hamlet is Shakespeare” Shakespeare authorOf Hamlet Hamlet hasAuthor Shakespeare
© 2012 IBM Corporation
Triples
“Shakespeare wrote Hamlet in 1876”
Shakepeare authorOf Hamlet
Hamlet writtenIn 1876
© 2012 IBM Corporation
Triples (Reification)
Wikipedia states “Shakespeare wrote Hamlet in 1876”
Wikipedia states Shakepeare
Shakepeare authorOf Hamlet
Hamlet writtenIn 1876
© 2012 IBM Corporation
Triples (Reification)
Wikipedia states “Shakespeare wrote Hamlet in 1876”
Wikipedia states (Hamlet writtenIn 1876)
Shakespeare authorOf Hamlet
© 2012 IBM Corporation
Triples (Confidence Levels)
ShakespeareOnline states (Hamlet writtenIn 1599)
Wikipedia states (Hamlet writtenIn 1876)
When was Hamlet written?– 1599– 1876
© 2012 IBM Corporation
Triples (Confidence Levels)
Go from this:– ShakepeareOnline states (Hamlet writtenIn 1599)
To this:– (ShakepeareOnline states (Hamlet writtenIn 1599)) hasConfidenceLevel 90
© 2012 IBM Corporation
Triples (Confidence Levels)
© 2012 IBM Corporation
What is an Ontology?
Description of the kinds of entities there are and how they are related (Chris Welty)
© 2012 IBM Corporation
Ontology
“Shakespeare wrote Hamlet in 1876”
How many “types” of things are there in this statement?– Authors– Books– Plays– Years– Sources– Characters
What relationships could exist between these types?
© 2012 IBM Corporation
Ontology
Author – Playwright {Shakespeare, Marlowe}
Book– Play {Hamlet, Macbeth, Faustus}
RDF:– Shakepeare a Playwright– Shakepeare a Author– Hamlet a Play– Hamlet a Book
© 2012 IBM Corporation
© 2012 IBM Corporation
William Shakespeareen2:Playwright was an English poet and playwright, widely regarded as the greatest writer in the English language and the world's pre-eminent dramatist.
© 2012 IBM Corporation
© 2012 IBM Corporation
© 2012 IBM Corporation
© 2012 IBM Corporation
© 2012 IBM Corporation
AIX hasCommand topas monitors (process uses (CPU hasComponent resources))
Semantic Chains
© 2012 IBM Corporation
SELECT ?commandWHERE {
AIX hasCommand ?command .?command monitors/uses CPU
}
SPARQL
© 2012 IBM Corporation
© 2012 IBM Corporation
Inference
Ontology Model (Classes):
Product– SupportedProduct (x hasMaker IBM)
Company– IBM– NonIBM (disjoint to IBM)
• { Microsoft, Oracle, Teradata)
Ontology Model (Predicates):
<Product> hasMaker <Company>
Triple Store data:
Rational Software Architect hasMaker IBM
Rational Software Architect a SupportedProduct
© 2012 IBM Corporation
© 2012 IBM Corporation
© 2012 IBM Corporation
Tivoli Monitoring hasSynonym ITM
© 2012 IBM Corporation
Tivoli Monitoring hasSynonym ITMITM hasComponent ITM Agent
© 2012 IBM Corporation
Tivoli Monitoring hasSynonym ITMITM hasComponent ITM AgentTivoli Monitoring hasComponent Tivoli Monitoring AgentTivoli Monitoring Agent hasSynonym ITM Agent
© 2012 IBM Corporation
© 2012 IBM Corporation
© 2012 IBM Corporation
© 2012 IBM Corporation
© 2012 IBM Corporation
© 2012 IBM Corporation
“Agent” analysis
itm agent 54
db2 agent 32
os agent 32
ul agent 31
monitoring agent 29
oracle agent 22
agent needs 21
itm ul agent 16
windows os agent 15
agent left 14
agent system 14
citrix agent 14
mysap agent 14
unix os agent 13
linux agent 13
© 2012 IBM Corporation
Proximal Verbs (normalized)
monitor
support
configure
run
start
show
build
appear
© 2012 IBM Corporation
Events
Situation Event
Omnibus Event
ITM Event
Minor Event
Triggering Event
Console Event
System Event
TBSM Event
JMX Event
TEC Event
© 2012 IBM Corporation
Blank Nodes
Explict Characterization vs Implicit (Predicate-driven) Identification
© 2012 IBM Corporation
Blank Nodes
What are blank nodes?– A way of profiling entities– A way of identifying entities without explicit identification– Implicit identification– Predicate driven identification of data (rather than explict characterization)
Examples:– “That person has a child”– “That person has a child and a husband”
© 2012 IBM Corporation
Anonymous (Anon) Nodes
What is the difference between an Anon Node and a Blank Node?
An “anonymous node” is an existentially quantitifed variable
A typical RDF node has an identifier to which it is useful to refer
∃
© 2012 IBM Corporation
Appendix A - Resources
Glossary
Books
Common OWL Editors
Triple Stores
© 2012 IBM Corporation
Glossary
OWL – Web Ontology Language
RDF – Resource Description Framework
SPARQL – Simple Protocol and RDF Query Language
© 2012 IBM Corporation
Books
Semantic Web for the Working Ontologist: Effective Modeling in RDFS and OWL – Author(s): Dean Allemang and Jim Hendler– Second Edition
© 2012 IBM Corporation
Common OWL Editors
TopBraid Composer (TBC)
Free Edition (also Standard + Maestro Editions) http://www.topquadrant.com/products/TB_Composer.html
Protege
Free, open source ontology editor and knowledge-base framework http://protege.stanford.edu/
© 2012 IBM Corporation
Triple Stores
Comparison and links here:
http://www.w3.org/wiki/LargeTripleStores
Sesame - scalable and transactional
May be more suited to web environments Setup slightly more complex than Jena TDB
Jena TDB - scalable and very simple set up
Code Samples and API introduction here: http://cattail.boulder.ibm.com/cattail/#[email protected]/files/
53A1E4007F0F3DDB8C12752E093F23B6 The latest version of Jena TDB (0.90) is transactional. Past versions of TDB
were not transactional, and may not be suited for web environments.
DB2-RDF – builds on top of the Jena Graph SPI.
https://www.ibm.com/developerworks/mydeveloperworks/blogs/nlp/entry/db2_rdf_nosql_graph_support13
© 2012 IBM Corporation
Appendix B - OWL
OWL (Web Ontology Language)– Built on top of RDF (same syntax RDF)
Open World vs Closed World assumption
Parts of an Ontology:– Header– Classes and Individuals– Properties– Annotations– Datatypes
Instance vs Subclass
© 2012 IBM Corporation
OWL – Subclasses and Types
alpha rdfs:subClassOf of Thing– a rdf:type alpha– b rdf:type alpha
beta rdfs:subClassOf alpha– c rdf:type beta– d rdf:type beta– c rdf:type alpha – d rdf:type alpha
© 2012 IBM Corporation
OWL – Subclasses and Types
President rdfs:subClassOf Dignitary
Dignitary rdfs:subClassOf Person
This model states:– All dignitaries are people– All presidents are dignitaries (and thus,
people)
John Smith rdf:type Person
Queen Elizabeth rdf:type Dignitary– Queen Elizabeth rdf:type Person
GW Bush rdf:type President– GW Bush rdf:type Dignitary– GW Bush rdf:type Person
Barack Obama rdf:type President– Barack Obama rdf:type Dignitary– Barack Obama rdf:type Person
How do we expand this model to classify actively-serving American presidents?
© 2012 IBM Corporation
OWL – Subclasses and Types
President rdfs:subClassOf Dignitary
Dignitary rdfs:subClassOf Person
This model states:– All dignitaries are people– All presidents are dignitaries (and thus,
people)
John Smith rdf:type Person
Queen Elizabeth rdf:type Dignitary– Queen Elizabeth rdf:type Person
GW Bush rdf:type President– GW Bush rdf:type Dignitary– GW Bush rdf:type Person
Barack Obama rdf:type President– Barack Obama rdf:type Dignitary– Barack Obama rdf:type Person
How do we expand this model to classify actively-serving American presidents?
© 2012 IBM Corporation
Appendix C – OWL Properties
Transitive Property
Functional Property
Inverse Functional Property
Symmetric Property
Asymmetric Property
Reflexive Property
Irreflexive Property
Property Chains
Putting it all together
Others
© 2012 IBM Corporation
Transitive Property
hasVersion rdf:type owl:TransitiveProperty
Windows hasVersion Windows XP
Windows XP hasVersion Windows XP SP2
Windows hasVersion Windows XP SP2
© 2012 IBM Corporation
Functional Property
ssn-name rdf:type owl:FunctionalProperty
123-45-6789 ssn-ame Bob Smith
123-45-6789 ssn-ame Robert Smythe
Bob Smith owl:sameAs Robert Smythe
© 2012 IBM Corporation
Inverse Functional Property
hasSpeKey rdf:type owl:InverseFunctionalProperty
File Net Web Services hasSpeKey 5724S03
FN WS hasSpeKey 5724S03
File Net Web Services owl:sameAs FN WS
© 2012 IBM Corporation
Symmetric Property
siblingOf rdf:type owl:SymmetricProperty
Tim siblingOf Jim
Jim siblingOf Tim
© 2012 IBM Corporation
Asymmetric Property
hasParent rdf:type owl:AsymmetricProperty
Stewie hasParent Peter
Peter does not have parent Stewie
© 2012 IBM Corporation
Reflexive Property
© 2012 IBM Corporation
Irreflexive Property
© 2012 IBM Corporation
Property Chain
[] rdfs:subPropertyOf hasGrandfather;owl:propertyChain (
hasFatherhasFather
).
John III hasFather John JR
John JR hasFather John SR
John III hasGrandfather John SR
© 2012 IBM Corporation
Putting it all together …
hasSynonym– Transitive, Symmetric
© 2012 IBM Corporation
Appendix D - Classic Mereology
Transitive Axiom
Reflexive Axiom
Antisymmetric Axiom
© 2012 IBM Corporation
Transitive Axiom
parts of parts are parts of the whole
If A is part of B and B is part of C, then A is part of C
© 2012 IBM Corporation
Reflexive Axiom
everything is part of itself– A is part of A
© 2012 IBM Corporation
Antisymmetric Axiom
nothing is a part of its parts– if A is part of B and A != B then B is not part of A
© 2012 IBM Corporation
Appendix E - Partonomy
Can you distinguish parts from kinds?
Why is this important?
This is often the difference between a taxonomy and an ontology– A taxonomy doesn’t need to distinguish between parts and kinds– An ontology must make this distinction
Vehicle-Car--Engine---Crankcase----Aluminum Crankcase
© 2012 IBM Corporation
Partonomy
© 2012 IBM Corporation
Partonomy
© 2012 IBM Corporation
Appendix F – Common Predicates
hasPart– hasPart owl:inverseOf partOf– hasPart rdf:type owl:TransitiveProperty– partOf rdf:type owl:TransitiveProperty
hasLocus
© 2012 IBM Corporation
Appendix G
Blank nodes
Anonymous (Anon) nodes
Quads
© 2012 IBM Corporation
Quads
(Reference Jena Tutorial with TDB.ppt)
© 2012 IBM Corporation
Maintenance*
The relational model has relations between entities established through explict keys (primary, foreign) and associative entities.
– Changing relationships in this case is cumbersome, as it requires changes to the base model structure itself.
– Changes in an RDBMS can be difficult for a populated database.
Hierarchcal models have similar limitations
The graph model (RDF) makes it much easier to maintain the model once it is deployed.– A critical point is that relations are part of the data, not part of the database structure– If a new relationship needs to be added that was not anticipated, a new triple is simply
added to the datastore.– A graph model can be traversed from any perspective. In constrast, other types of
database designs might require structural changes to answer new questions that arise after initial implementation.
© 2012 IBM Corporation
Design Styles
Avoid proliferating owl:inverseOf [1]