26
SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

Embed Size (px)

Citation preview

Page 1: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

SWAP-07

Ontology Engineering with OntoClean

Chris WeltyIBM Watson Research Center

Page 2: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

2SWAP-07

Acknowledgements

People

Nicola GuarinoCladio MasoloAldo GangemiAlessandro Oltramari

Bill Andersen

OrganizationsIBM Research

Vassar College, USA

LADSEB-CNR, PadovaCNR Cognitive

Science Institute, Trento

OntologyWorks, Inc.

Page 3: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

3SWAP-07

Which one is better?

T-Series

ThinkPad

T Series

ThinkPad Model

Thinkpad

model

Page 4: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

4SWAP-07

Which one is better?Computer

has-part

MemoryDisk Drive

Computer Part

Memory PartDisk Part

Computer Part

Disk Drive Memory

Computer

has-part

Due to: Guizzardi, et al, 2004.

Page 5: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

5SWAP-07

Formal Ontology of Relations

• Subsumption• Instantiation• Part/Whole• Constitution• Spatial (Cohn)• Temporal (Allen)

Page 6: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

6SWAP-07

Subsumption• The most pervasive relationship in ontologies

– Influence of taxonomies and OO

• AKA: Is-a, a-kind-of, specialization-of, subclass (Brachman, 1983)– “horse is a mammal”

• Capitalizes on general knowledge– Helps deal with complexity, structure– Reduces requirement to acquire and represent redundant specifics

• What does it mean?

□ x (x) (x)

Every instance of the subclass is necessarily an instance of the superclass

Page 7: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

7SWAP-07

Overloading Subsumption

Common modeling pitfalls

• Instantiation• Constitution• Composition• Disjunction• Polysemy• Temporality• Spatial/Containment

Page 8: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

8SWAP-07

Instantiation Pitfall

T21

My ThinkPad (s# xx123)

ThinkPad Model

Ooops…

Question: What ThinkPad models do you sell?Answer should NOT include My ThinkPad -- nor yours.

Does this ontology mean that My ThinkPad is a ThinkPad Model?

Page 9: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

9SWAP-07

Instantiation

T Series

My ThinkPad (s# xx123)

ThinkPad ModelNotebook Computer

model T 21

Page 10: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

10SWAP-07

Composition Pitfall

MemoryDisk Drive

Computer

Question: What Computers do you sell?Answer should NOT include Disk Drives or Memory.

Micro Drive

Page 11: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

11SWAP-07

Composition

MemoryDisk Drive

Computer

Micro Drive

part-of

Page 12: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

12SWAP-07

Disjunction Pitfall

MemoryDisk Drive

Computer

Micro Drive

has-partComputer Part

Flashcard-110Camera-15has-part

Unintended model: flashcard-110 is a computer-part

Page 13: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

13SWAP-07

Disjunction

Computerhas-part

Disk Drive Memory …

Page 14: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

14SWAP-07

Polysemy Pitfall(Mikrokosmos)

Abstract EntityPhysical Object

Book

Question: How many books do you have on Hemingway?Answer: 5,000

…..

Page 15: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

15SWAP-07

Polysemy(WordNet)

Abstract EntityPhysical Object

BookSense 1

BookSense 2

….. Biography of Hemingway

Page 16: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

16SWAP-07

Constitution Pitfall(WordNet)

Amount of Matter

Physical Object

Entity

ComputerClayMetal

Question: What types of matter will conduct electricity?Answer should NOT include computers.

Page 17: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

17SWAP-07

Constitution

Amount of Matter Physical Object

Entity

ComputerClayMetal

constituted

Page 18: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

18SWAP-07

Temporality Pitfall(Wikipedia)

1963 1964

1960s

Chris

Page 19: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

19SWAP-07

Temporality Pitfall(Wikipedia)

1963 births 1964 births

1960s births

Chris

Page 20: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

20SWAP-07

Temporality

1963 1964

1960s

Chris

contains

bornIn

Person

Decade

Year

Page 21: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

21SWAP-07

Spatial/Containment Pitfall(OWL Guide)

Alsace Region Loire Region

French Region

Page 22: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

22SWAP-07

Spatial/Containment

Alsace Loire

France

Region

Country

contains

Page 23: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

23SWAP-07

Its about the instances

• For every class, think about what an instance of it is– What is an instance of “Loire Region”?

• Classes do not describe their subclasses– “Regions by Country” is a class of classes

• Criteria for individuation must remain constant within a taxonomy– Instance of a class is also an instance of every superclass• Thus “Chris” is not an instance of “1963 births”

– Explore the “boundary conditions”• E.g. Changes in existence, distinctions with similar classes

• “Leaf Nodes” of a hierarchy have no special significance– Don’t switch to instances

Page 24: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

24SWAP-07

Common Pitfalls

• Composition (part of)– Arm subclass body

• Constitution– Statue subclass marble

• Disjunction– (class Car partial (all hasPart CarPart)

– (Engine subclass CarPart)

– (Tire subclass CarPart)• Spatial

– NewYork subclass US

• Polysemy– Book subClass PhysicalObject– Book subClass ConceptualCreation

• Arbitrary organizational nodes– FictionalBookbyLatinAmericanAuthor subClass FictionalBook

• Instance– PinotNoir instanceof Grape

• Temporality– YoungElvis instanceOf Elvis

Page 25: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

25SWAP-07

The linguistic tests• If P subclass Q, you should be able to say “P is a kind of Q”

• If a instanceOf P, you should be able to say, “a is a P”

• If a instanceof P subClassOf Q, you should be able to say “a is a Q”

• For every instance, there should be a class it is (rigidly) an instance of that is its natural label

• You should not find it natural to say, if P subclassOf Q, “P has Q”, “P might be Q”, “P was Q”, “P is in Q”, “P is part of Q”

Page 26: SWAP-07 Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center

26SWAP-07

What’s in a name• Don’t argue about what specific terms mean

– Common software architecture argument: “What is a bridge?”

• Try and find the distinctions that matter– Assign them labels later

• Avoid “ish” “-thing” & “other-” classes– Find good names that will avoid meaning creep– Other- classes create a maintenance nightmare

• Classes describe their instances– Remember the linguistic tests

• The superclass is not part of the name– So don’t assume it is (e.g. Best_Practices subClassOf Document)