Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
���ه �عا�ی
Semantic Web
Morteza Amini
Ontology Alignment
Sharif University of Technology Fall 94-95
Outline
The Problem of Ontologies
Ontology Heterogeneity
Ontology Alignment Overall Process
Similarity Methods
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 2
Outline
The Problem of Ontologies
Ontology Heterogeneity
Ontology Alignment Overall Process
Similarity Methods
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 3
The Problem
Like the Web, the Semantic Web by design will be distributed and heterogeneous.
Ontology is used in it to support interoperability and common understanding between different parties.
Ontologies themselves may have some heterogeneities.
Ontology Alignment is needed to find semantic relationships among entities of ontologies.
How should I use them? !!!
?
? ? ?
?
? ? d c
b
a
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 4
Need for Ontology Merging
There is significant overlap in existing ontologies Yahoo! and DMOZ Open Directory Product catalogs for similar domains
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 5
Terminology (1)
Mapping: a formal expression that states the semantic relationship between two entities belonging to different ontologies. Given two ontologies O1 and O2, mapping one ontology onto
another means that for each entity (concept C, relation R, or instance I) in ontology O1, we try to find a corresponding entity, which has the same intended meaning, in ontology O2.
map(e1i) = e2j
Ontology Alignment: a process of producing a set of correspondences between two or more (in case of multi-alignment) ontologies. These correspondences are expressed as mappings.
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 6
Terminology (2)
Ontology Transformation: a general term for referring to any process which leads to a new ontology O0 from an ontology O by using a transformation function T.
Ontology Translation: an ontology transformation function t for translating an ontology O written in some language L into an ontology O’ written in a distinct language L’.
Ontology Merging: the creation of a new ontology from two (possibly overlapping) source ontologies. This concept is closely related to that of integration in the database community.
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 7
An Example of Ontology Alignment
FastAli’s Peugeot
Vehicle Has Specification
Speed
250 km/h
Peugeot 405
Has SpeedCar
Speed
Ali
Owner
Boat
Thing
Automobile
Object
VehicleHas
Owner
1.0
0.6
0.6
0.8
Car – Automobile Label Similarity = 0.0 Super Similarity = 1.0 Instance Similarity = 0.6 Relation Similarity = 0.8 Total Similarity = 0.6
ConceptProperty
Instance
TypeSimilarity
Car : Ontology A ( similar to ) Automobile : Ontology B
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 8
An Example of Ontology Merging
Family Car
Porsche
Sport Car
Automobile
Thing Object
Luxury Car Family Car Sport Car
Vehicle
Car Bus
BMW
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 9
An Example of Ontology Merging
Object
Luxury Car Family Car Sport Car
Family Car Sport Car
Automobile
Thing
Vehicle
Car Bus
Porsche
BMW
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 10
An Example of Ontology Merging
Object, Thing
Luxury Car Family Car Sport Car
Vehicle
Car, Automobile Bus
Porsche BMW
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 11
Outline
The Problem of Ontologies
Ontology Heterogeneity
Ontology Alignment Overall Process
Similarity Methods
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 12
Forms of Heterogeneity in Ontologies (1)
(1) Syntactic: depend on the choice of the representation OWL, RDFS, DAML, N3, DATALOG, PROLOG, …
(2) Terminological: all forms of mismatches that are related to the process of naming the entities (e.g. individuals, classes, properties, relations) that occur in an ontology. Typical Examples:
different words are used to name the same entity (synonymy); the same word is used to name different entities (polysemy); words from different languages (English, French, etc.) are used to
name entities; syntactic variations of the same word (different acceptable spellings,
abbreviations, use of optional prefixes or suffixes, etc.). Mismatches at the terminological level are not as deep as those
occurring at the conceptual level. However, Most real cases have to do with the terminological level (e.g., with the way different people name the same entities), and therefore this level is at least as crucial as the other one.
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 13
Forms of Heterogeneity in Ontologies (2)
(3) Conceptual: we encounter mismatches which have to do with the content of an ontology.
Metaphysical differences: which have to do with how the world is “broken into pieces”.
Coverage: cover different portions – possibly overlapping– of the world.
Granularity: One ontology provides a more (or less) detailed description of the same entities.
Perspective: an ontology may provide a viewpoint, which is different from the viewpoint adopted in another ontology.
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 14
Forms of Heterogeneity in Ontologies (3)
Metaphysical differences:
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 15
Overcoming Heterogeneity
One common approach to the problems of heterogeneity is the definition of relations (mappings) across the heterogeneous representations.
These relations can be used for transforming expression of one ontology into a form compatible with that of the other.
This may happen at any level: syntactic: through semantic-preserving transducers; terminological: through functions mapping lexical information; conceptual: through general transformation of the representations.
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 16
Structure of Mappings
Alignment: a process that starts from two representations O and O’ and produces a set of mappings between pairs of (simple or complex) entities <e, e’> belonging to O and O’ respectively.
Intuitively, we will assume that in general a mapping can be described as a quadruple: <e, e’, n, R>
e and e’ are the entities between which a relation is asserted by the
mapping. n is a degree of trust (confidence) in that mapping. R is the relation associated to a mapping, where R identifies the
relation holding between e and e’. Example: (Car, Automobile, 0.6, Equivalent) In this course we focus on finding “equivalence” or “same as”
relations.
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 17
Finding Mappings Through Similarity
There are many ways to assess the similarity between two entities. The most common way amounts to defining a measure of this similarity.
The characteristics which can be asked from these measures:
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 18
Outline
The Problem of Ontologies
Ontology Heterogeneity
Ontology Alignment Overall Process
Similarity Methods
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 19
Ontology Alignment Process
Iterations
Input Output
1. Feature Extraction
3. Similarity 4. Aggregation 5. Interpretation 2. Entity Pair Selection
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 20
1 & 2. Feature Extraction / Pair Selection
Extracting entities of two ontologies and their properties or featureas.
Example Features: name, label, subclassOf, instances
Pair selection
Object
Vehicle
Car Boat
hasOwner
Owner Speed hasSpeed
Porsche KA-123 Marc 250 km/h
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 21
3. Similarity - Measures
)),min(
),(),min(,0max(),(
21
212121 ss
ssedsssssimString
−=
String similarity: string comparisons e.g. labels. E.g.,
Object similarity: direct object comparisons. Are two objects the same? E.g., for evaluating the similarity of instances.
Set similarity: set comparisons. Are the two sets of objects the same? E.g., for evaluating the similarity of concepts (based on their instances).
Set similarity requires a precalculated similarity of the objects based
on object similarity method.
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 22
3. Similarity - Rules
Feature Similarity Measure
Concepts name String Similarity
subclassOf Object Similarity
instances Set Similarity
…
Relations instances Set Similarity
Instances name String Similarity
instanceOf Object Similarity
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 23
4. Aggregation
∑=k
kk fesimwfesim ),(),(
How are the individual similarity measures combined?
Linearly
Weighted
Special Function
Aggregation methods are in fact Global similarity
methods.
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 24
5. Interpretation
From similarities to mappings.
A threshold can be applied on the similarity (measured in the previous step) to determine the required mapping.
map(e) = f if sim(e ,f)>t
The threshold can be determined through test (training) data sets.
Manual interpretation based on the collected information is another approach.
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 25
Outline
The Problem of Ontologies
Ontology Heterogeneity
Ontology Alignment Overall Process
Similarity Methods
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 26
Similarity Methods
Local Methods
Having local view to compute similarities.
Global Methods
Having global view to compute similarities and merge
computed local similarities.
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 27
Similarity – Local Methods
Terminological Methods String Based Methods Language Based Methods
Structural Methods Internal Structure External Structure
Extensional (based on instances) Methods When the classes share the same instances When they do not
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 28
Terminological Methods
The main idea in using such measures is the fact that usually similar entities have similar names and descriptions in different ontologies.
Terminological methods compare strings.
Can be applied to: name, label comments concerning entities URI
Take advantage of the structure of the string (as a sequence of letter).
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 29
Terminological Methods - Normalization
There are a number of normalization procedures that help improving the results of subsequent comparison:
Case normalization: consists of converting each alphabetic
character in the strings in their down case counterpart;
Diacritics suppression: replacing characters with diacritic signs with their most frequent replacement (replacing Montréal with Montreal);
Blank normalization: Normalizing all blank characters (blank, tabulation, carriage return) into a single blank character;
Link stripping: normalizing some links between words, e.g., replacing apostrophes and blank underline into dashes;
Stopword elimination: eliminates words that can be found in a list (usually like, “to”, “a". . . ).
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 30
Terminological Methods - String Based
Substring Similarity
Hamming Distance
N-Gram Distance
Edit Distance
Jaro Similarity
Token Based Distances
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 31
Terminological Methods - String Based
In string edit distance, the operations usually considered are insertion of a character, replacement of a character by another and deletion of a character.
Levenshtein Distance is an Edit Distance with all costs to 1.
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 32
Terminological Methods – Language Based
Rely on using NLP techniques to find associations between instances of concepts or classes.
Intrinsic methods: perform the terminological matching with the help of morphological and syntactic analysis to perform term normalization. (Stemming) : going go
Extrinsic methods: make use of external resources such as dictionaries and lexicons (Wordnet). Resnik Semantic Similarity
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 33
Structural Methods
The structure of entities that can be found in ontology can be compared, instead of comparing their names or identifiers.
Internal Structure: use criteria such as the range of their properties (attributes and relations), their cardinality, and the transitivity and/or symmetry of their properties to calculate the similarity between them.
External Structure: The similarity comparison between two entities from two ontologies can be based on the position of entities within their hierarchies.
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 34
Structural Methods – External (1)
If two entities from two ontologies are similar, their neighbors might also be somehow similar.
Criteria for deciding that the two entities are similar include: Their direct super-entities are already similar. Their sibling-entities are already similar. Their direct sub-entities are already similar. All (or most) of their descendant-entities (entities in the sub
tree rooted at the entity in question) are already similar. All (or most) of their leaf-entities are already similar. All (or most) of entities in the paths from the root to the
entities in question are already similar.
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 35
Structural Methods – External (2)
Some existing Approaches: Structural topological dissimilarity on hierarchies
Upward Cotopic Distance
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 36
‘
‘ ‘
Extensional (based on instances) Methods
Compares the extension of classes, i.e., their set of instances rather than their interpretation.
Conditions in which such techniques can be used: When the classes share the same instances
When they do not
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 37
Similarity – Global Methods
After calculation of local similarity, it is remain to compute the
alignment. This involve some kind of more global treatments,
including:
aggregating the results of these base methods in order to compute
the similarity between compound entities
organizing the combination of various similarity / alignment
algorithms
involving the user in the loop
finally extracting the alignments (mappings) from the resulting
(dis)similarity
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 38
Compound Similarity
Some existing approaches:
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 39
Users Feed Back
The support of effective interaction of the user with the system components is one concern of ontology alignment.
User input can take place in many areas of alignment: Assessing initial similarity between some terms; Invoking and composing alignment methods; Accepting or refusing similarity or alignment provided by the
various methods.
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 40
Alignment Extraction
The ultimate alignment goal is a satisfactory set of correspondences (mappings) between ontologies.
Manual Extraction: Display the entity pairs with their similarity scores and/or ranks and leaving the choice of the appropriate pairs up to the user of the alignment tool.
Automatic Extraction: Using Thresholds Hard threshold: retains all the correspondence above threshold n. Delta method: using the highest similarity value to which a
particular constant value d is subtracted as a threshold (max – d). Proportional method: using the n percentage of the highest
similarity value as a threshold. Percentage: retains the n% correspondences above the others.
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 41
Existing Works
Method Year Organization Project Leader Automatic
Features
Ag
greg
ation
Lexical Stru
cture
Strin
g
Sem
antic
Instan
ce
OntoMorph 1997 S. California Chalupsky Semi T
U.S. Army 1999 DARPA Semi T
Smart 1999 Sanford Fridman, Noy Semi T T
Chimaera 1999 Stanford McGuinness Semi T T T
Prompt 2001 Stanford Noy, Musen Semi T T
InfoSlueth 2001 Amsterdam Ding Semi T T
A. Prompt 2002 Stanford Noy, Musen Semi T T T
Glue 2002 Illinois Doan Automatic T T T T
IF Map 2003 Southampton Kafoglou Automatic T T
NOM 2003 Karlsruhe Ehric Automatic T T T T T
QOM 2004 Karlsruhe Ehric Automatic T T T T
CROSI 2005 Southampton Kafoglou Automatic T T T
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 42
Any Question... [email protected]
Sharif Univ. of Tech. Ontology Alignment - Morteza Amini 43