25
Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

Embed Size (px)

Citation preview

Page 1: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

Baoshi Yan, P2PKM 20057/17/2005 1

Grass-Roots Class Alignment

Baoshi Yan

Information Sciences Institute, University of Southern California

Page 2: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 2Baoshi Yan, P2PKM 2005

Motivation

Sharing Structured Data among peers However, peers might use different

terminology (Ontology)

Need Ontology Alignment

Page 3: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 3Baoshi Yan, P2PKM 2005

What is Alignment

Correspondence between concepts

PhDStudent Firstname

Lastname

major

DoctoralStudent Givenname

Familyname

specialization

Page 4: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 4Baoshi Yan, P2PKM 2005

Alignment: State of the Art

Heuristics-based: Name similarity Structure similarity Instance Constraints Co-occurrence

Domain Expert Centralized Precise Alignment

Page 5: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 5Baoshi Yan, P2PKM 2005

Our Approach

Cursory Alignment by End Users Easy to produce

Combining different user’s alignments Reuse to reduce

effort by each user

Grass-Roots Alignment

Peer-to-Peer Alignment

Alignment Corpus

Page 6: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 6Baoshi Yan, P2PKM 2005

Grass-roots Alignment Example: WebScripter tool

Inferred Alignment:iswc:phone = isi: phonenumber

Inferred Alignment:iswc:phone = isi: phonenumber

when a user puts different stuffs when a user puts different stuffs into the same column, they mean same thinginto the same column, they mean same thingInferred Alignment:iswc:Person = isi: Div2Member

Inferred Alignment:iswc:Person = isi: Div2Member

Page 7: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 7Baoshi Yan, P2PKM 2005

Properties of Grass-Roots Alignment

Might be Approximate

inconsistent

Intransitive

GraduateO1

DoctoralStudent

PhDStudent

GraduateStudent

MSStudent

O2

MasterStudent

O3 O4

Page 8: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 8Baoshi Yan, P2PKM 2005

Challenge

How to reuse approximate or inconsistent grass-roots alignments for alignment purposes

Approximation conservative semantics of alignment

Inconsistency evidences

Page 9: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 9Baoshi Yan, P2PKM 2005

Observations & Assumptions

Users tend to pick closest alignment candidate

O2 O2

A

B

CA

CB

O1A

B C

A C

B

O1

A

B C

O1A

B

C

B

C

AO1A

B

C(a) (b)

(c) (d)

O2O2

O2 O2

Page 10: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 10Baoshi Yan, P2PKM 2005

Basic Idea:

Class relationships specified in ontology definite

Class relationships indicated by previous alignments Indefinite/ambiguous

Inference to get more Definite class relationships

Use these class relationships for future alignment

Page 11: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 11Baoshi Yan, P2PKM 2005

Class Alignment Algorithm:Step 1

Subclass Relationships Specified in the Ontology

Page 12: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 12Baoshi Yan, P2PKM 2005

Class Alignment Algorithm:Step 2

Class Relationships Implied by Grass-roots Alignments: the Semantics of Grass-roots Alignments:

A

B CA

B

C

A

C

BOR

C

B

AA

B C

A

B C

NOT NOT, ,O1 O2

Page 13: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 13Baoshi Yan, P2PKM 2005

the Semantics of Grass-roots Alignments (Cont)

A

B

C A

C

BNOT

O1 O2

Page 14: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 14Baoshi Yan, P2PKM 2005

the Semantics of Grass-roots Alignments (Cont)

A

B C

D A · D

B · C

O1 O2

Page 15: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 15Baoshi Yan, P2PKM 2005

Class Alignment Algorithm:Step 2

Class Relationships Implied by Alignments

Page 16: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 16Baoshi Yan, P2PKM 2005

Class Alignment Algorithm:Step 3: Forward-chaining Inference

Page 17: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 17Baoshi Yan, P2PKM 2005

(f1, e1) AND (f2, e2) ... AND (fi, ei) = > (f, e), its evidence e = e1*e2*..*ei.

same fact supported by evidences e1, e2, ..ei, e = e1+e2+...+ei.

Also note that same evidence doesn't count twice, that is, e1 + e1 = e1, e1 * e1 = e1.

Quantifying Evidences: V(e): a numerical value between (0, 1). V(e1+e2) = 1-(1-V(e1))*(1-V(e2)) V(e1*e2) = V(e1)*V(e2)

Dealing with Evidences

Page 18: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 18Baoshi Yan, P2PKM 2005

Class Alignment AlgorithmStep 4: Class Alignment Using Facts KB

Sup(A): the set of superclasses of A Sub(A): the set of subclasses of A Ind(A): all B such that

(A > B OR B > A) neither A > B or B > A is in KB I.e., B and A are indistinguishable according to

facts KB. deal with inconsistencies:

for each B from Sup(A), if there is a better-supported fact A > B, NOT(B > A) or BA, remove B from Sup(A). Do the same to Sub(A).

Page 19: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 19Baoshi Yan, P2PKM 2005

Examples: Ind(MasterStudent)={

MSStudent}

Sup(MasterStudent)={Graduate,Student,UnivStudent}

Sub(Graduate)={MasterStudent,MSStudent,DoctoralStudent}

Class Alignment Using Facts KB (cont)

Page 20: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 20Baoshi Yan, P2PKM 2005

Class Alignment Using Facts KB (cont)

Given A from O1, find best alignment B in O2 in the following order: O2 ∩ Ind(A) O2 ∩ Sup(A)

If B, B1 ∈ O2 ∩ Sup(A), pick B if B1 > B O2 ∩ Sub(A)

If B, B1 ∈ O2 ∩ Sub(A), pick B if B > B1

Everything being equal, pick better supported Otherwise no alignment candidate for A in O2.

Page 21: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 21Baoshi Yan, P2PKM 2005

Class Alignment Using Facts KB (cont)

Example: Ind(MasterStudent)={MSStudent} Sup(DoctoralStudent)={Graduate,Student,UnivStudent} Ind(Student)={UnivStudent}

Student

O1 O2

DoctoralStudentMaster

Student

UnivStudent

Graduate

MSStudent

Page 22: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 22Baoshi Yan, P2PKM 2005

Evaluation (qualitative analysis)

In the ideal case: Each previous alignment is best possible Then: Guaranteed Correctness in some cases

In the not-so-ideal case: Bad facts likely filtered out

Student

O1

DoctoralStudent

UnivStudent

Graduate

O2

Sup(DoctoralStudent)= {UnivStudent,Graduate}

Page 23: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 23Baoshi Yan, P2PKM 2005

Evaluation

Performance on University Student Ontology Set

0

0.2

0.4

0.6

0.8

1

1.2

0 10 20 30 40

Number of Alignments

Recall-Single-Inheritance

Precision-Single-Inheritance

Recall-Multi-Inheritance

Precision-Multi-Inheritance

26 ontologies on university student domain Measure resultant fact KB vs Reference KB

Page 24: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 24Baoshi Yan, P2PKM 2005

Related Work: schema mediation, schema reconciliation, schema

matching, semantic coordination, semantic mapping, and ontology mapping

ONION, PROMPT, LSD, GLUE, Automatch, SemInt, CUPID, COMA, MGS-DCM, HSDM Mediator, MOBS…

Name similarity, Structure similarity, Domain Constraints, Instance Features, Instance similarity, Multi-strategy learning, Statistical analysis, Alignment reuse.

Little work on Peer-to-Peer Alignment

Page 25: Baoshi Yan, P2PKM 2005 7/17/2005 1 Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California

7/17/2005 25Baoshi Yan, P2PKM 2005

Summary

An Alignment Approach: Ontology Alignment carried out by end

users in a Peer to Peer fashion Peers are both alignment consumer and

producer Future work:

Detailed experiments, theoretical analysis Property alignment with class as context

Thank You!