32
Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Learning to Match Ontologies on the Semantic Web

AnHai DoanJayant MadhavanRobin DhamankarPedro Domingos

Alon Halevy

Page 2: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Glue

Identifies Mappings between websites

Uses Machine Learning

Uses Common Sense Knowledge

Domain Constraints

Page 3: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Motivation

Data comes from Different Ontologies Answers come from multiple web

pages Manual:

very tedious, error prone, not very scalable

Page 4: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Outline

Overview of GLUE GLUE Architecture Case Studies CGLUE Case Studies Conclusion Assessment

Page 5: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Overview• Assumes 2 Ontologies• 1-1 Matching• Similarity between two Concepts

• Computing Joint Distribution• P(A,B), P(A, ~B), P(~A,B), P(~A,~B)

• Machine Learning• Multistrategy Learning• Exploiting Domain Constraints• Data Instances

Page 6: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Overview

Relaxation Labeler

Similarity Estimator

Meta Learner M

L1 Lk

Taxonomy 01 Taxonomy 02

Joint DistributionsSimilarity function

Similarity MatrixCommon knowledgeDomain constraints

Mappings for Taxonomies

…………

Page 7: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Distribution Estimator

Meta Learner M

Base LearnerL1 ………

Base LearnerLk

Taxonomy 01 Taxonomy 02

Joint Distributions

Page 8: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Distribution Estimator

R

DCA

FE

t1,t2 t3,t4

t5 t6,t7

t1,t2,t3,t4

t5,t6,t7

Trained Learner L

Page 9: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Distribution Estimator

G

HB

JIs2,s3 s4

s5,s6

s1,s2,s3,s4

s5,s6

L

s1

Page 10: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Distribution Estimator

s1,s3

s5 s6

s2,s4

Page 11: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Multistrategy Learning

Base Learners Content Learner

Frequency Naïve Bayes

Name Learner Full Name

Specific and Descriptive Element MetaLearner

Page 12: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

MetaLearner

Combines the base learners Gives learner weight

User Input

Page 13: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Joint DistributionsSimilarity function

Similarity Estimator

Similarity Matrix

Similarity Estimator

Page 14: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Similarity Estimator Applies Function From User

Jaccard-sim

Outputs a matrix between concepts

Page 15: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Where are we?

Find Similarities

Compute Similarities

Satisfy Constraints

Page 16: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Relaxation Labeler

Relaxation Labeler

Similarity MatrixCommon knowledgeDomain constraints

Mappings for Taxonomies

Page 17: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Constraints

Domain-Independent General Knowledge

Domain-Dependent Interaction between two nodes

Model each as a feature f()

Page 18: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Domain Independent

Page 19: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Relaxation Labeler

Searches for best mapping given constraints

Labels are influenced by it “neighborhood”

Performs local optimization

Page 20: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Local Optimization

1. Assigns initial labels 2. Performs Optimization 3. Uses a formula to change a label 4. Repeat 2-3

Page 21: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Local Optimization

Node in taxonomy O1 Label in taxonomy O2 Everything we know

Other label assignments to all Nodes besides X

Page 22: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Local Optimization

Page 23: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Where are we?

Relaxation Labeler

Similarity Estimator

Meta Learner M

L1 Lk

Taxonomy 01 Taxonomy 02

Joint DistributionsSimilarity function

Similarity MatrixCommon knowledgeDomain constraints

Mappings for Taxonomies

…………

Page 24: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Case Study

• University Catalogs• Business Profiles

• For Each one• Entire set of data instances• Cleaned it up

Page 25: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Results

Page 26: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Improvements

Insufficient Training Data Local Optimization Additional Base Learners Ambiguous Best Match

Page 27: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

CGLUE

Page 28: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

CGLUE

Beam Search Uses structure and data No relaxation labeling (no

constraints)

Page 29: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

CGLUE Case Study

Page 30: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Improvements

Incorporate Domain Constraints Object Identification

Page 31: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Conclusion

Semantic Similarity Multistategy Learning Relaxation Labeling CGLUE

Page 32: Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy

Assessment

Data Instances Additional Sites? CGLUE Future Work