35
A Publish & Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universität Passau D-94030 Passau <lastname>@db.fmi.uni- passau.de 2 TU München D-81667 München [email protected] e

A Publish Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

Embed Size (px)

DESCRIPTION

ICDE Motivation Resource management in ObjectGlobe Requirements: Large number of clients  3-tier architecture Information close to the clients  caching Up-to-date information

Citation preview

Page 1: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

A Publish & Subscribe Architecture for Distributed Metadata Management

Markus Keidl1 Alexander Kreutz1Alfons Kemper1 Donald Kossmann2

1 Universität Passau D-94030 Passau <lastname>@db.fmi.uni-passau.de

2 TU München D-81667 München [email protected]

Page 2: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 2

Outline Motivation The MDV system The publish & subscribe algorithm Conclusion

Page 3: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 3

Motivation Resource management in ObjectGlobe Requirements:

Large number of clients 3-tier architecture

Information close to the clients caching

Up-to-date information

Page 4: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 4

RDF and RDF Schema RDF = Resource Description

Framework W3C Recommendation Defines resources, properties, and values

RDF Schema W3C Candidate Recommendation Defines schema of metadata, similar to

class hierarchy

Page 5: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 5

RDF Example: doc.rdf<CycleProvider rdf:ID="host1"> <serverHost>pirates.uni-passau.de</> <serverPort>5874</serverPort> <serverInformation> <ServerInformation rdf:ID="info1" memory="92" cpu="600" /> </serverInformation></CycleProvider>

Page 6: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 6

The MDV System Metadata: RDF and RDF Schema 3-tier Architecture:

MDPs, LMRs, and MDV Clients Caching on local tier Up-to-date metadata by using a

publish & subscribe mechanism

Page 7: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 7

Architecture Overview

MDP MDP MDP

LMR LMR

Publish/Subscribe

Optimizer

Backbone

Page 8: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 8

Architecture: MDPs MDP = Metadata Provider Backbone of MDPs

Sharing the same schema Full Replication of metadata

Metadata: globally and publicly available

Registration, update, deletion of metadata

Page 9: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 9

Architecture Overview

MDP MDP MDP

LMR LMR

Publish/Subscribe

Optimizer

Backbone

Page 10: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 10

Architecture: LMRs LMR = Local Metadata Repository Metadata

Caching of global metadata publish & subscribe

Storing of local metadata only locally accessible, for sensitive data

Query processing Declarative language Cached and local metadata

Page 11: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 11

Architecture Overview

MDP MDP MDP

LMR LMR

Publish/Subscribe

Optimizer

Backbone

Page 12: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 12

Architecture: MDV Clients Pose queries to LMRs Browse metadata at MDPs and LMRs

determine metadata that should be cached

Modify subscription rules of LMRs

Page 13: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 13

Based on subscription rules: LMRs subscribe to metadata (at MDPs) MDPs determine which metadata to

publish (to LMRs)

Insertion, update, or deletion of metadata Evaluation

The Publish & Subscribe System

MDP LMRPublish

metadataSubscription

rules

Page 14: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 14

Subscription Rule Language

Operators: =, !=, <, <=, >, >=, contains

Example:search CycleProvider cregister cwhere c.serverHost contains 'uni-passau.de'

and c.serverInformation.memory > 64 Joins

Input: document + complete database Publish: resources, not documents

Page 15: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 15

References

Problem: only subscription of CycleProvider resources

What’s with ServerInformation resources?

search CycleProvider cregister cwhere c.serverHost contains 'uni-passau.de'

and c.serverInformation.memory > 64

<CycleProvider rdf:ID="host1"> <serverHost>pirates.uni-passau.de</> <serverPort>5874</serverPort> <serverInformation> <ServerInformation rdf:ID="info1" memory="92" cpu="600" /> </serverInformation></CycleProvider>

Page 16: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 16

References - Solution Augmentation of RDF schema

"user-defined" dangling references Strong references:

transmitted together with referencing resource

garbage collector deletes superfluous resources at LMR

Weak references: never transmitted with referencing resource

Page 17: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 17

Publish & Subscribe Algorithm

Problem: huge set of subscription rule Solution: index on complete set of

rules Goal: evaluation of a subset of all

subscription rules Based on standard RDBMS technology

Page 18: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 18

Basic Approach

<CycleProvider rdf:ID="host1"> <serverHost>pirates.uni-passau.de</> <serverPort>5874</serverPort> <serverInformation> <ServerInformation rdf:ID="info1" memory="92" cpu="600" /> </serverInformation></CycleProvider>

search CycleProvider c register cwhere c.serverHost contains 'uni-passau.de' and c.serverInformation.memory > 64

RDF Document Subscription Rule

Set of all subscription

rules

Decomposition

Page 19: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 19

Advantages of the algorithm Based on standard RDBMS

technology: robustness, scalability, and query abilities

Usage of tables, SQL, indexes, optimizer, …

Insertions, updates, and deletions Support of joins

Page 20: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 20

The Filter Algorithm Decomposition of subscription rules Registration of new RDF document:

Decomposition of the RDF document Execution of algorithm:

Rules that match new metadata+new metadata

Rules LMRs Notification of these LMRs

Page 21: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 21

Details of the Algorithm Decomposition into atoms

RDF documents Rules triggering and join rules

Evaluation: Determination of affected triggering rules Iterative evaluation of join rules

calculation of transitive closure

Page 22: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 22

Decomposition of Documents

<ServerInformation rdf:ID="info1" memory="92" cpu="600" /> </serverInformation></CycleProvider>

<CycleProvider rdf:ID="host1"> <serverHost>pirates.uni-passau.de</> <serverPort>5874</serverPort> <serverInformation>

Page 23: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 23

Table: FilterData

rid class property value

doc.rdf#host1 CycleProvider rdf#subject doc.rdf#host1

doc.rdf#host1 CycleProvider serverHost pirates.uni-passau.de

doc.rdf#host1 CycleProvider serverPort 5874

doc.rdf#host1 CycleProvider serverInformation doc.rdf#info1

doc.rdf#info1 ServerInformation rdf#subject doc.rdf#info1

doc.rdf#info1 ServerInformation memory 92

doc.rdf#info1 ServerInformation cpu 600

Page 24: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 24

Details of the Algorithm Decomposition into atoms

RDF documents Rules triggering and join rules

Evaluation: Determination of affected triggering rules Iterative evaluation of join rules

calculation of transitive closure

Page 25: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 25

Normalization Path expressions are split up Search part contains all classes

referenced by the rule Example:

search CycleProvider c, ServerInformation sregister cwhere c.serverHost contains 'uni-passau.de'

and c.serverInformation = sand s.memory > 64

Page 26: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 26

Rule Decomposition – Example

search ServerInformation s register s where s.memory > 64

RuleA

search ServerInformation s register s where s.cpu > 500RuleB

search CycleProvider c register c where c.serverHost contains 'uni-passau.de'

RuleC

search CycleProvider c, ServerInformation sregister cwhere c.serverHost contains 'uni-passau.de'

and c.serverInformation = sand s.memory > 64 and s.cpu > 500

search RuleA a, RuleB b register a where a = bRuleEsearch RuleE a, RuleC c register c where c.serverInformation = a

RuleF

Page 27: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 27

Dependency Tree

RuleE

RuleF

ServerInformation

RuleAs.memory > 64

ServerInformation

RuleBs.cpu > 500

CycleProvider

RuleCc.serverHost contains 'uni-passau.de'

a = b

c.serverInformation = a

Page 28: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 28

Details of the Algorithm Decomposition into atoms

RDF documents Rules triggering and join rules

Evaluation: Determination of affected triggering rules Iterative evaluation of join rules

calculation of transitive closure

Page 29: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 29

Filter Algorithm - Example

rule_id class property valueRuleA ServerInformation memory 64RuleB ServerInformation cpu 500

FilterRulesGTrule_id class property valueRuleC CycleProvider serverHos

tuni-passau.deFilterRulesCON

oid class property valuedoc.rdf#host1

CycleProvider rdf#subject doc.rdf#host1

doc.rdf#host1

CycleProvider serverHost pirates.uni-passau.de

doc.rdf#host1

CycleProvider serverPort 5874

doc.rdf#host1

CycleProvider serverInformation

doc.rdf#info1

doc.rdf#info1

ServerInformation

rdf#subject doc.rdf#info1

doc.rdf#info1

ServerInformation

memory 92

doc.rdf#info1

ServerInformation

cpu 600

FilterData

ResultObjects

rid rule_id

RuleB doc.rdf#info1

RuleA doc.rdf#info1

RuleC doc.rdf#host1

Page 30: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 30

Details of the Algorithm Decomposition into atoms

RDF documents Rules triggering and join rules

Evaluation: Determination of affected triggering rules Iterative evaluation of join rules

calculation of transitive closure

Page 31: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 31

Iterative Evaluation

RuleE

RuleF

ServerInformation

RuleAs.memory > 64

ServerInformation

RuleBs.cpu > 500

CycleProvider

RuleCc.serverHost contains ‘uni-passau.de’

a = b

c.serverInformation = a

RuleA RuleB RuleC

RuleE

RuleF

Page 32: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 32

Updates and Deletions Filter algorithm only works for new

metadata Solution: execute algorithm 3 times

Page 33: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 33

Related Work - 1 Metadata:

Equal Time For Data on the Internet with WebSemantics[Mihaila, Raschid, Tomasic; EDBT '98]MOCHA: A Self-Extensible Database Middleware System for Distributed Data Sources [Rodriguez-Martinez, Roussopoulos; SIGMOD '00] Universal Description, Discovery, and Integration (UDDI)[Ariba, Inc., IBM, Microsoft; http://www.uddi.org]

Publish/Subscribe:Efficient Matching for Web-Based Publish/Subscribe Systems [Pereira, Fabret, Llirbat, Shasha; CoopIS '00]Matching Events in a Content-Based Subscription System[Aguilera, Strom, Sturman, Astley, Chandra; PODC '99]The SIFT Information Dissemination System [Yan, Garcia-Molina; TODS '99]Efficient Filtering of XML Documents for Selective Dissemination of Information [Altinel, Franklin; VLDB '00]

Page 34: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 34

Related Work - 2 Continuous Queries:

NiagaraCQ: A Scalable Continuous Query System for Internet Databases [Chen, DeWitt, Tian, Wang; SIGMOD '00]Continual Queries for Internet Scale Event-Driven Information Delivery [Liu, Pu, Tang; IEEE TKDE '99]

Materialized Views and Semantic Caching:Maintaining Views Incrementally [Gupta, Mumick, Subrahmanian; SIGMOD '93]Efficiently Updating Materialized Views[Blakeley, Larson, Tompa; SIGMOD '86]Semantic Data Caching and Replacement [Dar, Franklin, Jónsson, Srivastava, Tan; VLDB '96]

Page 35: A Publish  Subscribe Architecture for Distributed Metadata Management Markus Keidl 1 Alexander Kreutz 1 Alfons Kemper 1 Donald Kossmann 2 1 Universitt

ICDE 2002 35

Conclusion The MDV System:

MDPs, LMRs, and MDV Clients The Publish & Subscribe Algorithm:

Decomposition of documents and rules Determination of affected triggering rules Iterative evaluation of join rules