31
Distributed Database Distributed Database Management Systems Management Systems

Distributed Database Management Systems

  • Upload
    dalit

  • View
    44

  • Download
    1

Embed Size (px)

DESCRIPTION

Distributed Database Management Systems. Reading. Textbook: Ch. 4. Design Issues. Placing of data and programs (DBMS and application) Network issues. Level of Sharing. No sharing Data sharing Data and program sharing. Heterogeneous environment!. Top-Down Design. - PowerPoint PPT Presentation

Citation preview

Page 1: Distributed Database Management Systems

Distributed Database Distributed Database Management SystemsManagement Systems

Page 2: Distributed Database Management Systems

ReadingReading

Textbook: Ch. 4Textbook: Ch. 4

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 22

Page 3: Distributed Database Management Systems

Design IssuesDesign Issues

Placing of data and programs Placing of data and programs (DBMS and application)(DBMS and application)

Network issuesNetwork issues

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 33

Page 4: Distributed Database Management Systems

Level of SharingLevel of Sharing

No sharingNo sharing Data sharingData sharing Data and program sharingData and program sharing

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 44

Heterogeneous environment!

Page 5: Distributed Database Management Systems

Top-Down DesignTop-Down Design

Global Conceptual schema Global Conceptual schema distributiondistribution– FragmentationFragmentation– ReplicationReplication– AllocationAllocation

Figure 3.2Figure 3.2

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 55

Page 6: Distributed Database Management Systems

Correctness of Correctness of FragmentationFragmentation

1.1. Completeness: FCompleteness: FRR={R={R11, …, R, …, Rnn}}

2.2. Reconstruction: R=Reconstruction: R=RRii, , RRiiRR

3.3. Disjointness: Disjointness: – Horizontal: does not Horizontal: does not d djjRRi i such that dsuch that djjRRk k

where kwhere ki i – Vertical: same as horizontal for non-Vertical: same as horizontal for non-

primary key attributesprimary key attributesFarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 66

1&2: Lossless-join (normalization)

Page 7: Distributed Database Management Systems

Data DirectoryData Directory

Global vs. local conceptual Global vs. local conceptual schemasschemas– How to search?How to search?– Where to store?Where to store?– Single vs. multiple copies? Single vs. multiple copies?

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 77

Page 8: Distributed Database Management Systems

Current ResearchCurrent Research

Allocation: new requirements, Allocation: new requirements, technology, etc.technology, etc.

Where to store the fragments?Where to store the fragments? Dynamic environmentDynamic environment

– Usage patternUsage pattern– Application characteristicsApplication characteristics– Network changesNetwork changes– SecuritySecurity

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 88

Page 9: Distributed Database Management Systems

Bottom-Up ApproachBottom-Up Approach

Multi-database systemsMulti-database systems How to integrate them into 1 How to integrate them into 1

database?database?– InteroperabilityInteroperability

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 99

Page 10: Distributed Database Management Systems

Database IntegrationDatabase Integration

Physical integrationPhysical integration– Materialized database: data Materialized database: data

warehouseswarehouses– Extract-transform-load (ETL) toolsExtract-transform-load (ETL) tools

Logical integrationLogical integration– Virtual (not materialized) Virtual (not materialized)

integrationintegration– Enterprise Information IntegrationEnterprise Information Integration

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1010

Page 11: Distributed Database Management Systems

Data WarehousesData Warehouses

On-line Analytical Processing On-line Analytical Processing (OLAP) applications:(OLAP) applications:– Decision support systemsDecision support systems– Trend analysis and forecastingTrend analysis and forecasting

Complex queries, large Complex queries, large databasesdatabases

Materialized view maintanenceMaterialized view maintanence

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1111

Page 12: Distributed Database Management Systems

Logical IntegrationLogical Integration

No materialized global databaseNo materialized global database Virtual integration: data remains at Virtual integration: data remains at

the local (operational) databasesthe local (operational) databases Global conceptual schema may not Global conceptual schema may not

contain everything from local contain everything from local schemasschemas

AutonomousAutonomous and and heterogeneouheterogeneous s local systemslocal systems

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1212

Page 13: Distributed Database Management Systems

Bottom-Up DesignBottom-Up Design

Global Conceptual Schema (GCS Global Conceptual Schema (GCS or mediated schema)or mediated schema)– Defined first: local conceptual Defined first: local conceptual

schemas (LCS) are mapped to GCSschemas (LCS) are mapped to GCS– Defined during the integration of Defined during the integration of

the LCSs and develop the the LCSs and develop the corresponding mappings from LCSs corresponding mappings from LCSs to the GCSto the GCS

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1313

Page 14: Distributed Database Management Systems

GCS Defined FirstGCS Defined First Local-as-view (LAV) systemsLocal-as-view (LAV) systems

– Each LCS is treated as a view over the GCSEach LCS is treated as a view over the GCS– Query results: constrained to the objects in the Query results: constrained to the objects in the

local DBs while the GCS definition may be richerlocal DBs while the GCS definition may be richer– Potential incomplete answersPotential incomplete answers

Global-as-view GCS is defined as a set of views Global-as-view GCS is defined as a set of views over the LCSsover the LCSs– View definition defines how to derive elements View definition defines how to derive elements

of the GCSof the GCS– Query results: constrained to the GCS while the Query results: constrained to the GCS while the

local DBs might be richerlocal DBs might be richer

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1414

Page 15: Distributed Database Management Systems

Design TasksDesign Tasks

Schema translationSchema translation Schema generationSchema generation Figure 4.3Figure 4.3

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1515

Page 16: Distributed Database Management Systems

Intermediate Intermediate Canonical Canonical RepresentationRepresentation Expressive to incorporate all Expressive to incorporate all

concepts in the local databasesconcepts in the local databases Simple, intuitive, practical, etc. Simple, intuitive, practical, etc. Example: E/R model, relational Example: E/R model, relational

model, graph/tree models, etc.model, graph/tree models, etc. Tools Tools

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1616

Page 17: Distributed Database Management Systems

Schema GenerationSchema Generation

Schema matching: syntax and Schema matching: syntax and semanticssemantics

Integration of common schema Integration of common schema elementselements

Schema mappingSchema mapping See example 4.1, 4.2See example 4.1, 4.2

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1717

Page 18: Distributed Database Management Systems

Schema MatchingSchema Matching

Defined or discovered (e.g., web Defined or discovered (e.g., web data)data)

Rules:Rules:– Correspondence between 2 elementsCorrespondence between 2 elements– Predicate whether the Predicate whether the

correspondence holds or notcorrespondence holds or not– Similarity value between the 2 Similarity value between the 2

elementselements

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1818

Page 19: Distributed Database Management Systems

Finding Finding CorrespondenceCorrespondence Difficult process due to Difficult process due to schema schema

heterogeneity heterogeneity Can be automated?Can be automated?

– Insufficient schema and instance Insufficient schema and instance informationinformation

– Unavailability of schema Unavailability of schema documentationdocumentation

– Subjectivity of matchingSubjectivity of matchingFarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1919

Page 20: Distributed Database Management Systems

Matching Algorithm Matching Algorithm IssuesIssues Schema vs. instance matchingSchema vs. instance matching

– Concept matchConcept match– Data instance: semantic inconsistenciesData instance: semantic inconsistencies

Element-level vs. structure-level mappingElement-level vs. structure-level mapping– Element name Element name semantics semantics– Multiple attribute mapping?Multiple attribute mapping?

Matching cardinalityMatching cardinality– One-to-one, one-to-many, many-to-manyOne-to-one, one-to-many, many-to-many

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2020

Page 21: Distributed Database Management Systems

Semantic Schema Semantic Schema Heterogeneity Heterogeneity Semantic: meaning, interpretation, Semantic: meaning, interpretation,

and intended use of dataand intended use of data– Synonyms, homonyms, hypernymsSynonyms, homonyms, hypernyms– Different ontologiesDifferent ontologies– Imprecise wordingImprecise wording

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2121

Page 22: Distributed Database Management Systems

Structural Schema Structural Schema Heterogeneity Heterogeneity – Type conflict: attribute vs. entityType conflict: attribute vs. entity– Dependency conflict: mapping Dependency conflict: mapping

cardinality inconsistenciescardinality inconsistencies– Key conflict: different primary keys Key conflict: different primary keys – Behavioral conflict: modeling Behavioral conflict: modeling

assumptions, e.g., referential assumptions, e.g., referential integrity, deletion, etc.integrity, deletion, etc.

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2222

Page 23: Distributed Database Management Systems

Schema IntegrationSchema Integration

BinaryBinary N-aryN-ary

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2323

Page 24: Distributed Database Management Systems

Schema MappingSchema Mapping

How the data from local How the data from local databases can be mapped to databases can be mapped to GCSGCS

Mapping creatingMapping creating Mapping maintanenceMapping maintanence

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2424

Page 25: Distributed Database Management Systems

Mapping CreationMapping Creation

Input: LCS, GCS, M (schema Input: LCS, GCS, M (schema matches)matches)

Output: Q={QOutput: Q={Q11, …, Q, …, Qkk} such that} such that

– DBDBGCSGCS = = Q(DB Q(DBCLSCLS))

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2525

Page 26: Distributed Database Management Systems

Security ObjectivesSecurity Objectives

ConfidentialityConfidentiality IntegrityIntegrity AvailabilityAvailability

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2626

Page 27: Distributed Database Management Systems

Question 1Question 1

How distributed databases How distributed databases impact the security objectives?impact the security objectives?– Confidentiality in traditional vs. Confidentiality in traditional vs.

distributed DBsdistributed DBs– Integrity in traditional vs. Integrity in traditional vs.

distributed DBsdistributed DBs– Availability in traditional vs. Availability in traditional vs.

distributed DBsdistributed DBs

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2727

Page 28: Distributed Database Management Systems

IntegrityIntegrity

Correctness criteriaCorrectness criteria– Top-down designTop-down design– Bottom-up designBottom-up design

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2828

Page 29: Distributed Database Management Systems

AvailabilityAvailability

What are the issues related to What are the issues related to availability when dealing with availability when dealing with – Top-down designTop-down design– Bottom-up designBottom-up design

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2929

Page 30: Distributed Database Management Systems

ConfidentialityConfidentiality

(will be covered in 2(will be covered in 2ndnd part of part of semester but…)semester but…)

Centralized vs. distributed Centralized vs. distributed security policysecurity policy– Top-down designTop-down design– Bottom-up designBottom-up design

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 3030

Page 31: Distributed Database Management Systems

FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 3131

Next ClassNext Class

Semantics-based Database Semantics-based Database IntegrationIntegration