35
1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

Embed Size (px)

Citation preview

Page 1: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

1

Database Systems: Design, Implementation, and Management

Database Systems: Design, Implementation, and Management

CHAPTER 10

Distributed Database Management System

Page 2: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

2

Chapter ObjectivesChapter Objectives

Understand concepts of distributed DBMS Understand various transparency features of

distributed databases Understand distributed database design issues

Page 3: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

3What Is A Distributed DBMS?What Is A Distributed DBMS?

Decentralization of business operations and globalization of businesses created a demand for distributing the data and processes across multiple locations.

Distributed database management systems (DDBMS) are designed to meet the information requirements of such multi-location organizations.

A DDBMS manages the storage and processing of logically related data over interconnected computer systems in which both data and processing functions are distributed among several sites.

Distributed processing shares the database’s logical processing among two or more physically independent sites that are connected through a network.

Page 4: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

4

DDBMS Advantages

Data located near site with greatest demand Faster data access Faster data processing Growth facilitation Improved communications Reduced operating costs User-friendly interface Less danger of single-point failure Processor independence

Page 5: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

5

DDBMS Disadvantages

Complexity of management and control Security Lack of standards Increased storage requirements Greater difficulty in managing data environment Increased training costs

Page 6: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

6

Distributed ProcessingDistributed Processing

Figure 10.1 Distributed Processing Environment

Shares database’s logical processing among physically, networked independent sites

Page 7: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

7

Distributed DatabaseDistributed Database

Distributed database stores a logically related database over two or more physically independent sites connected via a computer network.

Page 8: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

8

Distributed Database

Stores logically related database over physically independent sites

Figure 10.2

Page 9: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

9

Distributed Database vs. Distributed Processing

Distributed processing Does not require distributed database May be based on a single database on single computer Copies or parts of database processing functions must

be distributed to all data storage sites Distributed database

Requires distributed processing Both

Require a network to connect components

Page 10: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

10Functions of DDBMS

Application/end user interface Validation to analyze data requests Transformation to determine request components Query optimization to find the best access strategy Mapping to determine the data location I/O interface to read or write data Formatting to prepare the data for presentation Security to provide data privacy Backup and recovery DB Administration Concurrency Control Transaction Management

Page 11: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

11

What Is A Distributed DBMS?What Is A Distributed DBMS?

Figure 10.3 Centralized Database Management System

Page 12: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

12

What Is A Distributed DBMS?What Is A Distributed DBMS?

Figure 10.4 Fully Distributed Database Management System

Page 13: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

13

DDBMS ComponentsDDBMS Components

Computer workstations that form the network system. Network hardware and software components that reside in

each workstation. Communications media that carry the data from one

workstation to another. Transaction processor (TP) receives and processes the

application’s data requests. Data processor (DP) stores and retrieves data located at the

site. Also known as data manager (DM).

Page 14: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

14

Distributed Database Components

Figure 10.5

Page 15: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

15

Levels of Data & Process DistributionLevels of Data & Process Distribution

Depending on the levels of data and process distribution we can envisage three different configurations: SPSD: Single site process, single site data (Centralized) MPSD: Multiple site processing, single site data MPMD: Multiple site processing, multiple site data

(Fully distributed) SPMD: Single site processing, multiple site data

(Logically unsound)

Page 16: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

16

Levels of Data & Process DistributionLevels of Data & Process Distribution

Single-Site Processing, Single-Site Data (SPSD) All processing is done on a single CPU or host computer. All data are stored on the host computer’s local disk. The DBMS is located on the host computer. The DBMS is accessed by dumb terminals. This is an example of a centralized DBMS

Page 17: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

17

Levels of Data & Process DistributionLevels of Data & Process Distribution

Figure 10.6 Nondistributed (Centralized) DBMS

Page 18: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

18

Levels of Data & Process DistributionLevels of Data & Process Distribution

Multiple-Site Processing, Single-Site Data (MPSD) Typically, MPSD requires a network file server on which

conventional applications are accessed through a LAN. A popular variation of the MPSD approach is known as a

client/server architecture.

Page 19: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

19

Levels of Data & Process DistributionLevels of Data & Process Distribution

Figure 10.7 Multiple-Site Processing, Single-Site Data

Page 20: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

20

Levels of Data & Process DistributionLevels of Data & Process Distribution

Multiple-Site Processing, Multiple-Site Data (MPMD) Fully distributed DBMS with support for multiple DPs

and TPs at multiple sites. Homogeneous DDBMS integrate only one type of

centralized DBMS over the network. Heterogeneous DDBMS integrate different types of

centralized DBMSs over a network.

Page 21: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

21

Distributed DB TransparencyDistributed DB Transparency

A DDBMS ensures that the database operations are transparent to the end user.

Different types of transparencies are: Distribution transparency Transaction transparency Failure transparency Performance transparency Heterogeneity transparency

Page 22: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

22

Distribution TransparencyDistribution Transparency

Distribution transparency allows us to manage a physically dispersed database as though it were a centralized database.

Three Levels of Distribution Transparency Fragmentation transparency Location transparency Local mapping transparency

Page 23: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

23

Distribution TransparencyDistribution Transparency

Example:Employee data (EMPLOYEE) are distributed over three locations: New York, Atlanta, and Miami.Depending on the level of distribution transparency support, three different cases of queries are possible:

Distributed DBMS

Employee Table

E1 E2 E3Fragment

Location New York Atlanta Miami

Page 24: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

24

Distribution TransparencyDistribution Transparency

When a DBMS support fragmentation transparency the user views a single logical database SELECT *FROM EMPLOYEEWHERE SALARY > 50000;

Page 25: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

25

Distribution TransparencyDistribution Transparency

When the DBMS supports location transparency the user needs to know the fragment names but need not know the actual location of the fragments SELECT * FROM E1 WHERE SALARY > 50000

UNION SELECT * FROM E2 WHERE SALARY > 50000 UNION SELECT * FROM E3 WHERE SALARY > 50000;

Page 26: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

26

Distribution TransparencyDistribution Transparency

When the DBMS supports local mapping transparency the user needs to know the fragment names as well as the actual location of the fragments SELECT *

FROM E1 NODE NYWHERE SALARY > 50000

UNION SELECT *

FROM E2 NODE ATL WHERE SALARY > 50000 UNION

SELECT * FROM E3 NODE MIA

WHERE SALARY > 50000;

Page 27: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

27

Distribution TransparencyDistribution Transparency

Distribution transparency is supported by a distributed data dictionary which captures the distributed global schema.

A local transaction processor uses this global schema to translate user requests into subqueries (remote requests) that will be processed by different data processors.

Page 28: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

28

Transaction TransparencyTransaction Transparency

A distributed transaction updates and/or requests data from multiple remote sites.

Transaction transparency ensures that the transaction will be completed only if all database sites involved in the transaction complete their part of the transaction.

It maintains database integrity of a distributed database. Giving a 5% raise to all employees in the previous example

involves updating the database at multiple locations. If the transaction cannot be committed in one location, it must be rolled back in all locations.

Page 29: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

29

Distributed DB TransparencyDistributed DB Transparency

Failure Transparency ensures that failure of a node will not affect the operation of a DDBMS

Performance Transparency ensures that the system performance will not degrade because of the distributed nature of the database. Query optimization becomes very complex in a

distributed database due to fragmentation and replication of data in multiple remote nodes.

Heterogeneity Transparency allows the integration of different types of DBMSs (multi vendor, multi model) under a common global schema.

The DDBMS transparently translates the user requests from one local schema to another.

Page 30: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

30

Distributed Database DesignDistributed Database Design

All design principles and concepts discussed in the context of a centralized database also apply to a distributed database.

Three additional issues are relevant to the design of a distributed database: data fragmentation data replication data allocation

Page 31: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

31

Data FragmentationData Fragmentation

Data fragmentation allows us to break a single object (a database or a table) into two or more fragments.

Three type of fragmentation strategies are available to distribute a table:Horizontal, Vertical, Mixed.

Horizontal fragmentation divides a table into fragments consisting of sets of tuples Each fragment has unique rows and is stored at a

different node Example: A bank may distribute its customer table by

location

Page 32: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

32

Data FragmentationData Fragmentation

Vertical fragmentation divides a table into fragments consisting of sets of columns Each fragment is located at a different node and consists

of unique columns - with the exception of the primary key column, which is common to all fragments

Example: The Customer table may be divided into two fragments, one fragment consisting of Cust ID, name, and address may be located in the Service building and the other fragment with Cust ID, credit limit, balance, dues may be located in the Collection building.

Page 33: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

33

Data FragmentationData Fragmentation

Mixed fragmentation combines the horizontal and vertical strategies.

A fragment may consist of a subset of rows and a subset of columns of the original table.

Example: Customer table may be divided by state and grouped by columns. The service building in Texas will store Customer service related information for customers from Texas.

Page 34: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

34

Data ReplicationData Replication

Data replication involves storing multiple copies of a fragment in different locations. For example, a copy may be stored in New York and another in San Francisco.

It improves response time and data availability. Data replication requires the DDBMS to maintain data

consistency among the replicas. A fully replicated database stores multiple copies of each

database fragment. A partially replicated database stores multiple copies of

some database fragments at multiple sites.

Page 35: 1 Database Systems: Design, Implementation, and Management CHAPTER 10 Distributed Database Management System

35

Data AllocationData Allocation

Data allocation decision involves determining the location of the fragments so as to achieve the design goals of cost, response time and availability.

Three data allocation strategies are: centralized, partitioned and replicated.

A centralized allocation strategy stores the entire database in a single location.

A partitioned strategy divides the database into disjointed parts (fragments) and allocates the fragments to different locations.

In a replicated strategy copies of one or more database fragments are stored at several sites.