22
DISTRIBUTED DATABASE DESIGN

Distributed Database Design 3rd Assignment

Embed Size (px)

Citation preview

Page 1: Distributed Database Design 3rd Assignment

7/28/2019 Distributed Database Design 3rd Assignment

http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 1/22

DISTRIBUTED DATABASE

DESIGN

Page 2: Distributed Database Design 3rd Assignment

7/28/2019 Distributed Database Design 3rd Assignment

http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 2/22

2

DISTRIBUTED DATABASE DESIGN

The organization of distributed systems can be investigated along

three orthogonal dimensions:

1. Level of sharing

2. Behavior of access patterns 

3. Level of knowledge on access pattern behavior  

Page 3: Distributed Database Design 3rd Assignment

7/28/2019 Distributed Database Design 3rd Assignment

http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 3/22

3

DISTRIBUTED DATABASE DESIGN

Level of sharing 

 – no sharing - each application and its data execute at one site,

 – data sharing - all the programs are replicated at all the sites, but data

files are not,

 – data plus program sharing - both data and programs may be shared.

Behavior of access patterns

 – static - access patterns of user requests do not change over time,

 – dynamic - access patterns of user requests change over time.

Level of knowledge on access pattern behavior  – complete information - the access patterns can reasonably be

predicted and do not deviate significantly from the predictions,

 –  partial information - there are deviations from the predictions.

Page 4: Distributed Database Design 3rd Assignment

7/28/2019 Distributed Database Design 3rd Assignment

http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 4/22

4

ALTERNATIVE DESIGN STRATEGIES

Two major strategies that have been identified for designing

distributed databases are:

the top-down approach

• the bottom-up approach 

Page 5: Distributed Database Design 3rd Assignment

7/28/2019 Distributed Database Design 3rd Assignment

http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 5/22

5

ALTERNATIVE DESIGN STRATEGIES

TOP-DOWN DESIGN PROCESS

Page 6: Distributed Database Design 3rd Assignment

7/28/2019 Distributed Database Design 3rd Assignment

http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 6/22

6

ALTERNATIVE DESIGN STRATEGIES

TOP-DOWN DESIGN PROCESS•  Activity begins with the requirement analysis that defines

environment of the system and elicits both the data and 

 processing needs of database users.

• The requirement study also specifies where the final system is

expected to stand with respect to the objectives of the

distributed DBMS.

• The requirement document is input to two parallel activities:

view design and conceptual design.

Page 7: Distributed Database Design 3rd Assignment

7/28/2019 Distributed Database Design 3rd Assignment

http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 7/22

Continued..

• view design - defining the interfaces for end users,

• conceptual design - is the process by which the enterprise isexamined to determine entity types and relationships among

these entities. One can possibly divide this process into torelated activity groups:

 – entity analysis - is concerned with determining theentities, their attributes, and the relationships amongthese entities,

 –  functional analysis - is concerned with determining thefundamental functions with which the modeled enterpriseis involved.

Page 8: Distributed Database Design 3rd Assignment

7/28/2019 Distributed Database Design 3rd Assignment

http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 8/22

Continued..

 – View integration should be used to ensure that entity and

relationship requirements for all the views are covered in

the conceptual schema.

 – In conceptual design and view design activities the user 

needs to specify the data entities and must determine the

applications that will run on the database as well as

statistical information about these applications.

 – Statistical information includes the specification of the

frequency of user applications, the volume of variousinformation. 

Page 9: Distributed Database Design 3rd Assignment

7/28/2019 Distributed Database Design 3rd Assignment

http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 9/22

9

ALTERNATIVE DESIGN STRATEGIES

TOP-DOWN DESIGN PROCESS• The GCS and acess pattern information collected as a result of 

view design are inputs to the distribution design step.

• distributions design  - design the local conceptual schemas by

distributing the entities over the sites of the distributed system. The

distribution design activity consists of two steps: –  fragmentation

 – allocation

•  physical design  - is the process, which maps the local conceptual

schemas to the physical storage devices available at the correspondingsites,

• observation and monitoring - the results is some form of feedback,

which may result in backing up to one of the earlier steps in the design. 

Page 10: Distributed Database Design 3rd Assignment

7/28/2019 Distributed Database Design 3rd Assignment

http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 10/22

10

ALTERNATIVE DESIGN STRATEGIES

BOTTOM-UP DESIGN PROCESS

Top-down design is a suitable approach when a database

system is being designed from scratch.

If a number of databases already exist, and the design task

involves integrating them into one database - the bottom-up

approach is suitable for this type of environment. The starting

point of  bottom-up design is the individual local conceptual

schemas. The process consists of integrating local schemas into

the global conceptual schema. 

Page 11: Distributed Database Design 3rd Assignment

7/28/2019 Distributed Database Design 3rd Assignment

http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 11/22

11

DISTRIBUTION DESIGN ISSUESREASONS FOR FRAGMENTATION

• The important issue is the appropriate unit of distribution. For a number

of reasons it is only natural to consider subsets of relations as distribution

units.

• If the applications that have views defined on a given relation reside at

different sites, two alternatives can be followed, with the entire relation

being the unit of distribution. The relation is not replicated and is stored

at only one site, or it is replicated at all or some of the sites where the

applications reside.

• The fragmentation of relations typically results in the parallel execution of 

a single query by dividing it into a set of  subqueries that operate on

fragments. Thus, fragmentation typically increases the level of 

concurrency and therefore the system throughput.

Page 12: Distributed Database Design 3rd Assignment

7/28/2019 Distributed Database Design 3rd Assignment

http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 12/22

12

DISTRIBUTION DESIGN ISSUESREASONS FOR FRAGMENTATION

• There are also the disadvantages of fragmentation:

 – if the application have conflicting requirements which

prevent decomposition of the relation into mutually

exclusive fragments, those applications whose views are

defined on more than one fragment may suffer

performance degradation,

 – the second problem is related to semantic data control,

specifically to integrity checking.

Page 13: Distributed Database Design 3rd Assignment

7/28/2019 Distributed Database Design 3rd Assignment

http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 13/22

13

DISTRIBUTION DESIGN ISSUESFRAGMENTATION ALTERNATIVES

• The are clearly two alternatives:

 – horizontal fragmentation

 – vertical fragmentation

• The fragmentation may, of course, be nested. If the nestings

are of different types, one gets hybrid fragmentation. 

Page 14: Distributed Database Design 3rd Assignment

7/28/2019 Distributed Database Design 3rd Assignment

http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 14/22

14

DISTRIBUTION DESIGN ISSUESDEGREE OF FRAGMENTATION

• The extent to which the database should be fragmented is an

important decision that affects the performance of query

execution.

• The degree of fragmentation goes from one extreme, that is,

not to fragment at all, to the other extreme, to fragment to

the level of  individual tuples (in the case of  horizontal 

 fragmentation) or to the level of  individual attributes (in the

case of vertical fragmentation).

Page 15: Distributed Database Design 3rd Assignment

7/28/2019 Distributed Database Design 3rd Assignment

http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 15/22

15

DISTRIBUTION DESIGN ISSUESCORRECTNESS RULES OF FRAGMENTATION

Completeness

If a relation instance R is decomposed into fragments R1 ,R2 , ..., Rn , each

data item that can be found in R can also be found in one or more of Ri ’s.

This property is also important in fragmentation since it ensures that the

data in a global relation is mapped into fragments without any loss.Reconstruction 

If a relation R is decomposed into fragments R1 ,R2 , ..., Rn , it should be

possible to define a relational operator such that:

R =Ri  ,   Ri  F R

The reconstructability of the relation from its fragments ensures that

constraints defined on the data in the form of dependencies are

preserved. 

Page 16: Distributed Database Design 3rd Assignment

7/28/2019 Distributed Database Design 3rd Assignment

http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 16/22

16

DISTRIBUTION DESIGN ISSUESCORRECTNESS RULES OF FRAGMENTATION

Disjointness

If a relation R is horizontally decomposed into fragments R1 ,R2 , ..., Rn and

data item d i  is in R j , it is not in any other fragment Rk  (k   j ). This criterion

ensures that the horizontal fragments are disjoint. If relation R is vertically

decomposed, its primary key attributes are typically repeated in all itsfragments. Therefore, in case of vertical partitioning, disjointness is

defined only on the nonprimary key attributes of a relation.

Page 17: Distributed Database Design 3rd Assignment

7/28/2019 Distributed Database Design 3rd Assignment

http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 17/22

17

DISTRIBUTION DESIGN ISSUESALLOCATION ALTERNATIVES

• The reasons for replication are reliability and efficiency of 

read-only queries.

• Read-only queries that access the same data items can be

executed in parallel since copies exist on multiple sites.

• The execution of  update queries cause trouble since the

system has to ensure that all the copies of the data are

updated properly.

• The decisions regarding replication is a trade-off which

depends on the ratio of the read-only queries to the update

queries.

Page 18: Distributed Database Design 3rd Assignment

7/28/2019 Distributed Database Design 3rd Assignment

http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 18/22

18

DISTRIBUTION DESIGN ISSUESALLOCATION ALTERNATIVES

• A nonreplicated database (commonly called a  partitioned 

database) contains fragments that are allocated to sites, and

there is only one copy of any fragment on the network.

• In case of replication, either the database exists in its entirety

at each site ( fully replicated  database), or fragments are

distributed to the sites in such a way that copies of a fragment

may reside in multiple sites ( partially replicated database).

Page 19: Distributed Database Design 3rd Assignment

7/28/2019 Distributed Database Design 3rd Assignment

http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 19/22

19

DISTRIBUTION DESIGN ISSUESALLOCATION ALTERNATIVES

Page 20: Distributed Database Design 3rd Assignment

7/28/2019 Distributed Database Design 3rd Assignment

http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 20/22

20

DISTRIBUTION DESIGN ISSUESINFORMATION REQUIREMENTS

• The information needed for distribution design can be divided

into four categories:

 – database information: local organization of database

 – application information: location of application

 – communication network information: access

characteristics of applications to database

 – computer system information.

Page 21: Distributed Database Design 3rd Assignment

7/28/2019 Distributed Database Design 3rd Assignment

http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 21/22

21

DISTRIBUTION DESIGN ISSUESFRAGMENTATION

• Horizontal fragmentation partitions a relation along its tuples

• Two versions of horizontal fragmentation

 –   Primary horizontal fragmentation of relation is performed using

predicates that are defined on that relation

 – Derived fragmentation is the partitioning of relation that results from

predicates being defined on another relation

Page 22: Distributed Database Design 3rd Assignment

7/28/2019 Distributed Database Design 3rd Assignment

http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 22/22

22

DISTRIBUTION DESIGN ISSUESFRAGMENTATION

• Vertical fragmentation partitions a relation into a set

of smaller relations so that many of users aplications

will run on only one fragment

• Vertical fragmentation is inherently more

complicated than horizontal partitioning