25
Database Replication Techniques by Adriano Caloiaro Advisor Daniel Plante A SENIOR RESEARCH PAPER PRESENTED TO THE DEPARTMENT OF COMPUTER SCIENCE OF STETSON UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF BACHELOR OF SCIENCE IN THE DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE IN THE COLLEGE OF ARTS AND SIENCE AT STETSON UNVERSITY DELAND, FL Spring 2008

Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Database Replication Techniques

byAdriano Caloiaro

AdvisorDaniel Plante

A SENIOR RESEARCH PAPER PRESENTED TO THE DEPARTMENT OF COMPUTER SCIENCE OF STETSON UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF BACHELOR OF SCIENCE IN THE DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE IN THE

COLLEGE OF ARTS AND SIENCE AT STETSON UNVERSITYDELAND, FL

Spring 2008

Page 2: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

0.1 Acknowledgments

I would just like to thank my advisor Dr. Dan Plante for always being available

for help and an encouraging word whenever it was needed. Dr. Plante has always been

able to explain problems in the clearest way possible to make complex problems as easy

to understand as possible. I believe Dr. Plante will always be an asset to the Mathematics

and Computer Science department here at Stetson University.

I would also like to thank my academic advisor Dr. Hala El Aarag for advising me

and directing me when I was unsure about my academic decisions. I feel privileged to

have studied under her guidance.

2

Page 3: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

TABLE OF CONTENTS

0.1 ACKNOWLEDGMENTS…………………………...……………………………20.2 ABSTRACT…………………………………………….…………………………4

TABLE OF CONTENTS………………………………………………………………….3

1. INTRODUCTION………………………………………………………………….....5

2. REPLICATION OVERVIEW………………………………………………………...6

2.1. SERVER ARCHITECTURES…………………………………………………...8

2.1.1.PRIMARY COPY…………………………………………………………8

2.1.2.UPDATE EVERYWHERE……………………………………………….9

2.2. SERVER INTERACTION……………………………………………………….9

2.2.1.LINEAR………………………………………………………………….10

2.2.2.CONSTANT……………………………………………………………..10

2.3. TRANSACTION TERMINATION…………………………………………….11

2.3.1.VOTING…………………………………………………………………11

2.3.1.1.TWO-PHASE COMMIT………………………………………..13

2.3.1.2.PAXOS COMMIT……………………………………………....14

2.3.2.NON-VOTING……………………………………………………….....15

3. RELATED WORK…………………………………………………………………16

4. IMPLEMENTATION………………………………………………………………17

5. CONCLUSION………………………………………………………………………22

3

Page 4: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

0.2 Abstract

Many large database systems significantly suffer from replication lag due to the

database management systems’ inability to efficiently propagate transactions to a large

number of replicas (slave servers). This replication lag can result in inaccurate,

irrelevant, or simply aged data when queries are executed on lagged servers. Replication

lag is commonly the result of lazy replication. The idea of lazy replication is that

transactions are propagated asynchronously to other nodes after a transaction has been

committed. What this essentially means is that one or more slaves in a cluster may or

may not have the most recent version of mission-critical data. On the other hand there

exists eager replication. Eager replication utilizes an atomic propagation technique,

which means that a transaction must be applied to all or none of the replicas to ensure

that data remains consistent after every transaction. The problem with eager replication

is that it provides excellent consistency, but poor performance. The purpose of this

research is to examine the various components that make up a replication protocol and

test different replication methods in order to find or devise the most efficient and reliable

replication technique.

4

Page 5: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

1. Introduction

In the last decade the World Wide Web has grown from a conglomeration of

relatively static web sites intended to serve a small user base to a labyrinth of dynamic,

database-driven web applications that support hundreds of thousands of users. Small

scale web sites which typically serve fewer than 100 concurrent users can usually manage

with a single, high powered database server to serve all of the site’s data needs.

However, as a web site’s user base grows, so grows the need for more highly available

and readily accessible database solutions. One way to provide high availability and low

database latency is to scale up by purchasing expensive proprietary data solutions on top

of mainframe architecture. However, scaling out has become a far more common

approach to provide high availability and low database latency by distributing database

load to many cheap commodity servers. In order to distribute the database load to many

database servers rather than serving data from one single mainframe computer, the

database must be replicated, meaning that each server must contain a copy of the data or

a subset of the data needed to maintain the function of the website, or other database-

driven application. The term replication in this sense is the copying of data from one

server or a group of servers to another server or group of servers, the result of which is a

theoretically consistent data set.

Database replication is often an issue of performance vs. data consistency, where

data consistency is the degree to which inter-server data is similar. As data becomes

decreasingly consistent, the data becomes less valuable to the application that relies on

the data.

In the broadest sense there are two methods of replication; eager replication

5

Page 6: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

(synchronous) and lazy replication (asynchronous). Eager replication provides excellent

data consistency by synchronizing transactions prior to commitment, but does so at the

expense of performance. Eager replication is known for deadlocking, its inability to scale

well, and excessive communications overhead [1], [2]. Deadlocking is most often the

result of multi-site commit protocols such as two-phase commit, that block until all nodes

have committed a transaction [6]. Lazy replication or asynchronous replication is known

for high scalability and performance, but low data consistency due to the fact that

asynchronous transactions are not ACID (Atomicity, Consistency, Isolation, Durability)

compliant and therefore fail to meet the consistency requirement [3]. ACID compliance,

while assuring high levels of data consistency, traditionally has not provided high

availability, which in most cases takes precedence over consistency in most user-centric

applications where stale data or approximated data is sufficient [8].

6

Page 7: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

2. Replication Overview

There are a myriad of replication methods available today ranging from

completely ACID compliant synchronous methods to partially ACID compliant

asynchronous methods. While most of these methods have their respective uses, the ideal

is to find a single method which can remain scalable, ensure strong data consistency,

reliability, and efficiency. Generally, replication methods can be broken down using

three different components: server architecture, server interaction, and transaction

termination. Transaction termination can be further broken down into voting and non-

voting termination. Figure 1 is a depiction of this classified concept of replication

methods [1]:

Figure 1

This image provides a visual breakdown of the components which compose every

replication protocol. Each individual box represents a different class of replication

protocol.

7

Page 8: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

2.1 Server Architecture

Server architecture is the most definitive characteristic of a replication protocol

and is often chosen based on what type of data is being replicated and the importance of

data availability to the dependant application. Primary backup or primary copy

architecture is often used in the case where lazy replication is employed and Update

Everywhere is often used where some form of group communication primitive is

employed such as total order broadcast, due to the nature of Update Everywhere

architecture.

2.1.1 Primary Copy

Primary Copy architecture utilizes the idea that there is a primary site (master)

which makes all decisions regarding serialization order and transaction delegations to

replicas (slaves) after adequately processing the request [7]. In an environment where the

database consists of multiple partitions, the primary site delegates the transaction only to

the appropriate replicas [1]. Many distributed database systems consist of multiple

partitions so that the entire data set is not replicated across all replicas. This cuts down

not only on storage capacity needs, but also on inter-server communication which may or

may not take place over a high latency WAN. Efficient transaction delegation is

particularly important when eager replication (synchronous) is employed and server

interaction is typically linear and not constant.

The main problem with Primary Copy architecture is that the structure innately

forms both a bottleneck and a single point of failure. Due to the bottleneck posed by

Primary Copy architecture, it is best to be used in a hierarchical tree-structured system

such as HAMP (Hierarchical Asynchronized Multi-level-consistency Protocol), where

8

Page 9: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

primary sites are the head of a small tree or cluster containing a relatively small subset of

data [2]. In a Primary Copy scenario, the only way to remedy the single point of failure

issue is to establish an active backup for the primary site as proposed in [13]. That way

whenever the primary site fails there is always an active backup containing all of the site

data ready to take over the role of primary site.

2.1.2 Update Everywhere

Update everywhere architecture allows clients to contact any server within a

cluster regardless of whether the transaction will be a simple query or a UDI (Update,

Delete, Insert) transaction. The server which is contacted will delegate the transaction to

all other servers within the cluster, usually using some form of multicast transaction

propagation primitive such as Total Order Broadcast, which will be detailed at a later

time [1] [5]. Update Everywhere architecture solves the bottleneck and single point of

failure problems of Primary Copy by providing multiple points of data insertion for every

cluster. However, implementing update everywhere has traditionally been a complex

task.

2.2 Server Interaction

Server interaction decisions are vital to the performance of a replication protocol

in a broadly distributed environment. Server interaction indicates the degree of

communication between servers during the commitment of a transaction. High

interaction (linear) typically correlates to higher consistency, but lower performance

while low interaction (constant) correlates to low consistency but better performance [1].

The reason for this correlation is that server interaction is the result of consistency

checking protocols such as two-phase commit which is explained at a later time or 1-safe

9

Page 10: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

which is a trimmed down version of two-phase commit. The level of server interaction

most often corresponds to the difference between lazy and eager replication where lazy

replication is asynchronous and eager replication is synchronous. It is important to limit

communication enough so that the replication protocol can remain as efficient as possible

while maintaining acceptable levels of data consistency and availability [16].

2.2.1 Linear Interaction

Linear server interaction works in a small minimal node environment, but when it

is applied to a widely distributed system with potentially thousands of nodes, scalability-

related performance is limited. Linear interaction protocols handle each replicated

operation on an individual basis, which means that communication overhead can reach

unacceptable levels in an environment where hundreds of UDI transactions occur every

second [15]. Even when replicated operations are bundled in a single transaction, each

operation results in communication between master and slave. In a situation where only

a few nodes are replicated the communication overhead of linear interaction is negligible,

but in a highly scaled environment replica lag results. Replica lag is the lag time between

a delegate server and its slaves. High replica lag means low data availability which can

be critical in financial institutions where data such as exchange rates are replicated. For

example, financial transactions may go through at exchange rates lower or higher than

what the market is trading at because brokers are working with historical exchange rates

rather than current market rates.

2.2.2 Constant Interaction

Constant server interaction means that synchronization messages are constantly

10

Page 11: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

proportionate to the number of transactions received, usually by means of message

grouping [1]. Constant interaction usually takes advantage of the concept of transactions.

Transactions may contain a single operation or multiple operations, but the idea is to

produce a single synchronization message for every transaction rather than generating

messages on a per-operation basis. By grouping multiple write operations in a

transaction, node communication can be limited significantly by taking advantage of

constant interaction mechanisms.

2.3 Transaction Termination

It is important to coordinate all replicas for termination when a transaction has

been received by a site in order to ensure atomicity, one of the elements of ACID

compliance. Transaction termination coordination can be broken down into two different

methods: voting and non-voting termination. Transaction termination, while not as

influential on performance as server interaction, can still affect replication performance,

data availability, and most importantly data consistency. The transaction termination

protocol is the process by which all servers are coordinated to either abort or commit a

particular transaction. Some termination protocols ensure total atomicity while others

can only offer partial atomicity. Additionally, some protocols may use a single

transaction manager to manage transaction termination and some protocols may use

multiple transaction managers; it simply depends on the implementation [17].

2.3.1 Voting Termination

Voting termination requires a round of messages to coordinate servers in such a way that

atomicity is ensured. The complexity of the voting mechanism can vary greatly with

many hybrid voting techniques available, but the most common protocol is the two-phase

11

Page 12: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

commit protocol. However, voting can be as simple as a single response from a slave to a

master server [1]. This simple diagram shows the different states that a transaction

traverses in order for it to be either committed or aborted [17].

Figure 2

As one can see, every transaction must first go through a working phase before it

can make any progress, regardless of protocol implementation. The goal of the process is

for every site to ultimately arrive at either an aborted state or a committed state. There

are a set of requirements for doing so.

General Requirements:

-Stability – once a site has entered the committed or aborted state it must remain in that

state permanently.

-Consistency – Every site ultimately finishes the transaction in either an aborted state or a

committed state.

Prepared Requirements:

12

Page 13: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

-A transaction can only be committed after all sites have reached the prepared state.

Abortion Requirements:

-A transaction is aborted when any site enters the aborted state.

An important element of voting termination is progress. Establishing progress

ensures that a transaction cannot block indefinitely. Progress is ensured by the following

laws [17]:

Note: RMs are resource managers, or sites.

—Nontriviality If the entire network is nonfaulty throughout the execution of the protocol, then (a) if all RMs reach the prepared state, then all RMs eventually reach the committed state, and (b) if some RM reaches the aborted state, then all RMs eventually reach the aborted state.

—Non-blocking If, at any time, a sufficiently large network of nodes is non-faulty for long enough, then every RM executed on those nodes will eventually reach either the committed or aborted state.

2.3.1.1 Two-Phase Commit

The concept behind two-phase commit is simple; there is a coordinating server

which coordinates the committal and abortion of all transactions. The coordinating

server does so by waiting on a response from all servers. The coordinator allows sites to

traverse from state to state. For example, when the coordinator receives a message from

some site that it has entered the aborted state, it must notify all other sites to enter the

aborted state as well so that atomicity is ensured. However, the essential flaw in the two-

phase commit protocol is that there is only a single coordinating site which means that all

sites are waiting to hear from the coordinator in order to move to the next appropriate

13

Page 14: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

state. If the coordinator suffers from a network failure or overload, transactions will

indefinitely block regardless of progress laws because the coordinator manages all state

transition orders. Many solutions to this problem have been proposed, one of which is a

three-phase commit protocol which adopts a transaction manager election policy such

that if the transaction manager fails in some way, a new transaction manager is elected

and the transaction committal process may resume as proposed in [18] [17].

2.3.1.2 PAXOS commit

PAXOS commit is based on the PAXOS algorithm which is an algorithm for

groups to reach a consensus. The algorithm solves two problems for transaction

termination. One problem is the single TM (transaction manager) problem where a

transaction manager fails and a new transaction manager must be elected. With the group

knowledge that every cluster requires a transaction manager, immediately upon failure of

the transaction manager, a PAXOS cluster begins the election process for a new TM

which always results in a new TM having been selected. The second problem is simply

deciding on whether to abort or commit a transaction. PAXOS does this by acquiring a

consensus from all replicas on a transaction manager and consensus on transaction

committal action.

What is most important about PAXOS is that it is simply a method for groups to

decide on arbitrary values. This means that the algorithm can be applied to a wide range

of group consensus problems and therefore may be applied both to transaction manager

election and transaction committal. PAXOS elects a transaction manager by a two-phase

election model which is initiated from a client which may be a database client or some

other node. The client (or leader) proposes a value to all of the acceptors which are the

14

Page 15: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

individual sites. The value proposed by the client is a unique identifier for the replicas

that it wishes to become the transaction manager of the cluster. The most common way

for this to occur is that the client immediately considers itself the leader and proposes a

ballot to itself which it then sends out to all the acceptors asking them to make their vote.

If the ballot passes it means that a consensus is reached. If the ballot does not pass then

any of the acceptors are free to make their own ballot and send it to the leader which then

sends out the new ballot to all acceptors until eventually a consensus is reached on a

ballot. Once consensus is reached the problem is solved. This same election pattern is

used to reach consensus on whether or not a transaction should be committed or aborted,

but the same rules apply as with two-phase commit. Consensus may not be reached for a

number of reasons ranging from network failure, transaction committal failure at one or

many nodes, or one or more transactions having note been committed within a specified

time frame. If full consensus is not reached from every replica then a transaction is

aborted.

2.3.2 Non-voting

Non-voting termination relies on individual servers to determine whether or not to

commit a transaction. Non-voting replication is often called lazy replication. Its main

benefit is that it is non-blocking. However, non-voting replication does not block at the

expense of data consistency. The idea of non-voting replication is simple; a master

server writes everything that it believes should be committed by its slaves to a transaction

log file which is read by all slaves and the slaves read the log file at their convenience

(usually immediately, but there is no restriction on how long they can take). Whenever a

slave sees that an update is made to the log file, it reads the file and commits any of the

15

Page 16: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

transactions that it has not yet applied to its local copy. No voting on committal or

interaction beyond the reading of the log file occurs between the master and slave servers

or between slave servers and other slave servers, hence the name Non-voting.

3. Related Work

Given the increased need for highly available data on the back end of today’s

database-driven applications and websites, a great amount of research has gone into the

development of more efficient and consistent database replication protocols. The

research with the most importance to this issue has gone into the two-phase commit

protocol for synchronous database replication, given that it is the most time-sensitive

component in synchronous database replication. While asynchronous replication remains

the dominant technique, synchronous replication is the most important to research as data

consistency becomes more vital to mission-critical applications and scalability is its

greatest drawback. The driving force behind synchronous replication research is the

inadequacies of asynchronous replication and the daunting task of monitoring and

maintaining the consistency of large databases that are asynchronously replicated.

One such method for improving the performance of synchronous replication has

been devised by the Swiss Federal Institute of Technology [14]. It promises to “eliminate

the possibility of deadlocks, reduce the message overhead and increase performance.”

Given the simulation results, it does a very reasonable job of delivering on its promises,

but like many other replication techniques under research, they have yet to be

implemented fully and used in the real world.

16

Page 17: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

4. Implementation

The implementation of a replication protocol consists of three important

characteristics: Server architecture, server interaction, and a voting mechanism. All of

which are absolutely vital to the overall performance and data consistency of the

protocol. The protocol implemented in this research is called Constant Broadcast, which

is an update everywhere, constant interaction, voting protocol, where the voting

mechanism is the most important to its performance. The voting protocol used by

Constant Broadcast is a modified Two-Phase Commit (2PC) protocol such that

transaction commits cannot wait indefinitely, and the waiting period is defined by a

shared variable value Tau (b). The following is a list of symbols used to describe the

Constant Broadcast’s 2PC protocol.

Symbol Meaningb The max time a transaction can take per-

node to commitN Number of servers in active server group

Nmin Minimum number of servers in active node group

G Active node groupI Inactive node group

Constant Broadcast relies on three principles: all nodes belong to either group G or I

within their respective cluster, there must exist more nodes than necessary to carry out

the tasks of the cluster, and no node can take longer than b to commit any transaction if it

belongs to G, otherwise it is placed in I.

Update everywhere server architecture was chosen for Constant Broadcast

because every node in a Constant Broadcast cluster must be considered expendable,

including the delegate server. Since any server in an Update Everywhere cluster can be

17

Page 18: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

the delegate server, any server also can be considered an expendable node to that cluster

because any other node can take its place as a transaction delegate. In the case of

Primary Copy however, by its definition the master server acts as the transaction delegate

and cannot under any circumstance be considered expendable because it is the glue that

binds the cluster together. Figure 3 is used to visualize the primary copy server

architecture.

Figure 3

As one can clearly see, if the master server was expelled from the cluster, the slave

nodes would have no server to be delegated transactions to them. Nodes must all be

expendable in order to adhere to the specification of Constant Broadcast’s 2PC protocol.

Figure 4 shows how the transaction delegation works with an Update Everywhere server

architecture.

18

Page 19: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

Figure 4

Figure 4 helps clarify why nodes can be expendable, while at the same time showing why

the 2PC protocol must be efficient because of the high degree of server interaction.

The next component of the Constant Broadcast protocol is its node groups, which are

vital in putting expendable servers to as good of use as possible. Whenever a node in G

(The active group) fails to commit a transaction within b it is added to group I. The

pseudo-code for this is quite simple for a node in G:

if (timeToCommit > and N > τ Nmin){

expellNode(node)} else {

notifyDelegate(commitSuccessful)}

(Nmin is an administrator-defined variable based on the number of expendable servers)

The purpose for this is to eliminate the chance of deadlocking and to increase the overall

performance of the active node group by setting a limit on how long a transaction can

take to commit. Traditional 2PC protocols allow unrealistically long or infinite amounts

of time for a transaction to commit which can result in unreliable performance that

simply does not compete well in the real world with other replication protocols such as

19

Page 20: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

Lazy Replication. The purpose of the inactive group I is not to disregard its group

members, but to allow nodes within I to asynchronously catch up with G so that it can

once again become a participating member of the group. The delegate of any transaction

controls who should be placed in I and who should remain in G.

indefiniteLoop:{

if new transaction arrived{

success = commitTransaction(transaction)if (success){

broadcastTransaction(transaction)if numberOfSuccessfulCommits equals N - I

transactionSuccessful()else transactionFailed();

} else abortTransaction()

}}

The broadcastTransaction(transaction) function method broadcasts the transaction to

all nodes, whether they are in G or I. However, as one can see from the pseudo-code, the

delegate server does not wait on responses from the nodes in I, which means they

continue to receive the broadcasts, but they are committed asynchronously and they

monitor their own state to measure whether they have caught up to the other nodes in G

or not. Every node has a polling mechanism that can check how many transactions other

nodes have committed, so the following pseudo-code applies to the inactive nodes. Keep

in mind that nodes in G trust nodes in I to manage their own group membership once they

belong to I, so the following pseudo-code applies to members of I:

20

Page 21: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

indefiniteLoop:{

if new transaction commitednumTransactionsCommitted = pollActiveServers()

if myNumCommitted equals numTransactionsCommittedplaceInActiveGroup(me)

elseremainInactive()

}

This allows members of I to manage their group membership such that whenever they

have caught up to the nodes in G, they can once again become a participating node in G.

The most important part of Constant Broadcast is the selection of a suitable base b

value. All transactions that will be run on a cluster should be analyzed by an

administrator to determine the longest running transaction that will ever be run by the

cluster of servers. That transaction should then be run thousands of times under normal

load and its maximum value should be the value of your base b . Once a base b is set,

the constant broadcast protocol will adjust b according to the following pseudo-code:

expellNode(node){

placeInInactiveGroup(node)N = N-1I = I + 1t = t + t/N

while (N == Nmin){

blockWrites() //Wait for node in I to catch up}

}

Constant Broadcast’s performance varies greatly based on the administrator’s ability to

set reasonable values for Nmin and τ . If τ is set too low, and many of the transaction

21

Page 22: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

commits run longer than τ, then I is going to fill up and N is going to become Nmin very

quickly, which means that transactions will block as the protocol waits for the nodes in I

to catch up with G. If τ is set too high, then Constant Broadcast is going to perform the

same as any other synchronous replication protocol and wait unnecessarily for slow

nodes to process transactions. If too few expendable servers are available and Nmin is

close to N, then Constant Broadcast is will also perform similarly to other synchronous

replication protocols.

5. Conclusion

Constant Broadcast is a fault-tolerant replication protocol for high performance

synchronous database replication. The shortcomings of most synchronous protocols

reside in the concept of single group or cluster membership with no pre-defined time

frame in which to commit transactions. Constant Broadcast addresses those shortcomings

by allowing nodes to move between an active node group and an inactive node group

based on their ability to commit transactions within a pre-defined time frame τ.

Allowing nodes to move from an active node group to an inactive node group whenever

they fail to meet the temporal criteria defined by τ prevents transaction commits from

deadlocking and decreasing the overall performance of the database system.

22

Page 23: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

References

[1] Wiesmann, M.; Pedone, F.; Schiper, A.; Kemme, B.; Alonso, G., "Database replication techniques: a three parameter classification ," Reliable Distributed Systems, 2000. SRDS-2000. Proceedings The 19th IEEE Symposium on , vol., no., pp.206-215, 2000URL: http://ieeexplore.ieee.org/iel5/7103/19136/00885408.pdf?isnumber=19136∏=STD&arnumber=885408&arnumber=885408&arSt=206&ared=215&arAuthor=Wiesmann%2C+M.%3B+Pedone%2C+F.%3B+Schiper%2C+A.%3B+Kemme%2C+B.%3B+Alonso%2C+G.

[2] Tao, J.; Williams, J.G., "Concurrency control and data replication strategies for large-scale and wide-distributed databases," Database Systems for Advanced Applications, 2001. Proceedings. Seventh International Conference on , vol., no., pp.352-359, 2001URL: http://ieeexplore.ieee.org/iel5/7316/19788/00916397.pdf?isnumber=19788∏=STD&arnumber=916397&arnumber=916397&arSt=352&ared=359&arAuthor=Tao%2C+J.%3B+Williams%2C+J.G.

[3] Wiesmann, M.; Schiper, A., "Comparison of database replication techniques based on total order broadcast," Transactions on Knowledge and Data Engineering , vol.17, no.4, pp. 551-566, April 2005URL: http://ieeexplore.ieee.org/iel5/69/30435/01401893.pdf?isnumber=30435∏=JNL&arnumber=1401893&arnumber=1401893&arSt=+551&ared=+566&arAuthor=Wiesmann%2C+M.%3B+Schiper%2C+A.

[4] Liu, X., Helal, A., and Du, W. 1998. Multiview access protocols for large-scale replication. ACM Trans. Database Syst. 23, 2 (Jun. 1998), 158-198. DOI= http://doi.acm.org/10.1145/292481.277628

[5] Groothuyse, T., Sivasubramanian, S., and Pierre, G. 2007. Globetp: template-based database replication for scalable web applications. In Proceedings of the 16th international Conference on World Wide Web (Banff, Alberta, Canada, May 08 - 12, 2007). WWW '07. ACM, New York, NY, 301-310. DOI= http://doi.acm.org/10.1145/1242572.1242614

[6] Manassiev, K. and Amza, C. 2005. Scalable database replication through dynamic multiversioning. In Proceedings of the 2005 Conference of the Centre For Advanced Studies on Collaborative Research (Toranto, Ontario, Canada, October 17 - 20, 2005). J. R. Cordy, A. W. Kark, and D. A. Stewart, Eds. IBM Centre for Advanced Studies Conference. IBM Press, 141-154.

[7] Sousa, A.; Pereira, J.; Soares, L.; Correia, A., Jr.; Rocha, L.; Oliveira, R.; Moura,

23

Page 24: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

F., "Testing the dependability and performance of group communication based database replication protocols," Dependable Systems and Networks, 2005. DSN 2005. Proceedings. International Conference on , vol., no., pp. 792-801, 28 June-1 July 2005URL: http://ieeexplore.ieee.org/iel5/9904/31476/01467853.pdf?isnumber=31476∏=STD&arnumber=1467853&arnumber=1467853&arSt=+792&ared=+801&arAuthor=Sousa%2C+A.%3B+Pereira%2C+J.%3B+Soares%2C+L.%3B+Correia%2C+A.%2C+Jr.%3B+Rocha%2C+L.%3B+Oliveira%2C+R.%3B+Moura%2C+F.

[8] Saito, Y. and Shapiro, M. 2005. Optimistic replication. ACM Comput. Surv. 37, 1 (Mar. 2005), 42-81. DOI= http://doi.acm.org/10.1145/1057977.1057980

[9] DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., and Vogels, W. 2007. Dynamo: amazon's highly available key-value store. In Proceedings of Twenty-First ACM SIGOPS Symposium on Operating Systems Principles (Stevenson, Washington, USA, October 14 - 17, 2007). SOSP '07. ACM, New York, NY, 205-220. DOI= http://doi.acm.org/10.1145/1294261.1294281

[10] Breitbart, Y., Komondoor, R., Rastogi, R., Seshadri, S., and Silberschatz, A. 1999. Update propagation protocols for replicated databates. In Proceedings of the 1999 ACM SIGMOD international Conference on Management of Data (Philadelphia, Pennsylvania, United States, May 31 - June 03, 1999). SIGMOD '99. ACM, New York, NY, 97-108. DOI= http://doi.acm.org/10.1145/304182.304191

[11] Fox, A., Gribble, S. D., Chawathe, Y., Brewer, E. A., and Gauthier, P. 1997. Cluster-based scalable network services. In Proceedings of the Sixteenth ACM Symposium on Operating Systems Principles (Saint Malo, France, October 05 - 08, 1997). W. M. Waite, Ed. SOSP '97. ACM, New York, NY, 78-91. DOI= http://doi.acm.org/10.1145/268998.266662

[12] Barbara, D., Garcia-Molina, H., and Spauster, A. 1989. Increasing availability under mutual exclusion constraints with dynamic vote reassignment. ACM Trans. Comput. Syst. 7, 4 (Nov. 1989), 394-426. DOI= http://doi.acm.org/10.1145/75104.75107

[13] Bhide, A., Goyal, A., Hsiao, H., and Jhingran, A. 1992. An efficient scheme for providing high availability. In Proceedings of the 1992 ACM SIGMOD international Conference on Management of Data (San Diego, California, United States, June 02 - 05, 1992). M. Stonebraker, Ed. SIGMOD '92. ACM, New York, NY, 236-245. DOI= http://doi.acm.org/10.1145/130283.130320

[14] Kemme, B. and Alonso, G. 2000. A new approach to developing and implementing eager database replication protocols. ACM Trans. Database Syst. 25, 3 (Sep. 2000), 333-379. DOI= http://doi.acm.org/10.1145/363951.363955

24

Page 25: Database Replication Techniques - Semantic Scholar · most often corresponds to the difference between lazy and eager replication where lazy replication is asynchronous and eager

Caloiaro

[15] Gifford, D. K. 1979. Weighted voting for replicated data. In Proceedings of the Seventh ACM Symposium on Operating Systems Principles (Pacific Grove, California, United States, December 10 - 12, 1979). SOSP '79. ACM, New York, NY, 150-162. DOI= http://doi.acm.org/10.1145/800215.806583

[16] Jiménez-Peris, R., Patiño-Martínez, M., Alonso, G., and Kemme, B. 2003. Are quorums an alternative for data replication?. ACM Trans. Database Syst. 28, 3 (Sep. 2003), 257-294. DOI= http://doi.acm.org/10.1145/937598.937601

[17] Gray, J. and Lamport, L. 2006. Consensus on transaction commit. ACM Trans. Database Syst. 31, 1 (Mar. 2006), 133-160. DOI= http://doi.acm.org/10.1145/1132863.1132867

[18] AGUILERA,M. K., DELPORTE-GALLET, C., FAUCONNIER, H., AND TOUEG, S. 2001. Stable leader election.In DISC ’01: Proceedings of the 15th International Conference on Distributed Computing, J. L.Welch, Ed. Lecture Notes in Computer Science, vol. 2180. Springer-Verlag, Berlin, Germany,

25