Proceedings Template - WORDperrizo/saturday/ExternalDrive/f… · Web viewOur performance analysis and simulation shows that the solution not only solves the write skew problem

Intervening Validations for Concurrency Control*

Removed for double blind reviewing



ABSTRACTThis paper discusses intervening validations for concurrency control. Multi-version intervening validation uses three timestamps (Start, Commit and Combined) for transaction validation. The Start timestamp records the time when the transaction reads its first data object. The Commit timestamp records the time when the transaction requests to commit. The Combined timestamp can either be set to the Start timestamp or to the Commit timestamp, depending on the result of a conflict check using the proposed “intervening validation” algorithm. In single version systems, more validation intervals are expected. Multiple intervening validations are necessary when more than one validation interval is introduced due to a single version constraint. Our study shows that applying intervening validation to a multi-version system achieves the same performance as Snapshot Isolation while providing execution correctness (full serializability). In addition, applying intervening validation achieves significant performance gain in single version database systems. Recovery and distributed intervening validations are also discussed.

KeywordsConcurrency control, High performance systems, Database management systems.

1. INTRODUCTION

Many commercial database management systems provide multiple levels of transaction isolation for the compromise between execution correctness of transactions and system performance. Most systems set a default level of transaction isolation, “read committed”, to achieve good performance [1]. Interestingly, whether lowering isolation levels can achieve better performance depends on the system implementation. For example, Shasha et al [2] reported that “read committed” isolation level in Microsoft SQL server is much better than “serializable” isolation level, though the error rate could be very high. In contrast, the “read committed” in Oracle 8i/9i has a performance level almost the same as “serializable” [2]. The “serializable” isolation in Oracle uses a variant of “Snapshot Isolation” (SI), which has a much lower error rate than “read committed”.

Snapshot Isolation demonstrated good performance in Oracle. To overcome its problem with “write skew” [3], A. Fetete et al proposed to “promote reads to writes” based on careful application analysis [4]. Such application level solutions may incur serious performance penalties. Also application level solutions requires programmers to write complex codes and are error-prone.

In this paper we present a system-level solution to the “write skew” problem in SI. The proposed algorithm uses “intervening validation” on a “conflict-equivalent history”. Our performance analysis and simulation shows that the solution not only solves the write skew problem and guarantees serializability, but also provides better performance than SI in some application settings. Furthermore, applying intervening validation to single version database systems may produce significant performance gains. Virtually, all single version commercial database systems use two phase locking at the serializable isolation level. The performance gain of the intervening validation algorithm over 2PL could be as much as 140% [6].

The paper is organized as follows. Section 2 presents the existing problems with SI. Section 3 introduces intervening validation and applies it to solve the problems with SI. Section 4 introduces ROCC, a method which applies intervening validation to single version database systems. In sections 5 and 6 we discuss distributed intervening validation and recovery. Section 7 presents experimental results of five different concurrency control methods. We conclude the paper and discuss future work in section 8.

2. SNAPSHOT ISOLATION PROBLEMSSnapshot Isolation (SI) is an optimistic multi-version concurrency control method, in which the scheduler uses the first-committer-wins policy to avoid lost-update problems. A transaction always reads data from a snapshot of the (committed) data as of the time the transaction starts, called its Start-Timestamp. Reads from a transaction are never blocked provided the snapshot data can be maintained in the system (an assumed Data Manager function). The transaction’s writes (updates, inserts and deletes) will also be recorded in this snapshot, so that it can be read again if the transaction accesses the data a second time. Updates of other transactions beginning after the transaction Start-Timestamp are invisible to the transaction.

When a transaction, T1 , is ready to commit, it gets a Commit-Timestamp that is larger than any existing Start-Timestamp or Commit-Timestamp. The transaction commits only if no other transaction, T2, with a Commit-Timestamp in T1’s interval [Start-Timestamp, Commit-Timestamp] wrote data that T1 also wrote. Otherwise T1 will abort. This is the so-called first-committer-wins policy. As pointed out in [5, 7], the first-committer-wins policy allows historyH1:r1[x=50]r1[y=50]r2[x=50]r2[y=50]w1[y=-40]w2[x=-40]c1c2

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

Conference ’03, Month 1-2, 2003, City, State.Copyright 2003 ACM 1-58113-000-0/00/0000…$5.00.

* Patents on ROCC and MVROCC are pending

Allowing H1 violates the integrity constraint x+y>0 (x, y could represent bank account balances for a married couple, or quantities of parts in different store branches). Thus systems using Snapshot isolation have a “write-skew” problem. Such a write skew problem could either cause overdraft (in the bank account balance case) or “over-sale” (inadequate parts on hand to deliver on a contract order). The system errors caused by write-skew problems could be “thousands per day” and “quite damaging” [4].

Another problem with SI is “snapshot too old”. This problem comes from the assumption that all requested snapshot data are kept in the system. In reality, the system may not be able to maintain snapshot data for all applications because of limited space. For example, Oracle implementation maintains old versions of data objects in rollback segments. Some old data has to be cleaned to make room for updates of newly arriving transactions. When snapshot data are not found, the transaction that requested them has to abort. The system will pass an error message “ORA-01555: Snapshot too old” to the application.

The third problem with SI is that it does not allow single write-write histories. History H2:w1[x1]w2[x2]c2c1 should be allowed in multi-version systems (x1 and x2 are two versions of data object x in this case), since it is equivalent to T2T1 (serializable). SI will abort T1 because T2 commits first (in Oracle T1 will block T2 on item x until it commits, unfortunately Oracle has a deadlock in history w1[x1]w2[x2]w2[y2]c2w1[y1]c1)

3. MAKING THE SI SERIALIZABLE

3.1 Intervening validation The “write skew” makes SI non-serializable. We suggest using techniques of “free move”, “two sided conflict check” and “combined timestamp” to reject H1 and allow H2 above, thus making SI serializable while achieving better performance. Before presenting the “intervening validation algorithm”, which incorporates the three techniques, we define some new terms for the convenience of presentation.

Element definitions

Read element: A transaction T’s read element is the set of all T’s reads (excepting the reads that read what T writes), with a start-timestamp recording the time when T reads its first data item. A transaction can have only one read element because of the snapshot-read rule (this constraint will be removed in the intervening validation for single version systems, as discussed in sections 4). Similar to SI, the snapshot-read rule requires a transaction to read versions that were committed before its start timestamp.

Commit element: A transaction T’s commit element is the set of all T’s writes and the set of T’s reads that read what T writes, with a commit-timestamp recording the time of T's commit. A transaction can have only one commit element due to the snapshot write rule imposed on the intervening mechanism (also called the deferred update rule since a transaction's updates are invisible until it commits). We follow the SI deferred write rule, which requires all writes to be invisible until the

transaction commits. Reads that read what T writes are included in the T’s commit element, because T’s writes are visible to itself).

Intervening element: A transaction T’s intervening element is the set of all reads/writes of other transactions with timestamps within T’s timestamp interval [Start-timestamp, Commit-timestamp]. A transaction can have only one intervening element due to the fact that it has only one time interval (this constraint will be removed in intervening validation case for single version systems, where multiple intervals exist due to the presence of multiple read elements).

Combined element: A transaction T’s combined element is the set of T’s reads/writes with a timestamp determined by the intervening validation algorithm (Figure 1). A transaction can have only one combined element after it commits. Otherwise it has to abort or restart.

With these four element definitions, we give an intervening validation algorithm in Figure 1. The upper/lower-sided conflict definitions used in the description of the intervening algorithm are defined as follows.

Upper-sided conflict: A transaction T has an upper-sided conflict if operations in its read element conflict with the operations in its intervening element. “Upper-sided” is used to reflect the property that the read element is before its intervening element if a FIFO queue is used for recording the arrival order of elements (see the discussion of RC queue in next section).

Lower-sided conflict: A transaction T has a lower-sided conflict if operations in its commit element conflict with operations in its intervening element

In the intervening validation algorithm, we set T’s combined timestamp to its start or commit-timestamp if T passes validation. We do this because, if T is validated, T’s read (or commit) element can be freely moved along the time axis to be

When transaction T requests to commit

Check if T has an upper-sided element conflict;

If T has an upper-sided element conflict

{ check if T has a lower-sized element conflict;

If has a lower-sized element conflict

T aborts;

Else

{ T commits;

set T’s combined element’s timestamp to its Start-

timestamp;

}

}

else

{ T commits;

set T’s combined element’s timestamp to its Commit-

timestamp;

}

Figure 1 The intervening validation algorithm

merged with its other element without violating any conflict order. Thus we can prove that the intervening validation algorithm guarantees serializable transaction execution histories.

Proof. The proof is by contradiction. Suppose the intervening algorithm produces a history that contains a cycle: T i …Tj…Ti and both Ti and Tj are committed. Then either T i or Tj must have two elements (one element of the one transaction conflicts with and is before the other transaction and its second element conflicts with and is after the other transaction). The before-element has an upper-sided conflict with its intervening element and the after-element has a lower-sided conflict with the intervening element. The conflicts make it impossible to merge the before-element and the after-element. This contradicts the validation criterion that all committed transactions have only one element with combined timestamp (all committed transactions are required to shrink into a combined element).

3.2 Intervening validation eliminates H1We rewrite H1 with timestamp marks in Figure 2 to explain how the intervening validation described in Figure 1 rejects H1. In Figure 2 (a), T1’s start-timestamp is 1000 and its commit-timestamp is 1010, T2’s start-timestamp is 1001 and its commit-timestamp is 1011. When T1 commits (see Figure 2 (b)), T1’s read element is r1[x=50]r1[y=50] with a start-timestamp 1000. T1’s commit element is w1[y=-40]c1 with a commit-timestamp 1010. The intervening element of T1 is T2’s read element r2[x=50]r2[y=50] with a T2’s start-timestamp 1001 (1000<1001<1010, thus T2’s read element falls into T1’s validation interval [1000, 1010]). When T1 commits, its read element has no conflict with its intervening element. Therefore T1’s read element can be moved down to merge with its commit element (see Figure 2 (c)). T1 can commit with a combined element r1[x=50]r1[y=50]w1[y=-40]c1. The timestamp for T1’s combined element is set to 1010 (the same timestamp as T1’s commit-timestamp, because T1’s read element is moved down).

When T2 requests to commit, its intervening element is T1’s combined element r1[x=50]r1[y=50]w1[y=-40]c1 (1001<1010< 1011, T1’s combined element falls into T2’s validation interval [1001, 1011], see Figure 2 (c)). T2 has to abort because its read element r2[x=50]r2[y=50] conflicts with the intervening element on data object y (upper-sided element conflict) and its commit element w2[x=-40]c2 conflicts with the intervening element on data object x. Thus the intervening validation avoids “write skew” history H1.

3.3 Intervening validation permits H2Similarly, we rewrite the history H2 with timestamp marks in Figure 3 to explain how the intervening validation permits it. x1 and x2 represent different versions of the same data object x. In H2, T1 has no read element. T2 has no intervening element when it commits. Thus T1 and T2 both can commit if using intervening validation algorithm. SI aborts T1 because

H1: 1000: r1[x=50] r1[y=50] 1001: r2[x=50] r2[y=50] w1[y=-40] w2[x=-40] 1010: c1 1011: c2

Equivalent H1: 1000: r1[x=50]r1[y=50] (T1’s read element)

1001: r2[x=50]r2[y=50] (T1’s intervening element, also is T2’s read element)

1010: w1[y=-40]c1 (T1’s commit element)

1011: w2[x=-40]c2

Equivalent H1:

1000:

1001: r2[x=50]r2[y=50] (T2’s read element)

1010: r1[x=50]r1[y=50]w1[y=-40]c1 (T1’s combined element, also is T2’s intervening element)

1011: w2[x=-40]c2 (T2’s commit element)

Equivalent H1: 1000:

1001:

1010: r1[x=50]r1[y=50]w1[y=-40]c1(T1’s combined element)

1011:

(a) Original history (b) Validating T1

(c) T1 committed, validating T2(d) T2 aborted because of validation failure

Figure 2 Intervening validation in history H1

w1[x1] conflicts with w2[x2] but T2 commits before T1 (first- committer-wins policy).

4. INTERVENING VALIDATION FOR SV-DBMSVirtually all commercial single version database management systems (SV-DBMSs) use two phase locking (2PL) mechanismS at the “serializable” level of transaction isolation. Unlike mutiversion DBMSs such as Oracle 8i/9i, the performance of “serializable” isolation level differs significantly from “read committed” in SV-DBMSs. According to the performance testing in [2], Microsoft SQL server provides a less-than half THE throughput when choosing “serializable” instead of “read committed”. 40% of reads may be wrong when using “read committed” for better performance. Thus users have to compromise between execution correctness and system performance.

A Read-commit Order Concurrency Control method (ROCC) was introduced in [6] to achieve a performance that is close to “read committed” in single version database systems. ROCC is a deadlock-free, serializable concurrency control method based on optimistic mechanisms. It maintains a Read-Commit queue (RC-queue) that records the access order of transactions. Along with the RC-queue, a variant of the “intervening” validation algorithm is developed for assuring execution correctness.

ROCC uses deferred update to avoid cascading abort [8]. Unlike MV-DBMSs, snapshot read does not work in SV-DBMSs. A transaction may have multiple read elements because only a single version is available for each data object. There is no snapshot data in SV-DBMSs. Similar to the intervening validation described in the preceding section, the technique of “moving elements down or up to merge with its neighbor elements” is used in the intervening validation for SV-DBMSs. A transaction has to abort (or restart) if its elements cannot be merged into a single combined element.

The intervening validation algorithm for single version system works as follows. When a commit request message arrives, the scheduler generates a Commit element and posts it to the RC queue - starting the validation process for that transaction. The process starts validation from the first Read element (called “first” below) of the validating transaction. If “first” conflicts with no elements in the interval between “first” and the transaction’s next element in the queue, the validation process combines “first” with that next element and renames the resulting element as “first”. This is reiterated until the commit element is reached. If “first” conflicts with no intervening

elements all the way down to the commit element, the validation passes. The scheduler sends a commit request message to the execution queue for the data manager to perform write operations upon validation success, otherwise it sends out an abort (or restart) message back to transaction manager.

If a conflict is found, then we have an upper-sided conflict. Let “second” point to the Commit element. If “second” conflicts with the intervening element in the interval between “second” and the “first-up-reached element” (the first-up-reached element is defined as the first element encountered when moving up along the queue from the “second”). The validation fails if “second” conflicts with the intervening element in the interval, since it is a lower-sided-conflict. Otherwise we merge the “second” with the “first-up-reached” element and continue the search of “lower-sided-conflict” until “second” merges with “first”. If “second” cannot merge with “first” because of “lower-sided conflict” (which blocks the up-movement of “second”), the validation fails, otherwise the validation succeeds. The merger of “first” and “second” becomes the “combined” element of the transaction. The combined element will stay for the commit validation of other transaction in the RC-queue. A combined element should be removed if it reaches the head of the queue.

An example in Figures 4 and 5 shows how ROCC works. Suppose we have a receive queue which stores all the request messages from transaction clients. The first is an access request message for transaction T0, asking for Read access of data objects x1 and x3 (different data objects, not different versions of the same data object x as in MV-DBMSs). Correspondingly a Read element is posted to the RC queue with Tid=0, V=0 (V: =1 Validated, otherwise 0), C=0 (C: =1 Commit, otherwise 0), R=0 (R: =1 Restarted, otherwise 0) and ReadSet = (1,3). Then the message is delivered to an execution queue for the data manager to perform operations. The second message in the receive queue is T1 asking for Read access of data objects x1, x2 and x3. A corresponding Read element is posted to the RC queue with Tid=1, VCR=000 and ReadSet=(1,2,3). The third message is for T2 to read data objects x1, x3 and then write data item x3. The fifth message is a commit request message from T2. Note that no WriteSet is placed in the Read element (the third element) due to delayed update rule (WriteSet=(3) is placed in T2’s commit element (the fifth element) though the writes is still sent to the database sites for data log [1]). The commit request message is then sent to the execution queue for

H2: 1000: w1[x1] 1001: w2[x2] 1010: c2

1011: c1

Equivalent H2: 1000: 1001: 1010: w2[x2]c2

(T2’s commit element, also is its combined element)

1011: w1[x1]c1 (T2’s commit element, also is its combined element)

(a) Original history (b) Validating T1

Figure 3 Intervening validation Permits H2

data manager to perform the operations (commit and write). The data manager executes operations in the execution queue orderly if conflicts occur (this is enforced by the data manager).

The fourth message is an access request message with a validated mark (V=1), which requests reads of data items x1, x2, and writes of x3. This means the client is certain that his transaction only needs one request message. Correspondingly the element representing T3 is marked as validated (V=1) since there is no validation interval for the transaction and thus no validation is needed. When T2’s commit request message comes after T3, its validation fails because it has two elements conflicting with an intervening element in the RC queue – the element corresponding to T3. Thus the elements (the dashed elements) of T2 are removed from the RC queue and a Restart element (R=1) is posted to RC queue. A corresponding restart request message is then put to the execution queue to re-access the data objects it requested before. Since the commit request message of T1 arrives during the validation of T2, and its Commit element is posted before T2’s Restart element, the ROCC delivers T1’s commit request message before T2’s restart request message to the database sites, as we see from the execution queue. Note that, even though transaction T3 is committed, its element in the RC queue cannot be removed until it becomes the head of the queue, or all the elements before it are marked as validated (see Figure 5 (a)). This is because it may be involved in a cycle. As in the example, T2 and T3 forms a cycle of T2->T3->T2. If T3’s Validated element were removed from RC before T2’s validation, the cycle would not be detected by the system during T2’s validation.

After T2 fails in validation, T1’s commit message activates the validation process for T1. T1’s “first” element in the RC queue tells us it reads x1, x2 and x3. Its intervening element is T3 (see Figure 5 (b)). We find “first” conflicts with its intervening element (an upper-sided conflict), since T3 writes x3 while T1 reads x3 (a read-write conflict). We then check if the “second” (here it is the commit element) conflicts with its intervening element. T3 is also the “second” element’s intervening element in this example. We find no lower-sided conflict exists. Thus T1 passes validation (traditional OCC will fail T1 because of read-write conflict on x3) and the “second” is moved up to merge with the “first”. Correspondingly V is set to 1 (validated). The result RC queue after the T1’s validation is shown in figure 5, in which T1 is marked as validated (V=1), and T2 is marked as restarted and assumed to have access invariance property (R=1 and V=1). A transaction is said to have the property of access invariance if it intends to have the same set of access operations as it did when the transaction restarts because of data contention.

Note that we described an alternate technique – RC queue, instead of timestamp, to record the conflict order of operations. It is unknown at present time which technique would work better in a single version system.

Figure 4 An example for execution flow of ROCC

T0, R(1,3)

T1, R(1,2,3)T3, V,

R(1,2),W(3)

T2, R(1,3)W(3)

T1, C, R(4)

T2, C

Receive QueueT2, R(1,3)W(4)

T2, R, R(1,3)W(3,4)

T1, R(1,2,3)T3, V,

R(1,2),W(3)

Execution Queue

T1, C, R(4)

RC Queue

T1 010 4

T3 110 1,2 3

T0 000 1,3

T1 000 1,2 ,3

T2 000 1,3

T2 010 3

Figure 5 RC queues after validating transactions

RC Queue

T3 110 1,2 3

T0 000 1,3

T1 000 1,2 ,3

T2 101 1,3 3, 4

T1 010 4

T2 is restarted because of validation failure

RC Queue

T0 000 1,3

T1 100 1,2 ,3,4

T3 110 1,2

T2 101 1,2 3,4

T1 passes validation and its elements are merged into one

(a) Validating T2 (b) Validating T1

5. DISTRIBUTED INTERVENINING VALIDATION In a distributed environment, timeout is usually used for 2PL to maintain serializability. With intervening validation, it is possible that a transaction’s global execution correctness may not be guaranteed even if all its sub-transactions pass local validation and all participants vote YES to the coordinator. Thus special care need be taken when intervening validation is used. In a fully distributed environment, (i.e., a coordinator is also a participant), intervening elements should be piggybacked with the YES vote so that it can be used for computing global intervening element of that transaction at the coordinator’s site. After receiving intervening elements from all participants, the coordinator validates if the transaction can commit. If so, a commit decision is sent to all participants. After receiving the commit decision, participants commit the transaction by forming a combined element with a combined timestamp and making its writes/updates/deletes visible to other transactions.

The intervening validation requires an atomic commit operation, i.e., the commit should be done one by one so that the transaction’s intervening element can be computed correctly. A point worth noting is that a transaction’s combined element should not be removed immediately. Whether a combined element can be removed depends on if its corresponding transaction is the oldest in the system.

6. RECOVERY The intervening validation requires the calculation of intervening elements. We maintain all the element information in the main memory for fast validation. It will be impossible to calculate intervening elements if all element information is lost because of system crash. The simplest way to recover the system is to remove all the data versions that are installed by active transactions in a multiversion database system (abort all active transactions). We can redo all the committed transaction by making their writes/updates/deletes visible if this step is not done at the time when the system crashes and the log record

shows that the transactions have committed (Write-Ahead-Log). In a single version system, we can recover the system in a similar manner, since we use deferred update technique to avoid cascading abort.

7. PERFORMANCE We use simulation for assessing the possible performance gain from intervening validation. Figure 6 is a closed queuing network we used for modeling database systems. The model is an extended version of the model described in [9, 10], in which three components – database system model, user model and transaction model – are considered fundamental. The database system model captures the relevant characteristics of the physical resources (CPUs and disks) and their associated schedulers, and characteristics of the database (e.g., its size and granularity), the load control mechanism for controlling the number of active transactions, and the concurrency control algorithm itself. We assume a multiprocessor system in which one CPU and two disks as one resource unit are considered balanced resources (“balanced” means the utilization of CPUs and disks are about equal, as opposed to be either strongly CPU bound or strongly I/O bound). We implemented 5 different concurrency control methods (2PL, WDL (wait-depth-limit [5]), OCC (optimistic concurrency control), ROCC (read-commit ordering concurrency control [6]), SI ([3]) and MVROCC (Multi-version ROCC)) in the simulator to attain a fairly complete performance comparison in different environment settings. The ROCC uses intervening validation for SV-DBMSs. The MVROCC uses intervening validation for SV-DBMSs. The user model captures the arrival process for users, assuming a closed system with terminals in our case. The transaction model captures the reference behavior and processing requirements of the transactions in the workload. Transactions originate from terminals and arrive with a Poissonian distribution. A transaction can be thought of as being described via two characteristic strings. There is a logical reference string, which contains concurrency control level read and write requests, and a physical string, which contains requests for accesses to physical items on disk and the associated CPU processing time for each item accessed.

Figure 6 A Closed Queuing Model for the Simulation

no

terminals

cc

cc queue

blocked queue

restart

commit

access

in buffer?

think?

yesthink

no

yes

update?

yes

no

ready queue

disk

disk

cpu

cpu

object queue

Figure 7 shows an interface to the simulator we implemented. The fields on the form are adjustable parameters with which we can change environmental settings for assessing performance of different concurrency control mechanisms. db_size is database size. Though larger size can be set, we usually set db_size to 1000 pages to yield a region of operation that would allow interesting performance effects to be observed without requiring impossibly long time simulation. Mpl is multiprogramming level, representing the number of active transactions in the system. A transaction is considered active if it is either receiving service or waiting for service inside the database system. Transactions are modeled according to the number of objects they read and write. max_trans is the maximum number of objects a transaction can read or write while min_trans is minimum number of the transaction can read or write. We assume a uniform distribution between max_trans and min_trans so the average number of objects a transaction can access is (max_trans + min_trans)/2.

When a transaction originates from a terminal, if the number of active transactions system has reached the system limit defined by mpl, it enters the ready queue (see Figure 6) where it waits for a currently active transaction to complete or abort. We assume a transaction may submit multiple requests. The average time interval of adjacent requests of a transaction is described by in_think_time – intra-transaction thinking time. in_think_time=0 indicates the modeled transactions are batch-style, otherwise they modeled as interactive transactions. ext_think_time is the average time interval between the completion of a transaction and a new transaction initiated from a terminal. When a transaction reads (or writes) an object, a CPU spends obj_cpu time to process the object and a data manager uses obj_io time to read (or write) the object from (to) disks if the object is not in memory. hit_ratio is the probability that an requested object is in memory. write_prob is the probability of an operation being a write while update_write_ratio is the probability of a write being an update.

Figure 7 The simulator for concurrency control study

Table 1: an experiment setting for high data contention

db_size 1000

mpl 5, 10, 20, 30, 50, 75, 100, 200

max_trans 12

min_trans 4

in_think_time 1

ext_think_time: 1

obj_io: Read 10ms, write 15ms.

obj_cpu: 150ms

num_cpus: 5 CPUs.

num_disks: 10 disks.

hit_ratio: 90%

write_prob: 25%.

update_write_ratio: 90%.

Table 1 summarizes an experiment setting that is designed to compare the performance of different concurrency control mechanisms. In general, there are three possible factors that may constrain achievable performance: 1) resource contention,

2) data contention, and 3) resource underutilization. We focus on comparing performance under high data contention because resource contention is a less concern nowadays due to the fact that CPUs, disks and memory bars are becoming much faster and cheaper. A high data contention environment can be simulated by setting relatively high hit ratio, small database size, long transaction size, and long intra-/inter- transaction thinking time. It is known that a small database size can help reduce simulation time for the impact study of high data contention. Otherwise, large database size requires large mpl for attaining the same level of effects as in high data contention environments. Large mpl dramatically increases simulation time in order to obtain performance values of 5% statistical error within 90% confidence interval [9, 10].

Figure 8 shows the throughput of 2PL, WDL [5], OCC and ROCC in a single version environment. The throughput is defined as the number of transactions successfully completed per second. Consistent with the results discussed in [9, 10], when having sufficient resources, the lock-based strict 2PL and WDL were significantly outperformed by ROCC and OCC. WDL has higher throughput than S2PL because it, instead of waiting until a deadlock is detected, restarts a transaction when there is a chance to form a deadlock (wait depth is greater than 1). In the two OCC-based methods, ROCC achieves better performs than OCC because it is more permissible – ROCC only

invalidates a transaction when there are two sided conflicts while OCC restarts a transaction whenever there is one conflict.

Figure 8 The throughput of CC methods in SV systems

Figure 9 The throughput comparison of SI and MROCC in multi-version sysem

Figure 10 The throughput comparison of ROCC, SI and MROCC

Figure 11 The response time comparison of ROCC, SI and MROCC

Figure 9 shows the throughputs of SI and MVROCC in a multi-version environment. MVROCC is slightly better than SI (we are only interested in the values at the plateaus since we can always achieve the best possible performance by adjusting system’s mpl). The major advantage that the MVROCC holds over SI is MVROCC guarantees transactional execution correctness.

An interesting performance comparison is in Figures 10 and 11. Figure 11 shows no response time difference between ROCC in a single version system and SI and MVROCC in a multi-version system. There is moderate throughput difference shown in Figure 10. It seems that there is a chance that ROCC may beat SI and MVROCC if we consider other factors such as reference locality, memory consumption, etc. [1]. Our simulation showed that if hit ratio is 0.9 for ROCC and 0.7 for SI or MVROCC, then ROCC outperformed the other two. Of course the simulation result may vary when the environment parameters change.

8. CONCLUSIONS AND FUTURE WORK In this paper we discussed intervening validation and its applications to single version systems and multiversion systems. Compared to other concurrency control methods, intervening validation guarantees transaction execution correctness while providing good performance. In addition, our simulation results showed that ROCC provides a comparable performance with SI, though ROCC is for single version system while SI is for multiversion system. Since buffer manager is not well modeled in our simulator, we see a chance that ROCC may outperform SI, as [1] pointed out that multi-version systems may not well utilize the access locality phenomenon.

While more and more companies are shifting to use SI because of its successful implementation in ORACLE, a potential problem is “snapshot too old” and the “dreadful” performance introduced by the rollback segments where old versions of data objects are stored. This paper presented two extreme application cases where intervening validation applies,

i.e., a case in which snapshot data are available and a case in which only one data version is available. A variant of intervening validation to suit the need of “available version” system is possible. In the “snapshot version” system where at most one time interval [start-timestamp, commit-timestamp] is used for execution validation of a transaction, more validation interval should be expected in an “available version” system. It can be expected that, with the variant of intervening validation for “available version” systems, the number of data versions that should be stored can be adjusted to achieve optimal performance without causing “snapshot too old” errors. Further discussions on the variants of intervening validation for “available version” DBMS can be found in [11].

In the paper we briefly described a simulator we implemented for assessing the performance gain from intervening validation algorithms. Also the performance discussions were concise because there are many performance studies on concurrency control in the literature. We merely meant to demonstrate the performance gain from the proposed intervening validations and to explain the underlying causes. Readers can find detailed performance discussions under various parameter settings in [8], if interested. Our future work is to further evaluate the performance impact of buffer manager on ROCC, SI and MVROCC, and to explore the possible variants of intervening validation for “available version” DBMSs. We distinguish two types of multi-version systems by the term of “snapshot version” and “available version”. In snapshot version systems, snapshot data are assumed to be available (otherwise a “snapshot too old” error occurs), and MVROCC can be used for transaction validation. In available version systems, snapshot data is not required. Consequently snapshot read rule and MVROCC do not apply in such circumstances.

9. ACKNOWLEDGMENTSThe simulator for performance study was mainly written by Dongsheng Yu.

10. REFERENCES[1] J. Gray and A. Reuter, “Transaction processing: Concepts

and techniques”, 1993, Morgan Kaufmann.

[2] D. Shasha and P. Bonnet, “Database Tuning : principles, experiments and troubleshooting techniques”, ACM SIGMOD Tutorial, 2002, http://www.distlab.dk/dbtune/slides/.

[3] H. Berenson, P. Beinstein, J. Gray, E. O’neil and P. O’neil, “A critique of ANSI SQL isolation levels”, ACM SIGMOD, 1995, pp.1-10.

[4] A. Fekete, E. O’Neil, P. O’Neil and D. Shasha, “Making Snapshot Isolation Serializable’, in Print. http://www.cs.umb.edu/~poneil/publist.html.

[5] P. Franaszek, J. Robinson and A. Thomasian, “Concurrency control for high performance environments”, ACM Transactions on Database Systems, Vol. 17, No.2, 1992, pp.304-345.

[6] Removed for double blind reviewing.

[7] A. Adya, B. Liskove and P. O’Neil, “Generalized Isolation Level Definitions”, Proceedings of The IEEE International Conference on Data Engineering, 2000.


[9] R. Agrawal, M. Carey, and M. Livny, “Concurrency control performance modeling: Alternatives and implications”, ACM transactions on database systems, No. 4, Vol. 12, 1987, pp.609-654.

[10] V. Kumar, “Performance of Concurrency Control Mechanisms in Centralized Database Systems”, Prentice Hall, 1996.


Documents

Proceedings Template - WORDperrizo/saturday/ExternalDrive/f… · Web viewOur performance analysis and simulation shows that the solution not only solves the write skew problem