Upload
simon-ouellette
View
137
Download
4
Tags:
Embed Size (px)
Citation preview
Database Consistency Models
ACID
● Atomicity: each transaction is "all or nothing" (Commit or rollback)
● Consistency: any transaction will bring the database from one valid state to another (Preserves relational integrity)
● Isolation: concurrent execution of transactions results in a system state that would be obtained if transactions were executed serially
● Durability: persistence to disk (rebooting doesn't cause data loss, for example)
Examples
● Traditional relational databases:
● Oracle
● SQL Server
● MySQL
● Etc.
● Some NewSQL databases:
● VoltDB
● AltiBase
Deficiencies of ACID
● Difficult to maintain high availability & fault tolerance in distributed scenarios
● CAP Theorem
● Huge performance overhead in distributed synchronization
● Huge performance overhead to maintain integrity
CAP Theorem(Brewer's conjecture)
CAP Theorem(Brewer's conjecture)
● In plain english:
"...during a network partition, a distributed system must choose either Consistency or Availability." -- foundationdb.com
CAP Theorem(Brewer's conjecture)
● Assume that you want strong consistency.
● This implies synchronous, blocking updates.
● Assume you also want availability
● This implies multiple nodes with redundancies.
● When you update one node, you need broadcast synchronously to all other nodes, waiting for successful confirmations (very slow!!!)
● So far so good... But now a node failed to connect to the others (network failure)!
● If you don't wait for it to come back, you've sacrificed consistency. If you block on it, you've sacrificed availability.
CAP Theorem(Brewer's conjecture)
BASE
● Basically available: there will be a response to any request, but that response could still be ‘failure’ to obtain the requested data or the data may be in an inconsistent or changing state.
● Soft state: even during times without input there may be changes going on due to ‘eventual consistency,’ thus the state of the system is always ‘soft.’
● Eventually consistent: "the storage system guarantees that if no new updates are made to the object, eventually all accesses will return the last updated value." -- the CTO of Amazon.com
Safety versus Liveness
● Liveness: a value distributed across systems eventually converges to be the same across those same systems (generally the last update value).
● "Something good eventually happens"
● Safety:the system is at all times consistent.
● "Nothing bad ever happens"
● Eventual consistency is purely a liveness guarantee (reads eventually return the same value) and does not make safety guarantees: an eventually consistent system can return any value before it converges.
Safety versus Liveness
● To be clear: in eventual consistency, by default, two concurrent read/write increments of a standard counter can potentially increase it by only 1.
● The last write wins, but there is no guarantee with regards to what happened in between (and they may have both read the value when it wasn't consistent)
● This is what happens when you don't have any safety guarantee, as in eventual consistency.
Examples
● Most big social media websites
● Google Cloud Datastore
● Most NoSQL databases:
● Riak, Redis, Hadoop (without Hbase), Couchbase, MongoDB (in some configurations), Cassandra (in some configurations)
● Etc.
● Amazon's Dynamo DB
● DNS (Domain Name System)
Deficiencies of BASE
● Delay in convergence
● No safety guarantee
● You don't have the same update semantics as in ACID transactions
Solutions to BASE's Problems
● Application developers can write compensation logic
● Okay in small, simple applications
● Quickly becomes umanageable in complex applications
● ACID 2.0 design principles that guarantee ACID-like consistency even with an eventual consistency mechanism.
Mutable shared states are the root of all evil.
ACID 2.0
● Associativity & Commutativity: the messages in the queue can be processed in any order.
● Idempotence: the message queue can use at-least-once-delivery guarantees (retry logic). Duplicate processing of the same message doesn't matter.
● Distributed: refers to the fact that ACID 2.0 applies to distributed systems.
What does it mean?
● Unlike ACID and BASE, ACID 2.0 doesn't tell you what are the guarantees, instead it tells you that there are certain design principles that are immune to transactional integrity issues.
● In particular, immutable data structures that you transform are easier to handle than mutable shared states (as most functional programming languages have understood)
The CALM Theorem
● Consistency as Logical Monotonicity
● Logically monotonic: intuitively, a monotonic program (or data structure) makes forward progress over time: it never "retracts" an earlier conclusion in the face of new information.
● Implementation is usually through a class of data structures referred to as CRDTs (conflict-free replicated data types)
Example: the PN-Counter
● Counts the number of increment and decrement calls per transaction (or "actor", or "node")
● When the value is read, it's calculated on the fly by summing up the number of increment "marks" and subtracting from the number of decrement "marks"
Example: the PN-Counter
Example: Bitcoin
● The bitcoin transaction ledger is a CRDT. It's an append only structure.
● The ledger contains the history of all transactions ever made: and it's a replicated dataset, updated by appending new transactions in a peer-to-peer "eventual consistency" framework.
Example: Apache Spark RDDs
● Spark is a high-performance distributed computing framework
● Big Data analytics
● Machine learning (MLlib)
● Distributed graph processing (GraphX)
● Spark SQL
● It replaces Hadoop MapReduce (about 30 to 100 times faster)
● The essence of the Spark framework is a type of data structure called a Resilient Distributed Dataset (which is a CRDT).
Example: Apache Spark RDDs
● RDDs features:
● Immutable
● Distributed / Replicated
● Expose map(), filter(), reduce(), join() operations to produce new derived RDDs (very "functional" rather than object-oriented – written in Scala)
● Logs "lineage" information (how the RDD was constructed) across partitions, rather than the data itself, for efficiency. If a network fault occurs, it can reconstruct the data through that lineage. This way the cost of data replication isn't generally incurred (only in fault recovery scenarios).
Example: Apache Spark RDDs
Other examples
● Apache Kafka message queue
● Riak vector clocks for synchronization
● The game league of legends uses Riak CRDTs for its in-game chat system
● TreeDoc and Logoot: for collaborative text editing
● SoundCloud uses a CRDT set for streaming, implemented on top of Redis
Deficiences of CRDTs
● Not a universal solution: doesn't cover all possible applications
● Garbage collection issues (append-only means it consumes increasing amounts of space!)
● Complex to design
Some solutions
● Bloom programming language
● Provide a "framework" to develop in a commutative, order-insensitive way that favors data structure of a CRDT type.
● Existing distributed computing platforms do the complicated work for us (Apache Spark, for example)
● We still need to accept locking ACID or weakly consistent BASE for some parts of the system. We can also resort to better "compromises" such as causal consistency.