Megastore - ID2220 Presentation

Preview:

DESCRIPTION

Paper presentation for Advanced Topics in Distributed System. Original paper can be found here -> http://research.google.com/pubs/pub36971.html

Citation preview

Megastore

Providing Scalable, Highly Available Storage for Interactive Services

Paper by Jason Barker et al. Presented by Arinto Murdopo

arinto@kth.se

7/11/2012 1

Outline

7/11/2012 2

Motivation Megastore:

Features Scalability Availabilty Putting them all together

Observation Conclusions

Motivation

Conflicting requirements • RDBMS – easy to use, but not scale • NoSQL – scale, but not easy to use

Interactive online services • Highly available and fast response time

7/11/2012 3

Here comes Megastore

easy to use • ACID semantics

scalable • data partitioning

highly available • synchronous replication through modified

Paxos

7/11/2012 4

Easy to use - Features

cost-transparent APIs • No API for joins • Joins are implemented in application code

data model • schema, table (entity), property • entity clustering • indexes: local, global • Bigtable column name == Megastore table

name and property name, i.e User.name

7/11/2012 5

Easy to use - Features

transactions and concurrency control • Bigtable for concurrency control • transaction lifecycle: read, application logic,

commit, apply, clean up

others • backup system of transaction logs • encryption

7/11/2012 6

Scalable

Scale the replication scheme Data partitioning

• Entity group concept

Data locality

• Entity group locality • Bigtable instances locality

7/11/2012 7

Entity Groups

7/11/2012 8

Entity is like instance of table. Entity group is group of entities. i.e

Email Application • Email account

Blog Application • User Profile • Blog post + metadata • Blog unique name

Entity Groups

7/11/2012 9

Highly Available

7/11/2012 10

Replicate mutations of write-ahead log inside entity groups using modified Paxos, but let’s revisit original Paxos…

Modified Paxos – Fast Reads

7/11/2012 11

Read in original Paxos

Modified Paxos – Fast Reads

7/11/2012 12

Contact Coordinator and read locally if possible

Modified Paxos – Fast Writes

7/11/2012 13

Skip “prepare” stage in subsequent write of same leader, provided no write from other writers

Modified Paxos – New Replica Types

7/11/2012 14

Full Replicas all replicas that we have seen until now

Witness Replicas

are able to vote store but do not apply write-ahead logs do not store entity data

Read-only Replicas

are not able to vote snapshots of entity data

Putting them all together

7/11/2012 15

Megastore Architecture

Reads

7/11/2012 16

Query Local

Find Position

Catchup

Validate

Query Data

Writes

7/11/2012 17

Accept Leader

Prepare

Accept

Invalidate

Apply

Observation - Availability

7/11/2012 18

Observation – Latency

7/11/2012 19

Conclusion

7/11/2012 20

Megastore and its motivation

Features of megastore • It has ACID semantics • But need to define entity groups • Need to handle inter-group updates

Scalability and Availability

More experiments are needed