Upload
konrad-malawski
View
735
Download
4
Tags:
Embed Size (px)
DESCRIPTION
Short lightning talk about the HBase plugin for Akka Persistence and how it's how key design was specifically tuned for increasing numeric sequential idenfitiers, so that the cluster can be utilised properly. https://github.com/ktoso/akka-persistence-hbase
Citation preview
RowKey design in HBaseAkka Persistence Plugin
Konrad 'ktoso' Malawski GeeCON 2014 @ Kraków, PL
Konrad `@ktosopl` Malawski
Konrad `@ktosopl` Malawski
hAkker @
Konrad `@ktosopl` Malawski
typesafe.com geecon.org
Java.pl / KrakowScala.pl sckrk.com / meetup.com/Paper-Cup @ London
GDGKrakow.pl meetup.com/Lambda-Lounge-Krakow
hAkker @
PersistentActor
Akka Persistence ScalaDays 2014
PersistentActor
Compared to “Good ol’ CRUD Model”
state
“Mutable Record”
state =
apply(es)
“Series of Events”
super quick domain modelling!
sealed trait Command!case class GiveMe(geeCoins: Int) extends Command!case class TakeMy(geeCoins: Int) extends Command
Commands - what others “tell” us; not persisted
case class Wallet(geeCoins: Int) {! def updated(diff: Int) = State(geeCoins + diff)!}
State - reflection of a series of events
sealed trait Event!case class BalanceChangedBy(geeCoins: Int) extends Event!
Events - reflect effects, past tense; persisted
var state = S0 !
persistenceId = “a” !
PersistentActor
Command
!
!
Journal
PersistentActor
var state = S0 !
persistenceId = “a” !
!
!
Journal
Generate Events
PersistentActor
var state = S0 !
persistenceId = “a” !
!
!
Journal
Generate Events
E1
PersistentActor
ACK “persisted”
!
!
Journal
E1
var state = S0 !
persistenceId = “a” !
PersistentActor
“Apply” event
!
!
Journal
E1
var state = S0 !
persistenceId = “a” !
E1
PersistentActor
!
!
Journal
E1
var state = S0 !
persistenceId = “a” !
E1
Okey!
PersistentActor
!
!
Journal
E1
var state = S0 !
persistenceId = “a” !
E1
Okey!
PersistentActor
!
!
Journal
E1
var state = S0 !
persistenceId = “a” !
E1
Ok, he got my $.
PersistentActor
class BitCoinWallet extends PersistentActor {!! var state = Wallet(coins = 0)!! def updateState(e: Event): State = {! case BalanceChangedBy(coins) => state.updatedWith(coins)! }! ! // API:!! def receiveCommand = ??? // TODO!! def receiveRecover = ??? // TODO!!}!
persist(e) { e => }
PersistentActor
def receiveCommand = {!! case TakeMy(coins) =>! persist(BalanceChangedBy(coins)) { changed =>! state = updateState(changed) ! }!!!!!!!}
async callback
persist(){} - Ordering guarantees
!
!
Journal
E1
var state = S0 !
persistenceId = “a” !
C1C2
C3
!
!
Journal
E1
var state = S0 !
persistenceId = “a” !
C1C2
C3
Commands get “stashed” until processing C1’s events are acted upon.
persist(){} - Ordering guarantees
!
!
Journal
var state = S0 !
persistenceId = “a” !
C1C2
C3 E1
E2
E2E1
events get applied in-order
persist(){} - Ordering guarantees
C2
!
!
Journal
var state = S0 !
persistenceId = “a” !
C3 E1 E2
E2E1
and the cycle repeats
persist(){} - Ordering guarantees
Recovery
Akka Persistence ScalaDays
Eventsourced, recovery
/** MUST NOT SIDE-EFFECT! */!def receiveRecover = {! case replayedEvent: Event => ! state = updateState(replayedEvent)!}
re-using updateState, as seen in receiveCommand
Akka Persistence ScalaDays
Snapshots
Snapshots (in SnapshotStore)
Eventsourced, snapshots
def receiveCommand = {! case command: Command =>! saveSnapshot(state) // async!!}
/** MUST NOT SIDE-EFFECT! */!def receiveRecover = {! case SnapshotOffer(meta, snapshot: State) => ! this.state = state!! case replayedEvent: Event => ! updateState(replayedEvent)!}
snapshot!? how?
…sum of states…
Snapshots
!
!
Journal
E1 E2 E3 E4
E5 E6 E7 E8
state until [E8]
Snapshots
S8!
!
Journal
E1 E2 E3 E4
E5 E6 E7 E8
!
!
Snapshot Store
snapshot!
state until [E8]
Snapshots
S8!
!
Journal
E1 E2 E3 E4
E5 E6 E7 E8
crash!
!
!
Snapshot Store
snapshot!
S8
Snapshots
!
!
Journal
E1 E2 E3 E4
E5 E6 E7 E8
crash!
!
!
Snapshot Store
S8
“bring me up-to-date!”
Snapshots
!
!
Journal
E1 E2 E3 E4
E5 E6 E7 E8
restart!replay!
!
!
Snapshot Store
S8
“bring me up-to-date!”
Snapshotsrestart!
replay!
S8!
!
Journal
E1 E2 E3 E4
E5 E6 E7 E8
!
!
Snapshot Store
S8
state until [E8]
Snapshotsrestart!
replay!
S8!
!
Journal
E1 E2 E3 E4
E5 E6 E7 E8
!
!
Snapshot Store
S8
state until [E8]
Snapshots
S8!
!
Journal
E1 E2 E3 E4
E5 E6 E7 E8
We could delete these!
!
!
Snapshot Store
S8
trait MySummer extends Processor {! var nums: List[Int]! var total: Int! ! def receive = {! case "snap" => saveSnapshot(total)! case SaveSnapshotSuccess(metadata) => // ...! case SaveSnapshotFailure(metadata, reason) => // ...! }!}!
Snapshots, save
Async!
Snapshot Recovery
class Counter extends Processor {! var total = 0! ! def receive = {! case SnapshotOffer(metadata, snap: Int) => ! total = snap!! case Persistent(payload, sequenceNr) => // ...! }!}
Persistence Plugins
Akka Persistence TCK
class JournalTCKSpec extends JournalSpec {! lazy val config = ConfigFactory.parseString(“").! withFallback(ConfigFactory.load())!}!!!!!class SnapshotStoreTCKSpec extends SnapshotStoreSpec {! lazy val config = ConfigFactory.parseString(“").!! ! ! ! ! ! ! ! ! ! withFallback(ConfigFactory.load())! }!
Journal Plugin API!def asyncWriteMessages(! messages: immutable.Seq[PersistentRepr]!): Future[Unit]!!def asyncDeleteMessagesTo(! persistenceId: String, ! toSequenceNr: Long, ! permanent: Boolean!): Future[Unit]!!@deprecated("writeConfirmations will be removed.", since = "2.3.4")!def asyncWriteConfirmations(! confirmations: immutable.Seq[PersistentConfirmation]!): Future[Unit]!!@deprecated("asyncDeleteMessages will be removed.", since = "2.3.4")!def asyncDeleteMessages(! messageIds: immutable.Seq[PersistentId], ! permanent: Boolean!): Future[Unit]!
Optimising writes
Hbase table data layout
a-0001,a-0002,a-0003 b-0001,b-0002,b-0003
Hbase table data layout
a-0001,a-0002,a-0003 b-0001,b-0002,b-0003
HBase regions
Hbase table data layout
a-0001,a-0002,a-0003 b-0001,b-0002,b-0003
HBase regions
(lexicographical order)
Optimising writes
a-0001,a-0002,a-0003 b-0001,b-0002,b-0003
id = “a”id = “b”
Optimising writes
a-0001,a-0002,a-0003 b-0001,b-0002,b-0003
id = “a”id = “b”
Optimising writes
a-0001,a-0002,a-0003 b-0001,b-0002,b-0003
id = “a”id = “b”
HOT region
Optimising writes
a-0001,a-0002,a-0003 b-0001,b-0002,b-0003
id = “a”id = “b”
HOT region
Optimising writes
a-0001,a-0002,a-0003 b-0001,b-0002,b-0003
id = “a”id = “b”
impossible to spread load!
Optimising writes
hot-spotting
Optimising writes
hot-spotting
utilise entire cluster
Optimising writes
001-a-0001, 001-b-0001, 001-a-0051
id = “a”
002-a-0002, 002-b-0002, 002-a-00052
003-a-0003, 003-b-0003, 003-a-0053
049-a-0049, 049-b-0049, 049-a-0099
…
“partition” seeds
[HMaster]
Optimising writes
001-a-0001, 001-b-0001, 001-a-0051
id = “a”
002-a-0002, 002-b-0002, 002-a-00052
003-a-0003, 003-b-0003, 003-a-0053
049-a-0049, 049-b-0049, 049-a-0099
…
“partition” seeds
[HMaster]
Optimising writes
001-a-0001, 001-a-0051
id = “a”
002-a-0002, 002-a-00052
003-a-0003, 003-a-0053
049-a-0049, 049-a-0099…
[HMaster]
Write load spread to cluster!
Optimising writes
001-a-0001, 001-a-0051
id = “a”
002-a-0002, 002-a-00052
003-a-0003, 003-a-0053
049-a-0049, 049-a-0099…
Writes to different regions! [HMaster]
Optimising writes
Optimising reads
a-0001,a-0002,a-0003 b-0001,b-0002,b-0003
id = “a”
replay!
Optimising reads
a-0001,a-0002,a-0003 b-0001,b-0002,b-0003
id = “a”
replay!
Ordered, batch read, super efficient!!!
Optimising writes
001-a-0001, 001-a-0051
id = “a”
002-a-0002, 002-a-00052
003-a-0003, 003-a-0053
049-a-0049, 049-a-0099…
replay!
Optimising writes
001-a-0001, 001-a-0051
id = “a”
002-a-0002, 002-a-00052
003-a-0003, 003-a-0053
049-a-0049, 049-a-0099…
replay!
Optimising writes
001-a-0001, 001-a-0051
id = “a”
002-a-0002, 002-a-00052
003-a-0003, 003-a-0053
049-a-0049, 049-a-0099…
replay!
Optimising writes
001-a-0001, 001-a-0051
id = “a”
002-a-0002, 002-a-00052
003-a-0003, 003-a-0053
049-a-0049, 049-a-0099…
replay!
Optimising writes
001-a-0001, 001-a-0051
id = “a”
002-a-0002, 002-a-00052
003-a-0003, 003-a-0053
049-a-0049, 049-a-0099…
replay!
Optimising writes
001-a-0001, 001-a-0051
id = “a”
002-a-0002, 002-a-00052
003-a-0003, 003-a-0053
049-a-0049, 049-a-0099…
replay!
Optimising writes
001-a-0001, 001-a-0051
id = “a”
002-a-0002, 002-a-00052
003-a-0003, 003-a-0053
049-a-0049, 049-a-0099…
this is madness!
replay!
Optimising reads
Optimising recovery
001-a-0001, 001-a-0051
id = “a”
002-a-0002, 002-a-00052
003-a-0003, 003-a-0053
049-a-0049, 049-a-0099…
replay!
Optimising recovery
001-a-0001, 001-a-0051
id = “a”
002-a-0002, 002-a-00052
003-a-0003, 003-a-0053
049-a-0049, 049-a-0099…
async!
replay!+ re-sequence
Optimising recovery
001-a-0001, 001-a-0051
id = “a”
002-a-0002, 002-a-00052
003-a-0003, 003-a-0053
049-a-0049, 049-a-0099…
async!
small batches
replay!+ re-sequence
Optimising recovery
001-a-0001, 001-a-0051
id = “a”
002-a-0002, 002-a-00052
003-a-0003, 003-a-0053
049-a-0049, 049-a-0099…
replay!(to seqNr = 2)
Optimising recovery
001-a-0001, 001-a-0051
id = “a”
002-a-0002, 002-a-00052
003-a-0003, 003-a-0053
049-a-0049, 049-a-0099…
replay!(to seqNr = 2)
Optimising recovery
001-a-0001, 001-a-0051
id = “a”
002-a-0002, 002-a-00052
003-a-0003, 003-a-0053
049-a-0049, 049-a-0099…
for short recovery = no need to check all servers!
replay!(to seqNr = 2)
Akka Persistence ScalaDays 2014
Akka Persistence Plugins• Journals / Snapshot Stores (http://akka.io/community/) • Cassandra • HBase • Kafka • DynamoDB • MongoDB • shared LevelDB journal for testing
Akka Persistence ScalaDays 2014
Links• http://akka.io • https://groups.google.com/forum/#!forum/akka-user !
• https://github.com/ktoso/akka-persistence-hbase !
• http://www.slideshare.net/alexbaranau/intro-to-hbase-internals-schema-design-for-hbase-users
• http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
• https://github.com/OpenTSDB/asynchbase
©Typesafe 2014 – All Rights Reserved