How to write your database: the story about Event Store

Preview:

DESCRIPTION

This story is about distributed open-source database called Event Store (http://geteventstore.com). Event Store is developed by distributed team, part of which are ELEKS employees. I am going to talk about Event Store purpose and how it works, share some lessons we learned during the development and how it feels when you develop distributed high-performance system of that complexity. The talk will be interesting for technical people: software architects and engineers in general and .NET developers in particular as Event Store is written in C#.

Citation preview

HOW TO WRITE YOUR

DATABASEThe story about

By Victor Haydin, Head of R&D, ELEKS

Agenda

1.Problem definition2.Event Store: introduction3.Under the hood4.Lessons learned5.Q&A

Few notes before we start1.This story is about the development, technology

and lessons learned, not a product2.Neither about Event Sourcing3.Outsourcing point of view4.I was there on pre-sales and initiation phases.

When the team was formed I left the project.

Survey

How many of you can understand code?

Event Sourcing?

Event Store?

SEDA?

140 characters long definition:

Event sourcing - persisting entities by appending all business events to transaction log. To rebuild the state, we replay this log.

ExampleBankAccount

GotSallary: +100GotCashFromAtm: -10MadeInternetPayment: -15MadeTerminalPayment: -5PutCashToAccount: +20

Account balance: 100-10-15-5+20=90

Event SourcingPros Cons

Performance Performance

Simplification Versioning

Audit Trail Querying

Integration with other systems Not common approach

Troubleshooting Better tastes with CQRS

Fixing errors

Testing

Flexibility

The marketing says

and time-series data

for single-node configuration

AtomPub protocol for HTTP

33000 writes per second on single node

Master-Slave replication for Premium version

I’ll talk more about Mono later

web-based

Client API1.Append to Stream2.Read Events3.Subscribe4.Transactions5.Stream Metadata6.Delete Stream7.System Settings8.Connection Lifecycle

Under the hood

Disclaimer

SEDA

Time to take a look at code

SEDA outcome1.Very few locks in code2.Simple state management3.Easy to understand system composition4.Easy to extend5.Easy to monitor (DEMO)

License is MIT, code is available on github. Enjoy!

Storage Engine

All events are stored in one transaction log (splitted into chunks)

Writer Chaser Readers Scavenger

Storage pipeline

Writer1.Single-threaded2.Append-only3.Write data in chunks4.Flush depending on load and settings

Chaser1.Checks what writer has appended to log and

sends acknowledgements 2.Updates index

Readers1.Read events from transaction log with help of

Read Index

Read Index1.Maps stream name and version to position in

transaction log2.Index = MemTable + PTables3.MemTable = Dictionary of Lists4.Index entry – record of fixed length5.PTable = file with sorted sequence of entries

and cached midpoints6.Binary search FTW!

Scavenger1.Background process that go through transaction

log chunks and clean it from obsolete data (e.g. deleted streams, $maxCount, $maxAge etc.)

2.Simply writes data to a new chunk file3.Last one out shut the lights off

High Availability

High Availability1.Master-Slave replication2.Automatic master elections (N/2 + 1 quorum)3.Wins the one with most actual state4.Write always to master, read from master or

slave, depending on client’s choice 5.Byte stream replication6.Available in commercial version

Projections

Projection – the process of taking an event stream and converting it to some other form (e.g. another event stream or state object)

Projections Engine1.Built-in query language – Javascript2.Based on Google’s v8

Demo: Projections Engine

Lessons Learned

Immutability is good

Pure async is fast,although hard for development

Lockless programming is hard

Distributed programming is even harder

Debugger is almost useless

Unit-testing is a limited tool

Verbose logs rock

Real-life testingrocks like nothing else

Bugs in .NET[System.Security.SecuritySafeCritical]public virtual void Flush(Boolean flushToDisk) {

// This code is duplicated in Disposeif (_handle.IsClosed) __Error.FileNotOpen();if (_writePos > 0) {FlushWrite(false);if (flushToDisk) {

if (!Win32Native.FlushFileBuffers(_handle)) {__Error.WinIOError();

}}

}else if (_readPos < _readLen && CanSeek) {FlushRead();

}_readPos = 0;_readLen = 0;

}

*Appears to be fixed in 4.5

Bugs in Mono

TCP/IP Stack deadlocks Concurrent Stack/Queue

XML Serializer

File Stream issuesGeneral slowness:

20K w/s

Disk caches: fake flush and reordering

*By default, can be disabled

Mind the HDD/SSD difference

*It is possible to optimize for both

There is a long way to simplicity

Experiments:The way it works

Further plans1.Horizontal scalability2.Hosted service3.Adding other features

Links1. GetEventStore.com – official web site

2. GitHub.com/EventStore/EventStore – source code

3. OreDev.org/2012/sessions/a-deep-look-into-the-event-store – A deep look into The Event Store by Greg Young

4. MartinFowler.com/eaaDev/EventSourcing.html – Event Sourcing overview by Martin Fowler

5. Google <Event Sourcing|Event Store>

Got a question?Ask!

@victor_haydin

linkedin.com/in/victorhaydin

victor.haydin@gmail.com

Recommended