Upload
pivotalopensourcehub
View
2.337
Download
2
Embed Size (px)
Citation preview
2016.03.09
Where does Geode fit in modern system architectures?
2
Eitan Suez
• Eitan Suez
• Pivotal Consultant Instructor
• Teach GemFire, Cloud Native, PCF
• Prior to joining Pivotal, was Principal Consultant with ThoughtWorks
• Long-time software developer, based in Austin, TX
3
About Me
• Over the years have worked on many enterprise projects for a number of customers
• First hands-on experience with Geode when consulting at SouthWest Airlines..
• ..in the role of technical lead on a multi-team project, where Geode played a prominent role in the system architecture
4
Relationship with Geode
• gfsh • OQL and the data browser • PDX serialization • Spring Data GemFire • Learn how to do automated functional testing with it
5
My Journey
• At first, we were so focused on building features • Regions were already defined by solutions architects, treated them as
tables • Didn’t pay too close attention to the fact that we had:
• near-linear scale-out capabilities built-in with partitioned regions • fault-tolerance with redundant data copies • locators adding indirection, clients isolated from cluster specifics
6
Don’t immediately realize what you’ve got
7
Example: Queries against partitioned regions
Client
Geode Distributed System
Query against partitioned region
Server
Query ExecutorPartitioned
Region
Server
Partitioned Region
Server
Partitioned Region
Can go further with server-side functions
• A Database, but in-memory? • Can also double as a simple cache? • A key-value store, but supports queries? • Supports transactions • Events?
8
Unique Combination of Features
• Briefly reviewed the traits of Apache Geode • It takes time to “wrap one’s head around” the whole of this
product
9
Impressive feature set
So, what can you do with it?
• Specific to Java stack: O/RM and Hibernate • can plug in as Hibernate L2 Cache • Peer-to-peer configuration
10
Use Cases “in the Small”
• Can be an out-of-process cache server, like Redis, or memcached ➡ gemcached
• These are fine, but does not take advantage of the full feature set
11
Use Cases “in the Small”
12
Canonical Architecture
Geode Distributed System
RegionsFunctionsLocator
Backing Store
Client Client
Events, Continuous Queries
RegionsFunctions
CacheLoader AsyncEventListener
Server
RegionsFunctions
Client
Queries, Transactions, Function Executions
• On a couple of projects over the last couple of years, have been exposed to CQRS
• At first it seemed strange, or overly complex. Didn't get it • Kept asking myself:
• Why not start out simpler? • Seems rather complicated • It’s more work
13
Switching Gears
• Stands for Command Query Responsibility Segregation • A Pattern
• deliberately not prescriptive regarding how you implement this separation
• Separation all the way down to the database • Germ of the idea came from Bertrand Mayer (of Eiffel fame), with
concept of CQS • Introduced, proposed by Greg Young • Active .NET community, Udi Dahan among others
14
What is CQRS?
15
CQRS..
..tells you what, not how, but to answer why, we are asked to look at what happens when you “go there”
16
When reads and writes are separate..
• With a single schema, you’re forced to optimize for one at the expense of the other
• With two schemas, one can be optimized for reads and the other for writes (have your cake and eat it too) • relational model for writes • denormalized views for reads
17
..can optimize reads and writes
Read-Optimized Write-Optimized
Data-Representation Spectrum
3rd normal formdenormalized views
18
Reading when your data is normalizedRequest
ServicesRepositories
multiple queries, lots of joins
results transformations
views
Relational Database
Controller
compositions
Constantly reassembling views
• no joins necessary • no transformations • no need to reconstruct a view model for each
request
19
With a denormalized schema
Apache Geode
Region: Customers
Region: Orders
Region: Products . . .
• Can scale reads and writes independently • many systems have a profile where reads outnumber writes at 100:1
ratio • Read and write sides can be implemented with entirely different tools and
technologies • Read-side can stay up when write-side is temporarily down
20
..more benefits
• Commands are semantic, in the language of the business, not REST CRUD: AddToCart, AddPaymentMethod, ChangeAddress
• Command handling can be asynchronous • enqueue commands • can scale command handling
21
Command Side
See: Udi Dahan Clarified CQRS
• The LOG • Append-only, no mutation • Immutable storage, doesn't destroy history • Activity just a stream of events
• tables are projections, can be derived entirely from log • views can be recreated at will • multiple views
22
Event Sourcing
See: Martin Kleppmann Stream processing, Event sourcing, Reactive, CEP … and making sense of it all
• Data in Motion vs Data at Rest • Entire History vs Snapshot in time • Source of truth vs derived information
• materialized views, caches, indexes, aggregations
23
The Log / Table Duality
See: Jay Kreps The Log: What every software engineer should know about real-time data's unifying abstraction
See: Martin Kleppmann Stream processing, Event sourcing, Reactive, CEP … and making sense of it all
• ..in test environments to reproduce bugs • ..in dev environments to test an upcoming release • ..in production to “undo” a bug • ..in production for blue-green type deployments
• Can transition to a new schema/representation of data in your regions because you've come up with a radically different user interface for navigating that information.
24
Replaying the Log
See: Greg Young CQRS and Event Sourcing
25
Diagram by“Exploring CQRS and Event Sourcing”, msdn
26
✓ update caches when new events come in ✓ invalidate caches proactively - ensure data
in caches remain fresh ✓ inverts the cache loader concept ✓ serving data from fast, in-memory caches ✓ regions contain “ViewModel” objects
Events
Projection Updates Views in Regions
Geode Distributed SystemRegions
containingView Models
Read Side - relay view models to ui - little to no transformations
Events, Continuous Queries
Queries
• Apache Geode as the read store in a CQRS system is a particularly good fit: • eager cache invalidation • scalable and fast reads via..
• regions store denormalized views • partitioned regions enable linear scale-out • in-memory data supports low-latency reads
• Curious to learn how Geode is being applied in your work
27
Summary
• Martin Kleppmann Stream processing, Event sourcing, Reactive, CEP … and making sense of it all
• Rx, Erik Meijer Your Mouse is a Database
• Greg Young CQRS and Event Sourcing
• Jay Kreps The Log: What every software engineer should know about real-time data's unifying abstraction
• Udi Dahan Clarified CQRS
• Dominic Betts, Julian Dominguez, Grigori Melnik, Fernando Simonazzi, Mani Subramanian
CQRS Journey • Dannielle Burrow
Four Real World Use Cases For An In-Memory Data Grid28
References & Attributions
29
Join the Apache Geode Community! • Check out http://geode.incubator.apache.org
• Subscribe: [email protected]
• Download: http://geode.incubator.apache.org/releases/
Thank you!