Upload
yves-goeleven
View
2.215
Download
2
Embed Size (px)
DESCRIPTION
The slides for my talk on how to implement an event driven architecture on top of the azure platform
Citation preview
Yours truly
• Yves Goeleven• Solution Architect @ Capgemini• MS Community Lead @ Capgemini• Board member @ Azug• Contact me @
– Cloudshaper.wordpress.com– Mscop.be.capgemini.com– Twitter.com/YvesGoeleven– Facebook – Linkedin– [email protected]
Agenda
• Introduction• Design challenges• Event driven architecture
– Event Flow Layers– Event Processing Styles– Extreme Loose coupling– Relationship to CQRS
• Answering the challenges• Technology Mapping
– Demo & Implementation Details
• Questions
The beauty of cloud computing• Infinite compute and storage
capacity• On demand• Extreme flexibility• Without upfront investments• Reduces overall costs• Better for the environment
Impact of scale
• Large scale systems suffer from– Data latency– No distributed transactions
• Global systems – No guaranteed instance availability
Latency
• Azure Fabric– Data overlay– Data replication over peers–Multiple failure & upgrade domains• Machines in different racks• Or even different datacenters
– Depending on traffic and data size• This might lead to latency
No distributed transactions
• The very nature of 2 phase commit–Makes each machine confirm twice in
the commit process– Adding more machines• Makes the commit process exponentially
slower
– This drastically limits scalability• So distributed transactions are not
supported in the cloud!
Instance availability
• Azure is a global platform– Can be used by anyone from anywhere,
24 / 7
• No windows for maintenance and upgrades– The automated update system will take
your system down • whenever it feels like it!
– Failure & upgrade domains counteract this• Need at least 2 instances of each role
Traditional composite SOA
• Synchronous request reply integration
• It assumes little to no latency– It assumes data from queries to be
accurate
• It assumes distributed transactions– To undo the work of services in case of
failure
• It assumes service availability– As consumer is waiting for service
completion
What’s an event anyway
• Notable, interesting, thing that happens in the business– State change, opportunity, threshold,
deviation
• From a technical perspective– Message: header & payload
• Events are completely self descriptive– All relevant state is encapsulated
• But whom decides what ‘interesting’ means?
Event flow layersEvent generator Event Channel Event Processing Engine Downstream activity
Event Processing Engine
Event Processing Actions
Publish
Notify
Invoke service
Start business process
Capture
StandingQuery
Simple Event ProcessingEvent generator Event Channel Event Processing Engine Downstream activity
Event Processing Engine
Event Processing Actions
Publish
Notify
Invoke service
Start business process
Capture
Simple Event
Stream Event ProcessingEvent generator Event Channel Event Processing Engine Downstream activity
Event Processing Engine
Simple Event
Event Processing Actions
Publish
Notify
Invoke service
Start business process
Capture
Complex Event ProcessingEvent generator Event Channel Event Processing Engine Downstream activity
Event Processing Engine
Simple Event
Complex Event Series
Event Processing Actions
Publish
Notify
Invoke service
Start business process
Capture
Extreme loose coupling
• Decoupling in multiple dimensions– Implementation
• Event generator and consumer’s implementation are not bound– They even don’t know about the other’s existance
• All information encapsulated in event
– Distribution• Event generator and consumer can be
physically separated• As long as they can access the event channel
Extreme loose coupling
• Decoupling (Continued)– Time• Events encapsulate all information• They can be stored for later processing
– Evolution• Event generators and consumers can evolve
independently• Usually they are ‘added’ instead of
‘changed’, contributing to ease of management
Brewers CAP Theorem
• A distributed system can satisfy any two of these guarantees at the same time, but not all three– Consistency
• all nodes see the same data at the same time
– Availability• node failures do not prevent survivors from
continuing to operate
– Partition Tolerance• The system continues to operate even if it
becomes partitioned due to loss of connectivity
Eventual consistency
• EDA sacrifices consistency– But will eventually become consistent
• Compensating events – Errors and exceptions are events as well• And should end up in the event stream
• Compensating transactions– Event Consumers should subscribe to
compensating events• To make things right again
Eventual consistency
• Local transactions, or similar mechanics, are still supported– So even if a role goes down while processing
a compensating event• The local transaction will ensure that the event
gets processed later
• Event consumers must be completely autonomous– They should not rely on other services– The long forgotten SOA principle ‘Service
Autonomy’, is mandatory in an EDA
Service Autonomy
• How to achieve it?• Make every event consumer (aggregate
root)– a stand alone state machine– Blog Comment (Submitted, Accepted, Rejected)
• Encapsulate state transitions with – Tentative operations
• Submit / Accept / Reject
– Internally it uses • the currently known state • risk mitigation logic
Service Autonomy
• Give it a memory (Partner state machine)– The relevant last known states of it’s
remote partners– Required to properly implement risk
mitigation logic
• This allows for– Stale, locally stored, state – out of order events
Apology based computing
• Eventual Consistency presents some challenges to the UI
• Users need to be informed that the system– Is working on it, instead of waiting for the
result
• The system will get back to them later– In case of failure or success
• Task oriented UI’s – support this better than data oriented UI’s
Web role
Similar to CQRS*
Browser Worker role
Command
Query
Submit New Blog Post
View Recent Blog Posts
Publish
Request/Reply
Request/Reply
Events* Command Query Responsibility Segregation
Web role Worker role
It happened, live with it
Browser
Event
Query
Blog Post Submitted
View Recent Blog Posts
Event Stream
Request/Reply
Events
One way
Web role Worker role
Open for extension
Browser
Event
Query
Blog Post Submitted
View Recent Blog Posts
Event Stream
Request/Reply
One way
Extension Points
Answering the challenges
• Latency– Event consumers listen to what’s happening– They react to events when they receive them– Time between event occurrence and
processing• Is probably irrelevant for the event consumer
– As all information is encapsulated inside the event
• Take into account that events may arrive out of order– Risk mitigation strategies required– Based on local state and partner state machine
Answering the challenges
• Availability– If the event channel technology is
durable• Like azure storage queues
– Event consumers can go down without impact• They will receive the events once they come
back up again
Answering the challenges
• Availability– The absence of events in the stream• Can be detected by a stream event
processing
– This can result into a new event• Triggering a self healing process• Or human intervention
Answering the challenges
• No distributed transactions– Compensating events• Compensation logic , local transactions
– Humans can subscribe to them as well• To fix the issues manually
Event channel
• Table storage queues– Durable
• might lead to dead queues though• Queue management required
– Supports queue-peek-lock• Similar to local transactions
• .Net Servicebus’ messagebuffers– Temporarily durable
• will disappear if unused
– Supports queue-peek-lock• Similar to local transactions
• WCF’s Relaybindings– Can reach outside of the cloud
• Even across NAT &Firewalls
– No transaction support
Event processing engine
• Microsoft StreamInsight– Stream event processing– Complex event processing
• .Net Servicebus used to* have routers– Ideal for forwarding events• Supported any / all forwarding
– *Rumored to come back soon
Event processing engine
• Open source to the rescue• NServicebus– Library to implement service busses– Routing scenarios
• Publisher (All)• Distributor (Any)
– Simple event processing• Message handlers
– Complex event processing• Saga pattern
ArchitectureWeb Web Web
Worker Worker Worker
Distributor
Web
Worker
Web
Worker
Distributor(Any)
CEP
Queue Storage
StreamInsight
NServicebus
Azure Roles
Table Storage
PublisherPublisher(All)
NHibernate
• Interaction with Azure is through REST API’s–Manually crafting requests–Manually parsing responses– Is cumbersome
• Co-created an NHibernate driver– To do the heavy lifting of interacting
with table storage– Provides persistence ignorance
NServiceBus
• Performs messaging on Queue Storage– Client API ( IBus )– Server API ( IHandleMessages<T> )– Pub/Sub with durable subscriptions– Distribution based on worker availability
• Uses Nhibernate and azure storage driver internally– For all it’s storage needs
NServiceBus
• Custom extensions– Azure Queue Transport– Subscription storage –Worker availability management– Saga support (not yet finished)
Modules
• In order to cut costs• Modules loaded ad runtime– From blob storage– For all components– Allows for dynamic views, message
handlers and CEP queries– Plan to switch to MEF
• StructureMap as IOC–Wires it all together
StreamInsight
55
CEP EngineO
utput Adapters
Input Adapters
Event
Standing Queries
Event generators Event consumers
Devices, Sensors
Web & Worker Roles
Event stores & Databases
Event
Event
Event
Event
Event
Event
C_ID C_NAME C_ZIP Event Stores & Databases
Pagers & Monitoring devices
Web & Worker roles
Event
Event
CEP Application at Runtime
Static reference data
Monitoring Systems
Web Role
• ASP.Net MVC• Heavy use of JQuery
– Asynchronous semantics– WYSYWIG style– Extremely apology based
• Nhibernate’s second level cache on by default– Effectively queries memory most of the time– Synchronization through events of course
• Switching to IronRuby as view engine– Views stored in blob storage– Allows for runtime customization