62
Windows Azure User Group Event Driven Architecture on the Azure Services Platform 27/05/2010

Eda on the azure services platform

Embed Size (px)

DESCRIPTION

The slides for my talk on how to implement an event driven architecture on top of the azure platform

Citation preview

Windows Azure User Group

Event Driven Architecture on the Azure Services Platform

27/05/2010

Yours truly

• Yves Goeleven• Solution Architect @ Capgemini• MS Community Lead @ Capgemini• Board member @ Azug• Contact me @

– Cloudshaper.wordpress.com– Mscop.be.capgemini.com– Twitter.com/YvesGoeleven– Facebook – Linkedin– [email protected]

INTRODUCTIONWhat are we going to discuss today?

Agenda

• Introduction• Design challenges• Event driven architecture

– Event Flow Layers– Event Processing Styles– Extreme Loose coupling– Relationship to CQRS

• Answering the challenges• Technology Mapping

– Demo & Implementation Details

• Questions

Windows Azure

.Net Service Bus

SQL Azure

Database

Business AnalyticsReporting Data Sync

The beauty of cloud computing• Infinite compute and storage

capacity• On demand• Extreme flexibility• Without upfront investments• Reduces overall costs• Better for the environment

DESIGN CHALLENGESAren’t there any downsides to cloud computing then?

Impact of scale

• Large scale systems suffer from– Data latency– No distributed transactions

• Global systems – No guaranteed instance availability

Latency

• Azure Fabric– Data overlay– Data replication over peers–Multiple failure & upgrade domains• Machines in different racks• Or even different datacenters

– Depending on traffic and data size• This might lead to latency

No distributed transactions

• The very nature of 2 phase commit–Makes each machine confirm twice in

the commit process– Adding more machines• Makes the commit process exponentially

slower

– This drastically limits scalability• So distributed transactions are not

supported in the cloud!

Instance availability

• Azure is a global platform– Can be used by anyone from anywhere,

24 / 7

• No windows for maintenance and upgrades– The automated update system will take

your system down • whenever it feels like it!

– Failure & upgrade domains counteract this• Need at least 2 instances of each role

Traditional composite SOA

• Synchronous request reply integration

• It assumes little to no latency– It assumes data from queries to be

accurate

• It assumes distributed transactions– To undo the work of services in case of

failure

• It assumes service availability– As consumer is waiting for service

completion

EVENT DRIVEN ARCHITECTURE

If synchronous composite SOA is unsuitable for the cloud, what is?

What’s an event anyway

• Notable, interesting, thing that happens in the business– State change, opportunity, threshold,

deviation

• From a technical perspective– Message: header & payload

• Events are completely self descriptive– All relevant state is encapsulated

• But whom decides what ‘interesting’ means?

Inversion of communicationEvent generators

Event Stream

Event consumers

Time

Event flow layersEvent generator Event Channel Event Processing Engine Downstream activity

Event Processing Engine

Event Processing Actions

Publish

Notify

Invoke service

Start business process

Capture

StandingQuery

EVENT PROCESSING STYLES

What does ‘interesting event’ mean ?

Simple Event ProcessingEvent generator Event Channel Event Processing Engine Downstream activity

Event Processing Engine

Event Processing Actions

Publish

Notify

Invoke service

Start business process

Capture

Simple Event

Stream Event ProcessingEvent generator Event Channel Event Processing Engine Downstream activity

Event Processing Engine

Simple Event

Event Processing Actions

Publish

Notify

Invoke service

Start business process

Capture

Complex Event ProcessingEvent generator Event Channel Event Processing Engine Downstream activity

Event Processing Engine

Simple Event

Complex Event Series

Event Processing Actions

Publish

Notify

Invoke service

Start business process

Capture

EXTREME LOOSE COUPLING

Event driven architecture delivers…

Extreme loose coupling

• Decoupling in multiple dimensions– Implementation

• Event generator and consumer’s implementation are not bound– They even don’t know about the other’s existance

• All information encapsulated in event

– Distribution• Event generator and consumer can be

physically separated• As long as they can access the event channel

Extreme loose coupling

• Decoupling (Continued)– Time• Events encapsulate all information• They can be stored for later processing

– Evolution• Event generators and consumers can evolve

independently• Usually they are ‘added’ instead of

‘changed’, contributing to ease of management

Brewers CAP Theorem

• A distributed system can satisfy any two of these guarantees at the same time, but not all three– Consistency

• all nodes see the same data at the same time

– Availability• node failures do not prevent survivors from

continuing to operate

– Partition Tolerance• The system continues to operate even if it

becomes partitioned due to loss of connectivity

Eventual consistency

• EDA sacrifices consistency– But will eventually become consistent

• Compensating events – Errors and exceptions are events as well• And should end up in the event stream

• Compensating transactions– Event Consumers should subscribe to

compensating events• To make things right again

Eventual consistency

• Local transactions, or similar mechanics, are still supported– So even if a role goes down while processing

a compensating event• The local transaction will ensure that the event

gets processed later

• Event consumers must be completely autonomous– They should not rely on other services– The long forgotten SOA principle ‘Service

Autonomy’, is mandatory in an EDA

Service Autonomy

• How to achieve it?• Make every event consumer (aggregate

root)– a stand alone state machine– Blog Comment (Submitted, Accepted, Rejected)

• Encapsulate state transitions with – Tentative operations

• Submit / Accept / Reject

– Internally it uses • the currently known state • risk mitigation logic

Service Autonomy

• Give it a memory (Partner state machine)– The relevant last known states of it’s

remote partners– Required to properly implement risk

mitigation logic

• This allows for– Stale, locally stored, state – out of order events

Apology based computing

• Eventual Consistency presents some challenges to the UI

• Users need to be informed that the system– Is working on it, instead of waiting for the

result

• The system will get back to them later– In case of failure or success

• Task oriented UI’s – support this better than data oriented UI’s

RELATIONSHIP TO CQRSHaven’t I seen some of this before?

Web role

Similar to CQRS*

Browser Worker role

Command

Query

Submit New Blog Post

View Recent Blog Posts

Publish

Request/Reply

Request/Reply

Events* Command Query Responsibility Segregation

Web role Worker role

It happened, live with it

Browser

Event

Query

Blog Post Submitted

View Recent Blog Posts

Event Stream

Request/Reply

Events

One way

Web role Worker role

Open for extension

Browser

Event

Query

Blog Post Submitted

View Recent Blog Posts

Event Stream

Request/Reply

One way

Extension Points

ANSWERING THE CHALLENGES

How does all of this solve the design challenges in the cloud?

Answering the challenges

• Latency– Event consumers listen to what’s happening– They react to events when they receive them– Time between event occurrence and

processing• Is probably irrelevant for the event consumer

– As all information is encapsulated inside the event

• Take into account that events may arrive out of order– Risk mitigation strategies required– Based on local state and partner state machine

Answering the challenges

• Availability– If the event channel technology is

durable• Like azure storage queues

– Event consumers can go down without impact• They will receive the events once they come

back up again

Answering the challenges

• Availability– The absence of events in the stream• Can be detected by a stream event

processing

– This can result into a new event• Triggering a self healing process• Or human intervention

Answering the challenges

• No distributed transactions– Compensating events• Compensation logic , local transactions

– Humans can subscribe to them as well• To fix the issues manually

TECHNOLOGY MAPPINGWhat technology could I use to implement this…

Event channel

• Table storage queues– Durable

• might lead to dead queues though• Queue management required

– Supports queue-peek-lock• Similar to local transactions

• .Net Servicebus’ messagebuffers– Temporarily durable

• will disappear if unused

– Supports queue-peek-lock• Similar to local transactions

• WCF’s Relaybindings– Can reach outside of the cloud

• Even across NAT &Firewalls

– No transaction support

Event processing engine

• Microsoft StreamInsight– Stream event processing– Complex event processing

• .Net Servicebus used to* have routers– Ideal for forwarding events• Supported any / all forwarding

– *Rumored to come back soon

Event processing engine

• Open source to the rescue• NServicebus– Library to implement service busses– Routing scenarios

• Publisher (All)• Distributor (Any)

– Simple event processing• Message handlers

– Complex event processing• Saga pattern

DEMOCan you show me an example?

ArchitectureWeb Web Web

Worker Worker Worker

Distributor

Web

Worker

Web

Worker

Distributor(Any)

CEP

Queue Storage

StreamInsight

NServicebus

Azure Roles

Table Storage

PublisherPublisher(All)

IMPLEMENTATION DETAILSHow the hell did you do that!

NHibernate

• Interaction with Azure is through REST API’s–Manually crafting requests–Manually parsing responses– Is cumbersome

• Co-created an NHibernate driver– To do the heavy lifting of interacting

with table storage– Provides persistence ignorance

NHIBERNATE DRIVERWalking through the code…

NServiceBus

• Performs messaging on Queue Storage– Client API ( IBus )– Server API ( IHandleMessages<T> )– Pub/Sub with durable subscriptions– Distribution based on worker availability

• Uses Nhibernate and azure storage driver internally– For all it’s storage needs

NServiceBus

• Custom extensions– Azure Queue Transport– Subscription storage –Worker availability management– Saga support (not yet finished)

NSERVICE BUS EXTENSIONS

Walking through the code…

Modules

• In order to cut costs• Modules loaded ad runtime– From blob storage– For all components– Allows for dynamic views, message

handlers and CEP queries– Plan to switch to MEF

• StructureMap as IOC–Wires it all together

CMS MODULE WORKERWalking through the code…

StreamInsight

55

CEP EngineO

utput Adapters

Input Adapters

Event

Standing Queries

Event generators Event consumers

Devices, Sensors

Web & Worker Roles

Event stores & Databases

Event

Event

Event

Event

Event

Event

C_ID C_NAME C_ZIP Event Stores & Databases

Pagers & Monitoring devices

Web & Worker roles

Event

Event

CEP Application at Runtime

Static reference data

Monitoring Systems

STREAMINSIGHT EXTENSIONS

Walking through the code…

Web Role

• ASP.Net MVC• Heavy use of JQuery

– Asynchronous semantics– WYSYWIG style– Extremely apology based

• Nhibernate’s second level cache on by default– Effectively queries memory most of the time– Synchronization through events of course

• Switching to IronRuby as view engine– Views stored in blob storage– Allows for runtime customization

CMS MODULE WEBWalking through the code…

RUNTIMEWalking through the code…

LET’S RECAPWe’re finally there!

ANY QUESTIONS?Would this guy still have a social life?

Thank you for coming!

Spread the word! www.azug.be