40
>>> 5 must-have Patterns for your web-scale Microservices @aliostad Ali Kheyrollahi, ASOS

Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

Embed Size (px)

Citation preview

Page 1: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

>>> 5 must-have Patterns for your web-scale

Microservices

@aliostad

Ali Kheyrollahi, ASOS

Page 2: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

> stackoverflow> £1.5 bln

global fashion destination

> 35% every year

Page 3: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// ASOS in numbers

2 0 1 6 T u r n O v e r → £15 bln

A c t i v e C u s t o m e r s → 12 M

N e w P r o d u c t s / w k → 4 k

U n i q u e V i s i t s / m o → 123 M

P a g e V i e w s / d a y → 95 M

P l a t f o r m T e a m s → 40

Page 4: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// Microservices Architecture

Page 5: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// why microservices> Scaling people not the solution

> Decentralising decision centres => Agility

> Frequent deployment => Agility

> Reduced complexity of each ms (Divide/Conquere) => Agility

> Overall solution complex but ...

Page 6: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// anecdote

Often you can measure your success in implementing Microservice Architecture not be the number of services you build, but by the number you decommission.

Page 7: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// microservices vs soaSOA Microservices

Main Goal Architectual Decoupling Agility

Audience Mainly Architecture Business (Everyone)

Set out to solve Architectural CouplingScaling People,

Frequent Deployment

Organisational Structure Impact Minimal Huge

Service Cardinality Usually up to a dozen >40 (Commonly >100)

When to do Always teams > ~5**** Debateable. There are articles and discussions on this very topic

Page 8: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// microservice challenges

> Very difficult to build a complete mental picture of solution

> When things go wrong, need to know where before why

> Potentially increased latency

> Performance outliers intractable to solve

> A complete mind-shift requiring a new operating model

Page 9: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// probability distribution

Response Time

Pro

bab

ilty

Page 10: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// performance outliersMicroservice

AMicroservie

B

99th Percentile = 500ms 99th Percentile = 500ms

A B Total<1s 99% 99% 98.01%

>500m 1% 99% 0.99%>500m 99% 1% 0.99%

>1s 1% 1% 0.01%

Page 11: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// ActivityId Propagator

Page 12: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// ActivityId

> Every customer request matters

> Every request is unique

> Every request creates a chain (or tree) of calls/events

> Activities are correlated

> You need an ActivityId (or CorrelationId) to link calls/events

Page 13: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// ActivityIdMicroservice

Id

IdId Thread Local Storage

Id

To Other APIs

Id

Event

Page 14: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// ActivityId - HTTPRequest

GET /api/v2/foo HTTP/1.1 host: foo.com activity-id: 96c5a1f106ce468ebcca8303ed7464bd

Response

200 OK activity-id: 96c5a1f106ce468ebcca8303ed7464bd

Page 15: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// Retry and Timeout Policy

Page 16: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// FailureMicroservice

A

1% chance of failure

XWait (back-off)XWait (back-off longer)

Microservice B

1% chance of failure

Page 17: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// Preemptive TimeoutMicroservice

A

XretryXretry

Short timeout

Short timeout

Microservice B

Page 18: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// TimeoutC

B

A

A > B > CA > B + C

Page 19: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// Choosing a timeout?

Static => Based on Server SLO

Dynamic => 95th percentile

Page 20: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// IO Monitor

Page 21: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// Blame Game“If there is a single place where

you can play blame game, instead of collective responsibility,

it is in Microservices troubleshooting”

Page 22: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// Did you say IO??

Microservice

DBAPI

Cache

Measure... every time your code

goes out of your process

Page 23: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// Recording Methods> Explicitly by calling record()

> Asking the library to record a closure

> Aspect-oriented

Java (spf4j)

private static final MeasurementRecorder recorder = RecorderFactory.createScalableCountingRecorder(forWhat, unitOfMeasurement, sampleTimeMillis);

… recorder.record(measurement);

.NET (PerfIt)

var ins = new SimpleInstrumentor(new InstrumentationInfo() { Counters = CounterTypes.StandardCounters, Description = "test", InstanceName = "Test instance", CategoryName = TestCategory });

ins.Instrument(() => Thread.Sleep(100), "test...");

Java and .NET

@PerformanceMonitor(warnThresholdMillis=1, errorThresholdMillis=100, recorderSource = RecorderSourceInstance.Rs5m.class)

[PerfItFilter(“PerfItTests", InstanceName = "Test")]public string Get(){ return Guid.NewGuid().ToString();}

Page 24: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// Publishing Methods

> Local file (various to logstash)

> TCP and HTTP (many, to zipkin, influxdb)

> UDP (statsd, collectd to graphite, logstash)

> Raising Kernel-level event (Windows ETW)

> Local communication (statsd)

Page 25: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// Circuit- Breaker

Page 26: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// tri-state> Closed traffic can flow normally

> Open traffic does not flow

> Half-open circuit breaker tests the waters again

Closed

Open

Half-open

Test

Failure

Wait timeout

Page 27: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// Netflix Hysterix

RequestVolumeThreshold

ErrorThresholdPercentage

SleepWindowInMilliseconds

TimeInMilliseconds

NumBuckets

Page 28: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// Fallback

> Custom: e.g. serve content from a local cache (status 206)

> Silent: return null/no-data/empty (status 200/204)

> Fail-fast: Customer experience is important (status 5xx)

Page 29: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// Canary and Health Endpoint

Page 30: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// Health Endpoints

Ping returns a success code when invoked

Canary returns a connectivity status and latency on the service and dependencies

“… none of them invoke any application code”

Page 31: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// PingRequest

GET /api/health HTTP/1.1 host: foo.com

Response

200 OK

Response

500 Server Error

Page 32: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// CanaryRequest

GET /api/canary HTTP/1.1 host: foo.com

Response

200 OK

{

[Nested Structure]

}

Page 33: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// ChirpResult

{ "serviceName": "foo", "latency": "00:00:00.0542172", "statusCode": 200, "isCritical": true }

Page 34: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// ChirpResult

Page 35: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// ChirpResult - critical failure

API

NC

NC

C

200

200

500

500

Page 36: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// ChirpResult - non-critical failure

API

NC

NC

C

500

200

200

200

Page 37: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// AOP / Declarative (c#)

[AzureStorageCanary("Foo-AzureStorage-BarDatabaseServer", “config-key-for-cn“)] [SqlCanary("SQL-BazActiveDatabase", null, typeof(SqlConnectionFactory))] [CanaryEndpointCanary("Dependency-Api", “config-key-for-endpoint“)] public class CanaryController : CanaryBaseController { … // some boilerplate code }

Page 38: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// Deep vs Shallow

API

API“Deep”“Shallow”/api/canary?deep=false

Page 39: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

/// Wrap-up> If you have more than ~5 teams, consider Microservices

> Logging/Monitoring/Alerting: single most important asset

> Use ActivityId Propagator to correlate (consider zipkin)

> Cloud is a jungleTM. Without retry/timeout you won’t survive

> Monitor and measure all calls to external services (blame game)

> Protect your systems with circuit-breakers (and isolation)

> Canary helps you detect connectivity from customer view

Page 40: Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice

@aliostad

Thomas Wood: Daisy Picture

Thomas Au: Thermometer Picture

Torbakhopper: Cables Picture

Dam Picture - Japan

Hsiung: Lights Picture

Health Endpoint in API Design