35
© 2016 ||| ASOS Plc @aliostad @dav3green Microservice Architecture at ASOS Ali Kheyrollahi ||| Dave Green

Microservice architecture at ASOS

Embed Size (px)

Citation preview

© 2016 ||| ASOS Plc@aliostad @dav3green

Microservice Architecture at ASOS

Ali Kheyrollahi ||| Dave Green

© 2016 ||| ASOS Plc@aliostad @dav3green

ASOS* Established at 2000

* Global Fashion destination for “20-something”.

*Hires +2500 staff & +250 in tech (700)* Grown average of 35% YoY

*

39th biggest global online retailer

© 2016 ||| ASOS Plc@aliostad @dav3green

In Numbers£1.5b

2016 Turnover

12MActive

Customers

4kNew products

every week

123MUnique visits/ month (June)

95MPageViews

(non-peak day)

10KRPS (Some

services @peak)

© 2016 ||| ASOS Plc@aliostad @dav3green

Nick Beighton - CEO from “IT Cascade” slides - Oct 2014

ASOS - A Tech Company

© 2016 ||| ASOS Plc@aliostad @dav3green

ASOS Stock Price: 2009-2013

More than 2000% growth!

© 2016 ||| ASOS Plc@aliostad @dav3green

ASOS Stock Price: 2009-2014

Multiple IT project failures

Tech issues disrupting promotions

Overdue features were not delivered

© 2016 ||| ASOS Plc@aliostad @dav3green

Just blame the IT? Quite the opposite…

© 2016 ||| ASOS Plc@aliostad @dav3green

Microservices… why?

© 2016 ||| ASOS Plc@aliostad @dav3green

“Can I haz microservicez?”

“It is all a marketing hoax.”

“I dunno…” The sceptic…

© 2016 ||| ASOS Plc@aliostad @dav3green

Scaling People not the solution

↓Complexity of each service at the cost of ↑overall solution

Frequent and independent Deployments

Decentralising decision centres

Governance and EA is different

Auftragstaktik Doctrine is the conceptual underpinning of HOW to think and operate effectively; teaching leaders WHAT to think is dogma…

Auftragstaktik encourages commanders to exhibit initiative, flexibility and improvisation while in command…

In what may be seen as surprising to some, Auftragstaktik empowers commanders to disobey orders and revise their effect as long as the intent of the commander is maintained…

© 2016 ||| ASOS Plc@aliostad @dav3green

Domain Modelling

Operating ModelPeople

Successful Architecture

© 2016 ||| ASOS Plc@aliostad @dav3green

ASOS Journey

© 2016 ||| ASOS Plc@aliostad @dav3green

Two-speed ITEnterprise Domain

Digital Domain

• Predominantly “Buy” • Integrated at top tier • Minimal Engineering • Project-centric

• Predominantly “Build” • Drives sales and customer

touchpoint • Product-centric • Downtime unacceptable

© 2016 ||| ASOS Plc@aliostad @dav3green

Digital Domains & PlatformsThis is Domain

This is Platform

Logical Services

•“Domains” are ASOS’ organisational structure for managing Platform Teams

• Platform Teams look after collections of aligned services. They are accountable for the full lifecycle management

© 2016 ||| ASOS Plc@aliostad @dav3green

Operating ModelIndicative Team Numbers

6 Digital Domains

19 Platforms

35 Dev / Scrum teams

24 Solution Architects

700+ people in Technology (2500 in Asos total)

Avg +20 people per month over last year

© 2016 ||| ASOS Plc@aliostad @dav3green

© 2016 ||| ASOS Plc@aliostad @dav3green

10K foot view of Application Architecture

© 2016 ||| ASOS Plc@aliostad @dav3green

Governance

Bottom-Up

Top-Down

© 2016 ||| ASOS Plc@aliostad @dav3green

Engineering Practices

© 2016 ||| ASOS Plc@aliostad @dav3green

❤ASOS

© 2016 ||| ASOS Plc@aliostad @dav3green

ASOS Tech Stack

© 2016 ||| ASOS Plc@aliostad @dav3green

Principles

* All queries and commands through HTTP API (No ESB-like pseudo-Microservices!)

*Microservices can subscribe to events raised by other Microservices

**

Each Microservice owns a business responsibility and defines a clear boundary for communication (APIs and Events)

They own their data (all access to data through API only)

* Each Microservice is realised in one (sometimes more) physical components

© 2016 ||| ASOS Plc@aliostad @dav3green

LMA Platform

© 2016 ||| ASOS Plc@aliostad @dav3green

Logging

* They get transferred to Table Storage by SLAB sink

* In addition, we store a CorrelationId and an optional EventCode

**Azure Logging not been a strength of Azure (just a text and log level)

We use a strongly-typed Logging provided by ETW

* Custom code to flow CorrelationId across async statements

* Performance of All APIs and external IO calls gets measured

© 2016 ||| ASOS Plc@aliostad @dav3green

A P

I

Application LoggingMicroservice Component

cidETW

cid

Thread.CurrentThread.SetLogicalData(…) [EventSource.ActivityId does not flow

over async methods]

SLAB Azure Table Storage Sink

Application Code

cid

To other APIs

cid

Raising Events

Listener

EC

cid

IIS Logs

© 2016 ||| ASOS Plc@aliostad @dav3green

A P

I

PerformanceMicroservice Component

WAD Windows Azure

Monitoring Agent

CPC

CPC: Custom Performance Counters Inst: Instrumentation

Perfit Library

SLAB Azure Table Storage Sink

Application Code

Inst

Inst

Call to Data Stores or other services

CPC

© 2016 ||| ASOS Plc@aliostad @dav3green

WoodpeckerFor `Pull` (rather than `Push`) metrics

Queue Depth/Size

Azure SQL Diagnostics

Canaries and Health Endpoints

© 2016 ||| ASOS Plc@aliostad @dav3green

Logsink API

For Native and Web Clients

A P

I

Logsink

/logchannels/<channel>

GET /logchannels/channelOne?a=b&c=d HTTP1.1

200 OK Content-Type: image/gif

POST /logchannels/channelOne?a=b HTTP1.1 {

“c”: “d” }

202 Accepted

POST /logchannels/_bulk HTTP1.1 [

{“channel”: “…”, “payload”: {...} } ]

202 Accepted [

{“status”: 202}, ... ]

Channels Config

EventHub

© 2016 ||| ASOS Plc@aliostad @dav3green

ConveyorBelt

Performance Counters

ConveyorBelt

Azure WAD logs

ETW Logs

Elasticsearch

Highly Available Headless Cluster shovelling data to ES

Instrumentation Logs

IIS Logs

Woodpecker Outputs (Pull Logs)

Sources Config

~100 GB/day

© 2016 ||| ASOS Plc@aliostad @dav3green

Monitoring (Demo)

© 2016 ||| ASOS Plc@aliostad @dav3green

AlertingElasticsearch

Platform Team LMA Support

Watch

OAT Spec

EC

1st-2nd line Support

- xxx EC seen more than 10 times over the last minute. Back-off for 15 minutes - 90th percentile of API response > 100ms over

the last hour... Back-off for an hour

© 2016 ||| ASOS Plc@aliostad @dav3green

Lessons Learnt

* Using any cloud technology? Forget the hype and trust no one: test, measure, adopt/drop, monitor, engage with your provider

* In cloud? Design for failure… “Cloud is a Jungle”

* At this scale, owning your LMA data and process very important

Distributed computing is hard, making it geo-distributed even harder** Managing platform costs can be difficult

* Network latency & failures add up: understand and optimise time from the user to the data

* Expect to roll your sleeves up: maturity in a lot of areas is can be low (platform, tooling, skills, supporting technologies) but is changing rapidly

© 2016 ||| ASOS Plc@aliostad @dav3green

ASOS is Hiring!If you are intrigued by what we do,

we would love to hear from you!

Just get in touch with us in twitter @aliostad @dav3green

© 2016 ||| ASOS Plc@aliostad @dav3green

Q AThank You!

© 2016 ||| ASOS Plc@aliostad @dav3green

References and external artwork used

Adrian Cockcroft talk: Simplifying the future

Work in progress: Picture

Rusty Rolls Royce: Picture

Tesla S Blueprint: Picture

Two-Speed (Slow Lane, Fast Lane): Picture

Copyright of the artworks listed here belong the owners specified in the links below