37
We're All Distributed Systems Developers Now By Aaron Stannard, Founder & CEO Petabridge

We're all distributed systems devs now: a crash course in distributed programming

Embed Size (px)

Citation preview

Page 1: We're all distributed systems devs now: a crash course in distributed programming

We're All Distributed Systems Developers Now

By Aaron Stannard,Founder & CEO

Petabridge

Page 4: We're all distributed systems devs now: a crash course in distributed programming

Load Balancer

User Requests

WebServer

WebServer

WebServer

WebServer

WebServer

PRIVATE ZONE

SQL(Master)

SQL(Slave)

OLAP

InternalAdmin Reports BI

Page 5: We're all distributed systems devs now: a crash course in distributed programming

Load Balancer

WebServer

WebServer

WebServer

WebServer

WebServer

OLAP

InternalAdmin Reports BI

SQL(Master)

SQL(Slave)

PRIVATE ZONE

SQL(Master)

SQL(Slave)

Page 6: We're all distributed systems devs now: a crash course in distributed programming

Obvious Solution: Sharding

Master

Slave

Slave

Master

Slave

Slave

Master

Slave

Slave

Master

Slave

Slave

Master

Slave

Slave

Coordinator

Client Client Client

Page 7: We're all distributed systems devs now: a crash course in distributed programming

Brittle

Master

Slave

Slave

Master

Slave

Slave

Master

Slave

Slave

Master

Slave

Slave

Master

Slave

Slave

Coordinator

Client Client Client

Master

Master

Master

Master

Page 8: We're all distributed systems devs now: a crash course in distributed programming

Scenario 2: Real-time User Interactivity

User generatesevent

Has userproduced

events 0-3?

Yes

Send messageback to user

NoWait for more

events

Page 9: We're all distributed systems devs now: a crash course in distributed programming

Obvious Solution: Read-after-Write

USER

HTTP LOAD BALANCER

Web Tier

Data Tier

Events 0...3

0

0

1

0 10,1

2

20,1,2

3

30,1,2,3

MESSAGE

Page 10: We're all distributed systems devs now: a crash course in distributed programming

Reality

HTTP LOAD BALANCER

Web Tier

Data Tier

USER

Events 0...3

0 3

0?? 3??

1 2

1?? 2 ??

Page 11: We're all distributed systems devs now: a crash course in distributed programming

Distributed Systems 101

DistributedSystems Theories

and Concepts

DecentralizedArchitectures

Event & MessageDriven Programming

Stateful Applications

CAP Theorem

Fault & ResourceIsolation

Page 14: We're all distributed systems devs now: a crash course in distributed programming

Elastic (Leave)C

DA

B E

F

All other nodes are notified of F leaving

Page 15: We're all distributed systems devs now: a crash course in distributed programming

Recover from FailuresC

DA

B E

F(CriticalState)

All other nodes are notified of failure

D(takeover)

Page 16: We're all distributed systems devs now: a crash course in distributed programming

Availability through Replication

C

D(replica 2)

A(replica 1)

B E

F(replica 3)

Page 17: We're all distributed systems devs now: a crash course in distributed programming

Event and Message Driven Programming

Front End Application Server

RPC / WebService Call

HTTP POST ....

HTTP 201 ....

Message Passing

Front End Application Server

Serialized Message

0-N Response Messages

Page 18: We're all distributed systems devs now: a crash course in distributed programming

Properties of Messages

Propertiesof Messages

Always comprised oftwo parts

Payload (data)

Reply-to address

Alwaysasynchronous

Can be serialized andstored

Can be ordered andre-ordered Deferrals

Can be forwardedand delegated

Can be received bymultiple parties

Page 19: We're all distributed systems devs now: a crash course in distributed programming

Messaging Patterns

Broadcast Node

Node

Node

Node

Proxy Node Node

Node

Forward

Pub-sub Node

Node

Node

NodeSubs

msg

NodeOne-way Node

Page 20: We're all distributed systems devs now: a crash course in distributed programming

Messaging ProtocolsC

(R2)

DA

B(R1)

E(R3)

FWrite

Can we accept?

Can we accept?

YesYes

COMMIT

COMMITCOMMIT

Page 21: We're all distributed systems devs now: a crash course in distributed programming

Gossip: How Nodes Discover Each Other

A(Seed)

B C

1. Join

2. Share gossipabout other nodes

3. Join

4. Share gossipabout other nodes

5. A --> C: did youknow about B?

5. A --> B: did youknow about C?

6. Connect

Page 23: We're all distributed systems devs now: a crash course in distributed programming

Stateful Apps Serve Results from Memory

Web Server(Stateless)

App Server(Stateful)

Database Server(Stateful)

1. Request

2. Get orUpdate

3. Response

4. Response

Aync writesAsync reads

Page 24: We're all distributed systems devs now: a crash course in distributed programming

Fastest Response Time?

App Server(Stateless)

App Server(Stateful)

Database Server(Stateful)

State

Which of these two service architectureswill produce the fastest response timefor the same data?

Request /Response

Page 26: We're all distributed systems devs now: a crash course in distributed programming

State Makes Protocols WorkAt Most Once Sender Receiver

message may be lost

At Least Once Sender Receiver

Statemessage may be duplicated

message ordering issues

Exactly Once Sender Receiver

State Statemessage ordering issues

expensive

Page 27: We're all distributed systems devs now: a crash course in distributed programming

CAP Theorem

CConsistency

PPartitionTolerance

AAvailability

CAP Gradient - have to trade offbetween all three.

Page 28: We're all distributed systems devs now: a crash course in distributed programming

CAP Terminologies

CAPTerminologies

Consistency All nodes see the same dataat the same time

AvailabilityGuarantee that every

request receives an explicitresponse

PartitionTolerance

System is able to continuedespite arbitrary

partitioning due to networkfailures

Page 29: We're all distributed systems devs now: a crash course in distributed programming

CAP Trade-offs

CAPTradeoffs

Want higherconsistency?

Be lessavailable More latency

Or run onfewer

machines

Want higheravailability?

Be lessconsistent

Potential forstale reads

Run on moremachines

Want morepartition

tolerance?

Design a wayfor nodes towork when

disconnected

(This is hard)

Requiresconsistency /availability

compromises

Page 30: We're all distributed systems devs now: a crash course in distributed programming

Highest Consistency?

App Server(Stateful)

App Server(Stateful)

App Server(Stateful)

App Server(Stateful)

App Server(Stateful)

OR

1. Write {X}

2. Accept {X}?

2. Accept {X}?

3. Agree

3. Agree

4. Commit {X}

1. Write {X}

2. Commit {X}

3. Notify:{X} committed

Greater Availability

Greater Consistency

Page 31: We're all distributed systems devs now: a crash course in distributed programming

Consistency vs. Availability

Consistency vs.Availability

HighAvailability

Low latency

Multiple nodes allable to handle same

request

Less consistent!

High Consistency

Multiple nodes allagree on what thecurrent "state" of

something is

More predictablebehavior

Less available

Higher latency

Page 32: We're all distributed systems devs now: a crash course in distributed programming

Fault and Resource Isolation with Microservices

Page 33: We're all distributed systems devs now: a crash course in distributed programming

WebCrawler Microservices

Join cluster,Receive gossip

Join cluster,Receive gossip

Run jobs,get progress reports

WebCrawler.WebASP.NETMVC, SignalR

Cluster Role: Web

WebCrawler.WebASP.NETMVC, SignalR

Cluster Role: Web

All Web Roles

All Tracker Roles

WebCrawler.TrackingServiceWindows Service

Cluster Role: Tracker

WebCrawler.TrackingServiceWindows Service

Cluster Role: Tracker

Lighthouse Role

Cluster-deployprocessing hierarchies

Join cluster,Receive gossip

WebCrawler.CrawlServiceWindows Service

Cluster Role: Crawler

WebCrawler.CrawlServiceWindows Service

Cluster Role: Crawler

All Crawler Roles

Stateless Stateless

StatelessStateful

Page 34: We're all distributed systems devs now: a crash course in distributed programming

WebCrawler Network Topology

C[WEB]

D[WEB]

A[Lighthouse]

B[Crawler]

E[Crawler]

F[Tracker]

Page 35: We're all distributed systems devs now: a crash course in distributed programming

Try to make CPU / Memory-intensive tasks into stateless services

Page 36: We're all distributed systems devs now: a crash course in distributed programming

Stateful services should increase CPU / memory

utilization slowly