66
Architecting for Failure in a Containerized World Tom Faulhaber Infolace

Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Architecting for Failure in a Containerized World

Tom FaulhaberInfolace

Page 2: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build
Page 3: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build
Page 4: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build
Page 5: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

How can container tech help us build robust systems?

Page 6: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Key takeaway: an architectural toolkit for building robust systems with containers

Page 7: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

The RulesDecomposition Orchestration and

Synchronization

Managing Stateful Apps

Page 8: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Simplicity

Page 9: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Simple means: “Do one thing!”

Page 10: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

The opposite of simple is complex

Page 11: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Complexity exists within

components

Page 12: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Complexity exists between

components

Page 13: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Example: a counter

CounterService

1 2 3 4 50 …

CounterService

1 2 3 4 50 …x CounterService

1 2 3 4 50

1 2 3 4 50 1 2 3 4 50

Page 14: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Example: a counter

CounterService

1 2 3 4 50 …

CounterService

1 2 3 4 50 …Load

Balancer

1 2 3 4 50 1 2 3 4 50

Page 15: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

State + composition = complexity

Page 16: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Part 1: Decomposition

Page 17: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Rule: Decompose vertically

Page 18: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

App Server

Service #1

Service #2

Service #3

Page 19: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

App Server

Page 20: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Rule: Separation of concerns

Page 21: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Example: LoggingApp

CoreCode

LoggingDriver

Config

Logging Server

Page 22: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Example: Logging

Logger

App

CoreCode

LoggingDriver

Config

Logging Server

StdOut

Page 23: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Aspect-oriented programming

Page 24: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Rule: Constrain state

Page 25: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Relational DB

Session Store

Page 26: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Rule: Battle-tested tools

Page 27: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Redis

MySQL

Page 28: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Rule: High code churn →Easy restart

Page 29: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Rule: No start-up order!

Page 30: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

time

a

b

c

d

Page 31: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

time

x

a

b

c

d

Page 32: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

time

x

a

b

c

d

xxx

Page 33: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

time

x

a

b

c

d

xxx

Page 34: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

time

a

b

c

d

Page 35: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

time

a

b

c

d

Page 36: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

time

a

b

c

d

Page 37: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Rule: Consider higher-order failure

Page 38: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

The RulesDecomposition

Decompose vertically Separation of concerns Constrain state Battle-tested tools High code churn, easy restart No start-up order! Consider higher-order failure

Orchestration and Synchronization

Managing Stateful Apps

Page 39: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Part 2: Orchestration and Synchronization

Page 40: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Rule: Use Framework Restarts

Page 41: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

• Mesos: Marathon always restarts

• Kubernetes: RestartPolicy=Always

• Docker: Swarm always restarts

Page 42: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Rule: Create your own framework

Page 43: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

MesosAgent

FrameworkExecutor

MesosMaster

FrameworkDriver

MesosAgent

FrameworkExecutor

MesosAgent

FrameworkExecutor

Page 44: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Rule: Use

Synchronized State

Page 45: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Synchronized StateTools: - zookeeper - etcd - consul

Patterns: - leader election - shared counters - peer awareness - work partitioning

Page 46: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Rule: Minimize

Synchronized State

Page 47: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Even battle-tested state management is a headache.

(Source: http://blog.cloudera.com/blog/2014/03/zookeeper-resilience-at-pinterest/)

Page 48: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

The RulesDecomposition

Decompose vertically Separation of concerns Constrain state Battle-tested tools High code churn, easy restart No start-up order! Consider higher-order failure

Orchestration and Synchronization

Use framework restarts Create your own framework Use synchronized state Minimize synchronized state

Managing Stateful Apps

Page 49: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Part 3: Managing Stateful Apps

Page 50: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Rule (repeat!): Always use battle-tested tools!

(State is the weak point)

Page 51: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Rule: Choose the DB architecture

Page 52: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Option 1: External DBExecution cluster

Database cluster

Page 53: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Option 1: External DBPros

• Somebody else’s problem!

• Can use a DB designed for clustering directly

• Can use DB as a service

Cons

• Not really somebody else’s problem!

• Higher latency/no reference locality

• Can’t leverage orchestration, etc.

Page 54: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Option 2: Run on Raw HW

HDFS

Mesos

Marathon

App

HDFS

Mesos

Marathon

App

HDFS

Mesos

Marathon

App

Page 55: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Option 2: Run on Raw HWPros

• Use existing recipes

• Have local data

• Manage a single cluster

Cons

• Orchestration doesn’t help with failure

• Increased management complexity

Page 56: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Option 3: In-memory DB

Mesos

Marathon

App

MemSQL

Mesos

Marathon

App

MemSQL

Mesos

Marathon

App

MemSQL

Page 57: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Option 3: In-memory DBPros

• No need for volume tracking

• Fast

• Have local data

• Manage a single cluster

Cons

• Bets all machines won’t go down

• Bets on orchestration framework

Page 58: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Option 4: Use OrchestrationMesos

Marathon

App

Cassandra

Mesos

Marathon

App

Cassandra

Mesos

Marathon

App

Cassandra

Page 59: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Option 4: Use OrchestrationPros

• Orchestration manages volumes

• One model for all programs

• Have local data

• Single cluster

Cons

• Currently the least mature

• Not well supported by vendors

Page 60: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Option 5: Roll Your OwnMesos

Marathon

App

ImageMgr

MesosMaster

Framework

Mesos

Marathon

App

ImageMgr

Mesos

Marathon

App

ImageMgr

Page 61: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Option 5: Roll Your OwnPros

• Very precise control

• You decide whether to use containers

• Have local data

• Can be system aware

Cons

• You’re on your own!

• Wedded to a single orchestration platform

• Not battle tested

Page 62: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Rule: Have replication

Page 63: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

The RulesDecomposition

Decompose vertically Separation of concerns Constrain state Battle-tested tools High code churn, easy restart No start-up order! Consider higher-order failure

Orchestration and Synchronization

Use framework restarts Create your own framework Use synchronized state Minimize synchronized state

Managing Stateful Apps

Battle-tested tools Choose the DB architecture Have replication

Page 64: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

Fin

Page 65: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

References• Rich Hickey:

“Are We There Yet?” (https://www.infoq.com/presentations/Are-We-There-Yet-Rich-Hickey)“Simple Made Easy” (https://www.infoq.com/presentations/Simple-Made-Easy-QCon-London-2012)

• David Greenberg, Building Applications on Mesos, O’Reilly, 2016

• Joe Johnston, et al., Docker in Production: Lessons from the Trenches, Bleeding Edge Press, 2015

Page 66: Architecting for Failure in a Containerized World · 2020. 7. 19. · Architecting for Failure in a Containerized World Tom Faulhaber Infolace. How can container tech help us build

The RulesDecomposition

Decompose vertically Separation of concerns Constrain state Battle-tested tools High code churn, easy restart No start-up order! Consider higher-order failure

Orchestration and Synchronization

Use framework restarts Create your own framework Use synchronized state Minimize synchronized state

Managing Stateful Apps

Battle-tested tools Choose the DB architecture Have replication