I volunteer as tribute: the future of oncall (Uptime)

Preview:

Citation preview

@bridgetkromhout

I volunteer as tribute

the future of

oncall

@bridgetkromhout

lives: Minneapolis,

Minnesota

works: Pivotal

podcasts: Arrested DevOps

organizes: devopsdays

Bridget Kromhout

@bridgetkromhout

traded oncall… …for more travel (similar effect on sleep)

@bridgetkromhout

things fall apart

@bridgetkromhout

“In a world that celebrates pioneers— be the settlers instead.”

— Laura Bell (@lady_nerd)

@bridgetkromhout

previously, on #opslife…

@bridgetkromhout

Image credit: James Ernest

@bridgetkromhout Image credit: 00abstrahiert99 on Flickr

…but #opslife means I’m a

cynical realist

@bridgetkromhout

@bridgetkromhout

@bridgetkromhout

Attack Kitten is

skeptical about

NoOps

@bridgetkromhout

Attack Kitten Cat Reality Check

@bridgetkromhout

empathy

@bridgetkromhout

@bridgetkromhout

serverless(in the brave new cloudy-with-a-chance-of-containers world)

@bridgetkromhout

serverless(in the brave new cloudy-with-a-chance-of-containers world)

@bridgetkromhout

two-pizza silo

@bridgetkromhout Image credit: Wikipedia

“Any organization that designs a system… will produce a design

whose structure is a copy of the organization's

communication structure.”

Mel Conway

@bridgetkromhout

Image credit: Vasa Museet

probably fine

@bridgetkromhout

in a perfect world

@bridgetkromhout

for ops, don’t tell devs: gl;hf!

do: automate document

share

@bridgetkromhout

for devs, build for operability:

observability, debuggability, reality

@bridgetkromhout

The Wall of Confusion

@bridgetkromhout

The Wall of Confusion

yolo nope

@bridgetkromhout

Image credit: wikimedia

@bridgetkromhout

"The past is never dead. It's not even past.” William Faulkner

@bridgetkromhout

limited custom dev; network incidents

oncall handled by ops only

Image credit: Wallpaper Up

@bridgetkromhout

limited custom dev; colo incidents

oncall handled by ops only

@bridgetkromhout

low trust; difficult to grant partial access

oncall handled by ops only

@bridgetkromhout

everyone’s on call!!1!

high trust; variable ability

Image credit: Robot Unicorn Attack 2

@bridgetkromhout

ops on call; devs available

building trust; variable visibility

@bridgetkromhout

shared oncall; branching decision tree

follow-the-sun if possible

@bridgetkromhout

oncall investments architecture observability

culture

@bridgetkromhout

@bridgetkromhout

keep on shipping (implementation details vary)

@bridgetkromhout

tree failure?!?

@bridgetkromhout

@bridgetkromhout

architecture: plan for continuous partial failure

@bridgetkromhout

CA

CP AP

AvailabilityConsistency

Partition Tolerance

“a partition is a time bound

on communication.”Eric Brewer

@bridgetkromhout

observability: answering questions we didn’t know to ask

@bridgetkromhout

observability: understand your environment

@bridgetkromhout

monitoring: the old way

@bridgetkromhout

Monitorin

g

monitoring: the new way

@bridgetkromhout

The business:

UX data for product & engineering Measure value delivered

Information Technology:

Visibility into state and failures Product & engineering decisions

Measure success of projects

monitoring needs of…

The Art of Monitoring (2016) James Turnbull

artofmonitoring.com

@bridgetkromhout

culture of collaboration

@bridgetkromhout

a tranquil beach… or is it?

@bridgetkromhout

@bridgetkromhout

@bridgetkromhout

learning culture: be adaptable

@bridgetkromhout

Computers are easy; people are hard

@bridgetkromhout

Massively scalable fault-tolerant distributed systems require a

significant engineering effort to build and operate; complex socio-technical systems are even more challenging.

Computers are easy; people are hard

@bridgetkromhout

Who owns your availability? The answer may surprise you!

Image credit: Wikipedia

@bridgetkromhout

not actually 20 units of devops

@bridgetkromhout

silos are for grain

@bridgetkromhout

@bridgetkromhout

still computers

oncall blood and tears don’t scale

@bridgetkromhoutgif credit: @paddyforan

oncall blood and tears don’t scale

@bridgetkromhoutgif credit: @paddyforan

@bridgetkromhout

don’t volunteer as tribute

@bridgetkromhout

don’t volunteer as tribute

invest in architecture, observability, culture

@bridgetkromhout

@bridgetkromhout

,

Recommended