Building(Large(Scale(Services( - USENIX...Failing Gracefully puppet ruby SKILLS perl nosql...

Preview:

Citation preview

Bu i l d i ng ( L a r ge ( S ca l e ( Se r v i c e s (

PRESENTED(BY(Jennifer'Davis!(November(8,(2013(

Twitter: @sigje Email: sigje@yahoo.com

SysAdmin Controls all the things

11/11/13(3(

Shared Dependencies

11/11/13(4(

The Reality… (

11/11/13(5(

The Dream…

11/11/13(6(

How?

Define Core Principles

11/11/13(8(

! Common((›  CollaboraGon(across(teams,(companies,(industry,(define(standards(

›  Incident,(Problem,(Change,(Config,(Release(management(

! DisGnct(›  Specifics(to(an(applicaGon(or(service(›  Availability,(Service,(Business(ConGnuity,(Capacity((

Kill the Myths

11/11/13(9(

! Stupid(User((

Kill the Myths(

11/11/13(10(

! Stupid(User(! System(Admin(==(Operator(

(

11/11/13(11(

Failing Gracefully

puppet

ruby

SKILLS

perl

nosql

operability security

mysql

unix

TCP/IP

bash

CHEF

11/11/13(12(

Kill the Myths(

11/11/13(13(

! Stupid(User(! System(Admin(==(Operator(! Words(have(a(common(universal(implicit(meaning((

(

11/11/13(14(

Learn to Modulate your Message(

11/11/13(15(

(

(

11/11/13(16(

Team

Manager Customer

Team

11/11/13(17(

! People(working(towards(common(goal.(! Different(roles.((! Different(views.(! Same(objecGves.(

11/11/13(18(

(Image(Credit:(Kyle(LaGno(

Team

11/11/13(19(

Sugges/on:'Don’t'talk'about'the'“devs”'request,'talk'about'Elaine’s'request.''

Team

11/11/13(20(

Sugges/on:'Don’t'talk'about'the'“devs”'request,'talk'about'Elaine’s'request.''Sugges/on:'Verify'that'your'team'has'the'same'vision.'

Understand the vision.

11/11/13(21(

!  Are(there(other(opGons,(open(source(or(not(within(the(company?(!  Are(there(other(opGons(outside(the(company?(!  Is(EVERYONE(on(the(same(page(about(what(the(service(is?(

Vision Statement

11/11/13(22(

!  Clear(statement(about(the(problem(that(the(service(is(solving.(›  DirecGon(

›  IdenGty(management(

›  Team(cohesion(

New(product?(Be(part(of(creaGng(that(vision!(

Sherpa’s Vision

11/11/13(23(

..(Distributed(replicated(eventually(consistent(key(value(store(that(had(a(focus(on(scalability(..((

My Job

11/11/13(24(

!  Examine(soaware(!  Define(risk(!  Communicate(cost(of(risks((!  MiGgate(risks(!  IdenGfy(events(!  Manage(events(

Fragile Platforms are Bad.

11/11/13(25(

Change is inevitable

11/11/13(26(

!  Products(pivot(based(on(needs.(!  Requirements(change(and(evolve.(!  Know(core(issues.(

Know Core Issues

11/11/13(27(

!  Limit(the(scope(of(focus.((

Know Core Issues

11/11/13(28(

!  Limit(the(scope(of(focus.(!  Focus(on(the(biggest(prioriGes.((

Know Core Issues

11/11/13(29(

!  Limit(the(scope(of(focus.(!  Focus(on(the(biggest(prioriGes.(

›  Understand(Development(Methodology:(Waterfall,(Scrum,(?(

(

Know Core Issues

11/11/13(30(

!  Limit(the(scope(of(focus.(!  Focus(on(the(biggest(prioriGes.(

›  Understand(Development(Methodology:(Waterfall,(Scrum,(?(

›  IdenGfy(the(key(“Gme”(elements.(

(

Know Core Issues

11/11/13(31(

!  Limit(the(scope(of(focus.(!  Focus(on(the(biggest(prioriGes.(

›  Understand(Development(Methodology:(Waterfall,(Scrum,(?(

›  IdenGfy(the(key(“Gme”(elements.(

›  Talk(to(them.(IdenGfy(their(key(terms.(“Enhancements”,(“Defects”(

(

Know Core Issues

11/11/13(32(

!  Limit(the(scope(of(focus.(!  Focus(on(the(biggest(prioriGes.(

›  Understand(Development(Methodology:(Waterfall,(Scrum,(?(

›  IdenGfy(the(key(“Gme”(elements.(

›  Talk(to(them.(IdenGfy(their(key(terms.(“Enhancements”,(“Defects”(

›  Establish(the(“Top”(list.((

(

Create checklists

11/11/13(33(

!  Not(because(people(are(dumb.(!  Not(only(because(of(automaGon.(!  When(things(break,(knowing(what(needs(focus.(!  During(normal(maintenance,(can(idenGfy(“not(OK”.(

›  Audit(checklists(for(deployment(through(staging(environment.(

Know Outputs

11/11/13(34(

!  IdenGfy(components.(!  Well(defined(protocols(between(components.(!  Expected(Inputs.(!  Expected(Outputs.(

11/11/13(35(

11/11/13(36(

11/11/13(37(

11/11/13(38(

11/11/13(39(

Know State Transitions Explicitly.

11/11/13(40(

!  When(component(is(installed(but(not(ready(

Know State Transitions Explicitly.

11/11/13(41(

!  When(component(is(installed(but(not(ready(!  When(the(colo(is(going(away(!  Go(through(What(If(Scenarios.(

›  Document(them.(

Know choke points explicitly.

11/11/13(42(

!  Memory(!  Disk(!  Bandwidth(

Now(and(in(6(months.(JIT?(

Failure will happen.

11/11/13(43(

!  There(are(no(0(failure(systems.(!  (“Give(me(the(brain”(documentaGon(so(that(anyone(can(be(the(brain.(!  Repeatable/Reliable(failure(handling.(!  Run(fire(drills.(Really.((

11/11/13(44(

System Administration is Gardening.

11/11/13(45(

!  No(guarantee(of(resources.(!  Only(guarantee(is(change.(

System Administration is Gardening.

11/11/13(46(

!  Nurture(relaGonships.(›  Be(authenGc.(

›  Be(trusGng(and(trustworthy.(

›  Have(integrity.(

Success At Scale is Collaboration & Cooperation across Teams.

Decreasing Value

11/11/13(48(

11/11/13(49(

0

2

4

6

8

Jan Apr Jul Oct

# of Support Engineers

# of Support Engineers

11/11/13(50(

0

1

2

3

4

5

6

Jan Apr Jul Oct

# of Support Engineers

# of Support Engineers

11/11/13(51(

Documentation is not the cure.

11/11/13(52(

! DocumentaGon(doesn’t(guarantee(understanding.(›  OperaGons(Sandbox(Environment(

! Don’t(spend(Gme(at(the(end(documenGng.(

53( 11/11/13(

Summary

Be Expendable. Feed your brain.

11/11/13(55(

Acknowledgements

11/11/13(56(

•  hkp://www.flickr.com/photos/levork((•  hkp://www.flickr.com/photos/puggles(•  hkp://www.flickr.com/photos/byteorder(•  hkp://www.flickr.com/photos/egoant(•  hkp://www.flickr.com/photos/happymonkey(•  Kyle(LaGno((•  Greg(Connor((

Thanks!

11/11/13(57(

sigje@yahoo.com http://www.slideshare.net/sigje/

presentation-lisa

Recommended