Upload
gene-kim
View
1.028
Download
3
Tags:
Embed Size (px)
DESCRIPTION
Here's a 10m lightning talk I gave at the #devops meetup at the @cloudcamp convened during #SxSW 2011.
Citation preview
@RealGeneKim, [email protected]
CloudCamp/#devops Lightning Talks:
Lessons Learned Creating Dev/Ops Super-Tribes
Gene Kim, CISA, TOCICO JonahMarch 14, 2011
@RealGeneKim, [email protected]
My Background
@RealGeneKim, [email protected]
Visible Ops: Playbook of High Performers
• The IT Process Institute has been studying high-performing organizations since 1999– What is common to all the high
performers?– What is different between them
and average and low performers?– How did they become great?
• Answers have been codified in the Visible Ops Methodology
• Over 180K copies sold since 2006
www.ITPI.org
@RealGeneKim, [email protected]
A Great Moment For Me
@RealGeneKim, [email protected]
Infosec
@RealGeneKim, [email protected]
IT Operations
@RealGeneKim, [email protected]
Developers
@RealGeneKim, [email protected]
Now, More Than Ever, We Need Great IT Operations
• In addition to delivering the online services we promised, when the business needs to take corrective actions to:– Reduce costs– Increase efficiencies– Gain competitive advantage
We are here…
Where we need to be…
IT is always in the way(again…)
@RealGeneKim, [email protected]
Where Did The High Performers Come From?
@RealGeneKim, [email protected]
Higher Performing IT Organizations Are More Stable, Nimble, Compliant And Secure
High performers maintain a posture of compliance Fewest number of repeat audit findings One-third amount of audit preparation effort
High performers find and fix security breaches faster 5 times more likely to detect breaches by automated control 5 times less likely to have breaches result in a loss event
When high performers implement changes… 14 times more changes One-half the change failure rate One-quarter the first fix failure rate 10x faster MTTR for Sev 1 outages
When high performers manage IT resources… One-third the amount of unplanned work 8 times more projects and IT services 6 times more applications
Source: IT Process Institute, 2008
@RealGeneKim, [email protected]
The Vicious Downward Spiral
Operations Sees…• Fragile applications are prone to failure• Long time required to figure out “which
bit got flipped”• Detective control is a salesperson• Too much time required to restore
service• Too much firefighting and unplanned
work • Planned project work cannot complete• Frustrated customers leave• Market share goes down• Business misses Wall Street
commitments• Business makes even larger promises to
Wall Street
Dev Sees…• More urgent, date-driven projects put
into the queue• Even more fragile code put into
production• More releases have increasingly
“turbulent installs”• Release cycles lengthen to amortize
“cost of deployments”• Failing bigger deployments more
difficult to diagnose• Most senior and constrained IT ops
resources have less time to fix underlying process problems
• Ever increasing backlog of infrastructure projects that could fix root cause and reduce costs
• Ever increasing amount of tension between IT Ops and DevelopmentThese aren’t IT Operations problems…
These are business problems!
@RealGeneKim, [email protected]
Operations Inside The Dev/Ops Super-Tribe
• Increase flow from Dev to Production– Increase throughput– Decrease WIP
• Our goal is to create a system of operations that allows – Planned work to quickly move to production– Ensure service is quickly restored when things go wrong
• How does this relate to Visible Ops?– We focused much on “unplanned work”– What’s happening to all the planned work?– At any given time, what should IT Ops be working on?– Now we are focusing on the flow of planned work
@RealGeneKim, [email protected]
Zone #1: Decrease Cycle Time Of Releases
• Create determinism in the release process• Move packaging responsibility to development• Release early and often• Decrease release cycle time
– Reduce deployment times from 6 hours to 45 minutes– Refactor deployment process that had 1300+ steps spanning 4 weeks
• Never again “fix forward,” instead “roll back,” escalating any deviation from plan to Dev
• Verify for all handoffs (e.g., correctness, accuracy, timeliness, etc…)• Ensure environments are properly built before deployment begins• Control code and environments down the preproduction runways• Hold Dev, QA, Int, and Staging owners accountable for integrity
@RealGeneKim, [email protected]
Zone #2: Increase Production Rigor
• Define what work is and where work can come from• Protect the integrity of the work queue (e.g., are checks being written
than won’t clear?)• To preserve and increase throughput, elevate preventive projects and
maintenance tasks• Document all work, changes and outcomes so that it is repeatable• Ops builds Agile standardized deployment stories, to be completed
after Dev sprints are complete• Maintains adequate situational awareness so that incidents could be
quickly detected and corrected• Standardize unplanned work and escalations• Always seeking to eradicate unplanned work and increase throughput
Lean Principle: “Better -> Faster -> Cheaper”
@RealGeneKim, [email protected]
“When IT Fails: The Novel”
• Steve Masters, CEO• Bill Palmer, VP IT
Operations• Chris Anderson, VP
Development
• Parts Unlimited$4B revenue/year
@RealGeneKim, [email protected]
Resources• From the IT Process Institute www.itpi.org
– Both Visible Ops Handbooks– ITPI IT Controls Performance Study
• “Lean IT” by Orzen and Bell– Winner of the Shingo Prize 2011
• “Web Operations: Keeping The Data On Time” by Allspaw, Robbins
• “Inspired: How To Create Products That Customers Love” by Cagan
• “Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation” by Humble, Farley
• Follow Gene Kim– On Twitter: @RealGeneKim– mailto:[email protected] – Blog: http://realgenekim.me/blog
• Follow Mike Orzen– On Twitter: @MikeOrzenLeanIT– mailto:[email protected] – http://www.steadyimprovement.com
@RealGeneKim, [email protected]
About Gene Kim
• I’ve spent the last 10 years studying high performing IT organizations, trying to understand:– What do they have in common?– What is present in successful transformations, absent in unsuccessful
transformations?– How do we lower the activation energy required to create the
transformations?• Founder and former CTO of Tripwire, Inc., a $100M automated
security/compliance software company• Co-author of Visible Ops Handbook, Security Visible Ops Handbook
(over 180K copies sold)• Active researcher
– Co-founder of IT Process Institute– Committee member of Institute of Internal Auditors– Leader of PCI Security Standards Council Scoping SIG